Detection of Bias in Machine Learning Models for Predicting Deaths Caused by COVID-19

Authors

  • Fatimatus Zachra Universitas Muhammadiyah Malang, Indonesia
  • Setio Basuki Universitas Muhammadiyah Malang, Indonesia

DOI:

https://doi.org/10.31961/eltikom.v8i1.1081

Keywords:

bias, COVID-19, DALEX, machine learning, protected attributes

Abstract

The COVID-19 pandemic has significantly impacted global health, resulting in numerous fatalities and presenting substantial challenges to national healthcare systems due to a sharp increase in cases. Key to managing this crisis is the rapid and accurate identification of COVID-19 infections, a task that can be enhanced with Machine Learning (ML) techniques. However, ML applications can also generate biased and potentially unfair outcomes for certain demographic groups. This paper introduces a ML model designed for detecting both COVID-19 cases and biases associated with specific patient attributes. The model employs Decision Tree and XGBoost algorithms for case detection, while bias analysis is performed using the DALEX library, which focuses on protected attributes such as age, gender, race, and ethnicity. DALEX works by creating an "explainer" object that represents the model, enabling exploration of the model's functions without requiring in-depth knowledge of its workings. This approach helps pinpoint influential attributes and uncover potential biases within the model. Model performance is assessed through accuracy metrics, with the Decision Tree algorithm achieving the highest accuracy at 99% following Bayesian hyperparameter optimization. However, high accuracy does not ensure fairness, as biases related to protected attributes may still persist.

Downloads

Download data is not yet available.

References

C. W. Morfi et al., “Kajian terkini Coronavirus disease 2019 (COVID-19),” J. Ilmu Kesehat. Indones., vol. 1, no. 1, 2020.

C. Long et al., “Diagnosis of the Coronavirus disease (COVID-19): rRT-PCR or CT?,” Eur. J. Radiol., vol. 126, p. 108961, 2020.

M. R. H. Mondal, S. Bharati, and P. Podder, “Diagnosis of COVID-19 using machine learning and deep learning: a review,” Curr. Med. Imaging, vol. 17, no. 12, pp. 1403–1418, 2021.

J. Ammar, “Cyber Gremlin: social networking, machine learning and the global war on Al-Qaida-and IS-inspired terrorism,” Int. J. Law Inf. Technol., vol. 27, no. 3, pp. 238–265, 2019.

M. Hardt et al., “Amazon sagemaker clarify: Machine learning bias detection and explainability in the cloud,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 2974–2983.

S. Alelyani, “Detection and evaluation of machine learning bias,” Appl. Sci., vol. 11, no. 14, p. 6271, 2021.

J. Wisniewski and P. Biecek, “fairmodels: a Flexible Tool for Bias Detection, Visualization, and Mitigation in Binary Classification Models.,” R J., vol. 14, no. 1, pp. 227–243, 2022.

N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, “A survey on bias and fairness in machine learning,” ACM Comput. Surv., vol. 54, no. 6, pp. 1–35, 2021.

S. Mitchell, E. Potash, S. Barocas, A. D’Amour, and K. Lum, “Algorithmic fairness: Choices, assumptions, and definitions,” Annu. Rev. Stat. its Appl., vol. 8, pp. 141–163, 2021.

Y. K. Dwivedi et al., “Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy,” Int. J. Inf. Manage., vol. 57, p. 101994, 2021.

W. Seymour, “Detecting bias: does an algorithm have to be transparent in order to Be Fair?,” BIAS 2018, 2018.

Synthea Development Team, “SyntheticMass.” Accessed: Sep. 09, 2023. [Online]. Available: https://synthea.mitre.org/

J. Walonoski et al., “SyntheaTM Novel coronavirus (COVID-19) model and synthetic data set,” Intell. Med., vol. 1, p. 100007, 2020.

Y. Zoabi, S. Deri-Rozov, and N. Shomron, “Machine learning-based prediction of COVID-19 diagnosis based on symptoms,” npj Digit. Med., vol. 4, no. 1, pp. 1–5, 2021.

S. S. Zakariaee, N. Naderi, M. Ebrahimi, and H. Kazemi-Arpanahi, “Comparing machine learning algorithms to predict COVID‑19 mortality using a dataset including chest computed tomography severity score data,” Sci. Rep., vol. 13, no. 1, p. 11343, 2023.

N. M. Elshennawy, D. M. Ibrahim, A. M. Sarhan, and M. Arafa, “Deep-Risk: Deep Learning-Based Mortality Risk Predictive Models for COVID-19,” Diagnostics, vol. 12, no. 8, p. 1847, 2022.

N. Rai, N. Kaushik, D. Kumar, C. Raj, and A. Ali, “Mortality prediction of COVID-19 patients using soft voting classifier,” Int. J. Cogn. Comput. Eng., vol. 3, pp. 172–179, 2022.

H. Estiri et al., “An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes,” J. Am. Med. Informatics Assoc., vol. 29, no. 8, pp. 1334–1341, 2022.

S. Afrose, W. Song, C. B. Nemeroff, C. Lu, and D. Yao, “Subpopulation-specific machine learning prognosis for underrepresented patients with double prioritized bias correction,” Commun. Med., vol. 2, no. 1, p. 111, 2022.

A. Allen et al., “A racially unbiased, machine learning approach to prediction of mortality: algorithm development study,” JMIR public Heal. Surveill., vol. 6, no. 4, p. e22400, 2020.

Downloads

Published

30-06-2024

How to Cite

[1]
Zachra, F. and Basuki, S. 2024. Detection of Bias in Machine Learning Models for Predicting Deaths Caused by COVID-19. Jurnal ELTIKOM : Jurnal Teknik Elektro, Teknologi Informasi dan Komputer. 8, 1 (Jun. 2024), 26–33. DOI:https://doi.org/10.31961/eltikom.v8i1.1081.

Issue

Section

Articles