Detection of Bias in Machine Learning Models for Predicting Deaths Caused by COVID-19
DOI:
https://doi.org/10.31961/eltikom.v8i1.1081Keywords:
bias, COVID-19, DALEX, machine learning, protected attributesAbstract
The COVID-19 pandemic has significantly impacted global health, resulting in numerous fatalities and presenting substantial challenges to national healthcare systems due to a sharp increase in cases. Key to managing this crisis is the rapid and accurate identification of COVID-19 infections, a task that can be enhanced with Machine Learning (ML) techniques. However, ML applications can also generate biased and potentially unfair outcomes for certain demographic groups. This paper introduces a ML model designed for detecting both COVID-19 cases and biases associated with specific patient attributes. The model employs Decision Tree and XGBoost algorithms for case detection, while bias analysis is performed using the DALEX library, which focuses on protected attributes such as age, gender, race, and ethnicity. DALEX works by creating an "explainer" object that represents the model, enabling exploration of the model's functions without requiring in-depth knowledge of its workings. This approach helps pinpoint influential attributes and uncover potential biases within the model. Model performance is assessed through accuracy metrics, with the Decision Tree algorithm achieving the highest accuracy at 99% following Bayesian hyperparameter optimization. However, high accuracy does not ensure fairness, as biases related to protected attributes may still persist.
Downloads
References
C. W. Morfi et al., “Kajian terkini Coronavirus disease 2019 (COVID-19),” J. Ilmu Kesehat. Indones., vol. 1, no. 1, 2020.
C. Long et al., “Diagnosis of the Coronavirus disease (COVID-19): rRT-PCR or CT?,” Eur. J. Radiol., vol. 126, p. 108961, 2020.
M. R. H. Mondal, S. Bharati, and P. Podder, “Diagnosis of COVID-19 using machine learning and deep learning: a review,” Curr. Med. Imaging, vol. 17, no. 12, pp. 1403–1418, 2021.
J. Ammar, “Cyber Gremlin: social networking, machine learning and the global war on Al-Qaida-and IS-inspired terrorism,” Int. J. Law Inf. Technol., vol. 27, no. 3, pp. 238–265, 2019.
M. Hardt et al., “Amazon sagemaker clarify: Machine learning bias detection and explainability in the cloud,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 2974–2983.
S. Alelyani, “Detection and evaluation of machine learning bias,” Appl. Sci., vol. 11, no. 14, p. 6271, 2021.
J. Wisniewski and P. Biecek, “fairmodels: a Flexible Tool for Bias Detection, Visualization, and Mitigation in Binary Classification Models.,” R J., vol. 14, no. 1, pp. 227–243, 2022.
N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, “A survey on bias and fairness in machine learning,” ACM Comput. Surv., vol. 54, no. 6, pp. 1–35, 2021.
S. Mitchell, E. Potash, S. Barocas, A. D’Amour, and K. Lum, “Algorithmic fairness: Choices, assumptions, and definitions,” Annu. Rev. Stat. its Appl., vol. 8, pp. 141–163, 2021.
Y. K. Dwivedi et al., “Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy,” Int. J. Inf. Manage., vol. 57, p. 101994, 2021.
W. Seymour, “Detecting bias: does an algorithm have to be transparent in order to Be Fair?,” BIAS 2018, 2018.
Synthea Development Team, “SyntheticMass.” Accessed: Sep. 09, 2023. [Online]. Available: https://synthea.mitre.org/
J. Walonoski et al., “SyntheaTM Novel coronavirus (COVID-19) model and synthetic data set,” Intell. Med., vol. 1, p. 100007, 2020.
Y. Zoabi, S. Deri-Rozov, and N. Shomron, “Machine learning-based prediction of COVID-19 diagnosis based on symptoms,” npj Digit. Med., vol. 4, no. 1, pp. 1–5, 2021.
S. S. Zakariaee, N. Naderi, M. Ebrahimi, and H. Kazemi-Arpanahi, “Comparing machine learning algorithms to predict COVID‑19 mortality using a dataset including chest computed tomography severity score data,” Sci. Rep., vol. 13, no. 1, p. 11343, 2023.
N. M. Elshennawy, D. M. Ibrahim, A. M. Sarhan, and M. Arafa, “Deep-Risk: Deep Learning-Based Mortality Risk Predictive Models for COVID-19,” Diagnostics, vol. 12, no. 8, p. 1847, 2022.
N. Rai, N. Kaushik, D. Kumar, C. Raj, and A. Ali, “Mortality prediction of COVID-19 patients using soft voting classifier,” Int. J. Cogn. Comput. Eng., vol. 3, pp. 172–179, 2022.
H. Estiri et al., “An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes,” J. Am. Med. Informatics Assoc., vol. 29, no. 8, pp. 1334–1341, 2022.
S. Afrose, W. Song, C. B. Nemeroff, C. Lu, and D. Yao, “Subpopulation-specific machine learning prognosis for underrepresented patients with double prioritized bias correction,” Commun. Med., vol. 2, no. 1, p. 111, 2022.
A. Allen et al., “A racially unbiased, machine learning approach to prediction of mortality: algorithm development study,” JMIR public Heal. Surveill., vol. 6, no. 4, p. e22400, 2020.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Jurnal ELTIKOM : Jurnal Teknik Elektro, Teknologi Informasi dan Komputer

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
All accepted papers will be published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. Authors retain copyright and grant the journal right of first publication. CC-BY Licenced means lets others to Share (copy and redistribute the material in any medium or format) and Adapt (remix, transform, and build upon the material for any purpose, even commercially).