Prediction of Telkomsel 4G LTE Card Sales using The K-Nearest Neighbor Algorithm
DOI:
https://doi.org/10.31961/eltikom.v9i1.1476Keywords:
card sales prediction, KNN, model accuracyAbstract
Accurate sales prediction is a critical challenge in business decision-making, as factors such as data imbalance, outliers, and overfitting may compromise the reliability of predictive models. This study aims to develop a precise model for predicting card sales using the K-Nearest Neighbor (KNN) algorithm and to offer recommendations for improving prediction quality by addressing issues related to data imbalance and overfitting. The KNN algorithm is applied to analyze a card sales dataset, with preprocessing steps that include detecting missing values, handling outliers, and converting the target attribute into a categorical format. The optimal value of k is identified using the elbow method to determine the model's best accuracy. Findings indicate that the KNN model with k = 1 achieves 100% accuracy, though it shows signs of overfitting, which may hinder its generalizability to new data. Handling outliers and transforming data contributed to improving the model's performance. However, to enhance robustness, further testing with different k values and the use of cross-validation are recommended. Moreover, balancing the dataset and incorporating external variables such as promotional activities or market trends could support more reliable future predictions.
Downloads
References
A. Ardiansyah, “Pengaruh Kemudahan dan Keamanan Data Pribadi Terhadap Minat Menggunakan Dompet Digital (E-Wallet) Lin-kaja (Studi Kasus Pada Mahasiswa Fakultas Syariah dan Ekonomi Islam Tahun 2017-2019),” IAIN Syekh Nurjati Cirebon, 2021. Accessed: Feb. 11, 2025. [Online]. Available: http://repository.syekhnurjati.ac.id/5202/
S. Hutajulu, W. Dhewanto, and E. A. Prasetio, “Two scenarios for 5G deployment in Indonesia,” Technol Forecast Soc Change, vol. 160, p. 120221, 2020, doi: 10.1016/j.techfore.2020.120221.
D. Chrisinta and J. E. Simarmata, “Eksplorasi Teknik Web Scraping pada Data Mining: Pendekatan Pencarian Data Berbasis Python,” Faktor Exacta, vol. 17, no. 1, pp. 1979–276, May 2024, doi: 10.30998/FAKTOREXACTA.V17I1.22393.
A. Isnain, … J. S.-I. (Indonesian J., and undefined 2021, “Implementation of K-Nearest Neighbor (K-NN) algorithm for public senti-ment analysis of online learning,” journal.ugm.ac.idAR Isnain, J Supriyanto, MP KharismaIJCCS (Indonesian Journal of Computing and Cybernetics Systems), 2021•journal.ugm.ac.id, vol. 15, no. 2, pp. 121–130, 2021, doi: 10.22146/ijccs.65176.
I. Triguero, D. García-Gil, J. Maillo, J. Luengo, S. García, and F. Herrera, “Transforming big data into smart data: An insight on the use of the k-nearest neighbors algorithm to obtain quality data,” Wiley Interdiscip Rev Data Min Knowl Discov, vol. 9, no. 2, Mar. 2019, doi: 10.1002/WIDM.1289.
P. Cunningham and S. J. Delany, “K-Nearest Neighbour Classifiers-A Tutorial,” ACM Comput Surv, vol. 54, no. 6, Jul. 2021, doi: 10.1145/3459665.
S. Ayyad, A. Saleh, L. L.- Biosystems, and undefined 2019, “Gene expression cancer classification using modified K-Nearest Neigh-bors technique,” Elsevier, Accessed: Mar. 08, 2025. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0303264718302685
H. Said, N. Matondang, H. I.-Techno. Com, and undefined 2022, “Penerapan Algoritma K-Nearest Neighbor Untuk Memprediksi Kualitas Air yang Dapat Dikonsumsi,” academia.eduH Said, NH Matondang, HN IrmandaTechno. Com, 2022•academia.edu, Ac-cessed: Mar. 08, 2025. [Online]. Available: https://www.academia.edu/download/89314820/2927.pdf
I. Nikmatun, I. W.-J. Simetris, and undefined 2019, “Implementasi data mining untuk klasifikasi masa studi mahasiswa menggunakan algoritma K-Nearest Neighbor,” academia.eduIA Nikmatun, I WaspadaJurnal Simetris, 2019•academia.edu, Accessed: Mar. 08, 2025. [Online]. Available: https://www.academia.edu/download/103513773/304201835.pdf
A. Choirun and A. Andri, “Penerapan Algoritma K-Nearest Neighbor Untuk Prediksi Penjualan Obat Pada Apotek Kimia Farma Atmo Palembang,” Universitas Bina Darma, 2020.
Y. R. Amalia, “Penerapan data Mining untuk Prediksi Penjualan Produk Elektronik Terlaris Menggunakan Metode K-Nearest Neigh-bor,” Universitas Islam Negeri Raden Fatah, 2018. Accessed: Feb. 19, 2025. [Online]. Available:
A. A. WPR, F. Rozi, and F. Sukmana, “Prediksi Penjualan Produk Unilever Menggunakan Metode K-Nearest Neighbor,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 6, no. 1, pp. 155–160, Jun. 2021, doi: 10.29100/JIPI.V6I1.1910.
D. Chrisinta and J. E. Simarmata, “Comparative Study of Support Vector Machine and Naive Bayes for Sentiment Analysis on Lec-turer Performance,” Journal of Research in Mathematics Trends and Technology, vol. 5, no. 1, pp. 1–7, 2023, doi: 10.32734/jormtt.v5i1.
J. E. Simarmata, G. W. Weber, and D. Chrisinta, “Performance Evaluation of Classification Methods on Big Data: Decision Trees, Naive Bayes, K-Nearest Neighbors, and Support Vector Machines,” Jurnal Matematika, Statistika dan Komputasi, vol. 20, no. 3, pp. 623–638, 2024, doi: 10.20956/j.v20i3.32970.
A. Neonub, Y. R. L. Oktaviana, and D. Chrisinta, “Implementasi Algoritma Naive Bayes Pada Data Ulasan Mahasiswa Tentang Sa-rana dan Prasarana Kampus,” Prosiding Seminar Nasional Sains dan Teknologi “SainTek,” vol. 1, no. 2, pp. 206–212, 2024, Ac-cessed: Feb. 11, 2025.
D. Chrisinta and J. E. Simarmata, “Analisis Sentimen Penilaian Masyarakat Terhadap Pejabat Publik Menggunakan Algoritma Naïve Bayes Classifier,” Komputika: Jurnal Sistem Komputer, vol. 12, no. 1, pp. 93–101, 2023, doi: 10.34010/komputika.v12i1.9638.
T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago, and O. Tabona, “A survey on missing data in machine learning,” J Big Data, vol. 8, no. 1, pp. 1–37, Dec. 2021, doi: 10.1186/S40537-021-00516-9.
D. Bertsimas, C. Pawlowski, and Y. D. Zhuo, “From predictive methods to missing data imputation: an optimization approach,” Jour-nal of Machine Learning Research, vol. 18, no. 196, pp. 1–39, 2018, Accessed: Feb. 11, 2025.
L. Theodorakopoulos, A. Theodoropoulou, and Y. Stamatiou, “A state-of-the-art review in big data management engineering: Real-life case studies, challenges, and future research directions,” Eng, vol. 5, no. 3, pp. 1266–1297, 2024, doi: 10.3390/eng5030068.
G. Jesus, A. Casimiro, and A. Oliveira, “Using Machine Learning for Dependable Outlier Detection in Environmental Monitoring Sys-tems,” ACM Transactions on Cyber-Physical Systems, vol. 5, no. 3, pp. 1–30, Jul. 2021, doi: 10.1145/3445812.
H. Aguinis, R. K. Gottfredson, and H. Joo, “Best-practice recommendations for defining, identifying, and handling outliers,” Organ Res Methods, vol. 16, no. 2, pp. 270–301, Apr. 2013, doi: 10.1177/1094428112470848.
A. Lubis, Y. Irawan, J. Junadhi, and S. Defit, “Leveraging K-Nearest Neighbors with SMOTE and boosting techniques for data imbal-ance and accuracy improvement,” Journal of Applied Data Sciences, vol. 5, no. 4, pp. 1625–1638, 2024, doi: 10.47738/jads.v5i4.343.
O. A. Montesinos López, A. Montesinos López, and J. Crossa, Overfitting, model tuning, and evaluation of prediction performance. Springer International Publishing, 2022. doi: 10.1007/978-3-030-89010-0_4.
P. Nair and I. Kashyap, “Hybrid pre-processing technique for handling imbalanced data and detecting outliers for KNN classifier,” in 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), 2019, pp. 460–464. doi: 10.1109/COMITCon.2019.8862250.
D. Abriha, P. Srivastava, and S. Szabó, “Smaller is Better? Unduly Nice Accuracy Assessments in Roof Detection Using Remote Sens-ing Data With Machine Learning And K-Fold Cross-Validation,” Heliyon, vol. 9, no. 3, pp. 1–17, 2023, doi: 10.1016/j.heliyon.2023.e14045.
J. Josse, J. M. Chen, N. Prost, G. Varoquaux, and E. Scornet, “On the consistency of supervised learning with missing values,” Statisti-cal Papers, vol. 65, no. 9, pp. 5447–5479, Dec. 2024, doi: 10.1007/s00362-024-01550-4.
R. D. Guida et al., “Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation, missing value imputation, transformation and scaling,” Metabolomics, vol. 12, no. 5, pp. 1–14, May 2016, doi: 10.1007/s11306-016-1030-9.
J. L. R. Andersson, M. S. Graham, and E. Zsoldos, “Incorporating outlier detection and replacement into a non-parametric framework for movement and distortion correction of diffusion MR images,” Neuroimage, vol. 141, pp. 556–572, 2016, doi: 10.1016/j.neuroimage.2016.06.058.
M. Beckmann, N. F. F. Ebecken, and B. S. P. De Lima, “A KNN undersampling approach for data balancing,” Journal of Intelligent Learning Systems and Applications, vol. 7, no. 4, pp. 104–116, 2015, doi: 10.4236/jilsa.2015.74010.
S. Zhang, X. Li, M. Zong, X. Zhu, and R. Wang, “Efficient kNN classification with different numbers of nearest neighbors,” in IEEE transactions on neural networks and learning systems, 2017, pp. 1774–1785. doi: 10.1109/TNNLS.2017.2673241.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Alfiana Fontes Martins, Yasinta Oktaviana Legu Rema, Debora Chrisinta, Alejandro Jr. V. Matute, Krisantus Jumarto Tey Seran

This work is licensed under a Creative Commons Attribution 4.0 International License.
All accepted papers will be published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. Authors retain copyright and grant the journal right of first publication. CC-BY Licenced means lets others to Share (copy and redistribute the material in any medium or format) and Adapt (remix, transform, and build upon the material for any purpose, even commercially).

