Prediction of Telkomsel 4G LTE Card Sales using The K-Nearest Neighbor Algorithm

Alfiana Fontes Martins; Yasinta Oktaviana Legu Rema; Debora Chrisinta; Alejandro Jr. V. Matute; Krisantus Jumarto Tey Seran

doi:10.31961/eltikom.v9i1.1476

Authors

Alfiana Fontes Martins Universitas Timor, Indonesia
Yasinta Oktaviana Legu Rema Universitas Timor, Indonesia
Debora Chrisinta Universitas Timor
Alejandro Jr. V. Matute Laguna State Polytechnic University, Philippines
Krisantus Jumarto Tey Seran universitas t

DOI:

https://doi.org/10.31961/eltikom.v9i1.1476

Keywords:

card sales prediction, KNN, model accuracy

Abstract

Accurate sales prediction is a critical challenge in business decision-making, as factors such as data imbalance, outliers, and overfitting may compromise the reliability of predictive models. This study aims to develop a precise model for predicting card sales using the K-Nearest Neighbor (KNN) algorithm and to offer recommendations for improving prediction quality by addressing issues related to data imbalance and overfitting. The KNN algorithm is applied to analyze a card sales dataset, with preprocessing steps that include detecting missing values, handling outliers, and converting the target attribute into a categorical format. The optimal value of k is identified using the elbow method to determine the model's best accuracy. Findings indicate that the KNN model with k = 1 achieves 100% accuracy, though it shows signs of overfitting, which may hinder its generalizability to new data. Handling outliers and transforming data contributed to improving the model's performance. However, to enhance robustness, further testing with different k values and the use of cross-validation are recommended. Moreover, balancing the dataset and incorporating external variables such as promotional activities or market trends could support more reliable future predictions.

Downloads

Download data is not yet available.

References

A. Ardiansyah, “Pengaruh Kemudahan dan Keamanan Data Pribadi Terhadap Minat Menggunakan Dompet Digital (E-Wallet) Lin-kaja (Studi Kasus Pada Mahasiswa Fakultas Syariah dan Ekonomi Islam Tahun 2017-2019),” IAIN Syekh Nurjati Cirebon, 2021. Accessed: Feb. 11, 2025. [Online]. Available: http://repository.syekhnurjati.ac.id/5202/

S. Hutajulu, W. Dhewanto, and E. A. Prasetio, “Two scenarios for 5G deployment in Indonesia,” Technol Forecast Soc Change, vol. 160, p. 120221, 2020, doi: 10.1016/j.techfore.2020.120221.

D. Chrisinta and J. E. Simarmata, “Eksplorasi Teknik Web Scraping pada Data Mining: Pendekatan Pencarian Data Berbasis Python,” Faktor Exacta, vol. 17, no. 1, pp. 1979–276, May 2024, doi: 10.30998/FAKTOREXACTA.V17I1.22393.

A. Isnain, … J. S.-I. (Indonesian J., and undefined 2021, “Implementation of K-Nearest Neighbor (K-NN) algorithm for public senti-ment analysis of online learning,” journal.ugm.ac.idAR Isnain, J Supriyanto, MP KharismaIJCCS (Indonesian Journal of Computing and Cybernetics Systems), 2021•journal.ugm.ac.id, vol. 15, no. 2, pp. 121–130, 2021, doi: 10.22146/ijccs.65176.

I. Triguero, D. García-Gil, J. Maillo, J. Luengo, S. García, and F. Herrera, “Transforming big data into smart data: An insight on the use of the k-nearest neighbors algorithm to obtain quality data,” Wiley Interdiscip Rev Data Min Knowl Discov, vol. 9, no. 2, Mar. 2019, doi: 10.1002/WIDM.1289.

P. Cunningham and S. J. Delany, “K-Nearest Neighbour Classifiers-A Tutorial,” ACM Comput Surv, vol. 54, no. 6, Jul. 2021, doi: 10.1145/3459665.

S. Ayyad, A. Saleh, L. L.- Biosystems, and undefined 2019, “Gene expression cancer classification using modified K-Nearest Neigh-bors technique,” Elsevier, Accessed: Mar. 08, 2025. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0303264718302685

H. Said, N. Matondang, H. I.-Techno. Com, and undefined 2022, “Penerapan Algoritma K-Nearest Neighbor Untuk Memprediksi Kualitas Air yang Dapat Dikonsumsi,” academia.eduH Said, NH Matondang, HN IrmandaTechno. Com, 2022•academia.edu, Ac-cessed: Mar. 08, 2025. [Online]. Available: https://www.academia.edu/download/89314820/2927.pdf

I. Nikmatun, I. W.-J. Simetris, and undefined 2019, “Implementasi data mining untuk klasifikasi masa studi mahasiswa menggunakan algoritma K-Nearest Neighbor,” academia.eduIA Nikmatun, I WaspadaJurnal Simetris, 2019•academia.edu, Accessed: Mar. 08, 2025. [Online]. Available: https://www.academia.edu/download/103513773/304201835.pdf

A. Choirun and A. Andri, “Penerapan Algoritma K-Nearest Neighbor Untuk Prediksi Penjualan Obat Pada Apotek Kimia Farma Atmo Palembang,” Universitas Bina Darma, 2020.

Y. R. Amalia, “Penerapan data Mining untuk Prediksi Penjualan Produk Elektronik Terlaris Menggunakan Metode K-Nearest Neigh-bor,” Universitas Islam Negeri Raden Fatah, 2018. Accessed: Feb. 19, 2025. [Online]. Available:

A. A. WPR, F. Rozi, and F. Sukmana, “Prediksi Penjualan Produk Unilever Menggunakan Metode K-Nearest Neighbor,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 6, no. 1, pp. 155–160, Jun. 2021, doi: 10.29100/JIPI.V6I1.1910.

D. Chrisinta and J. E. Simarmata, “Comparative Study of Support Vector Machine and Naive Bayes for Sentiment Analysis on Lec-turer Performance,” Journal of Research in Mathematics Trends and Technology, vol. 5, no. 1, pp. 1–7, 2023, doi: 10.32734/jormtt.v5i1.

J. E. Simarmata, G. W. Weber, and D. Chrisinta, “Performance Evaluation of Classification Methods on Big Data: Decision Trees, Naive Bayes, K-Nearest Neighbors, and Support Vector Machines,” Jurnal Matematika, Statistika dan Komputasi, vol. 20, no. 3, pp. 623–638, 2024, doi: 10.20956/j.v20i3.32970.

A. Neonub, Y. R. L. Oktaviana, and D. Chrisinta, “Implementasi Algoritma Naive Bayes Pada Data Ulasan Mahasiswa Tentang Sa-rana dan Prasarana Kampus,” Prosiding Seminar Nasional Sains dan Teknologi “SainTek,” vol. 1, no. 2, pp. 206–212, 2024, Ac-cessed: Feb. 11, 2025.

D. Chrisinta and J. E. Simarmata, “Analisis Sentimen Penilaian Masyarakat Terhadap Pejabat Publik Menggunakan Algoritma Naïve Bayes Classifier,” Komputika: Jurnal Sistem Komputer, vol. 12, no. 1, pp. 93–101, 2023, doi: 10.34010/komputika.v12i1.9638.

T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago, and O. Tabona, “A survey on missing data in machine learning,” J Big Data, vol. 8, no. 1, pp. 1–37, Dec. 2021, doi: 10.1186/S40537-021-00516-9.

D. Bertsimas, C. Pawlowski, and Y. D. Zhuo, “From predictive methods to missing data imputation: an optimization approach,” Jour-nal of Machine Learning Research, vol. 18, no. 196, pp. 1–39, 2018, Accessed: Feb. 11, 2025.

L. Theodorakopoulos, A. Theodoropoulou, and Y. Stamatiou, “A state-of-the-art review in big data management engineering: Real-life case studies, challenges, and future research directions,” Eng, vol. 5, no. 3, pp. 1266–1297, 2024, doi: 10.3390/eng5030068.

G. Jesus, A. Casimiro, and A. Oliveira, “Using Machine Learning for Dependable Outlier Detection in Environmental Monitoring Sys-tems,” ACM Transactions on Cyber-Physical Systems, vol. 5, no. 3, pp. 1–30, Jul. 2021, doi: 10.1145/3445812.

H. Aguinis, R. K. Gottfredson, and H. Joo, “Best-practice recommendations for defining, identifying, and handling outliers,” Organ Res Methods, vol. 16, no. 2, pp. 270–301, Apr. 2013, doi: 10.1177/1094428112470848.

A. Lubis, Y. Irawan, J. Junadhi, and S. Defit, “Leveraging K-Nearest Neighbors with SMOTE and boosting techniques for data imbal-ance and accuracy improvement,” Journal of Applied Data Sciences, vol. 5, no. 4, pp. 1625–1638, 2024, doi: 10.47738/jads.v5i4.343.

O. A. Montesinos López, A. Montesinos López, and J. Crossa, Overfitting, model tuning, and evaluation of prediction performance. Springer International Publishing, 2022. doi: 10.1007/978-3-030-89010-0_4.

P. Nair and I. Kashyap, “Hybrid pre-processing technique for handling imbalanced data and detecting outliers for KNN classifier,” in 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), 2019, pp. 460–464. doi: 10.1109/COMITCon.2019.8862250.

D. Abriha, P. Srivastava, and S. Szabó, “Smaller is Better? Unduly Nice Accuracy Assessments in Roof Detection Using Remote Sens-ing Data With Machine Learning And K-Fold Cross-Validation,” Heliyon, vol. 9, no. 3, pp. 1–17, 2023, doi: 10.1016/j.heliyon.2023.e14045.

J. Josse, J. M. Chen, N. Prost, G. Varoquaux, and E. Scornet, “On the consistency of supervised learning with missing values,” Statisti-cal Papers, vol. 65, no. 9, pp. 5447–5479, Dec. 2024, doi: 10.1007/s00362-024-01550-4.

R. D. Guida et al., “Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation, missing value imputation, transformation and scaling,” Metabolomics, vol. 12, no. 5, pp. 1–14, May 2016, doi: 10.1007/s11306-016-1030-9.

J. L. R. Andersson, M. S. Graham, and E. Zsoldos, “Incorporating outlier detection and replacement into a non-parametric framework for movement and distortion correction of diffusion MR images,” Neuroimage, vol. 141, pp. 556–572, 2016, doi: 10.1016/j.neuroimage.2016.06.058.

M. Beckmann, N. F. F. Ebecken, and B. S. P. De Lima, “A KNN undersampling approach for data balancing,” Journal of Intelligent Learning Systems and Applications, vol. 7, no. 4, pp. 104–116, 2015, doi: 10.4236/jilsa.2015.74010.

S. Zhang, X. Li, M. Zong, X. Zhu, and R. Wang, “Efficient kNN classification with different numbers of nearest neighbors,” in IEEE transactions on neural networks and learning systems, 2017, pp. 1774–1785. doi: 10.1109/TNNLS.2017.2673241.

Prediction of Telkomsel 4G LTE Card Sales using The K-Nearest Neighbor Algorithm

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

info

Make a Submission

stat

Prediction of Telkomsel 4G LTE Card Sales using The K-Nearest Neighbor Algorithm

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

info

Make a Submission

Download Template

stat