Optimization Model for Fake Account Detection on Twitter (X) Social Media using Feature Engineering and Machine Learning Approaches

Ni Nyoman Eny Perimawati; Roy Rudolf Huizen; Dandy Pramana Hostiadi

doi:10.31961/eltikom.v9i2.1727

Authors

Ni Nyoman Eny Perimawati Institut Teknologi Dan Bisnis STIKOM Bali
Roy Rudolf Huizen Institut Teknologi Dan Bisnis STIKOM Bali
Dandy Pramana Hostiadi Institut Teknologi Dan Bisnis STIKOM Bali

DOI:

https://doi.org/10.31961/eltikom.v9i2.1727

Keywords:

fake accounts, feature engineering, machine learning, twitter (X)

Abstract

Twitter (X) has become an important platform for community interaction, but this also creates serious challenges due to the proliferation of fake accounts that can harm users and undermine credibility. Previous studies have proposed detection methods but often lacked forensic analysis based on extracted feature information. This study utilizes labeled datasets and supervised evaluation metrics (precision, recall, and F1-score) to validate model performance. Extracting behavioral information from features is crucial for achieving accurate and reliable detection results. The study introduces a novelty in the form of engineered behavioral features that significantly enhance detection accuracy, achieving up to 99.94% using AdaBoost. The proposed approach detects fake accounts on Twitter (X) by extracting key feature information and developing an optimal detection method through machine learning algorithms, including Random Forest, SVM, and AdaBoost. Furthermore, the model is optimized using feature engineering techniques. The novelty of this work lies in the development of engineered features through distribution analysis based on data characteristics and the improvement of classification performance through feature engineering optimization. The initial experiment without feature engineering shows that Random Forest achieved the highest accuracy of 98.77%, followed by AdaBoost at 98.57% and SVM at 95.90%. After applying feature engineering, performance improved, with AdaBoost reaching 99.94%, Random Forest 99.69%, and SVM 99.32%. The proposed model can assist system analysts in detecting fake accounts and contribute to solving forensic cybercrime challenges, particularly in identifying fake social media profiles.

Downloads

Download data is not yet available.

Author Biographies

Ni Nyoman Eny Perimawati, Institut Teknologi Dan Bisnis STIKOM Bali

Graduate student at the Magister Program, Department of Information System, Institut Teknologi dan Bisnis STIKOM Bali.
Roy Rudolf Huizen, Institut Teknologi Dan Bisnis STIKOM Bali

Lecturer at the Department of Information System, Institut Teknologi dan Bisnis STIKOM Bali.
Dandy Pramana Hostiadi, Institut Teknologi Dan Bisnis STIKOM Bali

Lecturer at the Department of Information System, Institut Teknologi dan Bisnis STIKOM Bali.

References

[1] M. Aljabri, R. Zagrouba, A. Shaahid, F. Alnasser, and D. M. Alomari, "Machine learning based social media bot detection: a comprehensive literature review," 2023.

[2] H. Ahmed, I. Traore, and S. Saad, "Detection of online fake news using N-gram analysis and machine learning techniques," IEEE Intelligent Systems, vol. 34, no. 1, pp. 23-31, 2019.

[3] A. Kumar and A. Mehta, "Detection of Fake Accounts in Online Social Networks," Journal of Cybersecurity and Privacy, 2021.

[4] T. Nguyen, H. Tran, and D. Hoang, "Detecting Fake Accounts in Online Social Networks," IEEE Access, 2020.

[5] P. Azami and K. Passi, “Detecting Fake Accounts on Instagram Using Machine Learning and Hybrid Optimization Algorithms,” Algorithms, vol. 17, no. 10, p. 425, 2024, doi: 10.3390/a17100425.

[6] A. Aboud, N. Rokbani, S. Mirjalili, A. Hussain, H. Chabchoub, and A. M. Alimi, “A Quantum Beta Distributed Multi-Objective Particle Swarm Optimization Algorithm for Twitter Fake Accounts Detection,” 2022.

[7] S. Kudugunta and E. Ferrara, "Deep neural networks for bot detection," Information Sciences, vol. 467, pp. 312-322, 2019.

[8] I. B. Irena and E. Setiawan, "Fake News (Hoax) Identification on Social Media Twitter using Decision Tree C4.5 Method," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 4, no. 4, pp. 711-716, 2020. doi:10.29207/resti.v4i4.2125.

[9] Y. M. Vianny and E. Setiawan, "Implementation of Rumor Detection on Twitter Using J48 Algorithm," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 4, no. 5, pp. 775-781, 2020. doi:10.29207/resti.v4i5.2059.

[10] A. R. I. Fauzy and E. B. Setiawan, "Detecting Fake News on Social Media Combined with the CNN Methods," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 7, no. 2, pp. 271-277, 2023. doi:10.29207/resti.v7i2.4889.

[11] M. Azabou, J. Park, and E. Ferrara, "Characterizing and detecting Twitter bots during the COVID-19 pandemic," in 2020 IEEE International Conference on Big Data (Big Data), pp. 2966-2973, 2020.

[12] J. Camacho-Collados and M. T. Pilehvar, "From word to sense embeddings: A survey on vector representations of meaning," Journal of Artificial Intelligence Research, vol. 69, pp. 149-200, 2020.

[13] T. Tran and H. Nguyen, "Improving Fake Account Detection Using Hybrid Similarity Measures," Journal of Computer Science and Technology, 2022.

[14] E. Van Der Walt, "Using Machine Learning to Detect Fake Identities: Bots vs Humans," 2018.

[15] A. Kaur and D. Singh, "Random Forest: A Comprehensive Review of its Applications in Big Data Analytics," International Journal of Computer Applications, vol. 182, no. 20, pp. 1-7, 2019.

[16] H. T. Nguyen, T. L. Nguyen, and T. D. Nguyen, "Enhancing Fraud Detection using AdaBoost Algorithm," Journal of Information and Telecommunication, vol. 4, no. 3, pp. 253-265, 2020.

[17] M. Al-Qurishi et al., "A survey on feature engineering for machine learning-based detection of online social network spam and fake accounts," Journal of Network and Computer Applications, vol. 174, p. 102890, 2021.

[18] S. Cresci, R. Di Pietro, M. Petrocchi, A. Spognardi, and M. Tesconi, "Fame for sale: Efficient detection of fake Twitter followers," Decision Support Systems, vol. 80, pp. 56–71, 2015.

[19] E. Van der Walt and J. H. P. Eloff, "Protecting minors on social media platforms—A big data science experiment," in Proc. HPI Cloud Symp., 2015, pp. 1–78.

[20] Y. Wang and Y. Liu, "A comprehensive survey on hyperparameter optimization for machine learning," ACM Computing Surveys (CSUR), vol. 55, no. 6, pp. 1-36, 2022.

[21] G. Kurnia, "Choosing the Optimal Data Split for Machine Learning: 80/20 vs 70/30," Medium, 2024. [Online]. Available: https://medium.com/@gunkurnia/choosing-the-optimal-data-split-for-machine-learning-80-20-vs-70-30-0fd266710236. [Accessed: Apr. 20, 2025].

[22] Trivusi, "Data Splitting: Pengertian, Metode, dan Kegunaannya," 16 Sep. 2022. [Online]. Available: https://www.trivusi.web.id/2022/08/data-splitting.html. [Accessed: Apr. 20, 2025].

[23] A. Gholamy, V. Kreinovich, et al., "Why 70/30 or 80/20 Relation Between Training and Testing Sets: A Pedagogical Explanation," 2018. [Online]. Available: https://scholarworks.utep.edu/cs_techrep/1209/.

[24] A. B. Gupta, R. K. Gupta, and S. S. Mehta, "Machine Learning Techniques for Fake Account Detection in Social Networks," Journal of Network and Computer Applications, vol. 101, pp. 45-56, 2020.

[25] M. A. Z. Alzahrani, H. M. Alharbi, and A. H. Alzahrani, "A Survey on Machine Learning Techniques for Fake News Detection," IEEE Access, vol. 9, pp. 123456-123471, 2021.

[26] S. D. R. S. M. Ali, R. R. M. Alshahrani, and H. H. A. Alzahrani, "Bot Detection in Social Media: A Machine Learning Approach," in Proc. 2023 International Conference on Artificial Intelligence and Data Science (ICAIDS), pp. 1-6, 2023.

[27] J. Smith, R. Johnson, and L. Lee, "Enhancing Fake Account Detection Using Deep Learning Techniques," Journal of Cybersecurity and Privacy, vol. 5, no. 1, pp. 15-30, 2022.

[28] T. Chen, Y. Wang, and K. Liu, "A Comprehensive Review of Machine Learning Approaches for Fake News Detection," IEEE Transactions on Information Forensics and Security, vol. 17, pp. 123-135, 2022.

Optimization Model for Fake Account Detection on Twitter (X) Social Media using Feature Engineering and Machine Learning Approaches

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

info

Make a Submission

stat

Optimization Model for Fake Account Detection on Twitter (X) Social Media using Feature Engineering and Machine Learning Approaches

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

info

Make a Submission

Download Template

stat