Classification Model for Bot-IoT Attack Detection Using Correlation and Analysis of Variance
Abstract
Industry 4.0 requires secure networks as the advancements in IoT and AI exacerbate the challenges and vulnerabilities in data security. This research focuses on detecting Bot-IoT activity using the Bot-IoT UNSW Canberra 2018 dataset. The dataset initially showed a significant imbalance, with 2,934,447 entries of attack activity and only 370 entries of normal activity. To address this imbalance, an innovative data aggregation technique was applied, effectively reducing similar patterns and trends. This approach resulted in a balanced dataset consisting of 8 attack activity points and 80 normal activity points. Feature selection using the ANOVA method identified 10 key features from a total of 17: seq, stddev, N_IN_Conn_P_SrcIP, min, state_number, mean, N_IN_Conn_P_DstIP, drate, srate, and max. The classification process utilized Random Forest, k-NN, Naïve Bayes, and Decision Tree algorithms, with 100 iterations and an 80:20 training-testing split. Random Forest showed superior performance, achieving 97.5% accuracy, 97.4% precision, and 97.4% recall, with a total computation time of 11.54 seconds. Pearson correlation analysis revealed a strong positive correlation (+0.937) between N_IN_Conn_P_DstIP and seq, as well as a weak negative correlation (-0.224) between N_IN_Conn_P_SrcIP and state_number. The novelty of this research lies in the application of a data aggregation technique to address class imbalance, significantly improving machine learning model performance and optimizing training time. These findings contribute to the development of robust cybersecurity systems to effectively detect IoT-related threats.
Downloads
References
L. L. Dhirani, E. Armstrong, and T. Newe, “Industrial iot, cyber threats, and standards landscape: Evaluation and roadmap,” Sensors, vol. 21, no. 11. 2021. doi: 10.3390/s21113901.
S. A. Rahman, H. Tout, C. Talhi, and A. Mourad, “Internet of Things intrusion Detection: Centralized, On-Device, or Federated Learning?,” IEEE Netw., vol. 34, no. 6, pp. 310–317, 2020, doi: 10.1109/MNET.011.2000286.
S. Lee, A. Abdullah, N. Jhanjhi, and S. Kok, “Classification of botnet attacks in IoT smart factory using honeypot combined with machine learning,” PeerJ Comput. Sci., vol. 7, 2021, doi: 10.7717/PEERJ-CS.350.
M. H. Nasir, J. Arshad, and M. M. Khan, “Collaborative device-level botnet detection for internet of things,” Comput. Secur., vol. 129, p. 103172, 2023, doi: 10.1016/j.cose.2023.103172.
M. A. R. Putra, T. Ahmad, and D. P. Hostiadi, “Analysis of Botnet Attack Communication Pattern Behavior on Computer Networks,” Int. J. Intell. Eng. Syst., vol. 15, no. 4, pp. 533–544, 2022, doi: 10.22266/ijies2022.0831.48.
M. Safaei Pour, C. Nader, K. Friday, and E. Bou-Harb, “A Comprehensive Survey of Recent Internet Measurement Techniques for Cyber Security,” Comput. Secur., vol. 128, p. 103123, 2023, doi: 10.1016/j.cose.2023.103123.
M. A. R. Putra, D. P. Hostiadi, and T. Ahmad, “Simultaneous Botnet Dataset Generator: A simulation tool for generating a botnet dataset with simultaneous attack characteristic[Formula presented],” Softw. Impacts, vol. 14, 2022, doi: 10.1016/j.simpa.2022.100441.
W. A. Safitri, T. Ahmad, and D. P. Hostiadi, “Analyzing Machine Learning-based Feature Selection for Botnet Detection,” 2022 1st Int. Conf. Inf. Syst. Inf. Technol. ICISIT 2022, pp. 386–391, 2022, doi: 10.1109/ICISIT54091.2022.9872812.
P. Jithu, J. Shareena, A. Ramdas, and A. P. Haripriya, “Intrusion Detection System for IOT Botnet Attacks Using Deep Learning,” SN Comput. Sci., vol. 2, no. 3, 2021, doi: 10.1007/s42979-021-00516-9.
D. P. Hostiadi, Y. P. Atmojo, R. R. Huizen, I. M. D. Susila, G. A. Pradipta, and I. M. Liandana, “A New Approach Feature Selection for Intrusion Detection System Using Correlation Analysis,” 2022 4th Int. Conf. Cybern. Intell. Syst. ICORIS 2022, 2022, doi: 10.1109/ICORIS56080.2022.10031468.
C. E. Beckerman, “Is there a cyber security dilemma?,” J. Cybersecurity, vol. 8, no. 1, pp. 1–14, 2022, doi: 10.1093/cybsec/tyac012.
I. Kerrakchou, A. A. El Hassan, S. Chadli, M. Emharraf, and M. Saber, “Selection of efficient machine learning algorithm on Bot-IoT dataset for intrusion detection in internet of things networks,” Indones. J. Electr. Eng. Comput. Sci., vol. 31, no. 3, pp. 1784–1793, 2023, doi: 10.11591/ijeecs.v31.i3.pp1784-1793.
D. P. Hostiadi, T. Ahmad, and W. Wibisono, “A New Approach of Botnet Activity Detection Model based on Time Periodic Analysis,” CENIM 2020 - Proceeding Int. Conf. Comput. Eng. Network, Intell. Multimed. 2020, no. Cenim, pp. 315–320, 2020, doi: 10.1109/CENIM51130.2020.9297846.
Z. Halim et al., “An effective genetic algorithm-based feature selection method for intrusion detection systems,” Comput. Secur., vol. 110, p. 102448, 2021, doi: 10.1016/j.cose.2021.102448.
X. Liu and Y. Du, “Towards Effective Feature Selection for IoT Botnet Attack Detection Using a Genetic Algorithm,” Electron., vol. 12, no. 5, 2023, doi: 10.3390/electronics12051260.
N. Koroniotis, N. Moustafa, E. Sitnikova, and B. Turnbull, “Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset,” Futur. Gener. Comput. Syst., vol. 100, pp. 779–796, 2019, doi: 10.1016/j.future.2019.05.041.
F. Taher, M. Abdel-Salam, M. Elhoseny, and I. M. El-Hasnony, “Reliable Machine Learning Model for IIoT Botnet Detection,” IEEE Access, vol. 11, 2023, doi: 10.1109/ACCESS.2023.3253432.
M. A. R. Putra, U. L. Yuhana, T. Ahmad, and D. P. Hostiadi, “Analyzing the Effect of Network Traffic Segmentation on the Accuracy of Botnet Activity Detection,” Proceeding Int. Conf. Comput. Eng. Netw. Intell. Multimedia, CENIM 2022, pp. 321–326, 2022, doi: 10.1109/CENIM56801.2022.10037365.
D. P. Hostiadi, T. Ahmad, M. A. R. Putra, G. A. Pradipta, P. D. W. Ayu, and M. Liandana, “A New Approach of Botnet Activity Detection Models Using Combination of Univariate and ANOVA Feature Selection Techniques,” Int. J. Intell. Eng. Syst., vol. 17, no. 3, pp. 485–502, 2024, doi: 10.22266/ijies2024.0630.38.
M. Matsumoto, A. S. M. Miah, N. Asai, and J. Shin, “Machine Learning-Based Differential Diagnosis of Parkinson’s Disease Using Kinematic Feature Extraction and Selection,” pp. 1–15, 2025, [Online]. Available: http://arxiv.org/abs/2501.02014
H. Pan, X. You, S. Liu, and D. Zhang, “Pearson correlation coefficient-based pheromone refactoring mechanism for multi-colony ant colony optimization,” Appl. Intell., vol. 51, no. 2, pp. 752–774, 2021, doi: 10.1007/s10489-020-01841-x.
F. H. Moh’d, K. A. Notodiputro, and Y. Angraini, “Enhancing interpretability in random forest: Leveraging inTrees for association rule extraction insights,” IAES Int. J. Artif. Intell., vol. 13, no. 4, pp. 4054–4061, 2024, doi: 10.11591/ijai.v13.i4.pp4054-4061.
P. R. Maidamwar, P. P. Lokulwar, and K. Kumar, “Ensemble Learning Approach for Classification of Network Intrusion Detection in IoT Environment,” Int. J. Comput. Netw. Inf. Secur., vol. 15, no. 3, 2023, doi: 10.5815/ijcnis.2023.03.03.
A. Agarwal, P. Sharma, M. Alshehri, A. A. Mohamed, and O. Alfarraj, “Classification model for accuracy and intrusion detection using machine learning approach,” PeerJ Comput. Sci., vol. 7, pp. 1–22, 2021, doi: 10.7717/PEERJ-CS.437.
T. H. H. Aldhyani and H. Alkahtani, “Artificial Intelligence Algorithm-Based Economic Denial of Sustainability Attack Detection Systems: Cloud Computing Environments,” Sensors, vol. 22, no. 13, 2022, doi: 10.3390/s22134685.
B. Charbuty and A. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 20–28, 2021, doi: 10.38094/jastt20165.
M. Panda, A. A. A. Mousa, and A. E. Hassanien, “Developing an Efficient Feature Engineering and Machine Learning Model for Detecting IoT-Botnet Cyber Attacks,” IEEE Access, vol. 9, 2021, doi: 10.1109/ACCESS.2021.3092054.
K. Alissa, T. Alyas, K. Zafar, Q. Abbas, N. Tabassum, and S. Sakib, “Botnet Attack Detection in IoT Using Machine Learning,” Comput. Intell. Neurosci., vol. 2022, 2022, doi: 10.1155/2022/4515642.
S. Vanitha and P. Balasubramanie, “Improved Ant Colony Optimization and Machine Learning Based Ensemble Intrusion Detection Model,” Intell. Autom. Soft Comput., vol. 36, no. 1, 2023, doi: 10.32604/iasc.2023.032324.
M. A. R. Putra, T. Ahmad, D. P. Hostiadi, R. M. Ijtihadie, and P. Maniriho, “Botnet Attack Analysis through Graph Visualization,” Int. J. Intell. Eng. Syst., vol. 17, no. 1, 2024, doi: 10.22266/ijies2024.0229.75.
Q. Long, “A Bayesian explanation of machine learning models based on modes and functional ANOVA,” 2024, [Online]. Available: http://arxiv.org/abs/2411.02746
S. El Hajla, E. M. Ennaji, Y. Maleh, and S. Mounir, “Enhancing IoT network defense: advanced intrusion detection via ensemble learning techniques,” Indones. J. Electr. Eng. Comput. Sci., vol. 35, no. 3, pp. 2010–2020, 2024, doi: 10.11591/ijeecs.v35.i3.pp2010-2020.
R. Sistem, “JURNAL RESTI Comparison of Machine Learning Algorithms in Detecting Tea Leaf,” vol. 5, no. 158, pp. 6–12, 2024.
S. Pokhrel, R. Abbas, and B. Aryal, “IoT Security: Botnet detection in IoT using Machine learning,” no. April, 2021, doi: 10.48550/arXiv.2104.02231.
Copyright (c) 2025 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;