Classification Model for Bot-IoT Attack Detection  Using Correlation and Analysis of Variance

Firgiawan Faira; Dandy Pramana Hostiadi; Roy Rudolf Huizen

doi:10.29207/resti.v9i2.6332

Firgiawan Faira Institut Teknologi dan Bisnis STIKOM Bali
Dandy Pramana Hostiadi Institut Teknologi dan Bisnis STIKOM Bali
Roy Rudolf Huizen Institut Teknologi dan Bisnis STIKOM Bali,

DOI: https://doi.org/10.29207/resti.v9i2.6332

Keywords: Aggregate Data, ANOVA, Bot-IoT, Pearson Correlation, Classification

Abstract

Industry 4.0 requires secure networks as the advancements in IoT and AI exacerbate the challenges and vulnerabilities in data security. This research focuses on detecting Bot-IoT activity using the Bot-IoT UNSW Canberra 2018 dataset. The dataset initially showed a significant imbalance, with 2,934,447 entries of attack activity and only 370 entries of normal activity. To address this imbalance, an innovative data aggregation technique was applied, effectively reducing similar patterns and trends. This approach resulted in a balanced dataset consisting of 8 attack activity points and 80 normal activity points. Feature selection using the ANOVA method identified 10 key features from a total of 17: seq, stddev, N_IN_Conn_P_SrcIP, min, state_number, mean, N_IN_Conn_P_DstIP, drate, srate, and max. The classification process utilized Random Forest, k-NN, Naïve Bayes, and Decision Tree algorithms, with 100 iterations and an 80:20 training-testing split. Random Forest showed superior performance, achieving 97.5% accuracy, 97.4% precision, and 97.4% recall, with a total computation time of 11.54 seconds. Pearson correlation analysis revealed a strong positive correlation (+0.937) between N_IN_Conn_P_DstIP and seq, as well as a weak negative correlation (-0.224) between N_IN_Conn_P_SrcIP and state_number. The novelty of this research lies in the application of a data aggregation technique to address class imbalance, significantly improving machine learning model performance and optimizing training time. These findings contribute to the development of robust cybersecurity systems to effectively detect IoT-related threats.

Downloads

Download data is not yet available.

References

L. L. Dhirani, E. Armstrong, and T. Newe, “Industrial iot, cyber threats, and standards landscape: Evaluation and roadmap,” Sensors, vol. 21, no. 11. 2021. doi: 10.3390/s21113901.

S. A. Rahman, H. Tout, C. Talhi, and A. Mourad, “Internet of Things intrusion Detection: Centralized, On-Device, or Federated Learning?,” IEEE Netw., vol. 34, no. 6, pp. 310–317, 2020, doi: 10.1109/MNET.011.2000286.

S. Lee, A. Abdullah, N. Jhanjhi, and S. Kok, “Classification of botnet attacks in IoT smart factory using honeypot combined with machine learning,” PeerJ Comput. Sci., vol. 7, 2021, doi: 10.7717/PEERJ-CS.350.

M. H. Nasir, J. Arshad, and M. M. Khan, “Collaborative device-level botnet detection for internet of things,” Comput. Secur., vol. 129, p. 103172, 2023, doi: 10.1016/j.cose.2023.103172.

M. A. R. Putra, T. Ahmad, and D. P. Hostiadi, “Analysis of Botnet Attack Communication Pattern Behavior on Computer Networks,” Int. J. Intell. Eng. Syst., vol. 15, no. 4, pp. 533–544, 2022, doi: 10.22266/ijies2022.0831.48.

M. Safaei Pour, C. Nader, K. Friday, and E. Bou-Harb, “A Comprehensive Survey of Recent Internet Measurement Techniques for Cyber Security,” Comput. Secur., vol. 128, p. 103123, 2023, doi: 10.1016/j.cose.2023.103123.

M. A. R. Putra, D. P. Hostiadi, and T. Ahmad, “Simultaneous Botnet Dataset Generator: A simulation tool for generating a botnet dataset with simultaneous attack characteristic[Formula presented],” Softw. Impacts, vol. 14, 2022, doi: 10.1016/j.simpa.2022.100441.

W. A. Safitri, T. Ahmad, and D. P. Hostiadi, “Analyzing Machine Learning-based Feature Selection for Botnet Detection,” 2022 1st Int. Conf. Inf. Syst. Inf. Technol. ICISIT 2022, pp. 386–391, 2022, doi: 10.1109/ICISIT54091.2022.9872812.

P. Jithu, J. Shareena, A. Ramdas, and A. P. Haripriya, “Intrusion Detection System for IOT Botnet Attacks Using Deep Learning,” SN Comput. Sci., vol. 2, no. 3, 2021, doi: 10.1007/s42979-021-00516-9.

D. P. Hostiadi, Y. P. Atmojo, R. R. Huizen, I. M. D. Susila, G. A. Pradipta, and I. M. Liandana, “A New Approach Feature Selection for Intrusion Detection System Using Correlation Analysis,” 2022 4th Int. Conf. Cybern. Intell. Syst. ICORIS 2022, 2022, doi: 10.1109/ICORIS56080.2022.10031468.

C. E. Beckerman, “Is there a cyber security dilemma?,” J. Cybersecurity, vol. 8, no. 1, pp. 1–14, 2022, doi: 10.1093/cybsec/tyac012.

I. Kerrakchou, A. A. El Hassan, S. Chadli, M. Emharraf, and M. Saber, “Selection of efficient machine learning algorithm on Bot-IoT dataset for intrusion detection in internet of things networks,” Indones. J. Electr. Eng. Comput. Sci., vol. 31, no. 3, pp. 1784–1793, 2023, doi: 10.11591/ijeecs.v31.i3.pp1784-1793.

D. P. Hostiadi, T. Ahmad, and W. Wibisono, “A New Approach of Botnet Activity Detection Model based on Time Periodic Analysis,” CENIM 2020 - Proceeding Int. Conf. Comput. Eng. Network, Intell. Multimed. 2020, no. Cenim, pp. 315–320, 2020, doi: 10.1109/CENIM51130.2020.9297846.

Z. Halim et al., “An effective genetic algorithm-based feature selection method for intrusion detection systems,” Comput. Secur., vol. 110, p. 102448, 2021, doi: 10.1016/j.cose.2021.102448.

X. Liu and Y. Du, “Towards Effective Feature Selection for IoT Botnet Attack Detection Using a Genetic Algorithm,” Electron., vol. 12, no. 5, 2023, doi: 10.3390/electronics12051260.

N. Koroniotis, N. Moustafa, E. Sitnikova, and B. Turnbull, “Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset,” Futur. Gener. Comput. Syst., vol. 100, pp. 779–796, 2019, doi: 10.1016/j.future.2019.05.041.

F. Taher, M. Abdel-Salam, M. Elhoseny, and I. M. El-Hasnony, “Reliable Machine Learning Model for IIoT Botnet Detection,” IEEE Access, vol. 11, 2023, doi: 10.1109/ACCESS.2023.3253432.

M. A. R. Putra, U. L. Yuhana, T. Ahmad, and D. P. Hostiadi, “Analyzing the Effect of Network Traffic Segmentation on the Accuracy of Botnet Activity Detection,” Proceeding Int. Conf. Comput. Eng. Netw. Intell. Multimedia, CENIM 2022, pp. 321–326, 2022, doi: 10.1109/CENIM56801.2022.10037365.

D. P. Hostiadi, T. Ahmad, M. A. R. Putra, G. A. Pradipta, P. D. W. Ayu, and M. Liandana, “A New Approach of Botnet Activity Detection Models Using Combination of Univariate and ANOVA Feature Selection Techniques,” Int. J. Intell. Eng. Syst., vol. 17, no. 3, pp. 485–502, 2024, doi: 10.22266/ijies2024.0630.38.

M. Matsumoto, A. S. M. Miah, N. Asai, and J. Shin, “Machine Learning-Based Differential Diagnosis of Parkinson’s Disease Using Kinematic Feature Extraction and Selection,” pp. 1–15, 2025, [Online]. Available: http://arxiv.org/abs/2501.02014

H. Pan, X. You, S. Liu, and D. Zhang, “Pearson correlation coefficient-based pheromone refactoring mechanism for multi-colony ant colony optimization,” Appl. Intell., vol. 51, no. 2, pp. 752–774, 2021, doi: 10.1007/s10489-020-01841-x.

F. H. Moh’d, K. A. Notodiputro, and Y. Angraini, “Enhancing interpretability in random forest: Leveraging inTrees for association rule extraction insights,” IAES Int. J. Artif. Intell., vol. 13, no. 4, pp. 4054–4061, 2024, doi: 10.11591/ijai.v13.i4.pp4054-4061.

P. R. Maidamwar, P. P. Lokulwar, and K. Kumar, “Ensemble Learning Approach for Classification of Network Intrusion Detection in IoT Environment,” Int. J. Comput. Netw. Inf. Secur., vol. 15, no. 3, 2023, doi: 10.5815/ijcnis.2023.03.03.

A. Agarwal, P. Sharma, M. Alshehri, A. A. Mohamed, and O. Alfarraj, “Classification model for accuracy and intrusion detection using machine learning approach,” PeerJ Comput. Sci., vol. 7, pp. 1–22, 2021, doi: 10.7717/PEERJ-CS.437.

T. H. H. Aldhyani and H. Alkahtani, “Artificial Intelligence Algorithm-Based Economic Denial of Sustainability Attack Detection Systems: Cloud Computing Environments,” Sensors, vol. 22, no. 13, 2022, doi: 10.3390/s22134685.

B. Charbuty and A. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 20–28, 2021, doi: 10.38094/jastt20165.

M. Panda, A. A. A. Mousa, and A. E. Hassanien, “Developing an Efficient Feature Engineering and Machine Learning Model for Detecting IoT-Botnet Cyber Attacks,” IEEE Access, vol. 9, 2021, doi: 10.1109/ACCESS.2021.3092054.

K. Alissa, T. Alyas, K. Zafar, Q. Abbas, N. Tabassum, and S. Sakib, “Botnet Attack Detection in IoT Using Machine Learning,” Comput. Intell. Neurosci., vol. 2022, 2022, doi: 10.1155/2022/4515642.

S. Vanitha and P. Balasubramanie, “Improved Ant Colony Optimization and Machine Learning Based Ensemble Intrusion Detection Model,” Intell. Autom. Soft Comput., vol. 36, no. 1, 2023, doi: 10.32604/iasc.2023.032324.

M. A. R. Putra, T. Ahmad, D. P. Hostiadi, R. M. Ijtihadie, and P. Maniriho, “Botnet Attack Analysis through Graph Visualization,” Int. J. Intell. Eng. Syst., vol. 17, no. 1, 2024, doi: 10.22266/ijies2024.0229.75.

Q. Long, “A Bayesian explanation of machine learning models based on modes and functional ANOVA,” 2024, [Online]. Available: http://arxiv.org/abs/2411.02746

S. El Hajla, E. M. Ennaji, Y. Maleh, and S. Mounir, “Enhancing IoT network defense: advanced intrusion detection via ensemble learning techniques,” Indones. J. Electr. Eng. Comput. Sci., vol. 35, no. 3, pp. 2010–2020, 2024, doi: 10.11591/ijeecs.v35.i3.pp2010-2020.

R. Sistem, “JURNAL RESTI Comparison of Machine Learning Algorithms in Detecting Tea Leaf,” vol. 5, no. 158, pp. 6–12, 2024.

S. Pokhrel, R. Abbas, and B. Aryal, “IoT Security: Botnet detection in IoT using Machine learning,” no. April, 2021, doi: 10.48550/arXiv.2104.02231.