PCA and t-SNE Implementation for KNN Hypertension Classification Visualization
Abstract
Hypertension is a condition that, if allowed to increase, can significantly injure internal organs due to high blood pressure. The objective of this study is to use the K-Nearest Neighbor (KNN) algorithm along with PCA and t-SNE to accurately identify four categories of Hypertension, Normal, Hypertension, Stage 1 Hypertension, and Stage 2 Hypertension. After establishing the scope, a dataset consisting of 7,794 samples was sourced from Labuang Baji Regional General Hospital, Makassar, and contained age, weight, and systolic and diastolic blood pressure parameters. The class distribution is Normal (36.3%), Hypertension (43.12%), Stage 1 Hypertension (8.29%), and Stage 2 Hypertension (12.31%). Experimental results show that the KNN base model achieved 99% accuracy, KNN with PCA reached 100%, and KNN with t-SNE attained 99%. Cross-validation was used to evaluate model generalization, yielding accuracies of 91%, 94%, and 91%, respectively. These findings suggest that KNN, particularly when integrated with t-SNE, is highly effective in visualizing and classifying non-linear data structures. Furthermore, this study demonstrates that incorporating dimensionality reduction techniques enhances the interpretability of classified hypertension data, which is crucial for informed decision-making by mental health committees.
Downloads
References
P. Purwono, P. Dewi, S. K. Wibisono, and B. P. Dewa, “Model Prediksi Otomatis Jenis Penyakit Hipertensi dengan Pemanfaatan Algoritma Machine Learning Artificial Neural Network,” Insect (Informatics Secur. J. Tek. Inform., vol. 7, no. 2, pp. 82–90, 2022.
https://doi.org/10.33506/insect.v7i2.1828
B. L. Yudha, L. Muflikhah, and R. C. Wihandika, “Klasifikasi Risiko Hipertensi Menggunakan Metode Neighbor Weighted K- Nearest Neighbor ( NWKNN ),” J. Pengemb. Teknol. Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. 2, pp. 897–904, 2018.
https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/998.
I. Agustinus, E. Santoso, and B. Rahayudi, “Klasifikasi Risiko Hipertensi Menggunakan Metode Learning Vector Quantization (LVQ),” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 2, no. 8, pp. 2947–2955, 2018.
https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/1725.
A. Insani and H. A. Ramadhani, “Determinan Kejadian Hipertensi Berdasarkan Pola Konsumsi: Model Prediksi Dengan Sistem Skoring,” Qual. J. Kesehat., vol. 16, no. 1, pp. 9–20, 2022.
https://ejournal.poltekkesjakarta1.ac.id/index.php/adm/article/view/399
E. Martinez-Ríos, L. Montesinos, M. Alfaro-Ponce, and L. Pecchia, “A review of machine learning in hypertension detection and blood pressure estimation based on clinical and physiological data,” Biomed. Signal Process. Control, vol. 68, no. May, p. 102813, 2021.
https://doi.org/10.1016/j.bspc.2021.102813
A. Yonata, A. Satria, and P. Pratama, “Arif Satria Putra Pratama dan Ade Yonata | Hipertensi sebagai Faktor Pencetus Terjadinya Stroke Majority,” J. Major., vol. 5, no. 3, p. 17, 2016, [Online].
http://repository.lppm.unila.ac.id/id/eprint/22420
L. Muflikhah, N. Hidayat, and D. J. Hariyanto, “Prediction of hypertention drug therapy response using K-NN imputation and SVM algorithm,” Indones. J. Electr. Eng. Comput. Sci., vol. 15, no. 1, pp. 460–467, 2019.
http://doi.org/10.11591/ijeecs.v15.i1.pp460-467
Y. Sakka, D. Qarashai, and A. Altarawneh, “Predicting Hypertension using Machine Learning: A Case Study at Petra University,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 3, pp. 586–591, 2023.
https://doi.org/10.14569/IJACSA.2023.0140368
F. O. Awalullaili, D. Ispriyanti, and T. Widiharih, “Klasifikasi Penyakit Hipertensi Menggunakan Metode Svm Grid Search Dan Svm Genetic Algorithm (Ga),” J. Gaussian, vol. 11, no. 4, pp. 488–498, 2023.
https://doi.org/10.14710/j.gauss.11.4.488-498
X. Xu, Z. Xie, Z. Yang, D. Li, and X. Xu, “A t-SNE Based Classification Approach to Compositional Microbiome Data,” Front. Genet., vol. 11, no. December, pp. 1–10, 2020.
https://doi.org/10.3389/fgene.2020.620143
D. Kobak and P. Berens, “The art of using t-SNE for single-cell transcriptomics,” Nat. Commun., vol. 10, no. 1, 2019.
https://doi.org/10.1038/s41467-019-13056-x
Al Danny Rian Wibisono, Syahrul Hidayat, Humam Maulana Tsubasanofa Ramadhan, and Eva Yulia Puspaningrum, “Comparison of K-Nearest Neighbor and Decision Tree Methods using Principal Component Analysis Technique in Heart Disease Classification,” Indones. J. Data Sci., vol. 4, no. 2, pp. 90–100, 2023.
https://doi.org/10.56705/ijodas.v4i2.70
N. Hidayati and A. Hermawan, “K-Nearest Neighbor (K-NN) algorithm with Euclidean and Manhattan in classification of student graduation,” J. Eng. Appl. Technol., vol. 2, no. 2, pp. 86–91, 2021.
http://dx.doi.org/10.21831/jeatech.v2i2.42777
I. G. I. Sudipa et al., Teknik Visualisasi Data. PT. Sonpedia Publishing Indonesia, 2023.
R. Rianti, R. Andarsyah, and R. M. Awangga, “Penerapan PCA dan Algoritma Clustering untuk Analisis Mutu Perguruan Tinggi di LLDIKTI Wilayah IV,” Nuansa Inform., vol. 18, no. 2, pp. 67–77, 2024.
https://doi.org/10.22146/ijccs.65176
P. S. Rao, D. N. Malleswari, K. S. Rao, B. S. Babu, and K. Saikumar, “The Impact of PCA and t-SNE on the Predictive Accuracy of k- NN , Naive Bayes , and LDA : A Study Using the Legal Medicine Legal Medicine Dataset,” vol. 27, no. 2, pp. 68–80, 2024.
https://ijmtlm.org/index.php/journal/article/view/168
M. C. Cieslak, A. M. Castelfranco, V. Roncalli, P. H. Lenz, and D. K. Hartline, “t-Distributed Stochastic Neighbor Embedding (t-SNE): A tool for eco-physiological transcriptomic analysis,” Mar. Genomics, vol. 51, p. 100723, Jun. 2020.
https://doi.org/10.1016/j.margen.2019.100723
T. Unger et al., “2020 International Society of Hypertension Global Hypertension Practice Guidelines,” Lippincott Williams & Wilkins, vol. 75, no. 6, pp. 1334–1357, 2020.
https://doi.org/10.1161/HYPERTENSIONAHA.120.15026
M. Rahmadhani, “Faktor-Faktor Yang Mempengaruhi Terjadinya Hipertensi Pada Masyarakat Di Kampung Bedagai Kota Pinang,” J. Kedokt. STM (Sains dan Teknol. Med., vol. 4, no. 1, pp. 52–62, 2021.
https://doi.org/10.30743/stm.v4i1.132
T. K. S. Jaya, “Hubungan nilai tekanan Darah dan Frekuensi Nadi dengan Kualitas Hidup Penderita Hipertensi,” Univ. Muhammadiyyah Surakarta, vol. 1, p. 8, 2021.
http://eprints.ums.ac.id/id/eprint/93232
G. Melliya Sari, V. Eko Kurniawan, E. Puspita, and S. Devi Amalia, “Hubungan Indeks Massa Tubuh Dengan Tekanan Darah Pada Penderita Hipertensi Di Poli Jantung Rumah Sakit Husada Utama Surabaya,” Prima Wiyata Heal., vol. 4, no. 1, pp. 47–63, 2023.
https://doi.org/10.60050/pwh.v4i1.39
F. Fantin, A. Giani, E. Zoico, A. P. Rossi, G. Mazzali, and M. Zamboni, “Weight loss and hypertension in obese subjects,” Nutrients, vol. 11, no. 7, 2019.
https://doi.org/10.3390/nu11071667
Z. Thakker and B. Harshadkant, “Effect of Feature Scaling Pre-processing Techniques on Machine Learning Algorithms to Predict Particulate Matter Concentration for Gandhinagar, Gujarat, India,” Int. J. Sci. Res. Sci. Technol., pp. 410–419, 2024.
https://doi.org//10.32628/IJSRST52411150
D. B. G. N. Singh and A. Bandyopadhyay, “Robust estimation strategy for handling outliers,” Commun. Stat. - Theory Methods, vol. 0, no. 0, pp. 1–20, 2023.
https://doi.org/10.1080/03610926.2023.2218567
A. R. Isnain, J. Supriyanto, and M. P. Kharisma, “Implementation of K-Nearest Neighbor (K-NN) Algorithm For Public Sentiment Analysis of Online Learning,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 15, no. 2, p. 121, 2021.
https://doi.org/10.22146/ijccs.65176
J. Bergler-Klein, “What’s new in the ESC 2018 guidelines for arterial hypertension: The ten most important messages,” Wien. Klin. Wochenschr., vol. 131, no. 7–8, pp. 180–185, 2019.
https://doi.org/10.1007/s00508-018-1435-8
F. Ardiansyah, F. Hamdan, S. Sugiyanto, and I. Wahyu Siadi, “Klasifikasi Customer Relationship Management Menggunakan Dataset KDD Cup 2009 dengan Teknik Reduksi Dimensi,” Komputika J. Sist. Komput., vol. 11, no. 2, pp. 193–202, 2022.
https://doi.org/10.34010/komputika.v11i2.6498
D. N. Aini, B. Oktavianti, M. J. Husain, D. A. Sabillah, S. T. Rizaldi, and M. Mustakim, “Seleksi Fitur untuk Prediksi Hasil Produksi Agrikultur pada Algoritma K-Nearest Neighbor (KNN),” J. Sist. Komput. dan Inform., vol. 4, no. 1, p. 140, 2022.
https://doi.org/10.30865/json.v4i1.4813
M. Rizky Adriansyah, M. Reza Faisal, A. Gafur, R. Adi Nugroho, I. Budiman, and M. Muliadi, “Implementasi Reduksi Fitur t-SNE Pada Clustering Gambar Head shape Nematoda,” J. Komputasi, vol. 10, no. 1, pp. 54–64, 2022.
https://doi.org/10.23960/komputasi.v10i1.2963
B. Firmanto, H. Soekotjo, and H. Suyono, “Perbandingan Kinerja Algoritma Promethee Dan Topsis Untuk Pemilihan Guru Teladan,” J. Penelit. Pendidik. IPA, vol. 2, no. 1, 2016.
https://www.academia.edu/download/113421197/31.pdf
B. M. S. Hasan and A. M. Abdulazeez, “A Review of Principal Component Analysis Algorithm for Dimensionality Reduction,” J. Soft Comput. Data Min., vol. 2, no. 1, pp. 20–30, 2021.
https://doi.org/10.30880/jscdm.2021.02.01.003
A. Platzer, “Visualization of SNPs with t-SNE,” PLoS One, vol. 8, no. 2, 2013, doi: 10.1371/journal.pone.0056883.
https://doi.org/10.1371/journal.pone.0056883
H. Hafid, “Penerapan K-Fold Cross Validation untuk Menganalisis Kinerja Algoritma K-Nearest Neighbor pada Data Kasus Covid-19 di Indonesia,” J. Math., vol. 6, no. 2, pp. 161–168, 2023, [Online].
https://doi.org/10.35580/jmathcos.v6i2.53043
H. Azis, P. Purnawansyah, F. Fattah, and I. P. Putri, “Performa Klasifikasi K-NN dan Cross Validation Pada Data Pasien Pengidap Penyakit Jantung,” Ilk. J. Ilm., vol. 12, no. 2, pp. 81–86, 2020.
https://doi.org/10.33096/ilkom.v12i2.507.81-86
S. Dewi and M. A. I. Pakereng, “Implementasi Principal Component Analysis Pada K-Means Untuk Klasterisasi Tingkat Pendidikan Penduduk Kabupaten Semarang,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 8, no. 4, pp. 1186–1195, 2023.
https://doi.org/10.29100/jipi.v8i4.4101
S. Abimanyu, N. Bahtiar, and E. Adi Sarwoko, “Implementasi Metode Support Vector Machine (SVM) dan t-Distributed Stochastic Neighbor Embedding (t-SNE) untuk Klasifikasi Depresi,” J. Masy. Inform., vol. 14, no. 2, pp. 146–158, 2023.
https://doi.org/10.14710/jmasif.14.2.59513
D. D. W, “Dimensionality Reduction : LDA, PCA, t-SNE,” Medium, 2021.
https://medium.com/analytics-vidhya/dimensionality-reduction-pca-vs-lda-vs-t-sne-681636bc686
Sachsoni, “Mastering t-SNE(t-distributed stochastic neighbor embedding),” Medium, 2024.
Unknown, “t-SNE and PCA: Two powerful tools for data exploration,” Fabrizio Musacchio, 2023. https://www.fabriziomusacchio.com/blog/2023-06-12-tsne_vs_pca/ (accessed May 14, 2024).
A. Chawla, “The Ultimate Comparison Between PCA and t-SNE Algorithm,” Daily Dose of Data Science, 2023. https://blog.dailydoseofds.com/p/the-ultimate-comparison-between-pca.
Copyright (c) 2025 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;