Discrete Wavelet Transform (DWT) and Random Forest for Cancer Detection Based on Microarray Data Classification
Main Article Content
Abstract
Cancer is one of the leading causes of death worldwide. According to the World Health Organization (WHO), in 2018, about 9.6 million deaths caused by cancer. DNA microarray technology has played an important role in analyzing and diagnosing cancer. The accuracy resulting from the classification of Random Forests is not optimal because microarrays have large dimensional data. Therefore, it is necessary to reduce the dimensions of the Discrete Wavelet Transform (DWT) as a feature to reduce dimensions and increase accuracy in microarray data. Based on the simulation, the dimension can be reduced and improve the accuracy of classification up to 8% - 20%. DWT approximation coefficient can improve accuracy better than detailed coefficients for data on colon cancer 100%, lung cancer 100%, ovarian 100%, prostate tumor 80%, and central nervous system 83.33%.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
References
[2] World Health Organization, “Cancer Factsheets,” World Health Organization, 2018. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/cancer. [Accessed: 19-Sep-2019].
[3] Adiwijaya, U. N. Wisesty, E. Lisnawati, A. Aditsania, and D. S. Kusumo, “Dimensionality reduction using Principal Component Analysis for cancer detection based on microarray data classification,” J. Comput. Sci., vol. 14, no. 11, pp. 1521–1530, 2018.
[4] W. Yip, S. B. Amin, and C. Li, Handbook of Statistical Bioinformatics. 2011.
[5] W. Astuti and A. Adiwijaya, “Principal Component Analysis Sebagai Ekstraksi Fitur Data Microarray Untuk Deteksi Kanker Berbasis Linear Discriminant Analysis,” J. Media Inform. Budidarma, vol. 3, no. 2, pp. 72–77, 2019.
[6] R. Nurviarelda, A. A. Rohmawati, F. Informatika, U. Telkom, F. Informatika, and U. Telkom, “Klasifikasi Data Microarray Menggunakan Discrete Wavelet Transform Dan Naive Bayes Classification “, vol. 5, no. 1, pp. 1536–1540, 2018.
[7] Adiwijaya, “Deteksi Kanker Berdasarkan Klasifikasi Microarray Data,” Media Inform. Budidarma, vol. 2, no. 4, pp. 181–186, 2018.
[8] K. Moorthy and M. S. Mohammad, “Random forest for gene selection and microarray data classification,” no. July, 2013.
[9] H. Aydadenta and Adiwijaya, “A clustering approach for feature selection in microarray data classification using random forest,” J. Inf. Process. Syst., vol. 14, no. 5, pp. 1167–1175, 2018.
[10] L. Breiman, “Random Forest Draft,” pp. 1–33, 2001.
[11] D. H. Mazumder and R. Veilumuthu, “An Enhanced Gene Selection Methodology for Effective Microarray Cancer Data Classification,” Int. J. Simul. Syst. Sci. Technol., pp. 1–7, 2018.
[12] Khadijah and H. S., “Klasifikasi Data Microarray Menggunakan Discrete Wavelet Transform dan Extreme Learning Machine,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 9, no. 1, pp. 33–42, 2015.
[13] Y. Liu, “Detect key gene information in classification of microarray data,” EURASIP J. Adv. Signal Process., vol. 2008, 2008.
[14] J. Bennet, C. A. Ganaprakasam, and K. Arputharaj. “A Discrete Wavelet based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis”. Anna University, Department of Computer Science and Engineering, 2014
[15] P. Liashchynskyi, “Grid Searh, Random Search, Genetic Algorithm : A Big Comparison for NAS”, Cornell University, 2019.
[16] M.D. Purbolaksono, K. C. Widiastuti, Adiwijaya, M. S. Mubarok, and F. A. Ma’ruf. Implementation of mutual information and bayes theorem for classification microarray data. In Journal of Physics: Conference Series, vol. 971, no. 1, p. 012011. IOP Publishing, 2018.
[17] I. Damayana, R. D. Atmaja, and H. Fauzi, “Menggunakan Wevelet Transform Detection of Skin Cancer Melanoma Based on Digital Image,” Deteksi Kanker Kulit Melanoma Berbas. Pengolah. Citra Menggunakan Wevelet Transform, vol. 3, no. 3, pp. 4718–4723, 2016.
[18] Ma’ruf, Firda Aminy, and Untari Novia Wisesty. "Analysis of the influence of Minimum Redundancy Maximum Relevance as dimensionality reduction method on cancer classification based on microarray data using Support Vector Machine classifier." In Journal of Physics: Conference Series, vol. 1192, no. 1, p. 012011. IOP Publishing, 2019.
[19] M. Yusa, Ema Utami and Emha T.Luthi. “Analisis Komparatif Evaluasi Performa Algoritma Klasifikasi pada Readmisi Pasien Diabetes.” In Journal of Buana Informatika, vol. 7, no. 4, 2016.
[20] Effendy, V., Adiwijaya, and Baizal, Z.A., 2014, May. Handling imbalanced data in customer Effendy, Veronikha, and ZK Abdurahman Baizal. "Handling imbalanced data in customer churn prediction using combined sampling and weighted random forest." 2014 2nd International Conference on Information and Communication Technology (ICoICT). IEEE, 2014.
[21] Mabarti, I., Aditsania, A., "Implementation of Minimum Redundancy Maximum Relevance (MRMR) and Genetic Algorithm (GA) for Microarray Data Classification with C4.5 Decision Tree". Journal of Data Science and Its Applications, 3(1), 2020.
[22] 1. Rohmawati A., Adiwijaya, 2017. A Daubechies Wavelet Transformation to Optimize Modeling Calibration of Active Compound on Drug Plants. In 5th International Conference on Information and Communication Technology (ICoICT). Pp.1-4. IEE
[23] Adiwijaya, Maharani, M., Dewi, B.K., Yulianto, F.A. and Purnama, B., 2013. digital image compression using graph coloring quantization based on wavelet-SVD. In Journal of Physics: Conference Series (Vol. 423, No. 1, p. 012019). IOP Publishing.
[24] Daeli, N.O.F, Adiwijaya. Sentiment analysis on movie reviews using Information gain and K-nearest neighbor. Journal of Data Science and Its Applications, 3(1), 2020
[25] Purnomoputra, Riko Bintang, Adiwijaya Adiwijaya, and Untari Novia Wisesty. "Sentiment Analysis of Movie Review using Naïve Bayes Method with Gini Index Feature Selection." Journal of Data Science and Its Applications 2(2) pp. 85-94. 2019