Recursive feature elimination optimization using shapley additive explanations in software defect prediction with lightgbm classification
Main Article Content
Abstract
Software defect refers to issues where the software does not function properly. The mistakes in the software development process are the reasons for software defects. Software defect prediction is performed to ensure the software is defect-free. Machine learning classification is used to classify defects in software. To improve the classification model, it is necessary to select the best features from the dataset. Recursive Feature Elimination (RFE) is a feature selection method. Shapley Additive Explanations (SHAP) is a method that can optimize feature selection algorithms to produce better results. In this research, the popular boosting algorithm LightGBM will be selected as a classifier to predict software defects. Meanwhile, RFE-SHAP will be used for feature selection to identify the best subset of features. The results and discussion show that RFE-SHAP feature selection slightly outperforms RFE, with average AUC values of 0.864 and 0.858, respectively. Moreover, RFE-SHAP produces more significant results in feature selection compared to RFE. The RFE feature selection T-Test results are Pvalue = 0.039 < α = 0.05 and tcount = 3.011 > ttable = 2.776. On the contrary, the RFE-SHAP feature selection T-Test results are Pvalue = 0.000 < α = 0.05 and tcount = 11.91 > ttable = 2.776.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work