Solar Radiation Prediction using Long Short-Term Memory with Handling of Missing Values and Outliers

Main Article Content

Alfin Syarifuddin Syahab
MS Hendriyawan Achamd

Abstract

The pyranometer sensor is an instrument for measuring Global Horizontal Irradiance (GHI) which is used as parameter for analyzing and predicting weather. GHI data which is processed into prediction model for photovoltaics is useful for determining the performance of solar power generation systems in distributed energy operations. However, GHI sensor data has weaknesses in missing values and outliers due to measurement errors. The research designed a GHI sensor data prediction model using data preprocessing by the imputation of missing values using linear, polynomial, and Piecewise Cubic Hermite Interpolating Polynomials (PCHIP) interpolation and eliminating outliers using Random Sample Consensus (RANSAC) on the dataset. Previous researches show that Long Short-Time Memory (LSTM) can improve the performance of predictions compared to machine learning. This research designs an LSTM prediction model with data preprocessing and without data preprocessing. The results of the imputation of missing values obtained the best performance in PCHIP with Mean Absolute Error (MAE) 39.708 W/m2, Root Mean Absolute Error (RMSE) 76.224 W/m2, Normalized Root Mean Absolute Error (NRMSE) 0.433, and Coefficient Determination (R2) 0.903 then imputation from outlier elimination obtained MAE 44.377 W/m2, RMSE 86.738 W/m2, NRMSE 0.500, and R2 0.886. RANSAC testing succeeded in eliminating 100% outliers. The results of LSTM with data preprocessing obtained better performance with the best evaluations on MAE, RMSE, NRMSE, and R2 for test data of 42.863 W/m2, 82.396 W/m2, 0.396 and 0.918. This study contributes to GHI prediction model that can handle missing values ​​and outliers from sensors to support solar power plants.

Downloads

Download data is not yet available.

Article Details

How to Cite
[1]
A. Syahab and M. H. Achamd, “Solar Radiation Prediction using Long Short-Term Memory with Handling of Missing Values and Outliers”, INFOTEL, vol. 16, no. 4, pp. 758-777, Dec. 2024.
Section
Informatics

References

[1] W. Sawadogo et al., “Hourly Global Horizontal Irradiance over West Africa: A Case Study of One-year Satellite- and Reanalysis-derived Estimates vs. In Situ Measurements,” Renew. Energy, vol. 216, Nov. 2023, doi: 10.1016/j.renene.2023.119066.
[2] S. Rahman, S. Rahman, and A. K. M. Bahalul Haque, “Prediction of Solar Radiation Using Artificial Neural Network,” J. Phys. Conf. Ser., vol. 1767, no. 1, 2021, doi: 10.1088/1742-6596/1767/1/012041.
[3] P. Megantoro et al., “Analysis of instrumentation system for photovoltaic pyranometer used to measure solar irradiation level,” Bull. Electr. Eng. Informatics, vol. 11, no. 6, pp. 3239–3248, 2022, doi: 10.11591/eei.v11i6.4390.
[4] T. Kim, W. Ko, and J. Kim, “Analysis and impact evaluation of missing data imputation in day-ahead PV generation forecasting,” Appl. Sci., vol. 9, no. 1, 2019, doi: 10.3390/app9010204.
[5] A. Forstinger et al., “Expert quality control of solar radiation ground data sets,” in Proceedings - ISES Solar World Congress 2021, International Solar Energy Society, 2021, pp. 1037–1048. doi: 10.18086/swc.2021.38.02.
[6] World Meteorological Organization, Guide to Meteorological Instruments and Methods of Observation Volume V. World Meteorological Organization, 2018.
[7] M. Bayray et al., “Temporal and spatial solar resource variation by analysis of measured irradiance in
Geba catchment, North Ethiopia,” Sustain. Energy Technol. Assessments, vol. 44, Apr. 2021, doi: 10.1016/j.seta.2021.101110.
[8] R. N. Faizin, M. Riasetiawan, and A. Ashari, “A Review of Missing Sensor Data Imputation Methods,” in 2019 5th International Conference on Science and Technology (ICST), IEEE, Jul. 2019, pp. 1–6. doi: 10.1109/ICST47872.2019.9166287.
[9] M. V Shcherbakov, A. Brebels, N. L. Shcherbakova, V. A. Kamaev, O. M. Gerget, and D. Devyatykh, “Outlier detection and classification in sensor data streams for proactive decision support systems,” J. Phys. Conf. Ser., vol. 803, p. 012143, Jan. 2017, doi: 10.1088/1742-6596/803/1/012143.
[10] H. Hissou, S. Benkirane, A. Guezzaz, M. Azrour, and A. Beni-Hssane, “A Novel Machine Learning Approach for Solar Radiation Estimation,” Sustain., vol. 15, no. 13, Jul. 2023, doi: 10.3390/su151310609.
[11] C. Fan, M. Chen, X. Wang, J. Wang, and B. Huang, “A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data,” Front. Energy Res., vol. 9, no. March, pp. 1–17, 2021, doi: 10.3389/fenrg.2021.652801.
[12] C. C. Turrado, M. del C. M. López, F. S. Lasheras, B. A. R. Gómez, J. L. C. Rollé, and F. J. de C. Juez, “Missing data imputation of solar radiation data under different atmospheric conditions,” Sensors (Switzerland), vol. 14, no. 11, pp. 20382–20399, 2014, doi: 10.3390/s141120382.
[13] S. Shan et al., “A deep-learning based solar irradiance forecast using missing data,” IET Renew. Power Gener., vol. 16, no. 7, pp. 1462–1473, 2022, doi: 10.1049/rpg2.12408.
[14] I. Daut, Y. M. Irwan, I. Safwati, M. Irwanto, N. Gomesh, and M. Fitra, “Finding the Outliers on Solar Radiation in Northern Malaysia, Perlis,” vol. 02, no. 03, pp. 35–40, 2012.
[15] A. D. Călin, A. M. Coroiu, and H. B. Mureşan, “Analysis of Preprocessing Techniques for Missing Data in the Prediction of Sunflower Yield in Response to the Effects of Climate Change,” Appl. Sci., vol. 13, no. 13, 2023, doi: 10.3390/app13137415.
[16] W. T. Handoko, Muladi, and A. N. Handayani, “Forecasting Solar Irradiation on Solar Tubes Using the LSTM Method and Exponential Smoothing,” J. Ilm. Tek. Elektro Komput. dan Inform., vol. 9, no. 3, pp. 649–660, 2023, doi: 10.26555/jiteki.v9i3.26395.
[17] C. N. Obiora, A. N. Hasan, and A. Ali, “Predicting Solar Irradiance at Several Time Horizons Using Machine Learning Algorithms,” Sustain., vol. 15, no. 11, 2023, doi: 10.3390/su15118927.
[18] A. K. Mandal, R. Sen, S. Goswami, and B. Chakraborty, “Comparative Study of Univariate and Multivariate Long Short-Term Memory for Very Short-Term Forecasting of Global Horizontal Irradiance,” Symmetry (Basel)., vol. 13, no. 8, 2021, doi: 10.3390/sym13081544.
[19] M. C. Sorkun, Ö. Durmaz Incel, and C. Paoli, “Time Series Forecasting on Multivariate Solar Radiation Data using Deep Learning (LSTM),” Turkish J. Electr. Eng. Comput. Sci., vol. 28, no. 1, pp. 211–223, 2020, doi: 10.3906/elk-1907-218.
[20] M. Munoz-Organero, “Deep physiological model for blood glucose prediction in T1DM patients,” Sensors (Switzerland), vol. 20, no. 14, pp. 1–17, 2020, doi: 10.3390/s20143896.
[21] E. Cho, T. W. Chang, and G. Hwang, “Data Preprocessing Combination to Improve the Performance of Quality Classification in the Manufacturing Process,” Electron., vol. 11, no. 3, pp. 1–15, 2022, doi: 10.3390/electronics11030477.
[22] N. M. Noor, M. M. Al Bakri Abdullah, A. S. Yahaya, and N. A. Ramli, “Comparison of linear interpolation method and mean method to replace the missing values in environmental data set,” Mater. Sci. Forum, vol. 803, no. August, pp. 278–281, 2015, doi: 10.4028/www.scientific.net/MSF.803.278.
[23] N. Fatimah, “Aplikasi Interpolasi Newton Menggunakan Borland Delphi 5.9,” J. Ilm. Teknol. dan Rekayasa, vol. 20, no. 1, pp. 36–51, 2015.
[24] R. Kumar, S. Bhattacharya, and G. Murmu, “Exploring Optimality of Piecewise Polynomial Interpolation Functions for Lung Field Modeling in 2D Chest X-Ray Images,” Front. Phys., vol. 9, no. November, pp. 1–14, 2021, doi: 10.3389/fphy.2021.770752.

[25] A. Jaffar, N. M. Thamrin, M. S. A. M. Ali, M. F. Misnan, A. I. M. Yassin, and N. M. Zan, “Spatial interpolation method comparison for physico-chemical parameters of river water in Klang River using MATLAB,” Bull. Electr. Eng. Informatics, vol. 11, no. 4, pp. 2368–2377, 2022, doi: 10.11591/eei.v11i4.3615.
[26] S. Jia et al., “An Improved RANSAC Algorithm for Simultaneous Localization and Mapping,” J. Phys. Conf. Ser., vol. 1069, p. 012170, Aug. 2018, doi: 10.1088/1742-6596/1069/1/012170.
[27] O. Kaspi, A. Yosipof, and H. Senderowitz, “RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells,” J. Cheminform., vol. 9, no. 1, pp. 1–15, 2017, doi: 10.1186/s13321-017-0224-0.
[28] P. Dey et al., “Comparative analysis of recurrent neural networks in stock price prediction for different frequency domains,” Algorithms, vol. 14, no. 8, 2021, doi: 10.3390/a14080251.
[29] S. Malakar et al., “Designing a long short-term network for short-term forecasting of global horizontal irradiance,” SN Appl. Sci., vol. 3, no. 4, pp. 1–15, 2021, doi: 10.1007/s42452-021-04421-x.
[30] T. O. Hodson, “Root-mean-square error (RMSE) or mean absolute error (MAE): when to use them or not,” Geosci. Model Dev., vol. 15, no. 14, pp. 5481–5487, 2022, doi: 10.5194/gmd-15-5481-2022.
[31] A. Jierula, S. Wang, T. M. Oh, and P. Wang, “Study on accuracy metrics for evaluating the predictions of damage locations in deep piles using artificial neural networks with acoustic emission data,” Appl. Sci., vol. 11, no. 5, pp. 1–21, 2021, doi: 10.3390/app11052314.
[32] D. G. da Silva, M. T. B. Geller, M. S. dos S. Moura, and A. A. de M. Meneses, “Performance evaluation of LSTM neural networks for consumption prediction,” e-Prime - Adv. Electr. Eng. Electron. Energy, vol. 2, no. December 2021, 2022, doi: 10.1016/j.prime.2022.100030.
[33] S. Liebermann, J.-S. Um, Y. Hwang, and S. Schlüter, “Performance Evaluation of Neural Network-Based Short-Term Solar Irradiation Forecasts,” Energies, vol. 14, no. 11, p. 3030, May 2021, doi: 10.3390/en14113030.
[34] H. Liu, Y. Wang, and W. G. Chen, “Three-step imputation of missing values in condition monitoring datasets,” IET Gener. Transm. Distrib., vol. 14, no. 16, pp. 3288–3300, 2020, doi: 10.1049/iet-gtd.2019.1446.
[35] J. He, L. Yuan, H. Lei, K. Wang, Y. Weng, and H. Gao, “Polynomial-Enhanced Convolutional Gated Recurrent Method under Multiple Sensor Feature Fusion for Tool Wear Prediction,” 2024.
[36] R. Solgi, H. A. Loáiciga, and M. Kram, “Long short-term memory neural network (LSTM-NN) for aquifer level time series forecasting using in-situ piezometric observations,” J. Hydrol., vol. 601, 2021, doi: 10.1016/j.jhydrol.2021.126800.
[37] K. Vijayaprabakaran and K. Sathiyamurthy, “Towards activation function search for long short-term model network: A differential evolution based approach,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 6, pp. 2637–2650, 2022, doi: 10.1016/j.jksuci.2020.04.015.
[38] M. Khosravi, B. M. Duti, M. M. S. Yazdan, S. Ghoochani, N. Nazemi, and H. Shabanian, “Multivariate Multi-Step Long Short-Term Memory Neural Network for Simultaneous Stream-Water Variable Prediction,” Eng, vol. 4, no. 3, pp. 1933–1950, 2023, doi: 10.3390/eng4030109.
[39] A. W. Saputra, A. P. Wibawa, U. Pujianto, A. B. Putra Utama, and A. Nafalski, “LSTM-based Multivariate Time-Series Analysis: A Case of Journal Visitors Forecasting,” Ilk. J. Ilm., vol. 14, no. 1, pp. 57–62, 2022, doi: 10.33096/ilkom.v14i1.1106.57-62.
[40] C. Liu, A. Zhang, J. Xue, C. Lei, and X. Zeng, “LSTM-Pearson Gas Concentration Prediction Model Feature Selection and Its Applications,” Energies, vol. 16, no. 5, p. 2318, Feb. 2023, doi: 10.3390/en16052318.
[41] H. Fan, M. Jiang, L. Xu, H. Zhu, J. Cheng, and J. Jiang, “Comparison of long short term memory networks and the hydrological model in runoff simulation,” Water (Switzerland), vol. 12, no. 1, pp. 1–15, 2020, doi: 10.3390/w12010175.