Machine Learning-Based Imputation Approaches for Efficient Electrical Load Forecasting
Contributo in Atti di convegno
Data di Pubblicazione:
2025
Citazione:
(2025). Machine Learning-Based Imputation Approaches for Efficient Electrical Load Forecasting . Retrieved from https://hdl.handle.net/10446/313065
Abstract:
Effective electrical load forecasting is based on the quality of historical data and the efficiency of forecasting algorithms. However, the presence of the missing data, due to sensor errors, communication failures, and data processing anomalies, is one of the significant problem, which not only compromising the integrity of the dataset but also reduces the accuracy of forecasting. Machine learning (ML) based imputation techniques are significant in addressing this issue by estimating and substituting the missing values based on the inherent correlations present within the dataset. In this study, four ML based imputation approaches, i.e., Random Forest (RF), Support Vector Regression (SVR), K-Nearest Neighbors (KNN) and Extreme Gradient Boosting (XGBoost), are applied to enhance the accuracy and reliability of the electrical load forecasting. A synthetic linear missing data pattern is introduced into the original dataset, and these imputation methods are evaluated for their effectiveness in restoring data integrity. This task is achieved by integrating the imputed datasets into two deep learning (DL) forecasting frameworks: Recurrent Neural Network (RNN) and Gated Recurrent Unit (GRU). The predictive performance is measured through metric parameters including Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R-squared (R2), along with an analysis of computational efficiency. The comparative study between DL structures indicates that the RNN requires less computational time, although the GRU consistently delivers superior forecasting accuracy across all imputation methods. Considering the evaluated imputation techniques, the XGBoost perform better at the lowest MSE with 6% missing data (894.98 with RNN; 876.62 with GRU), while the RF is the most consistent, particularly, at higher missing data rates (MSE: 1259.17 at 30% missingness). These findings highlight the critical significance of selecting suitable imputation techniques to enhance load forecasting efficacy in practical applications.
Tipologia CRIS:
1.4.01 Contributi in atti di convegno - Conference presentations
Elenco autori:
Hussain, Ayaz; Giangrande, Paolo; Franchini, Giuseppe; Fenili, Lorenzo; Messi, Silvio
Link alla scheda completa:
Titolo del libro:
2025 IEEE 13th International Conference on Smart Energy Grid Engineering, SEGE 2025
Pubblicato in: