This article presents a load forecasting model for commercial buildings with Enhanced Dynamically Weighted Multiple Kernel Support Vector Regression (EDW-MKSVR) and a mini-batch gradient descent method to achieve good regularization along with clustering techniques to segment different times of the day. The first step in preparing the dataset is to perform preprocessing, which involves conducting correlation analysis, scaling, and normalization followed by initial hyperparameter tuning using the multiswarm Levy flight particle swarm optimization technique. Compared with traditional methods, EDW-MKSVR offers greater adaptability to shifting load patterns since it makes use of dynamic kernel weight modifications that are dependent on data attributes. This method is used on the Commercial Buildings Energy Consumption Survey dataset, which includes information on the energy use of commercial buildings together with the climatic characteristics that are related to them across a range of periods, including the week, season, hour, and human behavior. This segmentation can be carried out using the clustering technique. In terms of performance, the study assesses the effectiveness of EDW-MKSVR models against those of boosted tree, random forest regression, K-nearest neighbor, support vector regression, and long short term memory. The study’s conclusions imply that EDW-MKSVR outperforms other current methods in terms of accuracy when obtaining complex load patterns. This article demonstrates how reliable and accurate the EDW-MKSVR approach is in predicting energy use. It facilitates the power industry’s ability to make better judgments by providing more precision and flexibility. Furthermore, the robustness of the proposed model is evaluated by predicting the electricity consumption of two additional datasets using the proposed method. Next, we compare these predictions’ performance measures, which has demonstrated the ability to generate highly ranked accuracy metrics.

Effective management of electricity demand is essential due to the instability of renewable energy sources and the fluctuating prices of electricity, which are essential for economic and social development. Short-Term Electricity Load Forecasting (STELF) and smart grids are indispensable for the optimization of energy consumption, the planning of future needs, and the guarantee of efficient electricity distribution.1 By enabling early preparation and distribution, accurate forecasting models assist power companies in market interactions, procurement planning, and waste reduction. In commercial contexts, such as office buildings and production facilities, load forecasting reduces energy consumption, peak demand, and greenhouse gas emissions, resulting in both economic and environmental benefits.2 Forecasting capabilities are further improved by smart building automation systems, which facilitate sustainable energy management methods.

In order to optimize the supply of electricity, minimize expenses, and enhance system dependability, STELF involves projecting future electrical demand for a power system. The strategy involves analyzing past data on power usage to identify recurrent patterns and trends in order to anticipate future demand. Authors3 have claimed that the proficient development of the Energy Anticipating Model (EAM) can enhance various building operations, such as real-time monitoring, efficient management of demand, energy transactions, prioritization of utilization, and the administration of battery backups. By using this proactive approach, there is a far lower chance of equipment overuse and interruption and loss of energy, which boosts productivity and saves money.4 By accurately predicting load patterns, building managers may synchronize the production of power from renewable sources with the building’s electricity demand.

Constructing an EAM for buildings encounters numerous problems, chiefly attributable to the intricacy and variability of energy usage patterns. Principal concerns encompass inadequate data quality, fragmented or erratic historical data, and the fluctuating nature of variables, including tenant behavior, meteorological circumstances, and building attributes.5 The intricate interplay among these variables, along with seasonal and meteorological variations, renders precise forecasting challenging. Ensuring models generalize effectively across various buildings, including energy-efficient technologies, and responding to real-time data introduce additional complexities.6 Addressing these problems is essential for creating precise, scalable, and flexible energy forecasting models that can enhance energy management in contemporary buildings.

Conventional techniques such as autoregressive integrated moving average (ARIMA),3 linear regression,5 and exponential smoothing7 were extensively employed historically. However, these statistical models exhibit limitations in addressing nonlinear patterns, rendering them less precise for intricate energy systems. Machine learning methodologies, including neural networks and SVM, have demonstrated superior efficacy in modeling the nonlinear, time-dependent characteristics of load data. Kalman filtering,8 autoregressive moving average (ARMA),9 and grey modeling10 are some of the conventional models that are highly seasonal and have tiny fluctuations that are irregular. Some journals suggest a Blind Kalman Filter (BKF) algorithm to address the short-term load forecasting problem. Nevertheless, the BKF algorithm is susceptible to choosing variables and initialization parameters, and it falls short in terms of robustness.

To be more precise, authors employ Regression Analysis (RA),11 multiple linear regression analysis,12 nonlinear regression analysis,13 and Time Series Regression Analysis (TSRA)14 quite extensively. Typically, these regression analyses utilize historical energy consumption data to identify patterns, seasonality, and repeating cycles. RA enables the integration of external variables, which improves the accuracy of burden estimates. It may be challenging to select and integrate appropriate independent variables with precision, and the assumption of linearity or stationarity may limit their effectiveness. TSRA models and analyzes the temporal connections and patterns of load data. Numerous TSRA disciplines, such as exponential smoothing methods, seasonal decomposition of time series, and ARIMA were implemented. TSRA is suitable for the identification of seasonal patterns, trends, and short-term burden variations. However, it may face challenges in managing complex traffic patterns that are influenced by a variety of factors or abrupt changes in building operations.

The advancements in artificial intelligence have led to the emergence of a vital application, Machine Learning Techniques (MLTs). Electricity load forecasts can be categorized into three time horizons: short-term, medium-term, and long-term. Numerous writers employed supervised machine learning techniques to predict the load with precision. Commonly utilized algorithms include Support Vector Regression (SVR),15 Random Forest Regression (RFR),16 Decision Tree (DT),17 and Boosted Tree (BT).18 SVR generates hyperplanes to delineate data points according to their attributes.

The utilization of SVR for load forecasting exhibits resilience to outliers, and it is capable of managing high-dimensional datasets. The SVR’s technique for controlling structural problems is optimal even for small datasets. Moreover, choosing the kernel function and regularization constant is the most difficult part of SVR.19 RF is an ensemble learning method that reduces the likelihood of overfitting and enhances accuracy by integrating several DTs. A boosted tree is a variant of the decision tree where models are trained sequentially to correct the errors made by the previous model. If not implemented correctly, this strategy may result in an overfitting issue, notwithstanding the possibility of a strong and accurate model. DTs are a supervised learning method utilized for classification or regression applications. The procedure entails iteratively partitioning the feature-based data into subsets until a termination criterion is met.

Ensemble approaches integrate various forecasting techniques to enhance precision and resilience. Ensemble models mitigate forecast errors and improve dependability by averaging or amalgamating the predictions produced by various models.20 Ensemble methods can utilize the advantages of several forecasting techniques, such RA, ANN, and TSA. Researchers presented an approach that integrates SVR, RF, and Long Short Term Memory (LSTM). Similarly, the adaptive decomposition approach was proposed to separate the data into fundamental series and the fluctuating by-product of the core. Furthermore, the linear regression model and regression-based XGBOOST model were employed to optimize the sub-series by analyzing the electricity consumption patterns of industries. Nonetheless, the characteristics of the primary series are not thoroughly examined.21 Ultimately, the processing of smart meter data using machine learning to predict future demand was addressed. This journal presented Convolutional Neural Networks (CNNs) in conjunction with Long Short-Term Memory (LSTM)22 and the Gated Recurrent Unit (GRU).23 CNNs identify characteristics, whereas the GRU and LSTM have superior memory capabilities. Authors24 constructed and assessed the GRU by integrating load and weather factors into an actual dataset. Furthermore, they introduced a machine learning approach that leverages data from the electricity network of Spain. They utilized six separate anticipation models. The authors advocated for the utilization of LSTM and GRU neural networks, evaluating their efficacy on the load demand in the Kurdistan region. The paper delineates the advantages and disadvantages of employing Levy Flight Particle Swarm Optimization (LFPSO) to optimize the hyperparameters of Bi-directional LSTM (Bi-LSTM)25 for time series forecasting. The randomness inherent in the Levy flight algorithm within this LFPSO may increase the likelihood of overlooking the fittest population. In this, a hybrid model was introduced that integrates the salp swarm optimization (SSO) algorithm26 and the Particle Swarm Optimization (PSO) algorithm to address issues such as local optima and premature convergence. A unique approach is introduced to optimize the hyperparameters of Support Vector Regression (SVR). Constructing a successful ensemble necessitates meticulous selection and weighting of individual models, perhaps introducing further complexity. In the literature, Hyperparameter Tuning (HT) is the principal factor influencing predicting accuracy. Many academics have developed various HT algorithms to optimize parameter HT and increase accuracy. Examples include Particle Swarm Optimization (PSO), enhanced swarm intelligence approach, Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), cuckoo search optimization, grasshopper optimization algorithm,27 elephant herd optimization algorithm, and wolf optimization algorithm. The researchers offered hybrid algorithms such as LFPSO to enhance global exploration.28 

The choice of kernels11 is a crucial element in nonlinear models in practical applications. Linear kernels predominantly tackle linear issues. Notwithstanding their application to nonlinear issues, the presumption of linearity persists, ultimately affecting the precision of forecasts. Polynomial kernels are then utilized for problems that are non-divisible and linearly separable. These kernels have considerable computing difficulties owing to their extensive parameter count. Consequently, the kernel utilizing the radial basis function is suitable for nonlinear loads; however, it is deficient in a global learning effect. The researchers indicated that selecting kernels for real-time applications is a highly intricate endeavor. The researchers devised and executed a hybrid kernel methodology to use diverse kernel functions across many applications. The proposed strategy may improve the SVR training process. Furthermore, it renders the algorithm flexible to any dataset. The key considerations in MKSVR are the initial hyperparameter optimization and the allocation of weights to each kernel.

The literature evaluation demonstrates that load forecasting is crucial for energy management. It requires comprehensive data acquisition and an advanced computational approach. Forecasters rely on the end-user application and the temporal context when estimating demand. Load forecasting is a nonlinear issue characterized by a pronounced propensity for periodic fluctuations. The created algorithm must exhibit adequate robustness to handle data of any type. Unlike singular kernel models, multiple kernel models provide greater appropriateness for nonlinear models. Moreover, employing fixed weights during the problem-solving process may lead to a future reduction in predictive accuracy.

  • The idea is to use EDW-MKSVR and mini-gradient descent loops together with a clustering technology to make a better dynamic scaling machine learning method.

  • An advanced dynamic weighting machine learning methodology integrating EDW-MKSVR and mini-gradient descent loops is proposed. The suggested method is evaluated against LSTM, K-Nearest Neighbor (KNN), RF, BT, and DT across various window lengths, such as days and seasons, while considering human behavior through a clustering strategy.

  • Adding data on energy use and weather conditions to the Commercial Buildings Energy Consumption Survey (CBECS) collection lets you test the model.

By integrating energy consumption statistics and meteorological variables, we can evaluate the model using the CBECS dataset. The algorithm’s robustness is evaluated by applying it to Morocco’s electricity consumption dataset, the United Kingdom National Grid System operator, and their significant parameters.

The subsequent sections are organized as follows: The methodology section starts with a short summary of the data description. It then sets up data correlation, feature extraction, hyperparameter tuning with MSELFPSO, load forecasting with EDW-MKSVR, and evaluation criteria. Subsequently, the results and discussion evaluates alternative methodologies, including LSTM, RF, KN, BT, and DT. The final session of this endeavor comprises the presentation of its conclusions and prospective scope.

This paper employs CBECS survey data gathering. The survey executed by the United States Energy Information Administration (EIA) is the most comprehensive evaluation of energy usage in commercial structures. The collection includes information from 6700 business buildings, providing essential insights into annual energy use trends. The survey comprises 510 features, including physical characteristics, occupational practices, and types of equipment, environmental conservation factors, and meteorological data. This article presents hourly energy use and meteorological data from the CBECS spanning three years (2016–2019). The dataset has 26 303 rows and 11 columns, each denoting distinct meteorological variables. The meteorological data for the corresponding years encompassed details such month, time, date, hourly data (HH), global radiation (Q), temperature (Temp), dew point temperature (TD), relative humidity (U), duration of precipitation (DR), wind gust (FX), and hourly precipitation sum (RH). The initial part of data processing is analyzing absent data values and outliers to guarantee precise outcomes. Thereafter, the conventional SVR is utilized on the same dataset to determine the appropriate data partitioning. The datasets were divided into training and testing ratios of 80%–20%, 70%–30%, and 60%–40%. The RMSE was compared with the R2. The outcomes for the 80%–20% division are an RMSE of 0.463 and R2 of 0.823. The values for the 70%–30% split are an RMSE of 0.566 and R2 of 0.646. Finally, for the 60%–40% division, the outcomes are an RMSE of 0.603 and R2 of 0.643. The 80%–20% combination was selected as the optimal choice due to its lower RMSE and higher R2 value. Furthermore, the dataset was segmented to provide adequate data for training a resilient model. The objective of the division is to alleviate overfitting and facilitate generalization. Numerous writers select this division to be statistically significant and computationally efficient.

Subsequently, a correlation analysis was conducted to ascertain the presence of a linear relationship between the variables. Pearson’s Correlation Coefficient (PC) aims to determine the linear association between two variables. The PC analysis reveals a positive correlation between temperature and energy consumption, while there is a negative correlation between temperature and relative humidity, which is clearly shown in Fig. 1. Both possess multicollinearity; hence, either one can be utilized for demand prediction. Spearman’s correlation (SC) is used to examine the nonlinear association between energy, hour, and month. It quantifies the strength and orientation of the correlation between the variables. In addition, it evaluates the monotonicity of variables. Unlike Pearson’s correlation, SC does not require the datasets to follow a normal distribution. The association between +1 and −1 is established, with 0 indicating no correlation. SC demonstrates that energy levels vary across the different months of the year. The dataset exhibits a stronger correlation with the months rather than the hours of the day.

FIG. 1.

Correlation analysis.

FIG. 1.

Correlation analysis.

Close modal

1. Segmentation of time intervals

Clustering methodologies such as K-means are employed to partition the day into distinct segments, namely, morning, afternoon, evening, and midnight; weeks to 1 week, 2 weeks, 4 weeks, etc., the month to weekdays and weekends; and seasons to summer, winter, fall, and spring. This enables the model to accurately represent the distinct load patterns that occur at different periods throughout the day.

2. Clustering feature engineering

To enhance clustering, consider using characteristics such as the time of the day, the day of the week, and temperature.

3. Cluster labels

Assign a label to each data point based on the specific time of the day.

Scaling is then carried out to improve model performance. The majority of scalers are standard, min-max, and robust. A standard scaler efficiently rescales a feature to a zero-mean, one-variance distribution. Min-max scaling rescales a feature or observation value to a 0–1 distribution. Robust scaling sets the median at zero and scales the data to the interquartile range. The data are standardized and normalized between −2 and 1. Standard scaling is used here since the R2 value is 0.865 and the RMSE value is 5.52, which is the smallest when compared to other scalers, including min-max (R2: 0.8545; RMSE: 6.0785) and robust (R2: 0.8617; RMSE: 5.77). Figure 2 shows the scaling output.

Hyperparameter tuning is a process of finding the most appropriate hyperparameters that provide better performance of machine learning models. HT utilizes various methodologies, such as random search, grid search, Bayesian optimization, and a number of optimization algorithms. Four different hyperparameter tuning algorithms used are MSPSO, LF, ELOBS, and GS.

Multi-swarm PSO: It is based on the idea of using several swarms that will explore different regions of the hyperparameter space. Instead of using a unique swarm, multiswarm PSO reduces the possibility of convergence to local optima.

Levy Flight: Levy flight is a concept that combines random long jumps. This enhances exploration by enabling the particles to jump to faraway areas in the search space.

EOBL (Elite Opposition-Based Learning): It builds opposite positions within the search space for highly performing particles, increasing exploration and reducing the chance of early convergence.

Gold Opposition-Based Learning: In this technique, gold opposition is used to create a “golden” opposing solution, enhancing exploration around the optimal solutions obtained so far.

The four above-mentioned algorithms are an ensemble to form the multiswarm enhanced Levy flight particle swarm optimization algorithm, and the pseudocode is given in Table I.

TABLE I.

Pseudocode of hyperparameter tuning-MSELFPSO.

1. Initialize parameters: 
Number of swarms (N_swarm), Number of particles per swarm (N_particle), Number of dimensions (D) (representing SVR hyperparameters: C, epsilon, gamma), Maximum number of iterations (max_iter), Inertia weight (w), Cognitive coefficient (c1), Social coefficient (c2),- Levy flight parameter (beta), Elite opposition parameters 
2. Define the fitness function: 
- The fitness function is the error metric for SVR (e.g., Mean Squared Error on validation data) 
3. Initialize swarm particles: a. For each swarm (i = 1 to N_swarm): 
- For each particle (j = 1 to N_particle): 
- Randomly initialize position P[i][j] in D dimensions (corresponding to hyperparameter space for SVR: C, epsilon, gamma) 
- Randomly initialize velocity V[i][j] 
- Evaluate fitness F[i][j] = SVR_error(P[i][j]) (fitness is based on SVR performance) 
- Set personal best P_best[i][j] = P[i][j], F_best[i][j] = F[i][j] 
4. Apply Elite Opposition-Based Learning (EOBL): a. Calculate the opposition position O[i][j] for each particle: 
- O[i][j] = Opposite_position(P[i][j]) based on the EOBL formula b. Evaluate fitness F_O[i][j] = SVR_error(O[i][j]) 
c. Compare F_O[i][j] with F[i][j]: 
- If F_O[i][j] is better, replace P[i][j] with O[i][j] and update P_best[i][j], F_best[i][j] 
5. For each iteration (iter = 1 to max_iter): a. Update global best for each swarm (G_best[i]) based on the best fitness in the swarm 
b. For each particle (i, j): 
- Generate random numbers for Levy flight step: 
- L = Levy_flight(beta) // Perform Levy flight for exploration 
- Update velocity V[i][j]: 
V[i][j] = w * V[i][j] + c1 * random() * (P_best[i][j] - P[i][j]) + c2 * random() * (G_best[i] - P[i][j]) + L 
- Update position P[i][j]: 
P[i][j] = P[i][j] + V[i][j] 
- Ensure particles stay within the bounds of the hyperparameter space (apply constraints) 
- Evaluate new fitness F[i][j] = SVR_error(P[i][j]) 
- Update personal best: 
If F[i][j] < F_best[i][j]: 
P_best[i][j] = P[i][j] 
F_best[i][j] = F[i][j] c. Update global best G_best[i] for each swarm: 
If F_best[i][j] < G_best[i]: 
G_best[i] = P_best[i][j] d. Apply **Gold Opposition-Based Learning** to G_best: 
- Generate gold opposition solution G_Opp[i] based on Gold Opposition rules 
- Evaluate G_Opp_fitness = SVR_error(G_Opp[i]) 
- If G_Opp_fitness is better than G_best[i], update G_best[i] = G_Opp[i] 
6. At each iteration, apply Elite Opposition-Based Learning: 
- For top-performing particles (elite particles), apply elite opposition to generate opposite positions. 
- Compare fitness and update positions accordingly to enhance diversity and prevent stagnation. 
1. Initialize parameters: 
Number of swarms (N_swarm), Number of particles per swarm (N_particle), Number of dimensions (D) (representing SVR hyperparameters: C, epsilon, gamma), Maximum number of iterations (max_iter), Inertia weight (w), Cognitive coefficient (c1), Social coefficient (c2),- Levy flight parameter (beta), Elite opposition parameters 
2. Define the fitness function: 
- The fitness function is the error metric for SVR (e.g., Mean Squared Error on validation data) 
3. Initialize swarm particles: a. For each swarm (i = 1 to N_swarm): 
- For each particle (j = 1 to N_particle): 
- Randomly initialize position P[i][j] in D dimensions (corresponding to hyperparameter space for SVR: C, epsilon, gamma) 
- Randomly initialize velocity V[i][j] 
- Evaluate fitness F[i][j] = SVR_error(P[i][j]) (fitness is based on SVR performance) 
- Set personal best P_best[i][j] = P[i][j], F_best[i][j] = F[i][j] 
4. Apply Elite Opposition-Based Learning (EOBL): a. Calculate the opposition position O[i][j] for each particle: 
- O[i][j] = Opposite_position(P[i][j]) based on the EOBL formula b. Evaluate fitness F_O[i][j] = SVR_error(O[i][j]) 
c. Compare F_O[i][j] with F[i][j]: 
- If F_O[i][j] is better, replace P[i][j] with O[i][j] and update P_best[i][j], F_best[i][j] 
5. For each iteration (iter = 1 to max_iter): a. Update global best for each swarm (G_best[i]) based on the best fitness in the swarm 
b. For each particle (i, j): 
- Generate random numbers for Levy flight step: 
- L = Levy_flight(beta) // Perform Levy flight for exploration 
- Update velocity V[i][j]: 
V[i][j] = w * V[i][j] + c1 * random() * (P_best[i][j] - P[i][j]) + c2 * random() * (G_best[i] - P[i][j]) + L 
- Update position P[i][j]: 
P[i][j] = P[i][j] + V[i][j] 
- Ensure particles stay within the bounds of the hyperparameter space (apply constraints) 
- Evaluate new fitness F[i][j] = SVR_error(P[i][j]) 
- Update personal best: 
If F[i][j] < F_best[i][j]: 
P_best[i][j] = P[i][j] 
F_best[i][j] = F[i][j] c. Update global best G_best[i] for each swarm: 
If F_best[i][j] < G_best[i]: 
G_best[i] = P_best[i][j] d. Apply **Gold Opposition-Based Learning** to G_best: 
- Generate gold opposition solution G_Opp[i] based on Gold Opposition rules 
- Evaluate G_Opp_fitness = SVR_error(G_Opp[i]) 
- If G_Opp_fitness is better than G_best[i], update G_best[i] = G_Opp[i] 
6. At each iteration, apply Elite Opposition-Based Learning: 
- For top-performing particles (elite particles), apply elite opposition to generate opposite positions. 
- Compare fitness and update positions accordingly to enhance diversity and prevent stagnation. 

Enhanced Dynamically Weighted Multiple Kernel Support Vector Regression (EDWMK-SVR) is a highly advanced machine learning technique that is specifically designed to address intricate, nonlinear regression problems. In order to effectively capture intricate patterns and adapt to diverse data distributions, the model incorporates a dynamic weighting mechanism with multiple kernel functions. The method ensures that each region is represented by a kernel that is appropriate for its unique characteristics by segmenting the data into clusters through clustering-based pre-processing. This method enhances the model’s predictive accuracy and robustness by reducing data heterogeneity. In addition, mini-batch gradient descent enhances the optimization process in EDWMK-SVR by efficiently updating kernel weights with tiny subsets of the training data. This not only expedites computation but also ensures that the model is scalable to large datasets and reduces overfitting. EDWMK-SVR is an effective solution for real-world regression challenges, making it well suited for applications in financial forecasting, climate modeling, biomedical signal analysis, and industrial optimization due to its adaptability and efficiency. Figure 3 visualizes the development of EDWMKSVR.

FIG. 3.

Development of EDW-MKSVR.

FIG. 3.

Development of EDW-MKSVR.

Close modal

The pseudo-code of EDWMKSVR is given in Table II.

TABLE II.

Pseudocode for dynamically weight-adjusted MKSVR algorithm.

Algorithm: Dynamic Kernel Weight adjusted MKSVR Algorithm 
1. Input the electricity consumption data and weather data 
2. Cluster: Create clusters for morning, midnoon, evening, night, 1 week, 2 week, 4 week,summer, fall, winter, spring, week days, week ends 
Add cluster labels to the data 
Standardize features 
3. Split the data to training and testing 
For each cluster C_k (k = 1 to K): 
Initialize weights w_{k,p} for each kernel K_p 
Initialize SVR parameters (C, epsilon, etc.) 
Repeat until convergence: 
4. Compute the weighted kernel matrix: 
Combine the kernel weights 
Solve the EDWMKSVR optimization problem for cluster C_k using K_combined 
5. Update kernel weights w_{k,p} based on the regression error 
Ensure weights are non-negative and sum to 1 
6. For new data point x_test: 
Assign x_test to the nearest cluster C_k 
Predict using the trained MK-SVR model with optimal kernel weights for C_Then Display prediction 
7. If not, Update the kernel weights using the mini gradient descent 
Continue till minimum error is achieved 
End 
Algorithm: Dynamic Kernel Weight adjusted MKSVR Algorithm 
1. Input the electricity consumption data and weather data 
2. Cluster: Create clusters for morning, midnoon, evening, night, 1 week, 2 week, 4 week,summer, fall, winter, spring, week days, week ends 
Add cluster labels to the data 
Standardize features 
3. Split the data to training and testing 
For each cluster C_k (k = 1 to K): 
Initialize weights w_{k,p} for each kernel K_p 
Initialize SVR parameters (C, epsilon, etc.) 
Repeat until convergence: 
4. Compute the weighted kernel matrix: 
Combine the kernel weights 
Solve the EDWMKSVR optimization problem for cluster C_k using K_combined 
5. Update kernel weights w_{k,p} based on the regression error 
Ensure weights are non-negative and sum to 1 
6. For new data point x_test: 
Assign x_test to the nearest cluster C_k 
Predict using the trained MK-SVR model with optimal kernel weights for C_Then Display prediction 
7. If not, Update the kernel weights using the mini gradient descent 
Continue till minimum error is achieved 
End 

1. Linear kernel

(1)

Equation (1) is the linear kernel equation is primarily used for situations that can be separated linearly.

2. Polynomial kernel

(2)

Equation (2) is the polynomial kernel equation, the polynomial’s order is greater than or equal to 1.

3. RBF kernel

(3)

The span of the kernel function is shown in Eq. (3), where σ > 0 is the RBF bandwidth.

4. Sigmoid kernel

(4)
where Eq. (4) gives the sigmoidal kernel function, β is the kernel’s slope, and θ is the kernel’s intercept.

Root Mean Square Error (RMSE) uses Euclidean distance to quantify how predictions deviate from the observed values.29 The coefficient of determination (R2) is a metric that quantifies the proportion of variance in a dependent variable that can be attributed to an independent variable inside a regression model.30 

1. Models to be compared

The model performance is evaluated in comparison to machine learning and deep learning techniques. KNN is a simple method that classifies data by analyzing the closest adjacent points, leading to ease of implementation but diminished performance with large datasets. Decision trees utilize a branching structure for decision-making, offering interpretability, but are prone to overfitting. Random forest improves this by utilizing many decision trees to reduce overfitting and increase accuracy; nevertheless, it may be more difficult to comprehend. Boosting trees, including gradient boosting, develop models incrementally, addressing the deficiencies of prior models, which leads to high accuracy but heightened vulnerability to overfitting. LSTM is a neural network architecture designed for sequential data, such as time series or text, utilizing gates to preserve important information, thereby improving its effectiveness for complex sequential datasets; however, it requires careful tuning and greater computational resources.

2. Experiment settings

The hyperparameters for our selected methodology are delineated in Table IV. This study employed an MSELFPSO model with an initial learning rate of 0.10 and a scheduler that modulates the learning rate based on validation loss monitoring. The learning rate will decrease by 0.5 if no enhancement is seen after ten successive epochs. In addition, a mini-gradient descent method is employed to reduce forecast error by accounting for real and expected load variations. All models are implemented for several time horizons, including hour, weekdays, weekends, season, and human behavior. The configuration of the search space is implemented as proper initialization yields optimal results. The method evaluations are conducted in Python on a 64-bit Windows 10 machine powered by a 2.30 GHz Intel® Core i7-10510U processor, along with 16.0 GB of RAM.

The HT process starts with initializing a randomly generated population in a predefined hyperparameter space using Multiswarm Levy-Flight Particle Swarm Optimization (MSLFPSO). The space ranges taken for analysis are given in Table III. The rationale for specifying the configuration spaces in Table III is to provide a structured and guided environment, allow for sensitivity analysis, avoid infeasible configuration, and permit all three algorithms to function in the same search space. Table IV gives the hyperparameters of the proposed method. For other algorithms, default values are taken as hyperparameter values. Tables V and VI give the comparison of RMSE and R2 values of MSLFPSO. In all the cases, MSLFPSO achieves the least RMSE value and highest R2 value.

TABLE III.

Space ranges of hyperparameters.

HyperparametersCase 1Case 2Case 3
0–25 25–50 50–100 
0–1 0–1 0–1 
Γ 1–4 4–6 6–7 
Kernel function Available four kernels 
HyperparametersCase 1Case 2Case 3
0–25 25–50 50–100 
0–1 0–1 0–1 
Γ 1–4 4–6 6–7 
Kernel function Available four kernels 
TABLE IV.

Hyperparameters utilized.

S.No.ModelHyperparametersValue
1. EDW-MKSVR (initial C (Cost) 2.4 
 tuning of MSELFPSO)   
  E (Epsilon) 0.03 
  Γ (Gamma) 0.008 57 
  Number of support vectors 20 286 
  Learning rate 0.08 
  Nfeat 
S.No.ModelHyperparametersValue
1. EDW-MKSVR (initial C (Cost) 2.4 
 tuning of MSELFPSO)   
  E (Epsilon) 0.03 
  Γ (Gamma) 0.008 57 
  Number of support vectors 20 286 
  Learning rate 0.08 
  Nfeat 
TABLE V.

Comparison of RMSE. Note: bolded values indicate the optimal metrics achieved.

AlgorithmCase1Case2Case3
PSO 0.1258 0.1173 0.1226 
LFPSO 0.0934 0.0998 0.0878 
ELFPSO 0.0732 0.0663 0.0582 
MSELFPSO 0.0672 0.0682 0.0553 
AlgorithmCase1Case2Case3
PSO 0.1258 0.1173 0.1226 
LFPSO 0.0934 0.0998 0.0878 
ELFPSO 0.0732 0.0663 0.0582 
MSELFPSO 0.0672 0.0682 0.0553 
TABLE VI.

Comparison of R2. Note: bolded values indicate the optimal metrics achieved.

AlgorithmCase 1Case 2Case 3
PSO 0.768 0.821 0.801 
LFPSO 0.865 0.883 0.883 
ELFPSO 0.905 0.914 0.913 
MSELFPSO 0.923 0.916 0.921 
AlgorithmCase 1Case 2Case 3
PSO 0.768 0.821 0.801 
LFPSO 0.865 0.883 0.883 
ELFPSO 0.905 0.914 0.913 
MSELFPSO 0.923 0.916 0.921 

Initially, the accuracy metrics of EDW-MKSVR are assessed by the actual vs predicted plot and the predicted vs residual plot. The RMSE values from Fig. 4 show a balanced compromise between low and comparable errors. This suggests that the model has successfully captured the underlying patterns in the data without suffering from overfitting or underfitting. Table VII clearly describes the error metrics of the proposed model for different window lengths. The suggested model is evaluated by varying the duration of the windows, such as 1, 2, and 4 weeks, based on factors such as the season and human behavior. Furthermore, it is also contrasted with five distinct machine-learning algorithms. The error measurements, specifically R2 and RMSE, are evaluated. EDWMKSVR is evaluated by varying the window length to 1, 2, and 4 weeks, and its error metrics, such as R2, RMSE, and MAE, are examined. EDWMKSVR ranked highest in terms of R2 value and had the lowest RMSE values compared to other traditional models. Specifically, during one week, it yields the highest R2 value. The constructed model is also evaluated for its ability to forecast energy use in four distinct seasons: summer, autumn, winter, and spring. During this assessment, the proposed model demonstrates a remarkable R2 value and a reduction in both RMSE scores. Next, the suggested and convolution approaches are compared in terms of human behavior using two metrics: weekday behavior and weekend behavior. The model that has been built also yields remarkable results in this examination, which is clearly shown in Table V.

FIG. 4.

Comparison of computational times.

FIG. 4.

Comparison of computational times.

Close modal
TABLE VII.

Error metrics for different window lengths.

MetricsWindow lengthDTKNNBTRFLSTMEDWMKSVR
R2 Morning 0.88 0.91 0.93 0.924 0.93 0.946 
Midnoon 0.87 0.915 0.926 0.93 0.935 0.94 
Evening 0.86 0.92 0.935 0.932 0.936 0.95 
Night 0.865 0.92 0.94 0.931 0.938 0.96 
RMSE Morning 2.79 2.34 2.18 2.45 2.29 2.14 
Midnoon 2.76 2.36 2.17 2.46 2.28 2.13 
Evening 2.73 2.321 2.16 2.46 2.27 2.12 
Night 2.76 2.34 2.15 2.44 2.29 2.12 
MAE Morning 193 190 180 177 174 164.04 
Midnoon 191 187 179 178 175 172.02 
Evening 192 188 178 176 172 171.01 
Night 191 187 177 175 172 171 
R2 1 week 0.89 0.92 0.937 0.925 0.94 0.95 
2 weeks 0.88 0.917 0.926 0.926 0.941 0.946 
4 weeks 0.87 0.922 0.931 0.93 0.939 0.939 
RMSE 1 week 2.78 2.35 2.17 2.48 2.28 2.13 
2 weeks 2.75 2.34 2.17 2.44 2.27 2.12 
4 weeks 2.74 2.32 2.16 2.46 2.27 2.12 
MAE 1 week 192 189 179 175 172 165.04 
2 weeks 191 188 180 176 173 173.02 
4 weeks 191 188 180 176 172 172.01 
R2 Summer 0.88 0.91 0.92 0.91 0.93 0.96 
Fall 0.87 0.926 0.928 0.918 0.936 0.95 
Winter 0.88 0.918 0.93 0.92 0.946 0.956 
Spring 0.86 0.92 0.929 0.926 0.938 0.958 
RMSE Summer 2.77 2.34 2.18 2.49 2.281 2.14 
Fall 2.751 2.33 2.176 2.441 2.272 2.131 
Winter 2.743 2.32 2.169 2.46 2.26 2.126 
Spring 2.666 2.343 2.22 2.35 2.46 2.2 
MAE Summer 192.06 189.08 179.01 175.01 172.06 164.02 
Fall 191.02 188.12 180.03 176.03 173.08 171.01 
Winter 191.02 188.11 180.06 176.06 172.07 172.01 
Spring 191.03 188.10 180.03 176.05 172.03 172.03 
R2 Weekdays 0.88 0.919 0.928 0.924 0.936 0.952 
Weekends 0.89 0.92 0.927 0.926 0.935 0.941 
RMSE Weekdays 2.75 2.35 2.20 2.34 2.27 2.11 
Weekends 2.74 2.34 2.21 2.33 2.26 2.12 
MAE Weekdays 192 187 181 176 173 171.01 
Weekends 193 188 180 179 172 171.01 
MetricsWindow lengthDTKNNBTRFLSTMEDWMKSVR
R2 Morning 0.88 0.91 0.93 0.924 0.93 0.946 
Midnoon 0.87 0.915 0.926 0.93 0.935 0.94 
Evening 0.86 0.92 0.935 0.932 0.936 0.95 
Night 0.865 0.92 0.94 0.931 0.938 0.96 
RMSE Morning 2.79 2.34 2.18 2.45 2.29 2.14 
Midnoon 2.76 2.36 2.17 2.46 2.28 2.13 
Evening 2.73 2.321 2.16 2.46 2.27 2.12 
Night 2.76 2.34 2.15 2.44 2.29 2.12 
MAE Morning 193 190 180 177 174 164.04 
Midnoon 191 187 179 178 175 172.02 
Evening 192 188 178 176 172 171.01 
Night 191 187 177 175 172 171 
R2 1 week 0.89 0.92 0.937 0.925 0.94 0.95 
2 weeks 0.88 0.917 0.926 0.926 0.941 0.946 
4 weeks 0.87 0.922 0.931 0.93 0.939 0.939 
RMSE 1 week 2.78 2.35 2.17 2.48 2.28 2.13 
2 weeks 2.75 2.34 2.17 2.44 2.27 2.12 
4 weeks 2.74 2.32 2.16 2.46 2.27 2.12 
MAE 1 week 192 189 179 175 172 165.04 
2 weeks 191 188 180 176 173 173.02 
4 weeks 191 188 180 176 172 172.01 
R2 Summer 0.88 0.91 0.92 0.91 0.93 0.96 
Fall 0.87 0.926 0.928 0.918 0.936 0.95 
Winter 0.88 0.918 0.93 0.92 0.946 0.956 
Spring 0.86 0.92 0.929 0.926 0.938 0.958 
RMSE Summer 2.77 2.34 2.18 2.49 2.281 2.14 
Fall 2.751 2.33 2.176 2.441 2.272 2.131 
Winter 2.743 2.32 2.169 2.46 2.26 2.126 
Spring 2.666 2.343 2.22 2.35 2.46 2.2 
MAE Summer 192.06 189.08 179.01 175.01 172.06 164.02 
Fall 191.02 188.12 180.03 176.03 173.08 171.01 
Winter 191.02 188.11 180.06 176.06 172.07 172.01 
Spring 191.03 188.10 180.03 176.05 172.03 172.03 
R2 Weekdays 0.88 0.919 0.928 0.924 0.936 0.952 
Weekends 0.89 0.92 0.927 0.926 0.935 0.941 
RMSE Weekdays 2.75 2.35 2.20 2.34 2.27 2.11 
Weekends 2.74 2.34 2.21 2.33 2.26 2.12 
MAE Weekdays 192 187 181 176 173 171.01 
Weekends 193 188 180 179 172 171.01 

Furthermore, the proposed EDW-MKSVR is compared to various existing machine learning models and deep learning models in terms of computational times, as shown in Fig. 4. The suggested model, EDW-MKSVR, has the highest accuracy and error metric ranking. The model is also analyzed by keeping the initial weights fixed throughout the iteration. The results are shown in FW-MKSVR. When compared to FW-MKSVR, EDW-MKSVR has increased R2 and decreased RMSE. Next to FWMKSVR, LSTM works well in prediction. It is shown in Table V. The increased computational complexity indicates a delay in calculating the hyperparameters for each iteration to get the lowest possible error prediction. The computational time of several methods exhibits variability, with a range of 1.36–3.32 min. The elevated computational time indicates that the model encounters some excess time in adapting to the dataset. ELFPSO employs predetermined fixed weights that remain constant during training, resulting in a resilient but potentially less adaptable model. The EDW-MKSVR algorithm dynamically modifies weights to shift the model’s attention to different kernels based on the dataset’s characteristics. Adaptability can result in enhanced effectiveness in capturing intricate data patterns.

In summary, EDW-MKSVR produces a decreased RMSE value, which shows that the model predictions are close to the actual value. Increased RMSE shows that the model is better at predicting the outcomes irrespective of the time frame. In addition, lower MAE scores indicate that the model’s predictions are closer to the actual value on average. Furthermore, it applies to all datasets, regardless of their type. In each iteration, there is a corresponding increase in processing time necessary for error adjustment and weight modification. Despite its substantial computational time, it achieves the highest level of accuracy and the lowest level of error metrics.

Numerous studies have demonstrated that optimizing hyperparameters can enhance the model’s performance. Our argument posits that using dynamic weighting hyperparameters at each iteration can yield considerably superior outcomes to alternative approaches. In a more precise manner, varying initial model weights can yield diverse models, even though these models possess a similar architecture. Based on the data presented in Figs. 5 and 6, it can be observed that the EDW-MKSVR method has a strong correlation with the actual load in comparison to other standard machine learning algorithms. The subsequent models include EDW-MKSVR, LSTM, RF, KNN, BT, and DT. It captures the peaks and falls more precisely when compared to other algorithms. The obtained R-value exhibits a high degree of similarity, suggesting a high level of accuracy. To validate the performance of EDW-MKSVR, the convergence curve is as shown in Fig. 7. It is plotted by taking the number of epochs on the x-axis and the MSE value on the y-axis.

FIG. 5.

Actual vs predicted.

FIG. 5.

Actual vs predicted.

Close modal
FIG. 6.

Predicted vs residual.

FIG. 6.

Predicted vs residual.

Close modal
FIG. 7.

Convergence graph of EDW-MKSVR.

FIG. 7.

Convergence graph of EDW-MKSVR.

Close modal

In the same way, the model captures the underlying pattern for a week, hour, month, and season in Figs. 811. The pattern peaks during the mid-hours compared to the early morning and the late evening. In addition, weekday consumption is high compared to that in weekends. Figure 12 shows the daily distribution of electricity consumption. The electricity consumption is high on weekdays compared to holidays. The algorithm catches the underlying pattern. The consumption starts increasing from 8 am and remains in the increasing pattern until the evening, which is shown in Fig. 13. Figure 10 shows the monthly electricity consumption. In that, electricity consumption rises during the summer months compared to the winter months. Figure 11 shows the seasonal electricity consumption, and the graph is plotted randomly by taking weeks from summer, winter, autumn, and spring. Different seasons have varying temperatures. The building energy consumption also varies.

FIG. 8.

Weekly electricity consumption of EDWMKSVR.

FIG. 8.

Weekly electricity consumption of EDWMKSVR.

Close modal
FIG. 9.

Hourly electricity consumption.

FIG. 9.

Hourly electricity consumption.

Close modal
FIG. 10.

Monthly electricity consumption.

FIG. 10.

Monthly electricity consumption.

Close modal
FIG. 11.

Seasonal consumption.

FIG. 11.

Seasonal consumption.

Close modal
FIG. 12.

Weekly and monthly trend pattern.

FIG. 12.

Weekly and monthly trend pattern.

Close modal
FIG. 13.

Predicted trend hourly electricity loads.

FIG. 13.

Predicted trend hourly electricity loads.

Close modal

1. Reliability analysis

The stability of the proposed algorithm is further investigated in the other two datasets. The first experiment starts with Morocco’s electricity consumption dataset. The provided dataset is a CSV file containing three columns representing electric power consumption data collected at six samples per hour (equivalent to one sample every 10 min). Each of these three divisions corresponds to a distinct zone within the Moroccan city of Tétouan. Features such as time, hour, windspeed, pressure, temperature, diffuse flows, and humidity are added and analyzed. Next, the National Grid ESO serves as the designated electrical system operator for the United Kingdom. Data on electricity demand in Great Britain from 2009 have been collected. The data are updated bi-hourly, resulting in 48 entries every day. This dataset is well suited for time series forecasting. EDW-MKSVR is compared with the LSTM, BT, RF. Analyzing the other two datasets, EDW-MKSVR predicts the response with high accuracy and low error metrics. LSTM is good, but it may suffer from a vanishing gradient problem, where the gradients become extremely small during training, or may be because of the exploding gradient problem. Table VIII shows that the proposed algorithm works well with all the datasets, irrespective of their size and nature. Because the dataset is clustered using the clustering methodology, the utilization of clustering methodology splits the dataset into segments. Clustering enhances the efficacy of EDWMKSVR more markedly than that of LSTM, as evidenced by the elevated R2 and diminished RMSE values when clustering is utilized. EDWMKSVR demonstrates superior accuracy to LSTM in both circumstances.

TABLE VIII.

Experimental results of different datasets.

S.No.DatasetMethodsR2RMSE
Morocco dataset With clustering EDWMKSVR 0.978 1.445 
Without clustering 0.967 1.845 
Great Britain With clustering EDWMKSVR 0.969 1.385 
Without clustering 0.958 1.425 
Morocco dataset With clustering LSTM 0.945 2.356 
Without clustering 0.934 2.212 
Great Britain With clustering LSTM 0.948 2.18 
Without clustering 0.943 2.142 
Morocco dataset With clustering RF 0.938 2.345 
10 Without clustering 0.937 2.445 
11 Great Britain With clustering RF 0.939 2.385 
12 Without clustering 0.938 2.425 
13 Morocco dataset With clustering BT 0.925 2.556 
14 Without clustering 0.924 2.712 
15 Great Britain With clustering BT 0.928 2.518 
16 Without clustering 0.923 2.742 
S.No.DatasetMethodsR2RMSE
Morocco dataset With clustering EDWMKSVR 0.978 1.445 
Without clustering 0.967 1.845 
Great Britain With clustering EDWMKSVR 0.969 1.385 
Without clustering 0.958 1.425 
Morocco dataset With clustering LSTM 0.945 2.356 
Without clustering 0.934 2.212 
Great Britain With clustering LSTM 0.948 2.18 
Without clustering 0.943 2.142 
Morocco dataset With clustering RF 0.938 2.345 
10 Without clustering 0.937 2.445 
11 Great Britain With clustering RF 0.939 2.385 
12 Without clustering 0.938 2.425 
13 Morocco dataset With clustering BT 0.925 2.556 
14 Without clustering 0.924 2.712 
15 Great Britain With clustering BT 0.928 2.518 
16 Without clustering 0.923 2.742 

This article addresses the primary challenges associated with the development of a building energy forecasting model, including the complexity of the model, data restrictions, uncertainty, and scalability. Effective load planning, participation in demand response programs, seamless integration of renewable energy sources, and enhanced energy efficiency are among the numerous benefits of building energy management. It is possible to integrate BEFM with EDW-MKSVR by employing clustering techniques. MSLFPSO improves the accuracy of the models by precisely adjusting the initial parameters of EDW-MKSVR. By employing the outputs of the mini-batch gradient descent approach, the model subsequently refines the parameters at each iteration. Subsequently, the CBECS dataset utilizes the model for assessment. We evaluate the forecasting model against LSTM, RFR, BT, KNN, and DT by employing a diverse array of window lengths. The results clearly demonstrate that EDW-MKSVR with clustering has a robust capacity for generalization and learning. The model demonstrated improved efficacy as the concentration of the model increased. We compare the anticipated results for a variety of time periods, such as hours, days, weekends, weeks, and seasons. It demonstrates its effectiveness by exhibiting an increase in R2 and a decrease in RMSE when coping with heavily nonlinear data. In comparison to other algorithms, such as KNN, BT, DT, RFR and LSTM, the EDW-MKSVR algorithm exhibits the lowest error metric, thereby surpassing other models. The model’s resilience is assessed and validated by employing two additional datasets, each with two conditions, one with clustering and the other without. It is evident that the R2 and RMSE values are capable of adapting to any dataset by comparing them. The results also suggest that clustering produces superior results. By employing the clustering methodology, it is feasible to comprehend the fundamental patterns. This also reduces the time required for iteration as the larger datasets are clustered. The EDW-MKSVR model has the potential to be notably beneficial for load prediction tasks, as it offers enhanced precision, albeit at a marginally increased computational overhead due to the additional time required for weight adjustments. It is anticipated that the framework proposed will be applicable to a wide range of structures and malls, regardless of the specific dataset used. The EDW-MKSVR forecasting model significantly reduces the obstacles associated with model development.

We are grateful to all those people who helped us during our project. We express our courtesy to the open source software GitHub from where we collected the data for our project. We also thank Coimbatore Institute of Technology for providing the resources for completing the process.

The authors have no conflicts to disclose.

C. Jeevakarunya: Data curation (equal); Formal analysis (equal); Methodology (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). V. Manikandan: Conceptualization (equal); Project administration (equal); Supervision (equal); Writing – review & editing (equal).

The datasets generated and/or analyzed during the current study are available in the energy information and industry repository https://www.eia.gov/consumption/commercial/data/.

ARIMA

Autoregressive integrated moving average

ARMA

Autoregressive moving average

BKF

Blind Kalman filter

BT

Boosted tree

CBECS

Commercial building energy consumption survey

DT

Decision trees

EAM

Energy anticipating model

EDW-MKSVR

Enhanced dynamically weighted multiple Kernel support vector regression

EOBL

Elite opposition-based learning

KNN

K-nearest neighbor

LSTM

Long short term memory

MSLFPSO

Multiswarm Levy-flight particle swarm optimization

PC

Pearson’s correlation coefficient

R2

Coefficient of determination

RFR

Random forest regression

RMSE

Root mean square error

SC

Spearman’s correlation

STELF

Short-term electricity load forecasting

SVR

Support vector regression

TSRA

Time series regression analysis

ARIMA

Autoregressive integrated moving average

ARMA

Autoregressive moving average

BKF

Blind Kalman filter

BT

Boosted tree

CBECS

Commercial building energy consumption survey

DT

Decision trees

EAM

Energy anticipating model

EDW-MKSVR

Enhanced dynamically weighted multiple Kernel support vector regression

EOBL

Elite opposition-based learning

KNN

K-nearest neighbor

LSTM

Long short term memory

MSLFPSO

Multiswarm Levy-flight particle swarm optimization

PC

Pearson’s correlation coefficient

R2

Coefficient of determination

RFR

Random forest regression

RMSE

Root mean square error

SC

Spearman’s correlation

STELF

Short-term electricity load forecasting

SVR

Support vector regression

TSRA

Time series regression analysis

1.
S.
Seyedzadeh
,
F. P.
Rahimian
,
I.
Glesk
, and
M.
Roper
, “
Machine learning for estimation of building energy consumption and performance: A review
,”
Visualization Eng.
6
(
1
),
5
(
2018
).
2.
X.
Li
and
J.
Wen
, “
Review of building energy modeling for control and operation
,”
Renewable Sustainable Energy Rev.
37
,
517
537
(
2014
).
3.
M.
Bourdeau
,
X. q.
Zhai
,
E.
Nefzaoui
,
X.
Guo
, and
P.
Chatellier
, “
Modeling and forecasting building energy consumption: A review of data-driven techniques
,”
Sustainable Cities Soc.
48
,
101533
(
2019
).
4.
G.
Zhang
,
C.
Tian
,
C.
Li
,
J. J.
Zhang
, and
W.
Zuo
, “
Accurate forecasting of building energy consumption via a novel ensembled deep learning method considering the cyclic feature
,”
Energy
201
,
117531
(
2020
).
5.
A.
D’Amico
,
G.
Ciulla
,
L.
Tupenaite
, and
A.
Kaklauskas
, “
Multiple criteria assessment of methods for forecasting building thermal energy demand
,”
Energy Build.
224
,
110220
(
2020
).
6.
J.
Moon
,
S.
Park
,
S.
Rho
, and
E.
Hwang
, “
Robust building energy consumption forecasting using an online learning approach with R ranger
,”
J. Build. Eng.
47
,
103851
(
2022
).
7.
C.
Deb
,
F.
Zhang
,
J.
Yang
,
S. E.
Lee
, and
K. W.
Shah
, “
A review on time series forecasting techniques for building energy consumption
,”
Renewable Sustainable Energy Rev.
74
,
902
924
(
2017
).
8.
J.
Zhang
,
Y. M.
Wei
,
D.
Li
,
Z.
Tan
, and
J.
Zhou
, “
Short term electricity load forecasting using a hybrid model
,”
Energy
158
,
774
781
(
2018
).
9.
W.
Zhu
,
H.
Ma
,
G.
Cai
,
J.
Chen
,
X.
Wang
, and
A.
Li
, “
Research on PSO-ARMA-SVR short-term electricity consumption forecast based on the particle swarm algorithm
,”
Wireless Commun. Mobile Comput.
2021
,
6691537
.
10.
T.
Ahmad
,
H.
Chen
,
Y.
Guo
, and
J.
Wang
, “
A comprehensive overview on the data driven and large scale based approaches for forecasting of building energy demand: A review
,”
Energy Build.
165
,
301
320
(
2018
).
11.
H.
Yang
,
Z.
Wang
, and
K.
Song
, “
A new hybrid grey wolf optimizer-feature weighted-multiple kernel-support vector regression technique to predict TBM performance
,”
Eng. Comput.
38
(
3
),
2469
2485
(
2022
).
12.
G.
Ciulla
and
A.
D’Amico
, “
Building energy performance forecasting: A multiple linear regression approach
,”
Appl. Energy
253
,
113500
(
2019
).
13.
N.
Zhang
,
Z.
Li
,
X.
Zou
, and
S. M.
Quiring
, “
Comparison of three short-term load forecast models in Southern California
,”
Energy
189
,
116358
(
2019
).
14.
H.
Kim
,
S.
Park
, and
S.
Kim
, “
Time-series clustering and forecasting household electricity demand using smart meter data
,”
Energy Rep.
9
,
4111
4121
(
2023
).
15.
L.
Sehovac
and
K.
Grolinger
, “
Deep learning for load forecasting: Sequence to sequence recurrent neural networks with attention
,”
IEEE Access
8
,
36411
36426
(
2020
).
16.
S. S.
Yassin
and
Pooja
, “
Road accident prediction and model interpretation using a hybrid K-means and random forest algorithm approach
,”
SN Appl. Sci.
2
(
9
),
1576
(
2020
).
17.
T.
Alquthami
,
M.
Zulfiqar
,
M.
Kamran
,
A. H.
Milyani
, and
M. B.
Rasheed
, “
A performance comparison of machine learning algorithms for load forecasting in smart grid
,”
IEEE Access
10
,
48419
48433
(
2022
).
18.
T.
Pinto
,
I.
Praça
,
Z.
Vale
, and
J.
Silva
, “
Ensemble learning for electricity consumption forecasting in office buildings
,”
Neurocomputing
423
,
747
755
(
2021
).
19.
M. C.
Pegalajar
,
L. G. B.
Ruiz
,
M. P.
Cuéllar
, and
R.
Rueda
, “
Analysis and enhanced prediction of the Spanish electricity network through big data and machine learning techniques
,”
Int. J. Approximate Reasoning
133
,
48
59
(
2021
).
20.
Z.
Dong
,
J.
Liu
,
B.
Liu
,
K.
Li
, and
X.
Li
, “
Hourly energy consumption prediction of an office building based on ensemble learning and energy consumption pattern classification
,”
Energy Build.
241
,
110929
(
2021
).
21.
Y.
Ding
,
L.
Fan
, and
X.
Liu
, “
Analysis of feature matrix in machine learning algorithms to predict energy consumption of public buildings
,”
Energy Build.
249
,
111208
(
2021
).
22.
W.
Kong
,
Z. Y.
Dong
,
D. J.
Hill
,
F.
Luo
, and
Y.
Xu
, “
Short-term residential load forecasting based on resident behaviour learning
,”
IEEE Trans. Power Syst.
33
(
1
),
1087
1088
(
2018
).
23.
M.
Cai
,
M.
Pipattanasomporn
, and
S.
Rahman
, “
Day-ahead building-level load forecasts using deep learning vs. traditional time-series techniques
,”
Appl. Energy
236
,
1078
1088
(
2019
).
24.
H.
Eskandari
,
M.
Imani
, and
M. P.
Moghaddam
, “
Convolutional and recurrent neural network based model for short-term load forecasting
,”
Electr. Power Syst. Res.
195
,
107173
(
2021
).
25.
D.
Kiruthiga
and
V.
Manikandan
, “
Levy flight-particle swarm optimization-assisted BiLSTM+ dropout deep learning model for short-term load forecasting
,”
Neural Comput. Appl.
35
(
3
),
2679
2700
(
2023
).
26.
T.
Si
,
P. B. C.
Miranda
, and
D.
Bhattacharya
, “
Novel enhanced Salp Swarm Algorithms using opposition-based learning schemes for global optimization problems
,”
Expert Syst. Appl.
207
,
117961
(
2022
).
27.
C.
Jeevakarunya
,
V.
Manikandan
,
C.
Karthick
,
S.
Seenivasan
, and
N.
Nandhini
, “
Grasshopper optimized tuning of support vector regression for day a head prognostic problem
,” in
2023 International Conference on Integrated Circuits and Communication Systems ICICACS 2023
(
IEEE
,
2023
).
28.
M.
Barman
and
N. B.
Dev Choudhury
, “
Season specific approach for short-term load forecasting based on hybrid FA-SVM and similarity concept
,”
Energy
174
,
886
896
(
2019
).
29.
D. S. K.
Karunasingha
, “
Root mean square error or mean absolute error? Use their ratio as well
,”
Inf. Sci.
585
,
609
629
(
2022
).
30.
J.
Miles
, “
R-squared, adjusted R-squared
,”
Encycl. Stat. Behav. Sci.
4
,
1655
1657
(
2005
).