Rising energy demand, resource depletion, and environmental issues tied to fossil fuels demand a transition to renewable energy. Solar power, abundant and well-established, presents a promising solution to address our expanding energy requirements. The sun radiates an astonishing amount of energy every second, far more than humanity’s current and future energy needs. Accurate solar radiation prediction is crucial for optimizing solar panel design, placement, and grid integration. This paper aims to predict daily global solar radiation data for six Pakistani cities: Karachi, Lahore, Islamabad, Quetta, Peshawar, and Multan. It highlights the importance of advanced algorithms and introduces an innovative data collection method using pyranometer sensors and microcontrollers, making data storage and analysis more affordable and efficient while reducing the financial burdens associated with traditional equipment. Focusing on Pakistan’s diverse solar radiation potential, this research evaluates eight machine learning algorithms using seven key statistical metrics to understand and compare their performance in predicting solar radiation. Four algorithms, k-nearest neighbors, Random Forest Regression, Gradient Boosting Regression, and Support Vector Regression (SVR), consistently exhibit remarkable precision, achieving outstanding R2 values of up to 99%. This highlights the crucial role of algorithm selection in solar radiation prediction, with SVR emerging as the top choice. SVR’s precise and reliable forecasts empower renewable energy planning and decision-making. This study provides valuable guidance for decision-makers to optimize solar energy utilization across diverse geographical regions and contributes invaluable insights to the field of renewable energy forecasting.

AI

artificial intelligence

ANN

artificial neural network

DL

deep learning

GA

genetic algorithms

GBR

gradient boosting regression

k-NN

kernel nearest neighbor

LR

linear regression

MABE

mean absolute bias error

MAPE

mean absolute percentage error

MBE

mean bias error

ML

machine learning

n

number of observations

R2

R-squared

RFR

random forest regression

RMSE

root mean square error

rRMSE

relative root mean square error

RX/TX

receive/transmit

SR

solar radiations

SVM

support vector machine

SVR

support vector regression

t-stat

T-statistic

xI

measured daily global solar radiation

x̄I

mean of measured daily global solar radiation

yi

predicted daily global solar radiation

Growing demand for energy supplies, worries over their depletion, and environmental issues related to traditional energy sources all contribute to the urgent global energy dilemma. For many years, the primary energy sources have been fossil fuels, including coal, oil, and natural gas. However, its widespread use has produced several serious issues. First, because fossil fuels are finite, they will ultimately run out, which raises questions about the security of the energy supply. These worries are made worse by the increasing demand for energy as the world’s population expands. In addition, burning fossil fuels and extracting them emit greenhouse gases into the atmosphere, which have a considerable impact on climate change and environmental deterioration. In light of these difficulties, renewable energy sources show promise as a remedy.1 By embracing renewables, we can mitigate the adverse impacts of fossil fuels, ensure a long-term and secure energy supply, reduce greenhouse gas emissions, and foster economic growth while addressing the urgent energy needs of our planet.2,3

Solar energy is widely recognized as one of the most abundant and mature renewable energy technologies available at present. This acknowledgment is because the Earth receives an enormous amount of energy from the sun.4–6 The sun radiates an astonishing amount of energy every second, far more than humanity’s current and future energy needs. Solar energy is undeniably a critical and progressively favored energy source, boasting substantial potential for fulfilling the world’s burgeoning energy demands.7 For the effective use of solar energy in diverse applications, it is crucial to accurately forecast solar radiation. Solar radiation forecasts are essential for estimating the potential energy output of solar panels, assisting in their ideal positioning and alignment for maximum energy output.8,9 This information is crucial for energy planning and grid integration because it helps decision-makers determine how much solar energy can be easily incorporated into current power grids.10,11 Achieving a balanced energy supply and demand, preventing overloads or underutilization, and ensuring grid stability all depend on this information.12,13 Furthermore, forecasts of solar radiation are essential for determining the potential of solar resources in certain areas, which makes it easier to analyze the viability and feasibility of solar energy projects. Developers and investors are empowered by precise projections.

The measurement of solar radiation typically depends on specialized instruments. However, the associated expenses for installation, maintenance, and calibration render them largely inaccessible to many meteorological stations worldwide. Solar radiation prediction hinges on the temporal factor of time, bolstered by the incorporation of historical solar radiation data to bolster predictive precision. This underscores the critical importance of devising alternative methods to predict solar radiation data. Consequently, various models, including empirical ones based on mathematical formulas, have been developed for solar radiation forecasting.14 However, empirical models have shown limitations, particularly in accurately predicting daily global solar radiation data. In response to these challenges, the field has witnessed a surge in the utilization of artificial intelligence (AI) techniques like support vector machines (SVMs), deep learning (DL), kernel nearest neighbor (k-NN), artificial neural networks (ANN), genetic algorithms (GA), and more. These AI algorithms have consistently demonstrated superior accuracy when compared to traditional empirical models in various research studies.15–17 

Numerous studies have investigated the performance of AI algorithms vs empirical models in predicting solar radiation, consistently favoring AI’s accuracy.14,16 Quej et al. conducted research in Mexico, employing SVM, Adaptive Neuro-Fuzzy Interference System (ANFIS), and ANN to forecast daily global solar radiation. Their best performance was achieved with SVM, resulting in RMSE = 2.578, MAE = 1.97, and R2 = 0.689.18 Marzo et al. focused on 13 stations, using only ANN to predict daily global solar radiation. By incorporating input features like minimum temperature, maximum temperature, and extraterrestrial solar radiation, they achieved noteworthy outcomes: a Relative Root Mean Squared Error (rRMSE) of 13% and an R-squared (R2) value of 0.64.19 Mehdizadeh et al. tackled daily global solar radiation in Karmen, Iran, employing Gene Expressio Programming (GEP), ANN, and ANFIS models. Among these, the ANN model stood out with an R2 of 0.935.20 Tymvios et al. conducted a comparison between the Angstrom model and the ANN approach for forecasting global solar radiation data. The ANN method outperformed the Angstrom model.21 Meenal and Selvakumar examined daily global solar radiation using SVM, empirical, and ANN models. SVM produced outstanding results, with a correlation exceeding 0.99.21 Yildirim et al. investigated monthly global solar radiation prediction for four Turkish stations, employing regression analysis and ANN. The ANN model outperforms other approaches, with an R2 of 0.961 and a root mean square error (RMSE) of 0.14.22 Kaba et al. predicted the monthly average daily global solar radiation at various Turkish stations using a deep learning algorithm. Their approach, based on DL, achieved an R2 value of 0.98 in the best results.23 Marzouq et al. investigated the application of k-NN, ANN, and empirical models in the prediction of daily global solar radiation. The k-NN model yielded an R2 of 0.96, while a hybrid k-NN and ANN model achieved an R2 of 0.97.24 Ağbulut et al. conducted a comparative analysis of four distinct machine learning algorithms, SVM, ANN, k-NN, and DL, to predict daily solar radiation in Turkey. The study revealed that among these models, the ANN emerged as the most effective, with an impressive R2 value of up to 0.936.25 These studies collectively underscore the effectiveness of AI and machine learning techniques in improving the accuracy of solar radiation predictions across diverse geographical locations.

In this case, this study significantly adds to the corpus of knowledge in three different ways: On the same dataset, it compares eight frequent and relatively infrequent machine learning techniques. Second, it forecasts the distribution of solar radiation in six cities that illustrate Pakistan’s potential for solar radiation, which ranges from low to high. Solar energy radiation is predicted with time (days) being a primary factor, and historical solar radiation data are utilized to enhance the accuracy of predictions. Last but not least, it offers a comprehensive analysis of algorithm performance by combining seven metrics, namely mean bias error (MBE), RMSE, rRMSE, mean absolute bias error (MABE), mean absolute percentage error (MAPE), T-statistic (t-stat), and R2. This holistic approach to solar radiation prediction holds promise for cost-effective and precise data collection and analysis in meteorological research. Accordingly, the remainder of this paper is structured as follows: Sec. II provides an in-depth overview of the study regions. Section III provides a detailed elaboration of the data collection process, introduces the machine learning algorithms utilized, and offers a comprehensive description of the statistical metrics employed for evaluation. Section IV unveils the results of the statistical metrics and presents prediction graphs. Finally, Sec. V offers a comprehensive discussion of the study’s findings and draws its conclusion.

At the meeting point of South and Central Asia is the nation of Pakistan. ∼24°–37° North latitude and 60°–77° East longitude make up its geographic coordinates.26 The total land area of Pakistan is around 796 095 square kilometers. Pakistan boasts considerable solar energy potential, often surpassing that of many European nations. In many parts of the country, there are ∼7–8 h of daily sunshine, resulting in an annual availability of solar energy for about 2300–2700 h.27,28 On average, Pakistan receives a solar irradiance of 2400 kWh/m2 per year.29 Solar maps of Pakistan, created by the U.S. National Renewable Energy Laboratory, reveal that numerous regions in the country are endowed with abundant solar irradiance, averaging between 5 and 7 kWh/m2 per day.30–32 Overall, Pakistan has abundant solar irradiance and a favorable environment for solar energy production. With the implementation of supportive policies and increased investments, the country has the potential to harness solar energy and contribute to its energy needs while reducing its dependence on traditional fossil fuels.

This paper is centered on the anticipation of daily global solar radiation patterns in Pakistan, with a particular concentration on six designated cities. Table I provides crucial geographical information about the selected cities, and when considered alongside Fig. 1,33 it visually depicts their locations on the map of Pakistan. Furthermore, Fig. 1 offers a comprehensive overview of the anticipated solar energy potential for various applications within Pakistan. This map showcases the extended-term averages of both yearly and daily global horizontal irradiation (GHI) accumulations that are specific to Pakistan.

TABLE I.

Key geographical characteristics of the chosen cities include.

CityLatitudeLongitudeElevation (m)
Karachi 24.8607° N 67.0011° E 10 
Lahore 31.5497° N 74.3436° E 217 
Islamabad 33.6844° N 73.0479° E 540 
Quetta 30.1798° N 66.9750° E 1680 
Peshawar 34.0150° N 71.5249° E 359 
Multan 30.1798° N 71.5249° E 122 
CityLatitudeLongitudeElevation (m)
Karachi 24.8607° N 67.0011° E 10 
Lahore 31.5497° N 74.3436° E 217 
Islamabad 33.6844° N 73.0479° E 540 
Quetta 30.1798° N 66.9750° E 1680 
Peshawar 34.0150° N 71.5249° E 359 
Multan 30.1798° N 71.5249° E 122 
FIG. 1.

Pakistan solar resource visualization published by the World Bank Group, funded by the Energy Sector Management Assistance Program (ESMAP), and prepared by SolarGIS.33 

FIG. 1.

Pakistan solar resource visualization published by the World Bank Group, funded by the Energy Sector Management Assistance Program (ESMAP), and prepared by SolarGIS.33 

Close modal

This section is divided into three distinct subsections. To begin, it provides insights into data collection and preprocessing methods. It then offers a concise overview of the machine learning algorithms employed. Finally, the concluding subsection outlines the evaluation metrics utilized in the study.

The current paper is centered on forecasting daily mean global solar radiation data for six distinct cities situated in Pakistan. In Fig. 2, the strategic approach to collecting solar intensity data involved utilizing a pyranometer, a specialized device for measuring solar radiation levels. To efficiently capture and transmit this valuable data, a MAX485 transceiver was incorporated into the setup alongside an Arduino microcontroller. The Arduino served as a central component, orchestrating data collection and transmission via the RX (receive) and TX (transmit) communication pins. Subsequently, the collected solar intensity data were seamlessly relayed to an ESP32 microcontroller, which acted as a vital data intermediary. The ESP32, renowned for its connectivity capabilities, played a pivotal role in processing the incoming data from the pyranometer. Following this, the ESP32 executed the transmission of this crucial solar intensity data to the Thing Speak server, as shown in Fig. 3. This intricate yet efficient process facilitated real-time and precise monitoring of solar radiation levels, rendering the data invaluable for subsequent analyses and research endeavors in the field. Using data collected from January 1, 2022, to December 31, 2022, an integrated dataset has been compiled that serves as a strong basis for predictive modeling and analysis in our study. This dataset plays a crucial role in improving the accuracy of solar radiation forecasts and pushing forward the boundaries of research in this field.

FIG. 2.

IoT-based solar intensity monitoring circuit.

FIG. 2.

IoT-based solar intensity monitoring circuit.

Close modal
FIG. 3.

Thing speak server interface.

FIG. 3.

Thing speak server interface.

Close modal

Machine learning (ML) is a widely used subset of artificial intelligence (AI) that has gained immense popularity due to its applicability across a wide range of domains. ML empowers systems to independently grasp patterns and make predictions about unknown outcomes. The effectiveness of an ML algorithm is highly dependent on the judicious selection of features and the success of its training process. In our study, we harnessed a comprehensive set of eight distinct ML algorithms to accomplish our research goals. These are Linear Regression (LR), Artificial Neural Network (ANN), Support Vector Machine (SVM), Kernel Nearest Neighbor (k-NN), Deep Learning (DL), Random Forest Regression (RFR), Gradient Boosting Regression (GBR), and Support Vector Regression (SVR).

All of these algorithms were implemented using Python within the Google Colab environment. We employed a random dataset split with shuffled sampling, dedicating 80% of the total dataset to training the algorithms and reserving the remaining 20% for testing. Importantly, the same dataset was consistently used for both training and testing across all methods to ensure uniformity in label predictions. This standardized approach facilitated more meaningful comparisons among the various ML models, resulting in a comprehensive and rigorous exploration of our research objectives.

1. Linear regression (LR)

Linear regression stands as a fundamental machine learning technique employed to model the connection between a dependent variable (often referred to as the target) and one or more independent variables (commonly known as features). The objective of linear regression is to determine a linear equation that effectively captures the association between certain factors (such as the day) and solar radiation.34 

2. Artificial neural network (ANN)

Subsequently, feature standardization is performed using mean and variance scaling techniques to ensure uniformity in feature scaling. An ANN regressor model is then constructed with two hidden layers, each comprising 50 neurons. A maximum iteration limit is set for training, and the random state parameter is specified to ensure reproducibility.

3. Support vector machine (SVM)

The Support Vector Machine (SVM) is a versatile machine learning algorithm originally designed for binary classification tasks. However, its utility has evolved over time, extending to regression, classification, and even outlier detection problems, particularly in scenarios involving nonlinear systems. SVM operates under the umbrella of supervised learning and relies on statistical learning theory principles. Its primary objective in classification tasks is to create a clear separation between two classes of data. This separation is represented as a parallel line (or hyperplane) in two-dimensional space or as a plane in three-dimensional space.35,36

4. Kernel nearest neighbor (k-NN)

The K-Nearest Neighbors (KNN) algorithm, applied to predict solar intensity in this context, utilizes the principle of similarity, assuming that data points close to each other in the feature space share similar target values. This algorithm involves data preparation, including feature scaling, the creation of a KNN regressor model with a specified number of neighbors (set to 5), model training, and subsequent prediction of “MJ/m2” values. KNN is valued for its simplicity and effectiveness, particularly in regression tasks, where proximity between data points plays a pivotal role in making predictions. Proper selection of the number of neighbors is crucial for optimal model performance.37 

5. Deep learning (DL)

Deep Learning (DL) is a widely employed technique for addressing intricate challenges involving notably extensive datasets. Within the realm of deep learning, one can conduct supervised, semi-supervised, and unsupervised tasks. A neural network model is constructed with three layers: an input layer with 64 neurons using the Rectified Linear (ReLU) activation function, a hidden layer with 32 neurons also utilizing ReLU, and an output layer with a single neuron. The model is compiled using the “adam” optimizer and the “mean_squared_error” loss function. Training the model entails adapting it to the training data through 100 epochs, utilizing a batch size of 32. Once trained, the model predicts “MJ/m2” values using the test data.38 

6. Random forest regression (RFR)

Random forest regression is a type of ensemble learning approach that merges numerous decision trees to enhance precision while mitigating the risk of overfitting. It is robust, handles high-dimensional data well, and captures complex relationships. A random forest regression model is created and trained on the training data using 100 decision trees (n_estimators = 100) for stability, with a fixed random state for reproducibility. Random Forest Regression, known for its ensemble of decision trees and ability to capture complex relationships, proves effective in predicting solar intensity accurately.39 

7. Gradient boosting regression (GBR)

Gradient boosting regression builds an additive model of decision trees, excelling in predictive accuracy and being less prone to overfitting. It is often used for regression tasks and can handle missing data. A gradient boosting regression model is created and trained on the training data using 100 estimators (n_estimators = 100) for robustness, with a predefined random state for reproducibility.

8. Support vector regression (SVR)

Support vector regression extends SVM to regression tasks, aiming to find a hyperplane that minimizes margin violations while fitting the data within a certain epsilon tube. It is effective for regression with non-linear relationships and outliers, offering robust modeling capabilities. Support vector regression (SVR) is employed to predict solar intensity.40 The data are split into features (X) and targets (y). A support vector regression model is created and fitted to the data using a radial basis function (RBF) kernel with specific hyperparameters such as C = 100 and gamma = “auto.” Predictions are made on the full dataset, and an array of performance metrics is calculated. SVR, a robust algorithm for regression tasks, demonstrates its efficacy in accurately predicting solar intensity.41 

Accuracy stands as the paramount criterion when evaluating the success of prediction methods. Consequently, common error metrics play a pivotal role in assessing the outcomes of prediction models and facilitating comparisons among them. In this study, a range of fundamental metrics was utilized to gauge the performance of the prediction models applied. These metrics include Mean Bias Error (MBE), Root Mean Squared Error (RMSE), Relative Root Mean Squared Error (rRMSE), Mean Absolute Bias Error (MABE), Mean Absolute Percentage Error (MAPE), t-stat, and Coefficient of Determination (R2).

Table II provides a comprehensive breakdown of these statistical metrics, complete with their respective equations and explanations. Within Table II, yi and xi represent predicted and measured daily global solar radiation, respectively. x̄i corresponds to the mean of the measured daily global solar radiation, while n signifies the total number of observations. These metrics collectively establish a robust framework for assessing and comparing the effectiveness of the prediction models employed in the study.

TABLE II.

A concise overview of the statistical metrics employed in the research.

MetricsEquationDescription
MBE 1ni=1n(yixi) MBE stands as a crucial metric for assessing the sustained effectiveness of prediction models. A lower MBE value signifies enhanced performance in the model, with zero representing the optimal scenario17  
RMSE 1ni=1n(yixi)2 RMSE offers insights into the immediate performance of prediction models. It is always a positive value, and the aim is for it to approach zero42  
rRMSE 1ni=1n(yixi)2x̄i×100 rRMSE is derived from RMSE and the mean value of the measured data. A low rRMSE value indicates improved performance in the employed prediction model.43 The evaluation criteria for the prediction model's success were defined as follows: 
Excellent: When the rRMSE is less than 10% 
Good: When the rRMSE falls between 10% and 20% 
Fair: When the rRMSE ranges from 20% to 30% 
Poor: When the rRMSE exceeds 30%44,45 
MABE 1ni=1nyixi MABE represents the absolute value of the bias error and serves as an indicator of the quality of a correlation. Ideally, MABE should approach zero. MABE offers insights into the long-term effectiveness of prediction models46,47 
MAPE 1ni=1nxiyixi×100 MAPE computes the percentage by which the average absolute prediction errors differ from the absolute values of the actual data. A reduced MAPE value signifies enhanced model performance.42,48 The assessment of the prediction model’s success is defined as follows: 
High prediction accuracy: When MAPE is equal to or less than 10% 
Good prediction: When MAPE falls between 10% and 20% 
Reasonable prediction: When MAPE ranges from 20% to 50% 
Inaccurate prediction: When MAPE exceeds 50%49  
t-stat n1MBE2RMSE2MBE2 The t-statistic is employed to determine whether a model’s predictions hold statistical significance. A smaller t-statistic value indicates better performance of the prediction model. In this approach, a t-critic value is determined using statistical tables45  
R2 1(xiyi)2(xix̄i)2 This method offers insights into the predictive capability of a model for a given set of measured data, with its value falling within the range of 0–1. When the R2 value approaches 1, it signifies the superior performance of the model47  
MetricsEquationDescription
MBE 1ni=1n(yixi) MBE stands as a crucial metric for assessing the sustained effectiveness of prediction models. A lower MBE value signifies enhanced performance in the model, with zero representing the optimal scenario17  
RMSE 1ni=1n(yixi)2 RMSE offers insights into the immediate performance of prediction models. It is always a positive value, and the aim is for it to approach zero42  
rRMSE 1ni=1n(yixi)2x̄i×100 rRMSE is derived from RMSE and the mean value of the measured data. A low rRMSE value indicates improved performance in the employed prediction model.43 The evaluation criteria for the prediction model's success were defined as follows: 
Excellent: When the rRMSE is less than 10% 
Good: When the rRMSE falls between 10% and 20% 
Fair: When the rRMSE ranges from 20% to 30% 
Poor: When the rRMSE exceeds 30%44,45 
MABE 1ni=1nyixi MABE represents the absolute value of the bias error and serves as an indicator of the quality of a correlation. Ideally, MABE should approach zero. MABE offers insights into the long-term effectiveness of prediction models46,47 
MAPE 1ni=1nxiyixi×100 MAPE computes the percentage by which the average absolute prediction errors differ from the absolute values of the actual data. A reduced MAPE value signifies enhanced model performance.42,48 The assessment of the prediction model’s success is defined as follows: 
High prediction accuracy: When MAPE is equal to or less than 10% 
Good prediction: When MAPE falls between 10% and 20% 
Reasonable prediction: When MAPE ranges from 20% to 50% 
Inaccurate prediction: When MAPE exceeds 50%49  
t-stat n1MBE2RMSE2MBE2 The t-statistic is employed to determine whether a model’s predictions hold statistical significance. A smaller t-statistic value indicates better performance of the prediction model. In this approach, a t-critic value is determined using statistical tables45  
R2 1(xiyi)2(xix̄i)2 This method offers insights into the predictive capability of a model for a given set of measured data, with its value falling within the range of 0–1. When the R2 value approaches 1, it signifies the superior performance of the model47  

This paper primarily revolves around the evaluation of the predictability of daily solar radiation on a horizontal surface within six distinct cities across Pakistan using eight diverse machine learning algorithms. To gauge the effectiveness of these algorithms, the study examines seven common statistical metrics commonly employed in the literature. Table III offers the numeric values for the metrics computed for both the cities and the algorithms studied. In this section, a comparison and discussion of all cities and algorithms will take place, with Table III serving as the reference point.

TABLE III.

City-wise algorithm performance comparison.

CitiesMetricLRANNSVMk-NNDLRFRGBRSVR
Karachi MBE (MJ/m20.03 0.89 0.01 0.01 0.65 1.7 0.09 
 RMSE (MJ/m24.92 3.25 3.82 2.29 3.02 1.07 2.59 0.09 
 rRMSE (%) 26 17.21 20.22 12.11 16.47 5.69 14 0.5 
 MABE (MJ/m23.96 2.14 2.33 1.44 2.26 0.65 1.7 0.09 
 MAPE (%) 26.82 16.23 19.49 9.98 14.30 4.49 10.54 0.57 
 t-stat −5.13 0.20 4.47 0.01 0.04 −0.25 −0.56 −0.07 
 R2 0.05 0.58 0.43 0.79 0.53 0.95 0.65 0.99 
Lahore MBE (MJ/m2−0.03 0.822 0.46 0.85 2.39 0.09 
 RMSE (MJ/m26.11 3.86 3.96 2.73 4.11 1.29 3.58 0.09 
 rRMSE (%) 38.08 24.06 24.7 17 28.02 8.07 24.40 0.61 
 MABE (MJ/m25.06 2.96 2.87 1.86 3.10 0.85 2.39 0.09 
 MAPE (%) 51.06 29.62 31.66 17.92 39.88 7.99 31.56 0.79 
 t-stat 4.61 −0.19 3.96 0.01 0.97 −0.02 −1.30 −0.03 
 R2 0.02 0.60 0.58 0.80 0.53 0.95 0.64 0.99 
Islamabad MBE (MJ/m2−0.02 0.84 0.47 1.08 2.66 0.09 
 RMSE (MJ/m26.72 4.2 4.36 3.41 4.45 1.57 3.80 0.09 
 rRMSE (%) 39.88 24.933 25.91 20.24 28.77 9.36 24.58 0.59 
 MABE (MJ/m25.77 3.08 3.01 2.40 3.03 1.08 2.66 0.09 
 MAPE (%) 63.23 36.56 39.45 28.14 38.63 12.87 30.12 0.81 
 t-stat −0.11 3.71 0.91 0.13 −0.69 −0.4 
 R2 0.007 0.61 0.58 0.74 0.46 0.94 0.61 0.99 
Quetta MBE (MJ/m20.03 0.75 −0.37 0.69 2.10 0.09 
 RMSE (MJ/m26.15 3.16 3.38 2.35 2.95 1.17 3.58 0.09 
 rRMSE (%) 28.47 14.63 15.69 10.91 14.19 5.44 17.22 0.45 
 MABE (MJ/m25.26 2.12 2.07 1.46 2.18 0.69 2.10 0.09 
 MAPE (%) 31.15 13.95 14.52 9.5 12.41 4.45 11.06 0.51 
 t-stat 0.21 4.26 −0.02 −1.10 0.20 1.91 0.61 
 R2 0.73 0.69 0.85 0.76 0.96 0.64 0.99 
Peshawar MBE (MJ/m2−0.02 0.89 0.38 0.95 2.19 0.09 
 RMSE (MJ/m26.75 3.91 4.08 3.09 4.11 1.42 2.97 0.09 
 rRMSE (%) 41.89 24.28 25.30 19.15 27.93 8.82 20.19 0.61 
 MABE (MJ/m25.75 2.86 2.79 2.15 2.94 0.95 2.19 0.09 
 MAPE (%) 59.55 31.28 34.31 22.96 30.31 10.11 21.50 0.83 
 t-stat −0.12 4.17 0.81 0.08 −0.94 −0.75 
 R2 0.006 0.66 0.63 0.79 0.59 0.95 0.78 0.99 
Multan MBE (MJ/m20.005 0.63 −0.14 0.73 2.10 0.09 
 RMSE (MJ/m25.52 2.82 2.98 2.27 2.98 1.09 2.96 0.09 
 rRMSE (%) 31.23 15.97 16.85 12.83 17.65 6.18 17.57 0.55 
 MABE (MJ/m24.76 2.05 2.07 1.57 2.32 0.73 2.10 0.09 
 MAPE (%) 36.03 15.55 16.53 11.25 16.23 5.05 14.29 0.64 
 t-stat 0.03 4.06 −0.42 −0.03 0.91 −0.22 
 R2 0.007 0.74 0.71 0.83 0.71 0.96 0.71 0.99 
CitiesMetricLRANNSVMk-NNDLRFRGBRSVR
Karachi MBE (MJ/m20.03 0.89 0.01 0.01 0.65 1.7 0.09 
 RMSE (MJ/m24.92 3.25 3.82 2.29 3.02 1.07 2.59 0.09 
 rRMSE (%) 26 17.21 20.22 12.11 16.47 5.69 14 0.5 
 MABE (MJ/m23.96 2.14 2.33 1.44 2.26 0.65 1.7 0.09 
 MAPE (%) 26.82 16.23 19.49 9.98 14.30 4.49 10.54 0.57 
 t-stat −5.13 0.20 4.47 0.01 0.04 −0.25 −0.56 −0.07 
 R2 0.05 0.58 0.43 0.79 0.53 0.95 0.65 0.99 
Lahore MBE (MJ/m2−0.03 0.822 0.46 0.85 2.39 0.09 
 RMSE (MJ/m26.11 3.86 3.96 2.73 4.11 1.29 3.58 0.09 
 rRMSE (%) 38.08 24.06 24.7 17 28.02 8.07 24.40 0.61 
 MABE (MJ/m25.06 2.96 2.87 1.86 3.10 0.85 2.39 0.09 
 MAPE (%) 51.06 29.62 31.66 17.92 39.88 7.99 31.56 0.79 
 t-stat 4.61 −0.19 3.96 0.01 0.97 −0.02 −1.30 −0.03 
 R2 0.02 0.60 0.58 0.80 0.53 0.95 0.64 0.99 
Islamabad MBE (MJ/m2−0.02 0.84 0.47 1.08 2.66 0.09 
 RMSE (MJ/m26.72 4.2 4.36 3.41 4.45 1.57 3.80 0.09 
 rRMSE (%) 39.88 24.933 25.91 20.24 28.77 9.36 24.58 0.59 
 MABE (MJ/m25.77 3.08 3.01 2.40 3.03 1.08 2.66 0.09 
 MAPE (%) 63.23 36.56 39.45 28.14 38.63 12.87 30.12 0.81 
 t-stat −0.11 3.71 0.91 0.13 −0.69 −0.4 
 R2 0.007 0.61 0.58 0.74 0.46 0.94 0.61 0.99 
Quetta MBE (MJ/m20.03 0.75 −0.37 0.69 2.10 0.09 
 RMSE (MJ/m26.15 3.16 3.38 2.35 2.95 1.17 3.58 0.09 
 rRMSE (%) 28.47 14.63 15.69 10.91 14.19 5.44 17.22 0.45 
 MABE (MJ/m25.26 2.12 2.07 1.46 2.18 0.69 2.10 0.09 
 MAPE (%) 31.15 13.95 14.52 9.5 12.41 4.45 11.06 0.51 
 t-stat 0.21 4.26 −0.02 −1.10 0.20 1.91 0.61 
 R2 0.73 0.69 0.85 0.76 0.96 0.64 0.99 
Peshawar MBE (MJ/m2−0.02 0.89 0.38 0.95 2.19 0.09 
 RMSE (MJ/m26.75 3.91 4.08 3.09 4.11 1.42 2.97 0.09 
 rRMSE (%) 41.89 24.28 25.30 19.15 27.93 8.82 20.19 0.61 
 MABE (MJ/m25.75 2.86 2.79 2.15 2.94 0.95 2.19 0.09 
 MAPE (%) 59.55 31.28 34.31 22.96 30.31 10.11 21.50 0.83 
 t-stat −0.12 4.17 0.81 0.08 −0.94 −0.75 
 R2 0.006 0.66 0.63 0.79 0.59 0.95 0.78 0.99 
Multan MBE (MJ/m20.005 0.63 −0.14 0.73 2.10 0.09 
 RMSE (MJ/m25.52 2.82 2.98 2.27 2.98 1.09 2.96 0.09 
 rRMSE (%) 31.23 15.97 16.85 12.83 17.65 6.18 17.57 0.55 
 MABE (MJ/m24.76 2.05 2.07 1.57 2.32 0.73 2.10 0.09 
 MAPE (%) 36.03 15.55 16.53 11.25 16.23 5.05 14.29 0.64 
 t-stat 0.03 4.06 −0.42 −0.03 0.91 −0.22 
 R2 0.007 0.74 0.71 0.83 0.71 0.96 0.71 0.99 

Figure 4 displays the daily solar radiation data and the magnitude of errors for Karachi, including measurements and predictions made by various algorithms. When examining both Fig. 4 and Table III concurrently, several observations can be made regarding Karachi. In terms of rRMSE, which measures the normalized RMSE, the SVR algorithm performed the best with the lowest value of 0.5%, indicating its ability to make predictions with relatively low errors compared to the other algorithms. For MBE, the LR model achieved the smallest value, albeit close to zero, indicating that it slightly underestimates the actual values on average. Conversely, SVR demonstrated the best performance in terms of RMSE, with the smallest value of 0.0991, suggesting it provides the most accurate predictions. The R-squared (R2) metric, which gauges the extent of variance in the target variable explained by the model, demonstrated that SVR outperforms all other algorithms, achieving an exceptionally high R2 value of 0.9996, indicating an almost perfect fit. For MAPE, SVR achieved the best performance with a remarkably low value of 0.5796%, indicating that it provides predictions that are, on average, only 0.5796% off the actual values. For accuracy in predicting actual values, SVR is the top performer, especially considering its extremely high R2 and low RMSE.

FIG. 4.

Actual data, predicted data, and error magnitude for Karachi city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

FIG. 4.

Actual data, predicted data, and error magnitude for Karachi city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

Close modal

Figure 5 presents the daily data and prediction errors for Lahore’s forecasting task, with contributions from various machine learning algorithms, as outlined in Table III. Analyzing both Fig. 5 and Table III concurrently provides valuable insights into Lahore’s predictive modeling. In terms of the relative RMSE (rRMSE), a metric that normalizes RMSE, it is evident from Fig. 5 that the SVR algorithm stands out with the lowest value at 0.61%, signifying its exceptional ability to generate predictions with remarkably low errors compared to other algorithms. This impressive performance positions SVR as a frontrunner for precision in Lahore’s predictions. The R-squared (R2) metric, which gauges how much variance in the target variable the model explains, reinforces SVR’s dominance, as it achieves an exceptionally high R2 value of 0.9997, signifying an almost perfect fit. This level of explanatory power positions SVR as the leader in capturing the underlying patterns in Lahore’s data. Aligning with Fig. 5 and Table III, it is evident that Support Vector Regression (SVR) is the standout algorithm for predicting Lahore’s data. SVR consistently outperforms other models across various metrics, displaying superior accuracy, minimal bias, and an exceptional ability to capture underlying patterns in the data. These results affirm SVR as the optimal choice for precise predictions in the context of Lahore’s critical forecasting applications.

FIG. 5.

Actual data, predicted data, and error magnitude for Lahore city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

FIG. 5.

Actual data, predicted data, and error magnitude for Lahore city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

Close modal

Figure 6 provides a visual representation of daily data and prediction errors for Islamabad, accompanied by a detailed analysis of key metrics in Table III. By examining both Fig. 6 and Table III concurrently, we can draw several insightful observations regarding Islamabad’s predictive modeling. SVR stands out as the top-performing algorithm for predicting data in Islamabad. In terms of the relative Root Mean Square Error (rRMSE), SVR achieves the lowest value at an impressive 0.59%, indicating its exceptional ability to generate predictions with remarkably low errors compared to other algorithms. SVR achieves an exceptionally high R2 value of 0.9998, indicating an almost perfect fit to Islamabad's data. For the Mean Absolute Percentage Error (MAPE), SVR again outperforms all other algorithms, with a remarkably low value of 0.8161%. This implies that, on average, SVR’s predictions deviate by only 0.8161% from the actual values, underscoring its precision and reliability. Based on the insights drawn from Fig. 6 and Table III, Support Vector Regression (SVR) emerges as the standout algorithm for predicting data in Islamabad. SVR consistently delivers exceptional accuracy, minimal bias, and an exceptional ability to capture the nuances of Islamabad’s data. These results firmly establish SVR as the optimal choice for precise and reliable predictions in the context of Islamabad’s critical forecasting applications.

FIG. 6.

Actual data, predicted data, and error magnitude for Islamabad city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

FIG. 6.

Actual data, predicted data, and error magnitude for Islamabad city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

Close modal

Figure 7 offers a visual representation of daily data and prediction errors for Quetta, Pakistan, complemented by a comprehensive analysis of key metrics in Table III. Examining both Fig. 7 and Table III concurrently allows us to draw several noteworthy observations concerning predictive modeling in Quetta. In terms of the rRMSE, SVR impressively records the lowest value at just 0.46%. SVR achieves an exceptionally high R2 value of 0.9997, indicating an almost perfect fit to Quetta’s data. For the Mean Absolute Percentage Error (MAPE), SVR once again surpasses all other algorithms, with an extraordinarily low value of 0.5141%. The insights drawn from Fig. 7 and Table III firmly establish SVR as the optimal choice for predicting data in Quetta. SVR consistently delivers exceptional accuracy, minimal bias, and an outstanding ability to capture the nuances of Quetta’s data. These results endorse SVR as the leading algorithm for precise and reliable predictions in the context of Quetta’s critical forecasting applications.

FIG. 7.

Actual data, predicted data, and error magnitude for Quetta city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

FIG. 7.

Actual data, predicted data, and error magnitude for Quetta city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

Close modal

Figure 8 presents a visual depiction of daily data and prediction errors for Peshawar, Pakistan. Despite having little bias, linear regression (LR) lacks accuracy, as shown by its large RMSE and low R-squared (R2) values. Artificial Neural Networks (ANNs), on the other hand, while somewhat underestimating values, show enhanced accuracy. Although Support Vector Machine (SVM) has some bias, it delivers fair accuracy. With almost impartial predictions and excellent accuracy, k-Nearest Neighbors (KNN) works brilliantly. Deep Learning (DL) displays average accuracy but has a somewhat greater error margin. Low bias, excellent accuracy, and few errors make Random Forest Regression (RFR) stand out. Although it increases bias, Gradient Boosting Regression (GBR) maintains respectable accuracy. Support Vector Regression (SVR) stands out as a top performer, providing highly accurate, nearly unbiased predictions with exceptionally low error rates. SVR features an RMSE of 0.0995, an R2 of 0.9998, and an impressively low MAPE of 0.8319%. These results position SVR as the optimal choice for precise predictive modeling in Peshawar.

FIG. 8.

Actual data, predicted data, and error magnitude for Peshawar city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

FIG. 8.

Actual data, predicted data, and error magnitude for Peshawar city: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

Close modal

Figure 9 presents daily data and prediction errors for Multan, Pakistan, and Table III provides a thorough analysis of the most important indicators. As shown in Table III, a thorough examination of different machine learning algorithms yields important revelations. Support Vector Regression (SVR) stands out among these algorithms as the top performer, providing remarkably accurate and completely unbiased predictions with an amazing R-squared (R2) value of 0.9997 and an incredibly low Root Mean Square Error (RMSE) of 0.0989. These measurements show SVR’s exceptional capacity to discern underlying data patterns with little error, which makes it the best option for accurate predictive modeling in Multan. The outcomes perfectly match particular modeling goals and highlight SVR’s supremacy in this situation, as seen in Fig. 9.

FIG. 9.

Actual data, predicted data, and error magnitude for Multan City: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

FIG. 9.

Actual data, predicted data, and error magnitude for Multan City: (a) LR, (b) ANN, (c) SVM, (d) k-NN, (e) DL, (f) FRF, (g) GBR, and (h) SVR for the year 2022.

Close modal

In Fig. 10, the scatter plots reveal daily global solar radiation data predictions generated by close-fit machine learning algorithms across all six cities. Notably, when examining the scatter plot for Islamabad, it becomes evident that the error data exhibit a considerable degree of dispersion. This wide variation in prediction errors for Islamabad can be attributed to the city’s solar radiation data, which is characterized by its high volatility. Additionally, this dispersion is accentuated by Islamabad’s geographical location within the transitional climatic zone, as illustrated in Fig. 1.

FIG. 10.

City-wise scatterplots with best-fit ML algorithms: (a) Karachi, (b) Lahore, (c) Islamabad, (d) Quetta, (e) Peshawar, and (f) Multan.

FIG. 10.

City-wise scatterplots with best-fit ML algorithms: (a) Karachi, (b) Lahore, (c) Islamabad, (d) Quetta, (e) Peshawar, and (f) Multan.

Close modal

This paper explores the effectiveness of eight distinct machine learning algorithms (LR, ANN, SVM, k-NN, DL, RFR, GBR, and SVR) in the context of predicting daily global solar radiation with time as the underlying variable. The analysis encompasses data from six Pakistani cities (Karachi, Lahore, Islamabad, Quetta, Peshawar, and Multan). The assessment of these machine learning algorithms involves a comprehensive examination of seven key metrics, including Mean Bias Error (MBE), Root Mean Square Error (RMSE), relative RMSE (rRMSE), Mean Absolute Bias Error (MABE), Mean Absolute Percentage Error (MAPE), t-statistic (t-stat), and R-squared (R2). From this extensive investigation, the following noteworthy conclusions can be drawn:

  1. Analyzing the predictive outcomes presented in Table III and the corresponding figures, it becomes evident that among the eight models under consideration, four exhibit a remarkable degree of closeness to the actual data. These models are k-Nearest Neighbors (KNN), Random Forest Regression (RFR), Gradient Boosting Regression (GBR), and Support Vector Regression (SVR).

  2. Analyzing the prediction results concerning the R2 metric, it becomes evident that the four algorithms employed across all cities have consistently delivered favorable outcomes. These algorithms have demonstrated R2 values ranging from 61% to as high as 99%, depending on the specific city. This remarkable performance consistency underscores these algorithms’ reliability and effectiveness in predicting daily global solar radiation across diverse geographical locations.

  3. Relative Root Mean Square Error (rRMSE) analysis of the prediction findings shows that the four closely fitting algorithms used across all cities have consistently produced positive results. Depending on the particular city, the rRMSE values for various algorithms have shown variances from as low as 0.45% to a maximum of 24.58%.

  4. When comparing the SVR algorithm’s error magnitudes to those of other algorithms, it can be shown that the error magnitudes in the use of the SVR algorithm are quite low.

These findings underscore the significance of choosing the right machine learning algorithm for solar radiation prediction tasks. In particular, SVR emerges as a standout choice, offering precise and reliable predictions for daily global solar radiation across the studied cities. This study contributes valuable insights for applications in renewable energy forecasting and can aid decision-makers in optimizing solar energy utilization in these regions.

The authors acknowledge the support provided by the Department of Mechanical Engineering at NED University of Engineering and Technology to carry out the necessary research.

The authors have no conflicts to disclose.

All authors contributed equally to this work.

Talha Bin Nadeem: Conceptualization (equal); Data curation (equal); Investigation (equal); Methodology (equal); Project administration (equal); Supervision (equal); Validation (equal); Writing – original draft (equal). Syed Usama Ali: Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Software (equal); Validation (equal); Visualization (equal); Writing – original draft (equal). Muhammad Asif: Conceptualization (equal); Data curation (equal); Methodology (equal); Project administration (equal); Supervision (equal); Validation (equal); Writing – review & editing (equal). Hari Kumar Suberi: Conceptualization (equal); Data curation (equal); Project administration (equal); Supervision (equal); Validation (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Python was used in this study.

1.
K.
Piechota
, Global energy crisis and renewable energy sources,
Institute of Economic Research
,
2013
, available at: https://EconPapers.repec.org/RePEc:pes:wpaper:2013:no11.
2.
T. B.
Nadeem
,
M.
Siddiqui
,
M.
Khalid
, and
M.
Asif
, “
Distributed energy systems: A review of classification, technologies, applications, and policies
,”
Energy Strategy Rev.
48
,
101096
(
2023
).
3.
T.
Bin Nadeem
,
A. A.
Naqvi
, and
A.
Ahmed
, “
Suitable site selection for ocean thermal energy conversion (OTEC) systems—A case study for Pakistan
,”
Tecciencia
33
,
35
48
(
2022
).
4.
S. A.
Kalogirou
, “
Chapter 11: Designing and modeling solar energy systems
,” in
Solar Energy Engineering
, 2nd ed., edited by
S. A.
Kalogirou
(
Academic Press
,
Boston
,
2014
), pp.
583
699
.
5.
A.
Khanlari
,
A.
Sözen
,
C.
Şirin
,
A. D.
Tuncer
, and
A.
Gungor
, “
Performance enhancement of a greenhouse dryer: Analysis of a cost-effective alternative solar air heater
,”
J. Cleaner Prod.
251
,
119672
(
2020
).
6.
A. A.
Naqvi
,
T. B.
Nadeem
,
A.
Ahmed
, and
F. A.
Butt
, “
Effective utilization of solar energy for the production of green hydrogen from photovoltaic powered electrolyzer
,”
J. Test. Eval.
52
,
JTE20230173
(
2023
).
7.
N.
Adelakun
and
B.
Olanipekun
, “
A review of solar energy
,”
SSRN Electron. J.
6
,
11344
11347
(
2019
).
8.
H.
Alkahtani
,
T. H. H.
Aldhyani
, and
S. N.
Alsubari
, “
Application of artificial intelligence model solar radiation prediction for renewable energy systems
,”
Sustainability
15
(
8
),
6973
(
2023
).
9.
A.
Ahmed
,
T. B.
Nadeem
,
A. A.
Naqvi
,
M. A.
Siddiqui
,
M. H.
Khan
,
M. S.
Bin Zahid
, and
S. M.
Ammar
, “
Investigation of PV utilizability on university buildings: A case study of Karachi, Pakistan
,”
Renewable Energy
195
,
238
251
(
2022
).
10.
H.
Kim
,
S.
Park
, and
S.
Kim
, “
Solar radiation forecasting using boosting decision tree and recurrent neural networks
,”
Commun. Stat. Appl. Methods
29
(
6
),
709
719
(
2022
).
11.
P.
Sasirekha
,
T. M.
Navinkumar
,
A. A. A.
Praveen
,
T. B.
Prakash
,
P.
Swapna
, and
M.
Vinothkumar
, “
Comparative analysis of prediction on solar radiation in energy generation system using random forest and decision tree
,” in
2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS)
(
IEEE
,
2022
), pp.
899
903
.
12.
L.
Huang
,
J.
Kang
,
M.
Wan
,
L.
Fang
,
C.
Zhang
, and
Z.
Zeng
, “
Solar radiation prediction using different machine learning algorithms and implications for extreme climate events
,”
Front. Earth Sci.
9
,
596860
(
2021
).
13.
A. A.
Naqvi
,
T.
Bin Nadeem
,
A.
Ahmed
, and
A.
Ali Zaidi
, “
Designing of an off-grid photovoltaic system with battery storage for remote location
,”
Tecciencia
16
,
15
28
(
2021
).
14.
H. C.
Bayrakçı
,
C.
Demircan
, and
A.
Keçebaş
, “
The development of empirical models for estimating global solar radiation on horizontal surface: A case study
,”
Renewable Sustainable Energy Rev.
81
,
2771
2782
(
2018
).
15.
C.
Huang
,
Z.
Zhao
,
L.
Wang
,
Z.
Zhang
, and
X.
Luo
, “
Point and interval forecasting of solar irradiance with an active Gaussian process
,”
IET Renewable Power Gener.
14
(
6
),
1020
1030
(
2020
).
16.
Y.
Liu
,
Y.
Zhou
,
Y.
Chen
,
D.
Wang
,
Y.
Wang
, and
Y.
Zhu
, “
Comparison of support vector machine and copula-based nonlinear quantile regression for estimating the daily diffuse solar radiation: A case study in China
,”
Renewable Energy
146
,
1101
1112
(
2020
).
17.
J.
Fan
,
X.
Wang
,
L.
Wu
,
F.
Zhang
,
H.
Bai
,
X.
Lu
, and
Y.
Xiang
, “
New combined models for estimating daily global solar radiation based on sunshine duration in humid regions: A case study in South China
,”
Energy Convers. Manage.
156
,
618
625
(
2018
).
18.
V. H.
Quej
,
J.
Almorox
,
J. A.
Arnaldo
, and
L.
Saito
, “
ANFIS, SVM and ANN soft-computing techniques to estimate daily global solar radiation in a warm sub-humid environment
,”
J. Atmos. Sol.-Terr. Phys.
155
,
62
70
(
2017
).
19.
A.
Marzo
et al, “
Daily global solar radiation estimation in desert areas using daily extreme temperatures and extraterrestrial radiation
,”
Renewable Energy
113
,
303
311
(
2017
).
20.
S.
Mehdizadeh
,
J.
Behmanesh
, and
K.
Khalili
, “
Comparison of artificial intelligence methods and empirical equations to estimate daily solar radiation
,”
J. Atmos. Sol.-Terr. Phys.
146
,
215
227
(
2016
).
21.
F. S.
Tymvios
,
C. P.
Jacovides
,
S. C.
Michaelides
, and
C.
Scouteli
, “
Comparative study of Ångström’s and artificial neural networks’ methodologies in estimating global solar radiation
,”
Sol. Energy
78
(
6
),
752
762
(
2005
).
22.
H. B.
Yıldırım
,
Ö.
Çelik
,
A.
Teke
, and
B.
Barutçu
, “
Estimating daily global solar radiation with graphical user interface in Eastern Mediterranean region of Turkey
,”
Renewable Sustainable Energy Rev.
82
,
1528
1537
(
2018
).
23.
K.
Kaba
,
M.
Sarıgül
,
M.
Avcı
, and
H. M.
Kandırmaz
, “
Estimation of daily global solar radiation using deep learning model
,”
Energy
162
,
126
135
(
2018
).
24.
M.
Marzouq
,
Z.
Bounoua
,
H.
El Fadili
,
A.
Mechaqrane
,
K.
Zenkouar
, and
Z.
Lakhliai
, “
New daily global solar irradiation estimation model based on automatic selection of input parameters using evolutionary artificial neural networks
,”
J. Cleaner Prod.
209
,
1105
1118
(
2019
).
25.
Ü.
Ağbulut
,
A. E.
Gürel
, and
Y.
Biçen
, “
Prediction of daily global solar radiation using different machine learning algorithms: Evaluation and comparison
,”
Renewable Sustainable Energy Rev.
135
,
110114
(
2021
).
26.
See https://www.geodatos.net/en/coordinates/pakistan for geodatos Pakistan geographic coordinates.
27.
M.
Ashraf Chaudhry
,
R.
Raza
, and
S. A.
Hayat
, “
Renewable energy technologies in Pakistan: Prospects and challenges
,”
Renewable Sustainable Energy Rev.
13
(
6–7
),
1657
1662
(
2009
).
28.
A. A.
Naqvi
,
A.
Ahmed
, and
T. B.
Nadeem
, “
Efficiency improvement of photovoltaic module by air cooling
,”
Appl. Sol. Energy
57
(
6
),
517
522
(
2021
).
29.
T.
Ilahi
,
M.
Abid
, and
T.
Ilahi
, “
Design and analysis of thermoelectric material based roof top energy harvesting system for Pakistan
,” in
2015 Power Generation System and Renewable Energy Technologies
(
PGSRET) (IEEE
,
2015
), pp.
1
3
.
30.
M.
Amer
and
T.
Daim
, “
Selection of renewable energy technologies for a developing county: A case of Pakistan
,”
Energy Sustainable Dev.
15
,
420
(
2011
).
31.
A. A.
Naqvi
,
T. B.
Nadeem
,
A.
Ahmed
,
M.
Uzair
, and
S. A. A.
Zaidi
, “
Techno-economic design of a grid-tied photovoltaic system for a residential building
,”
Adv. Energy Res.
8
,
59
(
2021
).
32.
A.
Ahsan
,
A. A.
Naqvi
,
T. B.
Nadeem
, and
M.
Uzair
, “
Experimental investigation of dust accumulation on the performance of the photovoltaic modules: A case study of Karachi, Pakistan
,”
Appl. Sol. Energy
57
(
5
),
370
376
(
2021
).
33.
G. S.
Atlas
, Solar resource map,
World Bank
, https://globalsolaratlas.info/download/pakistan (accessed 20 September 2023).
34.
S.
Ibrahim
,
I.
Daut
,
Y. M.
Irwan
,
M.
Irwanto
,
N.
Gomesh
, and
Z.
Farhana
, “
Linear regression model in estimating solar radiation in Perlis
,”
Energy Procedia
18
,
1402
1412
(
2012
).
35.
S.
Kim
,
B. M.
Mun
, and
S. J.
Bae
, “
Data depth based support vector machines for predicting corporate bankruptcy
,”
Appl. Intell.
48
(
3
),
791
804
(
2018
).
36.
Y.
Min
,
Y.
Yeboon
, and
H.
Nakayama
, “
A role of total margin in support vector machines
,” in
Proceedings of the International Joint Conference on Neural Networks
(
IEEE
,
2003
), Vol.
3
, pp.
2049
2053
.
37.
S.
Tan
, “
Neighbor-weighted K-nearest neighbor for unbalanced text corpus
,”
Expert Syst. Appl.
28
(
4
),
667
671
(
2005
).
38.
A.
Alzahrani
,
P.
Shamsi
,
C.
Dagli
, and
M.
Ferdowsi
, “
Solar irradiance forecasting using deep neural networks
,”
Procedia Comput. Sci.
114
,
304
313
(
2017
).
39.
S. K. R.
Thota
,
C.
Mala
,
P.
Chandamuri
, and
C.
Nooka
, “
Solar radiation prediction using the random forest regression algorithm
,” in
4th EAI International Conference on Big Data Innovation for Sustainable Cognitive Computing
, edited by
A.
Haldorai
,
A.
Ramu
,
S.
Mohanram
, and
R.
Zgheib
(
Springer International Publishing
,
Cham
,
2023
), pp.
147
157
.
40.
Z. E.
Mohamed
and
H. H.
Saleh
, “
Potential of machine learning based support vector regression for solar radiation prediction
,”
Comput. J.
66
(
2
),
399
415
(
2023
).
41.
H.
Ghazvinian
et al, “
Integrated support vector regression and an improved particle swarm optimization-based model for solar radiation prediction
,”
PLoS One
14
(
5
),
e0217634
(
2019
).
42.
H.
Zang
et al, “
Application of functional deep belief network for estimating daily global solar radiation: A case study in China
,”
Energy
191
,
116502
(
2020
).
43.
H.-L.
Chen
et al, “
An efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest neighbor approach
,”
Expert Syst. Appl.
40
(
1
),
263
271
(
2013
).
44.
A. D.
Tuncer
,
A.
Sözen
,
F.
Afshari
,
A.
Khanlari
,
C.
Şirin
, and
A.
Gungor
, “
Testing of a novel convex-type solar absorber drying chamber in dehumidification process of municipal sewage sludge
,”
J. Cleaner Prod.
272
,
122862
(
2020
).
45.
J.
Fan
,
L.
Wu
,
F.
Zhang
,
H.
Cai
,
X.
Ma
, and
H.
Bai
, “
Evaluation and development of empirical models for estimating daily and monthly mean daily diffuse horizontal solar radiation for different climatic regions of China
,”
Renewable Sustainable Energy Rev.
105
,
168
186
(
2019
).
46.
S.
Rehman
, “
Solar radiation over Saudi Arabia and comparisons with empirical models
,”
Energy
23
(
12
),
1077
1082
(
1998
).
47.
L.
Yang
,
Q.
Cao
,
Y.
Yu
, and
Y.
Liu
, “
Comparison of daily diffuse radiation models in regions of China without solar radiation measurement
,”
Energy
191
,
116571
(
2020
).
48.
S. G.
Gouda
,
Z.
Hussein
,
S.
Luo
, and
Q.
Yuan
, “
Model selection for accurate daily global solar radiation prediction in China
,”
J. Cleaner Prod.
221
,
132
144
(
2019
).
49.
İ.
Ceylan
,
A. E.
Gürel
, and
A.
Ergün
, “
The mathematical modeling of concentrated photovoltaic module temperature
,”
Int. J. Hydrogen Energy
42
(
31
),
19641
19653
(
2017
).