The application of machine learning (ML) to power and energy systems (PES) is being researched at an astounding rate, resulting in a significant number of recent additions to the literature. As the infrastructure of electric power systems evolves, so does interest in deploying ML techniques to PES. However, despite growing interest, the limited number of reported real-world applications suggests that the gap between research and practice is yet to be fully bridged. To help highlight areas where this gap could be narrowed, this article discusses the challenges and opportunities in developing and adapting ML techniques for modern electric power systems, with a particular focus on power distribution systems. These systems play a crucial role in transforming the electric power sector and accommodating emerging distributed technologies to mitigate the impacts of climate change and accelerate the transition to a sustainable energy future. The objective of this article is not to provide an exhaustive overview of the state-of-the-art in the literature, but rather to make the topic accessible to readers with an engineering or computer science background and an interest in the field of ML for PES, thereby encouraging cross-disciplinary research in this rapidly developing field. To this end, the article discusses the ways in which ML can contribute to addressing the evolving operational challenges facing power distribution systems and identifies relevant application areas that exemplify the potential for ML to make near-term contributions. At the same time, key considerations for the practical implementation of ML in power distribution systems are discussed, along with suggestions for several potential future directions.

AI

Artificial intelligence

ANN

Artificial neural network

D-OPF

Distribution optimal power flow

DER

Distributed energy resource

DNN

Deep neural network

DSSE

Distribution system state estimation

EV

Electric vehicle

FLISR

Fault location, isolation and service restoration

GAN

Generative adversarial network

GP

Gaussian process

GNN

Graph neural network

k-NN

k-nearest neighbors

ML

Machine learning

OPF

Optimal power flow

PCA

Principal components analysis

PMU

Phasor measurement unit

PV

Photovoltaic

PES

Power and energy systems

SCADA

Supervisory control and data acquisition

SE

State estimation

SSL

Semi-supervised learning

SVM

Support vector machines

V2G

Vehicle-to-grid

WLS

Weighted least square

Countries worldwide have set imperative targets for lowering greenhouse gas emissions to combat the worst impacts of climate change.1 As part of these climate measures, efforts to decarbonize the electric power sector rely heavily on deploying renewable energy sources—predominantly wind and solar—at scale.2 Reaching greenhouse gas emissions reduction targets also necessitates shifting transportation energy demand from fossil fuels to electricity through extensive vehicle electrification and supporting charging infrastructure, leading to a substantial increase in electricity demand.3 As the demand for electricity grows and the requirements for decarbonization persist, both the importance and the complexity of electric distribution systems are rising.

The integration of small-scale distributed energy resources (DERs), such as solar photovoltaics (PVs) and electric vehicles (EVs), into power distribution systems has increased dramatically in recent years. However, due to their intrinsic variability, uncertainty, and limited controllability, these new assets and technologies bring new challenges to the reliable and secure operations of electric distribution networks. Increasing numbers of DERs are exposing the limitations of existing analytical tools for operating and managing electric distribution networks, primarily because these tools were designed for traditional distribution networks wherein uncertainty arises only from the consumption profiles, and power is supplied predominantly from the transmission system to the distribution system. Meanwhile, the variability and uncertainty from weather-dependent renewable generation, along with limited sensing and visibility into distribution network operations, pose new challenges. These challenges will loom larger in future grids without advanced operation and planning methodologies to improve operational flexibility and guarantee network security, thereby maintaining stable voltages and frequencies.23 Artificial intelligence (AI) and, particularly, machine learning (ML) have attracted considerable research attention as promising solutions to help address these challenges. The implementation of ML solutions is envisioned to complement or potentially even replace traditional, long-established physics-based modeling approaches used in various aspects of power system analysis. In this context, ML applications cover nearly every area of interest, including generation, transmission, distribution, and consumption, and extend into coupled energy infrastructure systems, such as heating, gas, and transportation. Similarly, ML applications in power systems can cover a broad range of timescales, ranging from sub-second intervals for transient stability to decades for planning. Table I summarizes recent review papers on various power and energy system (PES) domains.

TABLE I.

A summary of the most recent review papers on AI/ML for electric power and energy systems in chronological order.

Ref. Year Topic Power system domaina AI domain
Jafari et al.4   2022  Distribution automation system applications  G, D, C  Deep learning 
Wang et al.5   2022  Multi-energy and power systems: resilience  T, D  Various ML paradigms 
Chen et al.6   2022  Decision-making and control  G, T, D, C  Reinforcement learning 
Zhang et al.7   2021  Frequency analysis and control  Deep learning 
Donti and Kolter8   2021  Sustainable power and energy systems  G, T, D, C  Various ML paradigms 
Aslam et al.9   2021  Renewable energy and load forecasting  G, C  Deep learning 
Stock et al.10   2021  Distribution system operations  Various ML paradigms 
Aminifar et al.11   2020  Power electronics-interfaced systems: protection  Various ML paradigms 
Farhoumandi et al.12   2020  IoT-integrated smart grids  G, T, D, C  Various ML paradigms 
Wu and Wang13   2020  Microgrids  D, C  Deep learning, deep reinforcement learning 
Yang and Wu14   2020  Unit commitment  Supervised learning (main focus) 
Cao et al.15   2020  Multi-energy and power systems  G, T, D, C  Reinforcement learning 
Alimi et al.16   2020  Stability assessment and control  Classification (main focus) 
Ibrahim et al.17   2020  Smart grid applications: forecasting, demand-side management, FLISR, cyber-security  G, T, D, C  Various ML paradigms 
Duchesne et al.18   2020  Reliability management  Various ML paradigms 
Ozcanli et al.19   2020  Power system applications: forecasting, fault diagnosis, energy management, power quality disturbances detection  G, T, D, C  Deep learning 
Zhang et al.20   2019  Power system applications: energy management, demand response, electricity markets, operational control  G, T, D, C  Deep reinforcement learning 
Cheng and Yu21   2019  Smart energy and electric power systems  G, T, D, C  Various ML paradigms 
Zhang et al.22   2018  Smart grids  G, T, D, C  Deep learning, reinforcement learning 
Ref. Year Topic Power system domaina AI domain
Jafari et al.4   2022  Distribution automation system applications  G, D, C  Deep learning 
Wang et al.5   2022  Multi-energy and power systems: resilience  T, D  Various ML paradigms 
Chen et al.6   2022  Decision-making and control  G, T, D, C  Reinforcement learning 
Zhang et al.7   2021  Frequency analysis and control  Deep learning 
Donti and Kolter8   2021  Sustainable power and energy systems  G, T, D, C  Various ML paradigms 
Aslam et al.9   2021  Renewable energy and load forecasting  G, C  Deep learning 
Stock et al.10   2021  Distribution system operations  Various ML paradigms 
Aminifar et al.11   2020  Power electronics-interfaced systems: protection  Various ML paradigms 
Farhoumandi et al.12   2020  IoT-integrated smart grids  G, T, D, C  Various ML paradigms 
Wu and Wang13   2020  Microgrids  D, C  Deep learning, deep reinforcement learning 
Yang and Wu14   2020  Unit commitment  Supervised learning (main focus) 
Cao et al.15   2020  Multi-energy and power systems  G, T, D, C  Reinforcement learning 
Alimi et al.16   2020  Stability assessment and control  Classification (main focus) 
Ibrahim et al.17   2020  Smart grid applications: forecasting, demand-side management, FLISR, cyber-security  G, T, D, C  Various ML paradigms 
Duchesne et al.18   2020  Reliability management  Various ML paradigms 
Ozcanli et al.19   2020  Power system applications: forecasting, fault diagnosis, energy management, power quality disturbances detection  G, T, D, C  Deep learning 
Zhang et al.20   2019  Power system applications: energy management, demand response, electricity markets, operational control  G, T, D, C  Deep reinforcement learning 
Cheng and Yu21   2019  Smart energy and electric power systems  G, T, D, C  Various ML paradigms 
Zhang et al.22   2018  Smart grids  G, T, D, C  Deep learning, reinforcement learning 
a

Abbreviations used for different power system domains: G, generation; T, transmission; D, distribution; C, consumption.

A conventional notion of the ML process involves a learning algorithm using input data to find hidden patterns and structures in data, extract new insight from the data, or make predictions automatically. When a priori information (e.g., initial conditions) and knowledge (e.g., physical and mathematical principles) are incorporated into the learning process, what was previously purely data-driven (also commonly referred to as physics-agnostic or model-free) now becomes a hybrid methodology (also commonly referred to as physics-informed or scientific ML), more suitable for engineering applications (Fig. 1). A concrete example of this emerging learning methodology is the category of deep learning algorithms referred to as physics-informed neural networks.24 The latest survey by Huang and Wang25 provides a thorough review of the application of physics-informed neural networks in the domain of power systems, making it a valuable read for those interested in the topic.

FIG. 1.

ML alternatives (1) and (2) to physics-based models. Physics-informed models (2), unlike black-box models (1), need less data, which is reflected in the smaller box size pertaining to data sources.

FIG. 1.

ML alternatives (1) and (2) to physics-based models. Physics-informed models (2), unlike black-box models (1), need less data, which is reflected in the smaller box size pertaining to data sources.

Close modal

In general, ML encompasses a wide variety of methods that can learn from (i) experimental data (in the case of PES, this is often data derived from computational simulations), (ii) observational data describing physical processes within the system (e.g., power system measurements), or (iii) both (observational and experimental/simulation data), to build predictive or explanatory models. As such, these methods may offer many advantages compared to traditional power system planning, operations, and control methodologies. Some examples include, but are not limited to: (i) reduced computational complexity for solving optimization-based problems prevalent in power systems; (ii) higher accuracy in linear modeling of nonlinear power flow equations and other linear function approximations, leading to improved solution accuracy and/or reduced computation time; (iii) the ability to process and analyze large volumes of measurements collected from heterogeneous data sources to derive valuable insight or predictions from such measured data; (iv) the capability to manage ill-posed problems; (v) more informed decision-making under growing uncertainty from DERs; (vi) intelligent support for fault diagnosis and monitoring in complex distribution networks; and (vii) learning hard-to-model functions, such as consumer behavior for demand response mechanisms that do not provide a closed system of equations. See Fig. 2 for a depiction of some potential application areas of ML in power distribution systems.

FIG. 2.

New power system operational paradigm shift with prominent ML applications in power distribution systems.

FIG. 2.

New power system operational paradigm shift with prominent ML applications in power distribution systems.

Close modal

Consistent advances in AI have been fueled by the advent of massive amounts of data and the utilization of high-performance parallel computing using AI accelerator hardware like graphics processing units.26 More recent advances in ML are characterized by the development of new algorithms, particularly within deep neural networks (DNNs)27 and graph neural networks (GNNs).28 As AI/ML advances at an unprecedented rate, it will likely have a significant impact on how power systems are operated and planned. Despite active research devoted to applying state-of-the-art ML techniques to power system problems, there is often a lack of attention paid to evaluating the suitability of the technique for a specific problem, as discussed in Sec. VI. For instance, data-intensive methods like DNNs may not be well-suited for power system problems where large amounts of data are not readily available. Similarly, while GNNs show promising advantages, many obstacles remain to be overcome for their usage in power distribution systems.29 To address this gap, this article aims to provide insight into important topics concerning (i) ML for electric power systems with a particular focus on electric distribution networks, and (ii) challenges unique to this interdisciplinary research area. Unlike previous works that have either broadly covered these topics or targeted specific application areas or ML methodologies, this paper concentrates on the general ML efforts required for electric distribution systems, specifically the D and C domains from Table I. It does not aim to be a comprehensive survey of all plausible applications nor a thorough literature review, as that would be too ambitious. Instead, this paper discusses applications within the distribution system domain that can benefit from a broad range of ML concepts and techniques by identifying suitable application areas. The goal is to foster and stimulate interdisciplinary research between the two disciplines to guide future research and realize the potential of ML in power distribution systems.

The remainder of this paper is organized as follows: Secs. II and III provide background reading on power systems and ML, respectively, from basic concepts and theoretical fundamentals to state-of-the-art techniques. Section IV provides relevant examples from the distribution system domain in the context of each main ML paradigm. Section V overviews selected distribution system application areas where ML can realize greater potential but still needs to be explored. Finally, Sec. VI discusses open problems and recommendations on research directions, while Sec. VII concludes the work.

This section provides fundamental background reading on electric power systems, specifically focusing on power distribution systems. It is intended for readers with little familiarity with the topic and sets the stage for later discussions by considering various distribution system problems where ML can realize great potential. The section begins with an introduction to key concepts pertinent to power system operations and planning. It then proceeds to describe the basic power system modeling aspects. Additionally, it briefly touches on the advantages offered by ML in power system analysis, reserving more detailed discussions for Secs. IV and V. Finally, the section concludes with an overview of power system data.

An electric power grid is a complex system consisting of a transmission network that transmits electricity over long distances using high-voltage transmission lines and a distribution network that delivers electricity to consumers using low-voltage distribution lines. In traditional power systems, electricity flows unidirectionally from centralized, large-scale synchronous generators to loads. However, the increasing prevalence of DERs in modern power distribution systems has given rise to bi-directional power flows, resulting in a greater incidence of over-voltages, as illustrated in Fig. 2. These and other issues arising from the clean energy transition pose challenges to the operational reliability of power grids.

An essential requirement for reliable and secure power grid operations is a continuous balancing of electricity supply and demand, differences between which are manifested in changes in the grid frequency (preferably a constant 60 or 50 Hz depending on the country).30 Any imbalance leads to a change in frequency, which drops from its nominal value in the case of insufficient generation and increases with oversupply. Traditionally, generation and load balancing have been achieved by scheduling (as part of the unit commitment problem) and dispatching (as part of the economic dispatch problem) a controllable generation fleet, such as coal and natural gas power plants, one day ahead and varying their output in real-time to match the varying load (as part of automatic generation control). Unit commitment and economic dispatch comprise a production cost model formulated as a mixed-integer programming problem that minimizes the bulk power system's total operating costs while adhering to the transmission network's and other physical constraints. In contrast, the capacity expansion model—also an optimization problem but with different objectives—focuses on planning and policy aspects of bulk power systems. The so-called optimal power flow is at the heart of these and other optimization problems and is ubiquitous in power systems (see Sec. II C 2).

Security of supply, paramount to reliable and secure power grid operations, is becoming more challenging with a high share of variable renewable energy.23 These changes call for new services at all system levels, from ancillary services provided by renewable energy sources to optimal market frameworks for obtaining those services. Distribution utilities can only meet these new requirements by significantly improving current operational and planning practices, creating opportunities for active ML research in various distribution system contexts, as shown in Fig. 2.

The structure of the electric power system (known as the topology)—referring to the physical arrangement and connection of system components—can be conceptualized as a graph composed of nodes (buses) to which generators and loads are connected and edges (lines) connecting these nodes, as illustrated in Fig. 2. Depending on the problem at hand, the topology of the power system can be represented by a directed or undirected graph.31 This representation facilitates the mathematical derivation of the power system model used in all power system problems, from operations and control to planning. In general, the power system model comprises a set of equations describing the relationships between variables of interest within the timeframe under study while accounting for various components and their more-or-less simplified models, such as transformers, circuit breakers, loads, lines, and cables. Notably, the distribution network model differs from the transmission network model in that it has (i) a radial or weakly meshed network topology that frequently changes; (ii) high resistance-to-reactance ratios; and (iii) unbalanced phases due to the presence of many asymmetrical loads (single-phase and two-phase loads, primarily in North America). These network model differences imply that the distribution and transmission systems analysis also differs. For example, a direct implication of (ii) is that line resistances cannot be neglected, a simplification commonly used for traditional transmission system analysis.

This section presents a brief overview of three fundamental problems in power system analysis—the power flow problem, optimal power flow (OPF), and state estimation (SE)—with a focus on challenges related to distribution system analysis. The discussion frames the challenges in the context of ML applications, thereby serving as an introduction to the use of ML in power system analysis. As such, this section is a good primer for readers interested in exploring the intersection of power systems and ML.

1. Power flow problem

The power flow problem, also known as the load flow problem, is instrumental for steady-state power system analysis. It entails determining the state of the system under normal operating conditions, including the voltage magnitudes and angles at all nodes within the network.32,33 Once the system state is known, any other relevant quantity such as the real and reactive power flow and losses on each line can be analytically calculated.32,33 This information is crucial for power system operators (hereafter referred to as operators for short) to ensure reliable and secure power system operations as it enables them to understand system behavior under different operating conditions, identify potential issues, and take corrective measures to maintain safe operating conditions. As a result, power flow analysis is a fundamental aspect of key power system operational tasks, such as OPF and SE, which will be further discussed below.

The power flow problem comprises a system of nonlinear equations represented in polar form as shown below:
P i = V i j N V j ( G i j cos ( θ i θ j ) + B i j sin ( θ i θ j ) ) , i N , Q i = V i j N V j ( G i j sin ( θ i θ j ) B i j cos ( θ i θ j ) ) , i N ,
(1)
where N is the set of all buses in the system. P i = P i G P i L and Q i = Q i G Q i L are the net active and reactive power injections at bus i, where the superscripts G and L are used to denote the generation and load power injections and draws, respectively. Vi and θi represent the magnitude and phase angle of the voltage at bus i. Gij and Bij are the conductance and susceptance of the line between buses i and j, which correspond to the real and imaginary parts of the line admittance Yij, respectively. While (1) is commonly used, it is important to note that other forms of the power flow equations (e.g., rectangular or phasor form) may be more suitable for different applications.
In a more generic notation, the power flow equations can be expressed in a concise nonlinear algebraic equation form as follows:
f ( x , p , A ) = 0 ,
(2)
where x is a vector of the system unknowns, p is a vector of line parameters (e.g., resistances and reactances), and A is an incidence matrix that accounts for the network topology. The vector 0 represents the zero vector. As a closed-form solution for x does not exist, the nonlinear equation system (2) is commonly solved using iterative methods like Newton–Raphson or Gauss–Seidel.32,33 Additional information on this topic can be found in Refs. 32 and 33.
Using the above notation, (1) can be written compactly as
f ( x ) = [ Δ P ( x ) Δ Q ( x ) ] = 0 .
(3)
Here, the nonlinear function f ( · ) maps the complex voltage phasor x comprising voltage magnitudes and phase angles to the corresponding net power injections Δ P , Δ Q.
Remark 1. In this paper, the power system model usually refers to an algebraic (static) model formulation akin to (2), which is adequate for describing steady-state or quasi-steady-state operations. Formally, such a static physical model can be expressed using ML terminology as
y = A x + ε ,
(4)
where x and y are input and output vectors, ε corresponds to modeling error or noise, and A is a model matrix, which can be an incidence matrix from (2).
When ML is applied to (4), as depicted in Fig. 1, the following data-driven—often referred to as “black-box”—surrogate is obtained:
y = F ( x ) + ε .
(5)

To avoid physical inconsistencies or implausible outcomes resulting from imperfect data or noisy measurements that can negatively impact the performance of ML models, it is necessary to move beyond purely data-driven models like (5).34 This is where physics-informed models (Fig. 1) come into play and offer a promising research direction to pursue; see Sec. VI A 3 for a detailed discussion.

2. Optimal power flow

OPF is routinely used by operators for various power system problems, from planning to operations. Generally, OPF is formulated as a constrained optimization problem
min x , u c ( x , u ) , s . t . g i ( x , u ) = 0 ( i = 1 , , n ) , h j ( x , u ) 0 ( j = 1 , , m ) ,
(6)
where c ( · ) is the objective function, x is the vector of decision variables representing the state of the system, such as generator outputs, and voltage magnitudes and angles, u is the vector of control variables representing the controllable elements of the system, such as generator setpoints, and g ( · ) and h ( · ) are either linear or nonlinear equality and inequality constraints, respectively, pertaining to the physical and operational constraint space. To illustrate, the equalities correspond to the 2 | N | power balance equations (1), whereas the inequalities comprise lower and upper bounds on critical system quantities like line flows and voltage magnitudes and angles:
| Sij | 2 ( S i j max ) 2 , ( i , j ) L , V i min | V i | V i max , i N , θ i min θ i θ i max , i N .
Here, L is the set of all transmission lines in the system, Sij and S i j max are the complex power flow and the maximum line rating for the line (i, j), θi is the phase angle of voltage phasor at bus i, and V i min and V i max are the lower and upper bounds on voltage magnitude at bus i, respectively. Notably, the objective of OPF is problem-dependent and may involve minimizing costs, maximizing system stability, maximizing the utilization of renewable energy sources, and so on. This ultimately determines the definition of c ( · ) in (6).

Remark 2. The OPF problem is nonlinear, nonconvex, and generally large-scale (with tens of thousands of variables involved), making it nondeterministic polynomial-time (NP-) hard to solve.35 As a result, it is common to use less computationally expensive solution approaches to this problem, including:

  • Linear approximations. This set of solution approaches is computationally efficient; however, it yields less accurate, sub-optimal solutions that may violate operational constraints.

  • Convex relaxations. Although computationally more intensive than the previous solution methods, the main advantage of convex relaxation approaches is convergence to the global optimum and feasibility guarantees.

  • Black-box or hybrid solvers. This set of solution approaches is ML-based, may be data-prohibitive, and typically does not provide optimality guarantees. Within this group, we can distinguish between end-to-end learning, which is the most popular in the PES literature, and learning-to-optimize approaches for OPF, as discussed in Ref. 36. The end-to-end learning approach employs supervised learning to map an input, such as nodal net power injections, to an optimal solution, such as voltages or power generation, resulting in high-fidelity optimization proxies for OPF solutions. Conversely, the latter approach leverages ML techniques to accelerate the computational speed of existing optimization algorithms used to solve OPF; one such example is using ML to facilitate the warm starting of OPF solvers.36 As such, some of the learning-to-optimize approaches for OPF can be sub-categorized into physics-based ML models, which are also known as hybrid solvers.

For a detailed survey on approximations and relaxations of power flow equations in the context of OPF, particularly for the first two solution approaches listed above, the reader is referred to Ref. 37. More details on the third solution approach listed above can be found in Ref. 36.

a. Challenges

OPF techniques have been effectively used to optimize bulk power system operations, such as economic dispatch of the generation fleet. However, applying the same OPF methods to distribution systems is challenging due to the differences outlined in Sec. II B. Moreover, the increasing presence of DERs in modern distribution systems exacerbates uncertainty, making distribution OPF (D-OPF) even more complex. As a result, operators need D-OPF solutions that can be executed in near real-time to accurately reflect system operating conditions and make timely decisions under uncertainties. Despite the significant number of published papers on the D-OPF problem, it remains an active area of research.37 

b. ML applications

The recent literature has seen many instances of ML being applied directly to OPF/D-OPF or its variants, including security-constrained OPF and probabilistic OPF that takes uncertainty factors into consideration. For a brief overview of the use of ML in OPF, refer to Ref. 38. One exciting and promising research direction involves physics-informed neural networks for solving OPF while imposing constraints on neural network optimization functions.39 The interested reader is referred to Ref. 40 and references therein for additional details.

3. State estimation

SE is a critical inference task for enabling system-wide situational awareness by determining the most-likely operating state of the power system, typically represented by complex voltage phasors comprising voltage magnitudes and phase angles. To achieve this, SE requires two key inputs: (i) telemetry data available in sufficient quantity to make the network observable, and (ii) an up-to-date network model that includes network parameters and topology. The primary objective of SE is to provide an accurate real-time estimate of the system state (denoted by x) that is consistent with the available measurements, primarily current and voltage magnitudes in distribution systems.41 Specifically, the measurement vector z can be represented as z = h ( x ) + e, where e denotes the measurement error. The nonlinear function h ( · ) relates system states to these measurements by using the previously defined (inverse) power flow equations.

In the classic formulation, a set of measurements (z) is fed into a state estimator that produces state estimates at all nodes ( x ̂) by minimizing the residuals of all measurements as follows:
x ̂ ( z ) = arg min x | | z h ( x ) | | 2 .
(7)

Remark 3. Measurement redundancy and accurate knowledge of the network model are prerequisites for a classic SE framework, such as weighted least squares (WLSs).41,42 The WLS state estimator has been used for decades in transmission systems; however, it is unsuitable for distribution systems where the SE problem remains undetermined due to the low observability of the network, as discussed in the Challenges section below. Furthermore, even if the network is observable, the WLS state estimator, or similar formulations, cannot guarantee a solution in all cases and may encounter convergence problems, especially for large-scale systems. Moreover, these state estimators are sensitive to measurement errors and bad data.41,42

a. Challenges

The direct application of the classic SE framework in distribution systems is severely challenged for three reasons:41,42 (i) low observability due to scarce measurement devices and insufficient communications infrastructure; (ii) poor network models resulting from network model uncertainty, that is, imprecise network parameters and incomplete topology information; and (iii) unbalanced operations. Presently, distribution state estimators rely on so-called pseudo-measurements43 due to the difficulties in addressing (i) and (ii) without substantial upgrades to existing communications and metering infrastructures. Unfortunately, such infrastructure upgrades are economically impractical due to the vast scale of distribution systems. On the other hand, load-derived pseudo-measurements used to compensate for insufficient telemetry data can propagate significant errors through the distribution state estimator, leading to their unreliable performance.41,42 Moreover, (iii) renders the decoupled versions of state estimators used in transmission networks inadequate for distribution systems, requiring the development of three-phase state estimators. Hence, there is a growing interest in exploring techniques that enable accurate distribution system state estimation (DSSE) despite limited measurements and uncertainties in system models. Among these techniques, researchers have focused on ML44,45 and sparsity-based approaches to DSSE, which are discussed below.

b. ML applications

ML techniques have shown promising performance in various aspects directly or indirectly related to SE, such as bad data detection,46 topology identification, including transformer-to-customer mapping and phase connectivity identification,47 and generation and modeling of higher fidelity pseudo-measurements.48–50 Furthermore, numerous ML-based state estimators have been proposed that can be broadly classified into two different categories, depending on whether they require difficult-in-practice knowledge of the distribution network model (model-augmented) or not (model-agnostic or model-free). Notably, many ML-based state estimators proposed in the literature assume a level of observability that exceeds what is usually observed in real-world distribution systems or employ sensor placement optimization strategies.51 Therefore, there are numerous opportunities for further research in this field.

Last, but not least, it is worth highlighting a specific methodological concept utilized within the realm of recommendation (or recommender) systems that is relevant to research in power system SE. For an introduction to recommender systems, see, for example, Ref. 52. While recommendation systems are commonly linked to platforms such as Netflix, YouTube, and Amazon, where ML is utilized to suggest items of potential interest to individual users based on data from other users, the mathematical concepts behind these systems can also be applied in the power system domain. Specifically, low-rank matrix completion has been used in the area of DSSE, particularly in the context of low-observable distribution networks. For further information, see Refs. 53 and 54, which provide examples of model-augmented and model-free approaches to voltage estimation, respectively. Apart from matrix completion techniques, the domain of sparsity-based DSSE has seen active research on other pertinent techniques, such as tensor completion.55–57 

4. General considerations on ML for power distribution system analysis

In power distribution system analysis, high-fidelity modeling is essential for obtaining trustworthy results. Traditional power distribution system modeling relies on complex physics-based models and rigorous mathematical principles. However, these models can lead to intractable optimization and simulation problems given the large scale of these systems. To address this scalability issue, power distribution system modeling is often simplified through assumptions and approximations that come in many forms, like linearization or convex relaxation. Such simplified models may be satisfactory for most operating points but can lead to sub-optimal solutions for others. This becomes more pronounced with increasing variability and uncertainty in system operating conditions due to the increased integration of DERs.

There are many examples where ML can offer an efficient alternative to conventional simplifications, especially where obtaining timely solutions is a primary concern, at the expense of solution accuracy. One such example is the use of ML to develop surrogate models—high-accuracy, computationally cheap approximations of a detailed analytical model—to accelerate analysis. Examples may include convex and nonconvex optimization problems (e.g., D-OPF, DSSE), Monte Carlo simulations, and approximation of nonlinear power flow equations embedded in optimization-based problems.

Computationally intensive Monte Carlo simulations are traditionally used to generate hundreds to thousands of scenarios pertaining to the different system operating conditions (i.e., power generation, consumption, and voltage level) for various relevant studies, such as security assessment under different DERs penetration levels. Identifying the most critical scenario (i.e., operating point) is highly challenging because the system state space can be vast.58 To that end, ML techniques can be leveraged to optimally search the state space, thus circumventing the need to run distribution power flow analysis for each scenario individually.

By reducing the time burden of various analyses, ML can allow for more comprehensive and/or more frequent optimization, therefore improving the efficiency and reliability of the system. Additionally, because time-intensive model solving often limits the above problems to offline applications, ML methods are promising tools for shifting model solving from offline to online settings, which are more suitable for time-varying operational conditions. It is important to note that by replacing the original detailed physics-based analysis with ML surrogates, certain guarantees on performance may be lost, therefore limiting the application of ML to high-regret applications, especially in the context of system stability.

We close this section with an overview of the various types of data sources used in power distribution systems, accompanied by a brief discussion of important considerations to keep in mind when working with these data. As depicted in Fig. 3, diverse data sources, such as supervisory control and data acquisition (SCADA), weather stations, meteorological databases, and metering devices, including μ phasor measurement units (μ-PMUs) and smart meters, can be exploited for various ML applications. However, it is important to note that these data come with unique challenges, such as asynchronous measurements and limited access to real-world measurements.

FIG. 3.

Heterogeneous data sources in power distribution systems.

FIG. 3.

Heterogeneous data sources in power distribution systems.

Close modal

The inconsistency in the temporal resolution of distribution measurements originates from the spatiotemporal heterogeneity of sensors installed throughout the electric distribution network, such as smart meters, μ-PMUs, and field equipment monitors. These sensors have varying sampling rates and reporting times, as shown in Fig. 4, which results in measurements of different temporal resolutions. To effectively utilize these measurements in ML systems, it is necessary to standardize their resolutions. This requires downsampling the higher-resolution data to match the lower-resolution data or vice versa (the so-called upsampling). This can be done in many ways, from trivial linear interpolation methods to more advanced ML methods; we highlight recent work59 where low-resolution load profiles are upsampled into high-resolution load profiles using generative adversarial networks (GANs).

FIG. 4.

A range of temporal resolutions of distribution measurements. Adapted from Ref. 60.

FIG. 4.

A range of temporal resolutions of distribution measurements. Adapted from Ref. 60.

Close modal

Another important consideration is the limited availability of measured data from real systems due to a lack of sensors in addition to privacy and security concerns. For these reasons, researchers are often limited to using simulated and open-source data to develop ML systems. However, simulations typically assume error-free measurements and no communication delays, which is unrealistic for real-world measurements. To address this issue, researchers may introduce noise that follows certain assumed distributions, which may not accurately reflect the noise patterns found in actual measurements. Additionally, open-source datasets may have undergone pre-processing and may lack certain features that could be relevant to the problem being studied.

This section serves as a brief introduction to the field of ML for audiences outside that community. It delves into the fundamental principles of ML and categorizes the different learning paradigms. By providing examples from a distribution systems perspective, it illustrates the concepts discussed and serves as a starting point for researchers to explore the potential of ML within the power engineering domain.

A learning task T can be described as the process of optimizing a performance measure L driven by a training experience D.26 This task typically involves three main components: the input data (also known as the training set or feature set), the model (or algorithm), and the output or prediction (also known as the label or target).26 To simplify the discussion, in the following, we will formally define the learning task assuming the supervised learning setting, which is covered in more detail in Sec. IV A. Not only has this type of learning been widely studied in the field of PES, but it also provides a sound basis for explaining the fundamental concepts of ML.

To begin, let us briefly apply the abstract ML terminology above to a concrete example within the distribution systems context. In power systems, a microgrid is defined as a small electricity network that can operate either connected to the larger grid or in an autonomous mode. Consider the example of microgrid formation to facilitate distribution network recovery during low-probability, high-impact events, such as extreme weather events. The learning task in this scenario can be formulated as segmenting the distribution network into autonomous (microgrid) zones. The performance measure to be improved in such a learning task may be the minimum time required for the distribution network's restoration or alternatively the number of critical customers facing power outages. The training experience may involve defining all possible microgrids within the observed network beforehand.

By a more formal definition, the learning task T can be described as constructing a mapping f : X Y using the training data D = { ( x ( i ) , y ( i ) ) } i = 1 m to find a predictive function h H, such that, for unseen input pair ( x * , y * ) , h ( x * ) is a good model; here, m is the number of training samples, X and Y, respectively, denote the input and output space, and H is the hypothesis space of candidate models. In some cases, the function h ( · ) is represented explicitly as a parameterized functional form; in other cases, the function is implicit.26 In both cases, the function generally depends on parameters, and training corresponds to finding values for these parameters that optimize the performance metric L.26 Mathematically, T defined above can be formulated as
h ( x * ) = arg min f H L + R ,
(8a)
where
L = 1 m i = 1 m i ( f ( x i ) , y i ) .
(8b)

In (8), is a loss function used as a learning criterion for the optimization problem (8a), and R is the optional regularization term used to reduce the risk of overfitting.

Remark 4. What does a good model mean in the definition above? In the field of ML for PES, the term good model is not clearly defined and is open to interpretation. Generally, a good model is one that generalizes well to previously unseen data. However, when it comes to PES, other criteria should be considered when defining a good model. For example, features of a model that address barriers to adoption in the real world should be considered. Nevertheless, the lack of a clear consensus on what constitutes a good model in PES research can lead to difficulties in the practical application of ML methods in this field. Therefore, researchers must establish clear standards and definitions for good models in PES in order to make the use of ML techniques successful in the real world. Developing realistic benchmark problems can help the community at large agree upon a consistent definition of a good model for each application of ML to PES.

1. Loss function

The accuracy of an ML algorithm is quantified using the loss function—a mathematical function that measures the difference between the predicted output (also known as the prediction or estimate) produced by the ML algorithm and the actual output (also known as the ground truth). The loss function is optimized during the training process using techniques such as gradient descent to enhance the model's predictive efficacy.61 

The definition of the loss function depends on the specific learning task.62 For regression tasks where the output is a real number (Y embedded in the set ), the quadratic ( 2) loss function is often used and is defined as follows:
( f ( x ) , y ) = ( f ( x ) y ) 2 .
(9)
Alternatively, the absolute value ( 1) loss function can be employed, which is defined as
( f ( x ) , y ) = | ( f ( x ) y ) | .
(10)
For classification tasks where the output belongs to a finite set of class labels, the cross-entropy loss function is a popular choice because of its probabilistic interpretation. This loss function measures the dissimilarity between the predicted and true probability distributions and is given by62 
( f ( x ) , y ) = k = 1 K y k log ( f ( x k ) ) ,
(11)
where K is the number of classes, yk is the one-hot encoded ground truth label for class k, and f ( x k ) is the predicted probability of the sample belonging to class k.
The zero-one loss (0–1 loss) is frequently used for binary classification tasks. It penalizes misclassifications by assigning a value of 1 and rewards correct classifications with a value of 0, expressed as
( f ( x ) , y ) = δ f ( x ) y ,
(12a)
where
δ f ( x ) y = { 1 , if f ( x ) y , 0 , otherwise .
(12b)
Due to its nondifferentiability, 0–1 loss is generally used as an evaluation metric rather than as a loss function during the training process, unlike the cross-entropy loss. For more detailed information on this subject, please refer to a comprehensive survey of 31 loss functions found in Ref. 62.

2. Data requirements

As discussed above, the learning task is driven by the training experience, or in other words, the data. However, raw data are rarely used in the learning process; they often require prior processing and formatting. These steps generally fall under the umbrella of data engineering and are vital to developing quality ML systems.

The first step (data pre-processing) may include some, if not all, of the following sub-steps:

  • Data cleaning, which is used to detect and correct or remove corrupt observations—missing data and outliers;

  • Feature selection, which involves selecting an “optimal” subset of independent variables (features) from among many less useful ones; this implies ignoring other features as irrelevant. These techniques are instrumental when there are many features and comparatively few data observations;

  • Feature scaling, which involves normalizing the range of data features. Examples include mean normalization, min–max normalization, and standardization.

The second step (data formatting) is to ensure that the input data are well formatted. The most common is the tabular format (the so-called vector–matrix data representation) where each row and column of the table represents a particular example (also called instance) and feature (also called covariate), respectively. Let us consider a single distribution feeder and the corresponding past measurements at each node as an illustrative example of how this system can be represented in tabular form. In such a table, each row corresponds to a time instance when the measurement snapshots were made, and columns correspond to different measured variables at each node (e.g., voltages, injected complex power, consumed complex power).

Remark 5. In the context of ML applications within the distribution system domain, the data utilized are primarily structured in a tabular format. There are, however, certain exceptions to this, such as satellite imagery that can be used to estimate behind-the-meter solar generation or thermal images from distribution lines and transformers inspections that can be used for predictive maintenance.

3. Model performance evaluation

The evaluation of an ML model's performance should take place on an independent test dataset. In general, the full dataset is sampled into two (train/test) or three (train/validation/test) distinct sets:

  • Training set, typically the largest sample of the data, is used to learn a predictive ML model;

  • Validation set is a smaller data sample used to evaluate the model's performance during training and fine-tune its hyperparameters to prevent overfitting or underfitting to the training data;

  • Test set is an independent dataset that is not used in the training or hyperparameter tuning process. Instead, it is used to estimate the performance of the predictive ML model on new, unseen data.

Another technique for evaluating and selecting ML models is cross-validation.63 In this approach, the data are partitioned into subsamples, and the error rate is estimated as the mean of the error rates calculated from all the data subsamples.63 

The field of ML encompasses various learning paradigms, including supervised, unsupervised, and reinforcement learning. These paradigms have been extensively studied and widely adopted across various application domains.26 A summary of their key characteristics and related algorithms is presented in Fig. 5, color-coded to help differentiate them in Sec. IV, where they will be discussed individually in more detail. In Secs. IV A–IV C, we will focus on these main learning paradigms, including semi-supervised learning (SSL) in Sec. IV D. For a comprehensive overview of recent advancements in the field, including transfer, multitask, and multiview learning, the interested reader is referred to Refs. 64 and 65.

FIG. 5.

Main characteristics and representative algorithms of the four ML paradigms of interest.

FIG. 5.

Main characteristics and representative algorithms of the four ML paradigms of interest.

Close modal

Among the different ML paradigms, a noteworthy distinction is the difference between offline and online learning. In offline learning, a batch of data samples is processed simultaneously, while in online learning, data samples are processed sequentially as they arrive over time. Although supervised and unsupervised learning methods can be implemented using either offline or online learning strategies, these methods are traditionally associated with offline realization. In contrast, reinforcement learning, which relies on sequential interactions between the agent and the environment, primarily operates online. Nonetheless, there are also less represented forms of reinforcement learning, such as batch reinforcement learning that decouples data collection and policy training processes, thereby updating an agent's policy with a batch of pre-collected data.66 Another such form is pure batch (or offline) reinforcement learning, which aims to learn policies solely from a suitably diverse and sizable dataset, devoid of any online interactions.67 

Another distinction is between discriminative and generative models. Discriminative models learn the conditional probability distribution p ( y | x ) while generative models learn the joint probability distribution p(x, y).68 While discriminative models are primarily used for classification tasks, they can also be used for other types of supervised learning tasks, such as regression and structured prediction, wherein the output is a structured object, such as a sequence or a tree.69 On the other hand, generative models are primarily used for unsupervised learning tasks of generating new data samples that resemble the training data. Generative models are particularly promising for applications where the training dataset is small or difficult to obtain (see Sec. IV B).

This section provides a succinct overview of the primary learning paradigms outlined in Sec. III B and displayed in Fig. 5. At the end of each subsection are examples in order to demonstrate how the methods can be utilized in the PES domain. Most of the selected examples, including but not limited to predictive maintenance, phase identification, and voltage control, have been extensively studied over the years and are therefore well-suited to illustrate the considered ML paradigms in the context of power distribution systems.

In supervised learning, the goal is to learn a model from labeled data D = { ( x ( i ) , y ( i ) ) } i = 1 m to generate an output y for each input x (or a probability distribution over y given x), which can then be used to make predictions on new, unseen (test) data. There are two primary tasks within supervised learning—classification, where the output y is categorical, and regression, where the output y is numerical:

  • Classification can be further divided into two tasks. The first is binary classification, which aims to determine the class (out of two possible classes) to which a given data instance belongs. The second task is multiclass classification, which assigns a data instance to one of K categories. In the context of power systems, this could include determining if a fault is an open-circuit fault, symmetrical short-circuit fault, or asymmetrical short-circuit fault. Some popular classification algorithms include logistic regression, support vector machines (SVM), k-nearest neighbors (k-NN), and decision trees. Ensemble variants of decision trees, such as random forests, are also commonly used.

  • Regression is a task that involves making predictions about future outcomes by learning the relationship between a dependent variable and one or more independent variables. In the context of power systems, an example of a dependent variable could be the generated power from a PV system, while independent variables could include factors such as solar irradiance, ambient temperature, and PV system specifications. The common regression algorithms include linear regression, polynomial regression, SVM, neural networks, and random forests. These algorithms are widely used in the PES field to make predictions and projections based on historical data.

It is important to note that many supervised learning techniques, including decision trees, random forests, and artificial neural networks (ANNs), can describe complex, nonlinear relationships between input–output pairs. Among them, ANNs have been widely recognized as universal nonlinear function approximators, capable of approximating arbitrary functions with high accuracy.70 Recent advances in DNNs have further improved the capabilities of ANNs. DNNs utilize multiple layers of nodes to capture increasingly complex data relationships, making them the most powerful ML technique studied extensively across various research fields, including PES.

Supervised learning is arguably the most extensively studied learning paradigm in PES research. Two selected examples showcase the application of supervised learning in forecasting tasks related to distribution systems (Example 1) and for predictive maintenance (Example 2).

Example 1: Forecasting.

Forecasting is crucial in mitigating the increasing uncertainty in distribution system operations and planning. Supervised learning can be employed to forecast relevant system variables, such as distribution locational marginal prices71 and distribution system states,72,73 by learning from historical time series data. These forecasts can provide valuable information for system operators and decision-makers. For instance, accurate forecasts of renewable energy generation can assist operators in managing grids with a high penetration of distributed generation more economically and reliably, alleviating operational uncertainties. Depending on the forecasting horizon, long-term, short-term, and very short-term forecasting (often referred to as “nowcasting” in the ML community) can be distinguished. The first is used in planning studies with timescales ranging from months to years ahead, while the latter two are used in near- and real-time operations, with relevant timescales as depicted in Fig. 6.

Example 2: Predictive maintenance.

As the distribution infrastructure ages, more advanced maintenance practices are required to ensure its reliable operations. One such practice is equipment predictive (condition-based) maintenance.74,75 Supervised learning can be utilized to construct predictive models for this purpose. By utilizing health indicator data of network equipment, such as insulation degradation in transformers, and other readily available data, such as transformer specifications and historical failures, a model can be trained to detect potential equipment failures, thereby allowing operators to take timely preventive measures. This approach can greatly increase equipment longevity and help avoid unplanned equipment failures that can lead to power outages and other service disruptions, thereby increasing the overall reliability of the distribution infrastructure.

FIG. 6.

Illustration of important distribution system timescales for Example 1: Forecasting pertaining to supervised learning application.

FIG. 6.

Illustration of important distribution system timescales for Example 1: Forecasting pertaining to supervised learning application.

Close modal

Unsupervised learning analyzes unlabeled data D = { ( x ( i ) ) } i = 1 m with the aim of identifying underlying structures or uncovering patterns that may be present in the data. Some examples of tasks that fall under unsupervised learning include:

  • Clustering is a task within unsupervised learning that groups unlabeled data into clusters based on their similarities. Standard clustering algorithms are Gaussian mixture models, k-means, hierarchical, and spectral clustering. For a comprehensive overview of clustering algorithms, see Ref. 76.

  • Dimensionality reduction is a technique that is frequently employed in the pre-processing stage of data analysis to represent high-dimensional data with significantly fewer dimensions while maintaining the integrity of the data. There are various dimensionality reduction techniques that can be used, with the most popular being principal components analysis (PCA),77 and autoencoders.78 

  • Anomaly detection is a commonly employed technique for identifying observations that deviate from the norm, also known as outliers or anomalies. Among the various algorithms for anomaly detection, two notable examples are one-class SVM and isolation forest.79 In the context of PES, anomaly detection can identify abnormal patterns in measured data that may indicate a deviation from expected system behavior, such as diagnosing faults using anomaly detection algorithms on voltage data.

  • New sample generation involves generating new samples or scenarios with distributions that are representative of the actual distribution of the given data. It is important to note, however, that the quality of the generated scenarios is contingent upon the quality of the input data. One prominent algorithm employed for this task is the GANs, which consist of a pair of neural networks—a generator and a discriminator.80 The generator is trained to produce data samples that closely resemble the training data, while the discriminator is trained to accurately distinguish between the generated samples and the actual training data.80 This process is conducted in an adversarial manner where both models are trained simultaneously.80 Unlike traditional scenario generation methods like Monte Carlo simulations, GANs do not require a priori assumptions about the probability distributions of the input data, making them a more attractive option for PES applications. As a result, GANs have gained considerable attention in recent years, and various studies have demonstrated their efficacy in generating near-realistic scenarios for a wide range of applications. These include profiles for electricity demand, solar PV, and wind generation.81,82

Unsupervised learning has received less interest in PES research due to its lower accuracy compared to its supervised counterpart. Nonetheless, unsupervised learning can be highly useful, particularly for analyzing unlabeled data or in situations where there is not enough labeled data to use supervised learning techniques. In the context of power distribution systems, two common applications of unsupervised learning techniques are phase identification and customer segmentation. Two selected examples (Example 1: Phase identification and Example 2: Customer segmentation) are presented below to illustrate these applications.

Example 1: Phase identification.

The identification of phase connections in secondary distribution networks presents a significant challenge due to the vast size of these networks. Unlike transmission and primary distribution networks, the phase connectivity data in secondary distribution networks are often unknown, incomplete, or inaccurate. To facilitate capacity hosting analysis for accommodating new DERs in secondary distribution networks (e.g., residential and commercial PV installations), it is essential to correctly identify the phase connectivity of each customer. To address this challenge, unsupervised learning techniques can be utilized to identify phase connections using smart meter data, specifically voltage or power consumption measurements.83–87 By analyzing and extracting valuable insight from these data, customers with similar characteristics can be grouped into representative clusters (seven in total, assuming single-phase, two-phase, and three-phase loads), each corresponding to the same phase or combination of phases (i.e., phase A, phase B, phase C, and combinations thereof as illustrated in Fig. 7). However, one potential drawback is the sensitivity of the model to the distribution feeders' levels of unbalanced phases. Further information can be found in Ref. 88, which presents a review of the current phase identification methods in the literature.

Example 2: Consumer segmentation.

Consumer segmentation is a methodology for characterizing consumers based on their load profiles, which are determined by the characteristics of their interruptible (e.g., computers, refrigerators) and adjustable appliances (e.g., air conditioners, washing machines), as well as their individual preferences and consumption patterns. This process enables distribution utilities to develop targeted customer recruitment strategies, such as personalized pricing and incentives, to effectively engage demand response program participants. One approach for achieving this is through the use of unsupervised learning techniques, specifically clustering methods, to analyze smart meter data and extract consumption patterns. By doing so, utilities can gain valuable insight that can assist in managing demand response under various conditions, including weather, social activities, and holidays. For further information on this topic, the interested reader is referred to Ref. 89 and the references therein.

FIG. 7.

Illustration of Example 1: Phase identification pertaining to unsupervised learning application. Each distribution line (lateral) is colored according to its phase.

FIG. 7.

Illustration of Example 1: Phase identification pertaining to unsupervised learning application. Each distribution line (lateral) is colored according to its phase.

Close modal

Reinforcement learning is a popular ML paradigm for learning an optimal policy or set of actions through a trial-and-error search and a reward system.90 The process, diagramed in Fig. 8, involves an agent interacting with a dynamic environment through a series of actions, each of which leads to a state transition and affects the agent's subsequent actions, all aimed at maximizing a cumulative reward function. In the context of distribution voltage control, exemplified in Fig. 8, reinforcement learning is used to govern the power output of a PV system, thereby minimizing voltage fluctuations and maintaining acceptable voltage levels across the distribution grid. This topic has been covered in more detail in Sec. V B 1.

FIG. 8.

A distribution voltage control framework exemplifying reinforcement learning tasks. At each discrete time step t, the agent, such as a smart PV inverter, receives a representation of the current environment's state st, which includes the voltage level. Based on this observation, the agent selects an action at, such as regulating the power injection level. A time step later t + 1, the environment perceives this action and transitions to the next state s t + 1 while providing a numerical reward r t + 1 to the agent, which affects its next action a t + 1. For example, in the case of a voltage violation where | V | 1.05 p.u., the agent may initiate power curtailment.

FIG. 8.

A distribution voltage control framework exemplifying reinforcement learning tasks. At each discrete time step t, the agent, such as a smart PV inverter, receives a representation of the current environment's state st, which includes the voltage level. Based on this observation, the agent selects an action at, such as regulating the power injection level. A time step later t + 1, the environment perceives this action and transitions to the next state s t + 1 while providing a numerical reward r t + 1 to the agent, which affects its next action a t + 1. For example, in the case of a voltage violation where | V | 1.05 p.u., the agent may initiate power curtailment.

Close modal

A key distinction between reinforcement learning and supervised learning is the type of feedback provided during training. In supervised learning, training data provide quantitative feedback in the form of actual output for a given input (ground-truth). In contrast, reinforcement learning utilizes qualitative feedback, indicating whether the action taken is correct or not. Algorithmic formulations of reinforcement learning include Q-learning, deep variational learning, policy gradient, actor-critic, and state-action–reward–state–action (SARSA), among others.

Reinforcement learning has been extensively studied in recent years for decision-making and control problems in power systems. Within this framework, power system problems are typically modeled as Markov decision processes or variants thereof, such as partially observable Markov decision processes. Broadly, the potential applications of reinforcement learning in power systems can be categorized into two main groups, namely, game-based studies and search-based studies, as indicated in Ref. 58. Game-based studies cover various tasks, such as demand-side management (e.g., Example 1: V2G), strategic bidding for different electricity markets, and power system control tasks that involve coordination over multiple agents (e.g., Example 2: Voltage control). On the other hand, search-based studies include58 system fault diagnosis, security assessment, and cascading outage prediction, to name a few. For a detailed analysis of the advantages and limitations of reinforcement learning in decision-making and control applications for power systems, we recommend referring to the latest review paper by Chen et al.6 

Example 1: V2G.

The issue of optimal management of EV fleets in the context of demand-side management is of growing importance. Not only do EVs impose an extra load on the local distribution network, but they also offer a distributed form of potential energy storage through the use of their built-in batteries, referred to as vehicle-to-grid (V2G) technology. To efficiently schedule EV charging and discharging activities, which enables operators to smooth out fluctuating supply from highly variable renewable generation and achieve desired consumption profiles, the literature suggests the application of reinforcement learning techniques. A recent survey on this topic can be found in Ref. 91.

Example 2: Voltage control.

Maintaining distribution voltage magnitudes within acceptable operating limits, as defined by standards such as ANSI C84.1,92 is becoming increasingly challenging due to the growing number of customers adopting DERs. This has prompted the development of reinforcement learning-based voltage control solutions, as seen in studies.93–96 The definitions of environment, state, action, and reward system may vary depending on the specific problem setting. In one such setting, the learning agent assumes the role of the system operator responsible for maintaining voltage magnitudes within the normal operating range ( 0.95 p . u . | V | 1.05 p . u .). The environment, on the other hand, comprises the aggregate of DERs, with the dynamic aspect referring to changes in system states and associated controls, such as adjustments of renewable generation outputs. The underlying physical model pertaining to distribution network model is regarded as the unknown environment. Within this framework, states correspond to bus voltage magnitudes and actions may entail regulating the distributed generation outputs. The reward can be defined as positive if the voltage magnitude remains within the upper and lower thresholds and negative if a violation, such as a ± 5 % voltage deviation from the nominal value of 1 per unit, occurs.

We conclude this section with semi-supervised learning (SSL), which is conceptually situated between supervised and unsupervised learning.97 This learning paradigm is particularly relevant when labeled data are scarce or costly to obtain, as it allows for the utilization of unlabeled data to improve model performance. In SSL, the model is trained upon a combination of labeled and unlabeled data ( x ( i ) , y ( i ) , ) i = 1 m ( x ( j ) ) j = 1 m , the ratio of which usually significantly favors the latter (i.e., m m ). The labeled data are utilized to establish the relationships between input and output variables, while the unlabeled data are employed to enhance the model's generalization capabilities.97 The most popular and widely used SSL methods include:

  • Self-training98 uses an initial set of a small amount of labeled data to train a supervised learning model and then applies that model to classify the unlabeled data. The most confident predicted labels are then added to the labeled dataset and used to retrain the model.

  • Co-training99 is a variation of self-training where two or more classifiers are trained on different views of data, and their predictions are combined to predict labels for the unlabeled data.

  • Transductive SVM100 is a variation of SVM specifically designed for SSL that leverages the unlabeled data to find the decision boundary that best separates the labeled data.

  • Pseudo-labeling101 involves training an initial model to predict “pseudo-labels” for previously unlabeled data. The pseudo-labeled and original labeled data are then combined, and the model is retrained on this larger training set. For example, consider the task of classifying a distribution line as either undergoing an outage or operating normally using historical measurements. If only a small portion of the dataset is annotated with the status of the line, the labeled dataset alone is likely insufficient to robustly train a supervised learning model. In this situation, pseudo-labeling can augment the original labeled dataset with pseudo-labeled data, potentially increasing the accuracy of the resulting outage detection model.

  • MixMatch102 combines the concepts of data augmentation and “mixup” on both labeled and unlabeled data to generate new examples, and then trains a model on the mixed data using a combination of supervised and unsupervised loss functions, thereby improving the model's performance.

SSL presents a significant area of research potential within the field of PES, particularly when labeled data are scarce. One example is the classification of rare events that disrupt normal power system operations due to extreme weather events or cyber-events. In such scenarios, traditional supervised learning solutions may result in sub-optimal decision boundaries. However, SSL can overcome this limitation by incorporating both labeled and unlabeled data into the learning process to aid in such classification tasks. Specifically, the inclusion of unlabeled data allows the SSL algorithm to infer close-to-optimal decision boundaries, thereby improving classification accuracy. Previous studies have applied SSL techniques for fault diagnosis and event detection,103,104 detection of cyber-attacks such as false data injection attacks,105 electricity theft detection,106 and nonintrusive load monitoring.107,108 However, the full potential of SSL has yet to be explored in various distribution system applications. As an illustration of one such application, the detection of voltage harmonic distortions is chosen, although it should be noted that SSL can also be used for the detection of other power quality disturbances.

Example: Voltage harmonic distortion detection.

The proliferation of power electronics-based household appliances, or nonlinear loads, has resulted in an upsurge of voltage harmonic distortion within secondary distribution networks.109 The identification of instances of single even or odd voltage harmonic distortion can be a time-consuming process. A potential solution to this problem is to utilize the semi-supervised approach for the classification of voltage harmonic distortion. The proposed approach involves conducting simulations on power systems with varying levels of nonlinear loads to generate voltage profiles, and subsequently, utilizing these profiles to construct a classifier through SSL. By leveraging the combination of labeled and unlabeled data, the classifier can achieve improved performance in identifying instances of single even or odd voltage harmonic distortion within a subset of the labeled data. It is important to note that a sufficient amount of representative data is needed to ensure the robustness of the classifier.

Remark 6. The distribution system examples presented in this section assume a particular learning paradigm. However, the formulation of certain tasks can be approached from alternative perspectives, thereby enabling the potential utilization of learning paradigms other than those previously demonstrated. For instance, supervised learning techniques may be utilized for phase identification tasks,110 while unsupervised learning approaches may be employed for predictive maintenance. Additionally, SSL can be utilized for customer segmentation tasks, such as identifying household profiles,111 as an alternative to unsupervised learning methods. Furthermore, many problems encountered in distribution systems exhibit a high degree of complexity, which can be mitigated by decomposing them into smaller sub-problems, akin to the divide-and-conquer algorithm approach. Subsequently, various learning paradigms can be utilized to address these sub-problems.

This section aims to identify application areas in power distribution systems that could greatly benefit from ML contributions in the future. The identified areas are organized into subsections according to the shared challenges that make them suitable for ML applications. Figure 9 provides an illustrative representation of these categories, aiding readers in comprehending the potential of ML in each application domain. Although this section does not intend to comprehensively survey all relevant ML applications in distribution systems, the insight presented herein offer valuable perspectives on how ML can improve the operational and planning practices of distribution systems, especially in cases where conventional tools are inadequate or inapplicable.

FIG. 9.

Selected ML applications according to the advantages they offer.

FIG. 9.

Selected ML applications according to the advantages they offer.

Close modal

The following examples showcase the use of ML in tackling the task of constructing models pertaining to distribution system operations that are difficult to develop based on first principles alone. ML-based models can help predict system behavior and enhance system resiliency in response to evolving consumer and environmental factors, where traditional first-principles models are not available. Here, we can distinguish between data-driven models that leverage the vast amount of data to capture complex behaviors that are difficult to model and agent-based models, which allow for the representation of complex interactions between system components and agents.

1. Modeling consumer impacts on distribution systems

A highly promising area of ML applications in the power distribution system domain involves modeling consumer behavior for demand-response programs. For example, intelligent building control systems rely on accurate modeling of occupant behavior in buildings. However, the complexity of consumer behavior—driven primarily by human habits and weather—renders it challenging to model explicitly using mathematical equations. Likewise, modeling the impacts of household-owned batteries and EV charging on power distribution systems is also confronted with similar challenges. Mathematical equations alone may not fully capture the complex consumer behavior regarding energy consumption patterns, whereas ML techniques can excel in capturing such unknown complex relationships. More accurate models can, in turn, facilitate the development of coordination strategies for DERs to provide efficient demand-response. At present, research in this area is largely focused on reinforcement learning and deep reinforcement learning. Interested readers are directed to Refs. 112 and 113 for an in-depth review of occupant behavior modeling in buildings and EV charging management using reinforcement learning techniques, respectively.

Given the many successful examples of ML predicting consumer behavior in other application areas (e.g., recommender systems, targeted advertising), there is a powerful set of techniques available that could be extended and applied to understanding, predicting, and shaping consumer impacts on the distribution system. Nevertheless, leveraging ML techniques to model consumer behavior and/or impacts requires careful consideration of equity and fairness. Reference 114 provides a comprehensive overview for readers seeking a general understanding of the increasing concerns surrounding bias, fairness, and equity in ML-driven applications. There are numerous steps that can be taken to mitigate bias and ensure that ML systems are not only accurate but also fair and equitable.114 This matter is particularly critical for equitable market predictions in future electricity markets with a plethora of diverse and small-scale participants (commonly known as prosumers), or when executing demand-response initiatives to guarantee an equitable and unbiased outcome for all parties involved.

2. Modeling climate impacts on distribution systems

While many of the detailed power system analyses rightfully focus on analyzing the power system in isolation, in some cases, the power system is dramatically impacted by external systems. Of particular interest is modeling and predicting the impact of natural disasters on the distribution system as such events become more common and more destructive due to the effects of climate change. ML is a promising approach to capture such complex interactions and take advantage of the rich and varied weather datasets to build predictive models relevant to the power system.115 Wildfires are another example of a natural system that is bidirectionally coupled to the distribution system—wildfires can cause power outages and power lines can ignite wildfires—and becoming more prevalent due to climate change. Recent works have begun to address how wildfire risk should factor into power system operations.116,117

The subsequent application areas of ML in power distribution systems underscore its capacity to speed up computations or simplify modeling and simulation. To avoid repetition, this section will not reiterate on other pertinent topics that were previously discussed, such as the utilization of ML to accelerate D-OPF by optimizing the candidate space or linearizing power flow equations.

1. Voltage control in power electronics-dominated distribution systems

Power distribution systems worldwide are witnessing an increasing adoption of power electronics in various forms, chiefly inverter-interfaced renewable generators (such as PV systems and wind turbines) and power electronics loads (such as power electronics-backed air conditioners). Accordingly, the traditional approach to voltage control in power electronics-dominated distribution networks is becoming increasingly challenged.118 Instead, the active participation of distributed local power electronics will play a key role in various system-wide control mechanisms (e.g., voltage and frequency control), further increasing the complexity of these problems. Furthermore, conventional voltage regulation devices, such as transformers' on-load tap changers and capacitor banks, may not be adequate to respond to rapidly changing operating conditions due to their slow response times, which can range from hours to days. Moreover, traditional optimization-based methods for voltage control, rooted in the D-OPF framework, may experience slow computational speed and convergence difficulties.

Given these challenges, ML and data-driven techniques—in particular reinforcement learning and deep reinforcement learning—have emerged as a promising avenue for research that can help to (i) enable faster convergence of existing voltage control methods for real-time control or (ii) develop novel control methods adapted to the use of local power electronics controls. The basic concepts of the application of reinforcement learning to the voltage control problem were presented previously in Sec. IV C, Example 2, and are therefore omitted here for brevity. Reinforcement learning has proven effective in this area to deal with the complexity of the voltage control problem without requiring detailed system models (which are often not available) and in a distributed manner that can scale to large systems. Recent work in this area has focused on algorithms that not only learn good policies for nominal conditions, but also provide some guarantee that safety constraints will be met96 or that the system will remain stable.119–121 These works can serve as an example for other application areas in providing guarantees that will aid in the adoption of data-driven techniques (see Sec. VI B 4).

2. Adaptive relay protection for active distribution systems

Traditionally, protection relay schemes in distribution systems have been programmed for unidirectional power flows.122,123 The most common is overcurrent protection, which is designed to operate when the short-circuit (fault) current exceeds some predefined threshold (i.e., I f I min ). However, the increasing displacement of synchronous generators with inverter-based resources may lead to the miscoordination of overcurrent protection schemes.123 This is due to these resources' notably lower fault current contributions than those in traditionally operated distribution networks. In addition to potential miscoordination, reduced fault currents in power electronics-dominated distribution networks can make fault detection more difficult. Another concern is related to more frequent reverse power flows that can adversely affect normal network operations by causing relay protection to malfunction (i.e., unwanted tripping). This calls for new protection concepts that should gradually replace the functionality of traditional protection systems to ensure stable and safe distribution system operations during both normal and contingency operations.123 

The idea of adaptive protection has emerged as a potential solution;122 however, it is still in its infancy, as is the application of ML for protection.124–126 Nevertheless, ML has the potential to play a crucial role in advancing the field of relay protection in distribution systems. Specifically, ML techniques can aid in developing adaptive relay protection schemes, making it an exciting future research direction. By utilizing look-ahead search techniques, such as Monte Carlo Tree Search,127 the accuracy and speed of fault detection and isolation can be improved. Moreover, ML-based solutions can replace the conventional lookup tables used for protection relay settings, thereby enabling the establishment of more accurate and adaptive protection settings. However, implementing such solutions requires extensive offline analysis of all possible future outcomes, which can be prohibitively expensive. Despite this, the advantages that ML can bring to relay protection for active distribution systems make it a worthwhile topic for future research and development.

3. Dynamic modeling of distribution systems

The power system applications discussed to this point consider a static or quasi-static model of the power system (4). On the other hand, the application of ML to dynamic environments described by a time-varying (dynamic) model (13) is a challenging open research problem
y ̇ = f ( t , y , x ) .
(13)
Dynamic modeling of the power system, and especially transient stability analysis, are particularly computationally expensive for large systems given the existing methods involve numerical integration of large systems of differential equations. ML techniques are a promising approach to predict dynamic behavior, thereby removing the computationally expensive step of numerical integration.128 These techniques, however, have to date been applied on small test systems and must be improved upon to scale to realistic size systems. In addition, whereas solution tolerance and step size can be used to control the accuracy of the solution with numerical integration techniques, there are no such guarantees with ML-based approaches for predicting the dynamic response—a potential barrier for safety-critical applications (see Sec. VI B 4 for an extended discussion on performance guarantees).

Another potential application is in developing data-driven models that capture the aggregate dynamic behavior of an active distribution system for use in transmission level analysis (modeling both domains in detail is computationally prohibitive). Future research should focus on how ML-based dynamic models can be integrated seamlessly into existing dynamic simulations which are to-date centered around physics-based models.

The following application areas demonstrate how ML can be employed to alleviate issues related to missing, incomplete, or nonexistent data. In addition to the topics covered below, several other ML applications are relevant to this category, including synthetic data generation for creating pseudo load profiles used in DSSE (discussed in Sec. II C 3) and topology and phase identification (see Example 1 in Sec. IV B).

1. Building registers of distributed PV for demand-side management

Advanced demand-side management can help mitigate new operational challenges facing system operators by allowing them to shape and modulate flexible loads optimally. However, systemic missing or out-of-date information (e.g., size and location) on installed solar PV panels on residential and commercial rooftops makes demand-side management difficult at scale. Some reasons for this lack of data include older installations made prior to regulations being enacted, lack of owners' awareness of permitting rules, unauthorized installations to avoid permit fees, and discrepancies between reported and installed configurations.129 At the same time, registration of unregistered PVs can be highly challenging due to their large numbers in poorly visible secondary distribution networks.

Because accurate and up-to-date repositories of distributed PVs and other flexible loads are a prerequisite to effectively employing demand response, ML-based solutions are gaining research attention. In one line of research, ML has been proposed to classify and segment rooftop PVs using aerial and satellite imagery, thereby enabling the creation of more accurate registers (databases) according to the spatial scale of interest (e.g., street or city level). An overview of opportunities and challenges on this topic can be found in Ref. 130. Other data-driven solutions proposed in the literature rely on smart meter measurements (time-series data, as opposed to images) to detect abnormal energy consumption behaviors, including unauthorized PV installations. An exemplary work in this category is Ref. 131, where the corresponding solution is developed using a change-point detection algorithm.

2. Situational awareness for distribution system operations and planning

The traditional operations and “fit-and-forget” approach to distribution network planning (capacity hosting analysis) are becoming less than adequate with the widespread adoption of DERs. The necessary changes in operational and planning practices rely on increasing the grid's last-mile visibility, which has historically been difficult to achieve due to a lack of metering devices and insufficient communications infrastructure.41,42 However, the massive rollout of smart meters has significantly increased the number of metering devices installed at the grid's edge.132 In addition to smart meters, new untapped sensing potential comes from broadband cable television networks, with sensors deployed across the countries and their own high-speed, low-latency communications network.133 Unlocking the potential of these sensors can enable greater insight into the operations of last-mile networks, thus allowing operators to react promptly.

Despite the influx of new data, missing measurements due to metering sensor errors or communication disruptions will become an inevitable challenge. However, leveraging the low-rank property of the streaming data matrix can allow for the utilization of low-rank matrix completion and tensor completion techniques, as well as certain ML methods, to recover these missing measurements.134–136 Although data recovery is an important topic, the lack of visibility into distribution system operations currently presents more pressing challenges. For instance, outage detection in distribution systems remains a nontrivial task for operators who still rely on customer calls to identify power outages, sending crews to locate and eliminate the cause. To overcome this, fault location isolation and service restoration (FLISR) is a promising application in the advanced management of distribution systems, but it requires improved observability. To improve situational awareness at the distribution system level, the PES community has recognized the numerous opportunities that ML can offer, in addition to those discussed in Sec. II C 3 in the context of DSSE. For example, ML can be applied for outage detection,137,138 fault location,139 and service restoration,140,141 among others. Readers interested in further information on ML for FLISR are encouraged to refer to Ref. 10, which provides a comprehensive literature review on AI applications for distribution system operations.

ML methods and implementations must be characterized by well-posedness, reliability, and robustness in order to be accepted for practical application in distribution systems beyond the research community. Furthermore, their outcomes must be readily verifiable, validated, and reproducible. To attain this, several key challenges must be effectively addressed, including: (i) data availability, quality, nonstationarity, and privacy; (ii) robustness, scalability, and performance guarantees of selected models; (iii) adherence to physical principles and laws; (iv) explainability and visualization of obtained solutions (i.e., interpretability); and (v) a unified evaluation methodology for ML performance, encompassing well-defined metrics and structured baselines for comparison. Figure 10 outlines the identified research directions aimed at addressing these critical challenges through the lens of applied ML. The discussion in this section is centered on challenges and opportunities, emphasizing the critical research directions necessary for making ML a credible and trusted methodology in distribution systems. Nevertheless, we argue that the presented discussion can be generalized to other systems within the PES domain, including bulk power systems.

FIG. 10.

Main research themes for ML applications in power distribution systems.

FIG. 10.

Main research themes for ML applications in power distribution systems.

Close modal

The first research theme is centered around three main strategies: (i) leveraging domain knowledge (i.e., expert feedback) on data acquisition and model selection; (ii) incorporating physical principles and governing laws into ML models; and (iii) domain-aware decision-making or policy-making. Each of these strategies comprises multiple challenges that will be discussed in subsequent subsections VI A 1–VI A 4.

1. Data acquisition

The centrality of data in ML highlights the fundamental importance of data adequacy for successful ML implementations. Data adequacy refers to data availability, quality, and nonstationarity that may significantly constrain ML applications in distribution systems. ML can provide novel and valuable perspectives on these data adequacy challenge, guiding data acquisition and ensuring the adequacy and quality of data for distribution system applications.

a. Data availability

ML relies heavily on a substantial amount of data—often hundreds or thousands of samples—to train its algorithms. The fastest-growing area of ML, deep learning, requires even more extensive data. In practice, the data used for learning ML models in distribution systems are often obtained through computational simulations, as observational data may be scarce. However, generating these data through simulations can be costly. Therefore, it is of interest to design data generation and collection procedures that minimize the amount of data required and its associated costs. Research is needed to identify the most relevant data to produce or gather to improve model results. Using unsupervised and semi-supervised methods with small samples can also prove to be difficult. Gaussian processes (GPs)142 and their variants (e.g., sparse approximations143) offer a potential solution in these cases. Another approach to overcome the scarcity of data is through data augmentation using GANs, which can synthesize additional training samples from the existing data to supplement the original sparse datasets. Nevertheless, maintaining diversity in the synthesized data and accurately representing out-of-sample or rare events are major ongoing challenges when using generative techniques.

b. Data quality

Data quality is vital to the efficiency of the learning process and encompasses various aspects, such as completeness, consistency in representation, informative nature, and trustworthiness, of the data. Real-world distribution system measurements and operational data are often plagued by issues such as missing values, noise, and nonuniform sampling rates, requiring extensive pre-processing for effective utilization. Additionally, the available data may not accurately reflect the overall distribution of measurements, leading to unrepresentative results. Therefore, ML methods must be able to accommodate the presence of noise, communication delays, and missing data in real-world scenarios. Furthermore, while ML models are often trained and tested on data obtained from high-fidelity simulators and accurate distribution system models, these simulators may contain modeling errors and simplifications themselves and may not always have access to the exact parameters and topology of the distribution network. Therefore, it is essential to incorporate domain knowledge to guide data acquisition and ensure its quality and adequacy.

c. Data drift

Data drift in the field of ML refers to changes in the distribution of data over time relative to the data distribution upon which the model was originally trained. Within the context of PES, data drift can result from changes in the topology of distribution networks, such as reconfigurations of loads and the addition of DERs, as well as transitions between grid-connected and islanded modes in microgrids. The current literature on this issue generally suggests retraining the ML model in response to each topological change, but this may not be a feasible solution for rapidly changing operational settings with a large number of network topology combinations. Further research is thus necessary to fully understand and address the challenges posed by data drift in power distribution systems.

d. Data privacy

In the PES community, there are growing concerns about the accessibility of unencrypted consumer electricity usage data collected by smart meters and other data sources.144 The aforesaid concerns represent a significant obstacle to the safeguarding of privacy and, thus, call for the deployment of privacy-centric solutions that prioritize and preserve privacy. The adoption of cloud services is a viable option for addressing these challenges. For instance, software running on a distribution utility's server—collecting data from various end-users—may only transmit aggregated or anonymous statistics to the cloud, thereby allowing for data analysis without exposing individual consumer data. While this approach may benefit certain problems, its usefulness in demand-response applications where granular data are required to generate personalized incentives may be limited. In addition, a major hurdle lies in getting utilities to embrace cloud services as well as overcoming the inherent cybersecurity risks associated with this approach.

2. Model selection

In the PES field, applying new ML methodologies, particularly DNNs and GNNs, requires caution. DNNs and GNNs are rapidly evolving and growing research areas within ML, therefore new models are frequently developed and made available. In turn, PES research is often focused on applying these state-of-the-art neural networks, potentially at the expense of more pressing needs. For example, the reliance on large amounts of data—crucial requirement for deep learning—is frequently overlooked in distribution system research. This shift toward more data-intensive ML methods has also diverted attention away from traditional, less data-intensive ML methods that may be better suited for the problems at hand. To address these concerns, it is imperative to carefully consider the suitability of state-of-the-art ML methods for distribution systems and make informed decisions on their use. This requires a collaborative effort between researchers and practitioners from both disciplines, focusing on understanding the context and needs of the distribution system domain.

3. Hybrid and physics-informed solutions

In response to the increasing complexity of the distribution system, researchers have proposed a variety of ML-based alternatives to existing tools. As discussed in examples throughout this paper, these ML algorithms can effectively deal with complexity, reduce the computational burden of analysis, or leverage existing datasets, to name a few of the many potential benefits. At the same time, there exist decades of progress in PES developing traditional physics-based models and techniques that are governed by domain knowledge and theoretical foundations. In many cases, the best solution is not simply choosing one technique over the other, but rather combining the two in novel ways to achieve the benefits of each, while also limiting the drawbacks of each (Fig. 11). This has numerous benefits, including reduced data requirements or enhanced performance; for example, models can be trained with fewer data points, or the models can converge faster to optimal solutions.145 

FIG. 11.

Merits of hybrid models relative to purely data-driven or model-based solutions.

FIG. 11.

Merits of hybrid models relative to purely data-driven or model-based solutions.

Close modal

Incorporating physics into ML models can be achieved through the enforcement of soft and/or hard constraints during the training process, prediction time, or both. Hard constraints involve transforming an optimization problem into a constrained one, while soft constraints involve modifying the objective function (i.e., the loss function) by incorporating additional physics-based terms.145 This results in the creation of physics-constrained neural networks and physics-informed neural networks, respectively, which are characterized by their increased capability for generalization and enhanced interpretability.145 The interested reader is referred to Ref. 34 for a list of software libraries specifically designed for physics-informed ML. Further research is necessary to explore hybrid and physics-informed approaches to achieve better performance and explainability in various distribution system applications.

4. Operator in-the-loop

The core principles of ML assume minimal human intervention, which often contradicts traditional practices in distribution system management. Despite the growing trend toward digitization and automation in power distribution systems, the adoption of ML-based solutions in these systems requires expert input, particularly in the areas of operations, control, and planning. Thus, it is unrealistic to expect that ML alone can revolutionize the operations of power distribution systems. Instead, a more viable approach is to leverage ML-based solutions in conjunction with expert input to improve system operations by offloading certain tasks and allowing operators to focus on optimizing operations. Such co-occurrence of ML and human expertise would reduce the need for human intervention and specialized expertise.

The second research theme is centered around three main strategies: (i) explainable; (ii) understandable; and (iii) repeatable ML results. To gain a full understanding of these concepts in a general sense, it is suggested that the reader refers to Ref. 146.

1. Interpretability

Traditionally, power engineers' confidence in the suitability of a model for a particular application is closely tied to their understanding of how the model works, its adherence to fundamental physical principles, and the degree to which its outputs facilitate effective decisions. However, this can be challenging with ML models, which are often hard to understand due to their lack of interpretability. To overcome this issue, research into methods that incorporate domain knowledge is important to make these models understandable and trustworthy. This ties into the research theme of physics-informed ML, which has been previously discussed.

Numerous ML frameworks used in distribution systems predominantly employ data-driven methodologies that produce black-box solutions with restricted information, making them difficult to explain. Although these techniques have exhibited good performance, their lack of interpretability renders them unfit for deployment in safety-critical systems. It is therefore essential to incorporate interpretability to establish trust in ML models and to effectively communicate their outcomes to system operators and decision-makers. However, as ML models become more complex—a trend also evident in PES research—they become harder to explain, let alone understand. This leads to a trade-off between complexity and interpretability. One approach is to quantify this trade-off and focus on developing interpretable solutions tailored to specific distribution system applications. This way, operators can understand and trust the results of the models. Despite the extensive body of literature on interpretable ML,147,148 much progress still needs to be made in creating interpretable ML models tailored specifically to the PES domain.

2. Visualization

Visualization is a frequently overlooked, yet crucial aspect of decision-making for operators in control centers. Apart from facilitating their comprehension of system processes, it also enables them to swiftly identify critical states in the system and take prompt actions accordingly. When it comes to ML systems, it is imperative to take into account the unique needs and preferences of decision-makers and system operators concerning both the visualization of the learning processes and the resulting outcomes. To advance research in this area, the focus should be on developing advanced but intuitive explanatory approaches that incorporate domain knowledge. Such practices would aid operators in understanding and interacting with complex ML models. Therefore, the promising avenue for future research is the development of post-hoc visualization and explanatory techniques, which provide a way to visualize and explain the results of ML models after they have been trained rather than during the training process. Although showcasing the accuracy of a learning model during training is useful, it is even more useful to identify cases where the predictions were incorrect and focus on them. Another prospective area for future development is how to properly visualize the evaluation metrics employed for evaluating the performance of ML models, the topic discussed next.

3. Operator-friendly model evaluation

Researchers tend to prioritize finding the ML model that optimizes a chosen performance metric. However, solely relying on evaluation metrics to evaluate model performance can be misleading. For instance, when dealing with an unbalanced dataset wherein classes are not equally represented, using a performance metric like accuracy can provide a skewed perspective of the model's performance. For example, consider a classification task that aims to detect three-phase faults—the rarest variety—in power distribution systems. If the dataset used for this task contains 100 000 fault currents, with only 100 of them being the faults of interest and the rest being single- and two-phase faults, a trivial algorithm that classifies all faults as single- and two-phase faults (i.e., “non-three-phase faults”) would still have an accuracy of 99.9%. This could lead to the incorrect conclusion that the classifier is highly accurate, despite missing the faults of interest. Therefore, it is crucial to rigorously quantify the performance of learning systems, including the estimation of prediction quality and effective confidence bounds, to increase the reliability and credibility of ML in distribution systems. Additionally, efforts must be made to make performance measures understandable to operators by aligning with their expectations and translating them into more user-friendly metrics.

4. Performance guarantees

For certain high-regret applications in power systems (e.g., determining system stability), the costs associated with failure are so large that even a very reliable model will not be trusted by operators without performance guarantees. Therefore, an important area of future research will be to augment ML approaches with performance guarantees or adopt hybrid approaches which combine ML with traditional analysis that can provide guarantees. For example, in Refs. 120 and 121, techniques from robust control theory and Lyapunov stability theory, respectively, are combined with reinforcement-learning controllers to achieve stability guarantees. In the supervised learning domain, some efforts have been made to provide worst-case performance guarantees when learning OPF solutions.149 Despite these promising examples, there remain many proposed applications of ML in power systems that would benefit from work in this area. By incorporating guarantees into ML techniques researchers can help remove a major barrier to adoption. In some cases, when strict guarantees are not possible, probabilistic ML methods can help to accurately convey the uncertainty inherent in the model (see Sec. VI C).

Finally, it is worth noting that although ML robustness could be a subsection of its own, we cover it briefly here, as the notion of ML model robustness is closely related to this topic. Namely, many data-driven models assume that the inputs to the model are unaltered; however, those designed for power system applications may be susceptible to various perturbations, both adversarial and nonadversarial.150 Therefore, another important area of research will be analyzing and quantifying the robustness of the ML-based solutions against altered data instances. A relevant work on this particularly important topic is Ref. 150.

5. Benchmarks

Many of the most successful applications of ML, such as image classification, speech recognition, and natural language processing, have consistently been improved on by outperforming prior works on well-established benchmarks (available at the UCI repository of ML databases151). In contrast, the lack of a unified and well-established set of baselines or benchmarks in the existing PES literature presents a challenge for accurately comparing and evaluating the performance of ML models and reproducing reported results. This results in a fragmented approach to model evaluation, with individual works often presenting specific quality tradeoffs without a direct comparison to prior works. To improve the characterization of methodological performance and enable a fair comparison, it is essential to establish a systematic and standardized framework for defining datasets and benchmarks for each ML application area. This will provide a more robust foundation for evaluating the performance of models and facilitate meaningful comparisons. Moreover, there is a need for research to develop quantitative measures that are both meaningful and interpretable, or to replace existing measures with qualitative characteristics that are better suited for the intended audience.

The last research theme focuses on probabilistic ML and includes two primary strategies: (i) probabilistic ML modeling and (ii) translating predictions with uncertainty estimates into actionable decisions. In contrast to traditional ML methods that typically provide deterministic predictions, probabilistic ML models generate predictions in the form of prediction intervals or probability distributions, encompassing not only a single-point estimate but also a measure of uncertainty. To gain a better understanding of probabilistic ML, refer to Ref. 152 for a general discussion.

Regarding (i), the use of ML in distribution system applications necessitates incorporating probabilistic modeling to account for three intrinsic types of uncertainty. These uncertainties include those arising from the physics involved (see Sec. VI C 1 for examples), the data used to train ML models, and the model itself. The second type of uncertainty typically involves both aleatoric uncertainty resulting from noise in data and epistemic uncertainty resulting from limited data and knowledge, as discussed in Ref. 34.

1. Probabilistic forecasting and estimation

Modern power distribution systems must be better equipped to handle uncertainty from various sources (such as measurement errors, load fluctuations, weather-dependent renewable generation, and network parameters uncertainty) to guarantee secure operations. One of the key ways to achieve this is through reliable forecasting. Historically, energy forecasting, including demand, wind, and solar, relied on deterministic or single-valued forecasting methods.153 However, probabilistic energy forecasting, which provides a range of probable values in addition to single-point estimates, is now considered more suitable for power distribution systems and is expected to become a fundamental aspect of their effective planning and operations. Despite the growing popularity of probabilistic ML techniques, particularly in the field of probabilistic forecasting (see, e.g., Ref. 154), the majority of forecasting models in practice still rely on deterministic modeling approaches. The transition from deterministic to probabilistic forecasting in power distribution systems has yet to occur, making it crucial to develop best-practice guidelines for their design and standardize evaluation metrics that are suitable for probabilistic ML models at this early stage. In addition to the emphasis on probabilistic forecasting, there is also a significant focus on developing probabilistic estimation methods, such as probabilistic SE using Bayesian networks,155,156 which offer many advantages over traditional deterministic SE methods.

2. Uncertainty quantification

In the face of growing uncertainty in power distribution systems, relying solely on deterministic forecasting and point estimation is not sufficient for reliable decision-making and operational risk assessment. Additionally, the predictions made by ML models are often subject to a high degree of uncertainty. To enhance the credibility and reliability of ML-based solutions, it is crucial to incorporate uncertainty quantification and provide quantifiable confidence measures for decision-makers to utilize. This will allow them to make more informed decisions based on a better understanding of the uncertainty involved. This important issue has received much attention in recent years, supplementing traditional analytical methods with stochastic formulations to deal with uncertainty. The most common techniques for quantifying uncertainty in PES research are Monte Carlo simulations,157 Bayesian inference, quantile regression, and polynomial chaos expansion,158 with methods such as Bayesian deep learning,159 Monte Carlo dropout,160 (deep) GPs,142,161 gaining in popularity. Future research in this area should focus on developing statistical learning methods that allow for the requisite uncertainty quantification in ML predictions that are specifically tailored to distribution system applications. This also entails characterizing and quantifying all sources of uncertainty during model development and determining the factors that contribute most to uncertainty. Providing insight into sources of uncertainty, let alone quantifying their contributions, can significantly increase the chances of making ML systems more accessible to systems operators and decision-makers.

3. Decision-making under uncertainty

In light of the prior discussion, it is evident that deterministic ML is not best suited for decision-making under uncertainty. Instead, probabilistic ML is more appropriate and can enhance the trustworthiness and robustness of ML model predictions in power distribution systems. The challenge, however, is translating the outputs of ML models (i.e., predictive uncertainty estimates) into effective decisions.162 Currently, the use of ML methods for decision-making under uncertainty is extensively researched within the PES domain. A prime example of this is cost-oriented ML, which endeavors to improve the practical value of probabilistic forecasting in power system decision-making procedures.163 Similarly, future research should prioritize evaluating the value of probabilistic ML models in the context of decision-making processes within distribution systems rather than just examining the accuracy and precision of their outputs. Another important area for future research is exploring ways to seamlessly integrate probabilistic forecasts and uncertainty estimates into existing decision-making practices. One potential approach is to utilize the upper and lower bounds of uncertainty estimates as inputs into planning frameworks, considering them as the most favorable and unfavorable scenarios.164 Nevertheless, there is still ample opportunity for improvement in incorporating the full uncertainty estimates into decision-making processes rather than just the distribution tails. Effective visualization that is specifically tailored for distribution system control centers is important in this context, and it relates to the topic of visualization discussed earlier.

Emerging from the previous discussion are several key questions:

  • How can ML models' interpretability, robustness, and performance guarantees be enhanced so that they can be utilized confidently in complex distribution systems?

  • How to effectively present the results of ML models to improve their understanding by system operators and decision-makers?

  • What steps need to be taken to make the performance evaluation of complex ML models more accessible to system operators?

  • What measures should be taken to validate ML findings and translate them into actionable insight for decision-makers?

  • What methods should be employed to combine actual distribution system operational data with simulated data to enhance sparse datasets while ensuring data quality standards are met?

  • How can the uncertainty in ML model predictions be converted into quantifiable confidence measures for system operators and decision-makers?

The answers to these questions will vary depending on the application and require careful consideration to guide future research. Nonetheless, concerning future published works, four key pieces of information are indispensable for developing robust and reliable ML models for distribution system applications:

  • A comprehensive list of assumptions, parameters, and algorithmic choices used;

  • A complete description of the raw input data and data processing steps utilized;

  • A complete description of the model implementation to enable independent replication;

  • Verification and validation of the model implementation to ensure accuracy and reliability.

In addition to the information above, careful consideration should be given to standardizing model performance evaluation across benchmark datasets and problems and developing appropriate benchmarks in novel application areas if they do not already exist, thereby encouraging model-consistent testing. Additionally, a concerted effort by researchers will be needed to devise meaningful metrics that test the accuracy of newly proposed ML-based solutions, especially if the assessment of model fidelity at rare events (i.e., tails of distributions) is also factored into such evaluations. Last, but not least, by including the code used to generate results in conjunction with papers, researchers can make the methods more accessible (and comparable) to the community to encourage further improvements.

In this paper, we summarize the current state of ML for power distribution systems in a form that is accessible to readers from different engineering backgrounds in order to encourage interdisciplinary research on this topic. While a vast body of literature has focused on reviewing novel ML techniques and their applications in PES, our attention is directed toward power distribution systems. These systems are at the forefront of the ongoing electric power sector's transformation and come with distinctive data-related challenges related to ML applications. In addition to providing background material on power systems—with a specific emphasis on distribution systems—and relevant ML theory, we also explore the current capabilities and limitations of ML for power distribution systems problems. We highlight various ML applications, ranging from distribution system operations and control to planning. We finally discuss potential research directions that can facilitate the adoption of ML beyond the research community and academic circles, and promote its implementation in real-world applications. One of the key takeaways from our discourse is that the current use of new data-intensive methods—widely explored in PES research—is not practical for power distribution systems at this stage due to the limited availability of data in distribution systems planning, and in particular, operations. Consequently, until data availability increases and the data-related issues discussed in this paper are addressed, it is advisable to redirect research efforts toward ML approaches that require less data or enable trustworthy augmentation of existing limited data. In this regard, physics-based ML appears to offer viable and promising research directions to pursue.

This work was authored in part by the National Renewable Energy Laboratory (NREL), operated by the Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36–08GO28308. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes. This work was also partially funded by the Climate Change AI Innovation Grants program, hosted by Climate Change AI with the support of the Quadrature Climate Foundation, Schmidt Futures, and the Canada Hub of Future Earth.

The authors have no conflicts to disclose.

Marija Markovic: Conceptualization (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (equal). Matthew Bossart: Conceptualization (supporting); Visualization (supporting); Writing – original draft (supporting); Writing – review & editing (equal). Bri-Mathias Hodge: Supervision (lead); Writing – original draft (supporting); Writing – review & editing (equal).

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

1.
J.
Rockström
,
O.
Gaffney
,
J.
Rogelj
,
M.
Meinshausen
,
N.
Nakicenovic
, and
H. J.
Schellnhuber
, “
A roadmap for rapid decarbonization
,”
Science
355
,
1269
1271
(
2017
).
2.
J. H.
Williams
,
A.
DeBenedictis
,
R.
Ghanadan
,
A.
Mahone
,
J.
Moore
,
W. R.
Morrow
III
,
S.
Price
, and
M. S.
Torn
, “
The technology path to deep greenhouse gas emissions cuts by 2050: The pivotal role of electricity
,”
Science
335
,
53
59
(
2012
).
3.
E. M.
Bibra
,
E.
Connelly
,
M.
Gorner
,
C.
Lowans
,
L.
Paoli
,
J.
Tattini
, and
J.
Teter
,
Global EV Outlook 2021: Accelerating Ambitions Despite the Pandemic
[
International Energy Agency (IEA)
,
2021
].
4.
M.
Jafari
,
A.
Kavousi-Fard
,
M.
Dabbaghjamanesh
, and
M.
Karimi
, “
A survey on deep learning role in distribution automation system: A new collaborative learning-to-learning (L2L) concept
,”
IEEE Access
10
,
81220
(
2022
).
5.
J.
Wang
,
P.
Pinson
,
S.
Chatzivasileiadis
,
M.
Panteli
,
G.
Strbac
, and
V.
Terzija
, “
On machine learning-based techniques for future sustainable and resilient energy systems
,”
IEEE Trans. Sustainable Energy
14
,
1230
(
2022
).
6.
X.
Chen
,
G.
Qu
,
Y.
Tang
,
S.
Low
, and
N.
Li
, “
Reinforcement learning for selective key applications in power systems: Recent advances and future challenges
,”
IEEE Trans. Smart Grid
13
,
2935
(
2022
).
7.
Y.
Zhang
,
X.
Shi
,
H.
Zhang
,
Y.
Cao
, and
V.
Terzija
, “
Review on deep learning applications in frequency analysis and control of modern power system
,”
Int. J. Electr. Power Energy Syst.
136
,
107744
(
2022
).
8.
P. L.
Donti
and
J. Z.
Kolter
, “
Machine learning for sustainable energy systems
,”
Annu. Rev. Environ. Resour.
46
,
719
747
(
2021
).
9.
S.
Aslam
,
H.
Herodotou
,
S. M.
Mohsin
,
N.
Javaid
,
N.
Ashraf
, and
S.
Aslam
, “
A survey on deep learning methods for power load and renewable energy forecasting in smart microgrids
,”
Renewable Sustainable Energy Rev.
144
,
110992
(
2021
).
10.
S.
Stock
,
D.
Babazadeh
, and
C.
Becker
, “
Applications of artificial intelligence in distribution power system operation
,”
IEEE access
9
,
150098
150119
(
2021
).
11.
F.
Aminifar
,
S.
Teimourzadeh
,
A.
Shahsavari
,
M.
Savaghebi
, and
M. S.
Golsorkhi
, “
Machine learning for protection of distribution networks and power electronics-interfaced systems
,”
Electr. J.
34
,
106886
(
2021
).
12.
M.
Farhoumandi
,
Q.
Zhou
, and
M.
Shahidehpour
, “
A review of machine learning applications in IoT-integrated modern power systems
,”
Electr. J.
34
,
106879
(
2021
).
13.
T.
Wu
and
J.
Wang
, “
Artificial intelligence for operation and control: The case of microgrids
,”
Electr. J.
34
,
106890
(
2021
).
14.
Y.
Yang
and
L.
Wu
, “
Machine learning approaches to the unit commitment problem: Current trends, emerging challenges, and new strategies
,”
Electr. J.
34
,
106889
(
2021
).
15.
D.
Cao
,
W.
Hu
,
J.
Zhao
,
G.
Zhang
,
B.
Zhang
,
Z.
Liu
,
Z.
Chen
, and
F.
Blaabjerg
, “
Reinforcement learning and its applications in modern power and energy systems: A review
,”
J. Mod. Power Systems Clean Energy
8
,
1029
1042
(
2020
).
16.
O. A.
Alimi
,
K.
Ouahada
, and
A. M.
Abu-Mahfouz
, “
A review of machine learning approaches to power system security and stability
,”
IEEE Access
8
,
113512
113531
(
2020
).
17.
M. S.
Ibrahim
,
W.
Dong
, and
Q.
Yang
, “
Machine learning driven smart electric power systems: Current trends and new perspectives
,”
Appl. Energy
272
,
115237
(
2020
).
18.
L.
Duchesne
,
E.
Karangelos
, and
L.
Wehenkel
, “
Recent developments in machine learning for energy systems reliability management
,”
Proc. IEEE
108
,
1656
1676
(
2020
).
19.
A. K.
Ozcanli
,
F.
Yaprakdal
, and
M.
Baysal
, “
Deep learning methods and applications for electrical power systems: A comprehensive review
,”
Int. J. Energy Res.
44
,
7136
7157
(
2020
).
20.
Z.
Zhang
,
D.
Zhang
, and
R. C.
Qiu
, “
Deep reinforcement learning for power system applications: An overview
,”
CSEE J. Power Energy Syst.
6
,
213
225
(
2019
).
21.
L.
Cheng
and
T.
Yu
, “
A new generation of AI: A review and perspective on machine learning technologies applied to smart energy and electric power systems
,”
Int. J. Energy Res.
43
,
1928
1973
(
2019
).
22.
D.
Zhang
,
X.
Han
, and
C.
Deng
, “
Review on the research and practice of deep learning and reinforcement learning in smart grids
,”
CSEE J. Power Energy Syst.
4
,
362
370
(
2018
).
23.
L.
Bird
,
M.
Milligan
, and
D.
Lew
, “
Integrating variable renewable energy: Challenges and solutions
,”
Report No. NREL/TP-6A20-60451
[
National Renewable Energy Lab. (NREL)
,
Golden, CO
,
2013
].
24.
M.
Raissi
,
P.
Perdikaris
, and
G. E.
Karniadakis
, “
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
,”
J. Comput. Phys.
378
,
686
707
(
2019
).
25.
B.
Huang
and
J.
Wang
, “
Applications of physics-informed neural networks in power systems—A review
,”
IEEE Trans. Power Syst.
38
,
572
588
(
2022
).
26.
M. I.
Jordan
and
T. M.
Mitchell
, “
Machine learning: Trends, perspectives, and prospects
,”
Science
349
,
255
260
(
2015
).
27.
W.
Liu
,
Z.
Wang
,
X.
Liu
,
N.
Zeng
,
Y.
Liu
, and
F. E.
Alsaadi
, “
A survey of deep neural network architectures and their applications
,”
Neurocomputing
234
,
11
26
(
2017
).
28.
J.
Zhou
,
G.
Cui
,
S.
Hu
,
Z.
Zhang
,
C.
Yang
,
Z.
Liu
,
L.
Wang
,
C.
Li
, and
M.
Sun
, “
Graph neural networks: A review of methods and applications
,”
AI Open
1
,
57
81
(
2020
).
29.
W.
Liao
,
B.
Bak-Jensen
,
J. R.
Pillai
,
Y.
Wang
, and
Y.
Wang
, “
A review of graph neural networks and their applications in power systems
,”
J. Mod. Power Syst. Clean Energy
10
,
345
360
(
2021
).
30.
J.
Machowski
,
Z.
Lubosny
,
J.
Bialek
, and
J.
Bumby
,
Power System Dynamics: Stability and Control
(
Wiley
,
2020
).
31.
M.
Farivar
and
S. H.
Low
, “
Branch flow model: Relaxations and convexification—Part I
,”
IEEE Trans. Power Syst.
28
,
2554
2564
(
2013
).
32.
A.
Bergen
and
V.
Vittal
,
Power Systems Analysis
(
Prentice Hall
,
2000
).
33.
J.
Glover
,
M.
Sarma
, and
T.
Overbye
,
Power System Analysis and Design
(
Cengage Learning
,
2011
).
34.
G. E.
Karniadakis
,
I. G.
Kevrekidis
,
L.
Lu
,
P.
Perdikaris
,
S.
Wang
, and
L.
Yang
, “
Physics-informed machine learning
,”
Nat. Rev. Phys.
3
,
422
440
(
2021
).
35.
K.
Lehmann
,
A.
Grastien
, and
P.
Van Hentenryck
, “
Ac-feasibility on tree networks is np-hard
,”
IEEE Trans. Power Syst.
31
,
798
801
(
2015
).
36.
P.
Van Hentenryck
, “
Machine learning for optimal power flows
,”
Tutorials Oper. Res.
62
82
(
2021
).
37.
D. K.
Molzahn
and
I. A.
Hiskens
et al, “
A survey of relaxations and approximations of the power flow equations
,”
Found. Trends Electr. Energy Syst.
4
,
1
221
(
2019
).
38.
F.
Hasan
,
A.
Kargarian
, and
A.
Mohammadi
, “
A survey on applications of machine learning for optimal power flow
,” in
IEEE Texas Power and Energy Conference (TPEC)
(
IEEE
,
2020
).
39.
R.
Nellikkath
and
S.
Chatzivasileiadis
, “
Physics-informed neural networks for ac optimal power flow
,”
Electr. Power Syst. Res.
212
,
108412
(
2022
).
40.
S.
Chatzivasileiadis
,
A.
Venzke
,
J.
Stiasny
, and
G.
Misyris
, “
Machine learning in power systems: Is it time to trust it?
,”
IEEE Power Energy Mag.
20
,
32
41
(
2022
).
41.
A.
Primadianto
and
C.-N.
Lu
, “
A review on distribution system state estimation
,”
IEEE Trans. Power Syst.
32
,
3875
3883
(
2016
).
42.
K.
Dehghanpour
,
Z.
Wang
,
J.
Wang
,
Y.
Yuan
, and
F.
Bu
, “
A survey on state estimation techniques and challenges in smart distribution systems
,”
IEEE Trans. Smart Grid
10
,
2312
2322
(
2018
).
43.
A.
Abur
and
A.
Expósito
,
Power System State Estimation: Theory and Implementation
, Power Engineering (Willis) (
CRC Press
,
2004
).
44.
K. R.
Mestav
,
J.
Luengo-Rozas
, and
L.
Tong
, “
Bayesian state estimation for unobservable distribution systems via deep learning
,”
IEEE Trans. Power Syst.
34
,
4910
4920
(
2019
).
45.
A. S.
Zamzam
and
N. D.
Sidiropoulos
, “
Physics-aware neural networks for distribution system state estimation
,”
IEEE Trans. Power Syst.
35
,
4347
4356
(
2020
).
46.
M.
Cramer
,
P.
Goergens
, and
A.
Schnettler
, “
Bad data detection and handling in distribution grid state estimation using artificial neural networks
,” in
IEEE Eindhoven PowerTech
(
IEEE
,
2015
).
47.
D.
Gotti
,
H.
Amaris
, and
P. L.
Larrea
, “
A deep neural network approach for online topology identification in state estimation
,”
IEEE Trans. Power Syst.
36
,
5824
5833
(
2021
).
48.
E.
Manitsas
,
R.
Singh
,
B. C.
Pal
, and
G.
Strbac
, “
Distribution system state estimation using an artificial neural network approach for pseudo measurement modeling
,”
IEEE Trans. Power Syst.
27
,
1888
1896
(
2012
).
49.
K.
Dehghanpour
,
Y.
Yuan
,
Z.
Wang
, and
F.
Bu
, “
A game-theoretic data-driven approach for pseudo-measurement generation in distribution system state estimation
,”
IEEE Trans. Smart Grid
10
,
5942
5951
(
2019
).
50.
Z.
Cao
,
Y.
Wang
,
C.-C.
Chu
, and
R.
Gadh
, “
Robust pseudo-measurement modeling for three-phase distribution systems state estimation
,”
Electr. Power Syst. Res.
180
,
106138
(
2020
).
51.
W.
Wang
,
N.
Yu
,
F.
Rahmatian
, and
S.
Pandey
, “
Where to install distribution phasor measurement units to obtain accurate state estimation results?
,” in
IEEE Power & Energy Society General Meeting (PESGM)
(
IEEE
,
2022
).
52.
D.
Jannach
,
M.
Zanker
,
A.
Felfernig
, and
G.
Friedrich
,
Recommender Systems: An Introduction
(
Cambridge University Press
,
2010
).
53.
P. L.
Donti
,
Y.
Liu
,
A. J.
Schmitt
,
A.
Bernstein
,
R.
Yang
, and
Y.
Zhang
, “
Matrix completion for low-observability voltage estimation
,”
IEEE Trans. Smart Grid
11
,
2520
2530
(
2019
).
54.
M.
Marković
,
A.
Florita
, and
B.-M.
Hodge
, “
Matrix completion for improved observability in low-voltage distribution grids
,” in
IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)
(
IEEE
,
2021
).
55.
R.
Madbhavi
,
H. S.
Karimi
,
B.
Natarajan
, and
B.
Srinivasan
, “
Tensor completion based state estimation in distribution systems
,” in
IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT)
(
IEEE
,
2020
).
56.
R.
Madbhavi
,
B.
Natarajan
, and
B.
Srinivasan
, “
Enhanced tensor completion based approaches for state estimation in distribution systems
,”
IEEE Trans. Ind. Inf.
17
,
5938
5947
(
2020
).
57.
Y.
Liu
,
A. S.
Zamzam
, and
A.
Bernstein
, “
Multiarea distribution system state estimation via distributed tensor completion
,”
IEEE Trans. Smart Grid
13
,
4887
4898
(
2022
).
58.
F.
Li
and
Y.
Du
, “
From AlphaGO to power system AI: What engineers can learn from solving the most complex board game
,”
IEEE Power Energy Mag.
16
,
76
84
(
2018
).
59.
L.
Song
,
Y.
Li
, and
N.
Lu
, “
ProfileSR-GAN: A GAN based super-resolution method for generating high-resolution load profiles
,”
IEEE Trans. Smart Grid
13
,
3278
(
2022
).
60.
M.
Razanousky
and
K.
Morrissey
, “
Fundamental research challenges for distribution state estimation to enable high-performing grids
,”
NYSERDA Report No. 18-37
[
New York State Energy Research and Development Authority (NYSERDA)
,
New York
,
2018
].
61.
E.
Bisong
, “
Optimization for machine learning: Gradient descent
,” in
Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners
(Springer,
2019
), pp.
203
207
.
62.
Q.
Wang
,
Y.
Ma
,
K.
Zhao
, and
Y.
Tian
, “
A comprehensive survey of loss functions in machine learning
,”
Ann. Data Sci.
9
,
187
212
(
2020
).
63.
B.
Efron
and
G.
Gong
, “
A leisurely look at the bootstrap, the jackknife, and cross-validation
,”
Am. Stat.
37
,
36
48
(
1983
).
64.
F.
Zhuang
,
Z.
Qi
,
K.
Duan
,
D.
Xi
,
Y.
Zhu
,
H.
Zhu
,
H.
Xiong
, and
Q.
He
, “
A comprehensive survey on transfer learning
,”
Proc. IEEE
109
,
43
76
(
2020
).
65.
F.
Emmert-Streib
and
M.
Dehmer
, “
Taxonomy of machine learning paradigms: A data-centric perspective
,”
Wiley Interdiscip. Rev.
12
,
e1470
(
2022
).
66.
S.
Lange
,
T.
Gabel
, and
M.
Riedmiller
, “
Batch reinforcement learning
,” in
Reinforcement Learning
(
Springer
,
2012
), pp.
45
73
.
67.
R.
Agarwal
,
D.
Schuurmans
, and
M.
Norouzi
, “
An optimistic perspective on offline reinforcement learning
,” in
International Conference on Machine Learning
(
PMLR
,
2020
), pp.
104
114
.
68.
T.
Jebara
,
Machine Learning: Discriminative and Generative
(
Springer Science & Business Media
,
2012
), Vol.
755
.
69.
G.
BakIr
,
T.
Hofmann
,
A. J.
Smola
,
B.
Schölkopf
, and
B.
Taskar
,
Predicting Structured Data
(
MIT Press
,
2007
).
70.
M.
Leshno
,
V. Y.
Lin
,
A.
Pinkus
, and
S.
Schocken
, “
Multilayer feedforward networks with a nonpolynomial activation function can approximate any function
,”
Neural Networks
6
,
861
867
(
1993
).
71.
J.-F.
Toubeau
,
T.
Morstyn
,
J.
Bottieau
,
K.
Zheng
,
D.
Apostolopoulou
,
Z.
De Grève
,
Y.
Wang
, and
F.
Vallée
, “
Capturing spatio-temporal dependencies in the probabilistic forecasting of distribution locational marginal prices
,”
IEEE Trans. Smart Grid
12
,
2663
2674
(
2020
).
72.
T.
Zufferey
,
S.
Renggli
, and
G.
Hug
, “
Probabilistic state forecasting and optimal voltage control in distribution grids under uncertainty
,”
Electr. Power Syst. Res.
188
,
106562
(
2020
).
73.
D.
Mukherjee
,
S.
Chakraborty
, and
S.
Ghosh
, “
Power system state forecasting using machine learning techniques
,”
Electr. Eng.
104
,
283
305
(
2022
).
74.
Q. T.
Tran
,
K.
Davies
,
L.
Roose
,
P.
Wiriyakitikun
,
J.
Janjampop
,
E.
Riva Sanseverino
, and
G.
Zizzo
, “
A review of health assessment techniques for distribution transformers in smart distribution grids
,”
Appl. Sci.
10
,
8115
(
2020
).
75.
M. A.
Mahmoud
,
N. R.
Md Nasir
,
M.
Gurunathan
,
P.
Raj
, and
S. A.
Mostafa
, “
The current state of the art in research on predictive maintenance in smart grid distribution network: Fault's types, causes, and prediction methods–a systematic review
,”
Energies
14
,
5078
(
2021
).
76.
A. E.
Ezugwu
,
A. M.
Ikotun
,
O. O.
Oyelade
,
L.
Abualigah
,
J. O.
Agushaka
,
C. I.
Eke
, and
A. A.
Akinyelu
, “
A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects
,”
Eng. Appl. Artif. Intell.
110
,
104743
(
2022
).
77.
S.
Wold
,
K.
Esbensen
, and
P.
Geladi
, “
Principal component analysis
,”
Chemom. Intell. Lab. Syst.
2
,
37
52
(
1987
).
78.
G. E.
Hinton
and
R. R.
Salakhutdinov
, “
Reducing the dimensionality of data with neural networks
,”
Science
313
,
504
507
(
2006
).
79.
F. T.
Liu
,
K. M.
Ting
, and
Z.-H.
Zhou
, “
Isolation forest
,” in
Eighth IEEE International Conference on Data Mining
(
IEEE
,
2008
), pp.
413
422
.
80.
I.
Goodfellow
,
J.
Pouget-Abadie
,
M.
Mirza
,
B.
Xu
,
D.
Warde-Farley
,
S.
Ozair
,
A.
Courville
, and
Y.
Bengio
, “
Generative adversarial networks
,”
Commun. ACM
63
,
139
144
(
2020
).
81.
Y.
Chen
,
Y.
Wang
,
D.
Kirschen
, and
B.
Zhang
, “
Model-free renewable scenario generation using generative adversarial networks
,”
IEEE Trans. Power Syst.
33
,
3265
3275
(
2018
).
82.
J.
Li
,
J.
Zhou
, and
B.
Chen
, “
Review of wind power scenario generation methods for optimal operation of renewable energy systems
,”
Appl. Energy
280
,
115992
(
2020
).
83.
F.
Olivier
,
A.
Sutera
,
P.
Geurts
,
R.
Fonteneau
, and
D.
Ernst
, “
Phase identification of smart meters by clustering voltage measurements
,” in
Power Systems Computation Conference (PSCC)
(
IEEE
,
2018
).
84.
L.
Blakely
,
M. J.
Reno
, and
W.-c
Feng
, “
Spectral clustering for customer phase identification using AMI voltage timeseries
,” in
IEEE Power and Energy Conference at Illinois (PECI)
(
IEEE
,
2019
).
85.
Z. S.
Hosseini
,
A.
Khodaei
, and
A.
Paaso
, “
Machine learning-enabled distribution network phase identification
,”
IEEE Trans. Power Syst.
36
,
842
850
(
2020
).
86.
N.
Zaragoza
and
V.
Rao
, “
Phase identification of power distribution systems using hierarchical clustering methods,” in
North American Power Symposium (NAPS)
(
IEEE
,
2021
).
87.
H. P.
Lee
,
M.
Zhang
,
M.
Baran
,
N.
Lu
,
P.
Rehm
,
E.
Miller
, and
M.
Makdad
, “
A novel data segmentation method for data-driven phase identification
,” in
IEEE Power & Energy Society General Meeting (PESGM)
(
IEEE
,
2022
).
88.
F.
Therrien
,
L.
Blakely
, and
M. J.
Reno
, “
Assessment of measurement-based phase identification methods
,”
IEEE Open Access J. Power Energy
8
,
128
137
(
2021
).
89.
Y.
Wang
,
M.
Jia
,
N.
Gao
,
L.
Von Krannichfeldt
,
M.
Sun
, and
G.
Hug
, “
Federated clustering for electricity consumption pattern extraction
,”
IEEE Trans. Smart Grid
13
,
2425
2439
(
2022
).
90.
R. S.
Sutton
and
A. G.
Barto
,
Reinforcement Learning: An Introduction
(
MIT Press
,
2018
).
91.
D.
Qiu
,
Y.
Wang
,
W.
Hua
, and
G.
Strbac
, “
Reinforcement learning for electric vehicle applications in power systems: A critical review
,”
Renewable Sustainable Energy Rev.
173
,
113052
(
2023
).
92.
ANSI, “
For electric power systems and equipment-voltage ratings (60 Hz)
,” Standard No. C84.1.1-2006 (ANSI,
2006
).
93.
Q.
Yang
,
G.
Wang
,
A.
Sadeghi
,
G. B.
Giannakis
, and
J.
Sun
, “
Two-timescale voltage control in distribution grids using deep reinforcement learning
,”
IEEE Trans. Smart Grid
11
,
2313
2323
(
2019
).
94.
S.
Wang
,
J.
Duan
,
D.
Shi
,
C.
Xu
,
H.
Li
,
R.
Diao
, and
Z.
Wang
, “
A data-driven multi-agent autonomous voltage control framework using deep reinforcement learning
,”
IEEE Trans. Power Syst.
35
,
4644
4654
(
2020
).
95.
H.
Liu
,
W.
Wu
, and
Y.
Wang
, “
Bi-level off-policy reinforcement learning for two-timescale volt/VAR control in active distribution networks
,”
IEEE Trans. Power Syst.
38, 385–295 (
2022
).
96.
H.
Li
and
H.
He
, “
Learning to operate distribution networks with safe deep reinforcement learning
,”
IEEE Trans. Smart Grid
13
,
1860
1872
(
2022
).
97.
O.
Chapelle
,
B.
Scholkopf
, and
A.
Zien
,
Semi-Supervised Learning
, Adaptive Computation and Machine Learning Series (
MIT Press
,
2010
).
98.
I.
Triguero
,
S.
García
, and
F.
Herrera
, “
Self-labeled techniques for semi-supervised learning: Taxonomy, software and empirical study
,”
Knowl. Inf. Syst.
42
,
245
284
(
2015
).
99.
L.
Didaci
,
G.
Fumera
, and
F.
Roli
, “
Analysis of co-training algorithm with very small training sets
,” in
Structural, Syntactic, and Statistical Pattern Recognition: Joint IAPR International Workshop (SSPR&SPR 2012), Hiroshima, Japan, 7–9 November 2012
(
Springer
,
2012
), pp.
719
726
.
100.
T.
Joachims
, “
Transductive inference for text classification using support vector machines
,” in
ICML
(
Morgan Kaufmann Publishers Inc.
,
1999
), Vol.
99
, pp.
200
209
.
101.
D.-H.
Lee
, “
Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks
,” in
Workshop on Challenges in Representation Learning, ICML
(ICML,
2013
), Vol.
3
, p.
896
.
102.
D.
Berthelot
,
N.
Carlini
,
I.
Goodfellow
,
N.
Papernot
,
A.
Oliver
, and
C. A.
Raffel
, “
Mixmatch: A holistic approach to semi-supervised learning
,” in
Advances in Neural Information Processing Systems 32
(NeurIPS,
2019
).
103.
T. S.
Abdelgayed
,
W. G.
Morsi
, and
T. S.
Sidhu
, “
Fault detection and classification based on co-training of semisupervised machine learning
,”
IEEE Trans. Ind. Electron.
65
,
1595
1605
(
2017
).
104.
F.
Yang
,
Z.
Ling
,
Y.
Zhang
,
X.
He
,
Q.
Ai
, and
R. C.
Qiu
, “
Event detection, localization, and classification based on semi-supervised learning in power grids
,”
IEEE Trans. Power Syst.
(published online
2022
).
105.
Y.
Zhang
,
J.
Wang
, and
B.
Chen
, “
Detecting false data injection attacks in smart grids: A semi-supervised deep learning approach
,”
IEEE Trans. Smart Grid
12
,
623
634
(
2020
).
106.
Z.
Aslam
,
F.
Ahmed
,
A.
Almogren
,
M.
Shafiq
,
M.
Zuair
, and
N.
Javaid
, “
An attention guided semi-supervised learning mechanism to detect electricity frauds in the distribution systems
,”
IEEE Access
8
,
221767
221782
(
2020
).
107.
J. M.
Gillis
and
W. G.
Morsi
, “
Non-intrusive load monitoring using semi-supervised machine learning and wavelet design
,”
IEEE Trans. Smart Grid
8
,
2648
2655
(
2016
).
108.
D.
Li
and
S.
Dick
, “
Residential household non-intrusive load monitoring via graph-based multi-label semi-supervised learning
,”
IEEE Trans. Smart Grid
10
,
4615
4627
(
2018
).
109.
P.
Rodríguez-Pajarón
,
A. H.
Bayo
, and
J. V.
Milanović
, “
Forecasting voltage harmonic distortion in residential distribution networks using smart meter data
,”
Int. J. Electr. Power Energy Syst.
136
,
107653
(
2022
).
110.
B.
Foggo
and
N.
Yu
, “
A comprehensive evaluation of supervised machine learning for the phase identification problem
,”
Int. J. Comput. Syst. Eng.
11
,
419
427
(
2018
).
111.
F.
Wang
,
X.
Lu
,
X.
Chang
,
X.
Cao
,
S.
Yan
,
K.
Li
,
N.
Duić
,
M.
Shafie-Khah
, and
J. P.
Catalão
, “
Household profile identification for behavioral demand response: A semi-supervised learning approach using smart meter data
,”
Energy
238
,
121728
(
2022
).
112.
M.
Han
,
J.
Zhao
,
X.
Zhang
,
J.
Shen
, and
Y.
Li
, “
The reinforcement learning method for occupant behavior in building control: A review
,”
Energy Built. Environ.
2
,
137
148
(
2021
).
113.
H. M.
Abdullah
,
A.
Gastli
, and
L.
Ben-Brahim
, “
Reinforcement learning based EV charging management systems—A review
,”
IEEE Access
9
,
41506
41531
(
2021
).
114.
N.
Mehrabi
,
F.
Morstatter
,
N.
Saxena
,
K.
Lerman
, and
A.
Galstyan
, “
A survey on bias and fairness in machine learning
,”
ACM Comput. Surv.
54
,
1
35
(
2021
).
115.
P. L.
Watson
,
A.
Spaulding
,
M.
Koukoula
, and
E.
Anagnostou
, “
Improved quantitative prediction of power outages caused by extreme weather events
,”
Weather Clim. Extremes
37
,
100487
(
2022
).
116.
A.
Kody
,
A.
West
, and
D. K.
Molzahn
, “
Sharing the load: Considering fairness in de-energization scheduling to mitigate wildfire ignition risk using rolling optimization
,” in
IEEE 61st Conference on Decision and Control (CDC)
(IEEE,
2022
), pp.
5705
5712
.
117.
N.
Rhodes
,
L.
Ntaimo
, and
L.
Roald
, “
Balancing wildfire risk and power outages through optimized power shut-offs
,”
IEEE Trans. Power Syst.
36
,
3118
3128
(
2021
).
118.
H.
Sun
,
Q.
Guo
,
J.
Qi
,
V.
Ajjarapu
,
R.
Bravo
,
J.
Chow
,
Z.
Li
,
R.
Moghe
,
E.
Nasr-Azadani
,
U.
Tamrakar
et al, “
Review of challenges and research opportunities for voltage control in smart grids
,”
IEEE Trans. Power Syst.
34
,
2790
2801
(
2019
).
119.
W.
Cui
,
J.
Li
, and
B.
Zhang
, “
Decentralized safe reinforcement learning for inverter-based voltage control
,”
Electr. Power Syst. Res.
211
,
108609
(
2022
).
120.
P.
Donti
,
M.
Roderick
,
M.
Fazlyab
, and
J. Z.
Kolter
, “
Enforcing robust control guarantees within neural network policies
,” in
International Conference on Learning Representations
,
2021
.
121.
Y.
Shi
,
G.
Qu
,
S.
Low
,
A.
Anandkumar
, and
A.
Wierman
, “
Stability constrained reinforcement learning for real-time voltage control
,” in
American Control Conference (ACC)
(IEEE,
2022
), pp.
2715
2721
.
122.
R.
Jain
,
D. L.
Lubkeman
, and
S. M.
Lukic
, “
Dynamic adaptive protection for distribution systems in grid-connected and islanded modes
,”
IEEE Trans. Power Delivery
34
,
281
289
(
2018
).
123.
A.
Sajadi
,
J.
Rañola
,
R.
Kenyon
,
B.
Hodge
, and
B.
Mather
, “
Electric power industry challenges due to increasing shares of inverter-based resources in power systems
,” in
IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe)
(
IEEE
,
2022
).
124.
D.
Wu
,
X.
Zheng
,
D.
Kalathil
, and
L.
Xie
, “
Nested reinforcement learning based control for protective relays in power distribution systems,” in
IEEE 58th Conference on Decision and Control (CDC)
(
IEEE
,
2019
), pp.
1925
1930
.
125.
C. B.
Jones
,
A.
Summers
, and
M. J.
Reno
, “
Machine learning embedded in distribution network relays to classify and locate faults
,” in
IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT)
(
IEEE
,
2021
).
126.
P.
Zeng
,
S.
Cui
,
C.
Song
,
Z.
Wang
, and
G.
Li
, “
A multiagent deep deterministic policy gradient-based distributed protection method for distribution network
,”
Neural Comput. Appl.
35
,
2267
2278
(
2022
).
127.
C. B.
Browne
,
E.
Powley
,
D.
Whitehouse
,
S. M.
Lucas
,
P. I.
Cowling
,
P.
Rohlfshagen
,
S.
Tavener
,
D.
Perez
,
S.
Samothrakis
, and
S.
Colton
, “
A survey of Monte Carlo tree search methods
,”
IEEE Trans. Comput. Intell. AI Games
4
,
1
43
(
2012
).
128.
G. S.
Misyris
,
A.
Venzke
, and
S.
Chatzivasileiadis
, “
Physics-informed neural networks for power systems
,” in
IEEE Power & Energy Society General Meeting (PESGM)
(
IEEE
,
2020
).
129.
B. C.
Erdener
,
C.
Feng
,
K.
Doubleday
,
A.
Florita
, and
B.-M.
Hodge
, “
A review of behind-the-meter solar forecasting
,”
Renewable Sustainable Energy Rev.
160
,
112224
(
2022
).
130.
J.
de Hoog
,
S.
Maetschke
,
P.
Ilfrich
, and
R. R.
Kolluri
, “
Using satellite and aerial imagery for identification of solar PV: State of the art and research opportunities
,” in
Proceedings of the Eleventh ACM International Conference on Future Energy Systems
(ACM,
2020
), pp.
308
313
.
131.
X.
Zhang
and
S.
Grijalva
, “
A data-driven approach for detection and estimation of residential PV installations
,”
IEEE Trans. Smart Grid
7
,
2477
2485
(
2016
).
132.
Y.
Wang
,
Q.
Chen
,
T.
Hong
, and
C.
Kang
, “
Review of smart meter data analytics: Applications, methodologies, and challenges
,”
IEEE Trans. Smart Grid
10
,
3125
3148
(
2018
).
133.
M.
Marković
,
A.
Sajadi
,
A.
Florita
,
R.
Cruickshank
III
, and
B.-M.
Hodge
, “
Voltage estimation in low-voltage distribution grids with distributed energy resources
,”
IEEE Trans. Sustainable Energy
12
,
1640
1650
(
2021
).
134.
P.
Gao
,
M.
Wang
,
S. G.
Ghiocel
,
J. H.
Chow
,
B.
Fardanesh
, and
G.
Stefopoulos
, “
Missing data recovery by exploiting low-dimensionality in power system synchrophasor measurements
,”
IEEE Trans. Power Syst.
31
,
1006
1013
(
2015
).
135.
C.
Genes
,
I.
Esnaola
,
S. M.
Perlaza
,
L. F.
Ochoa
, and
D.
Coca
, “
Robust recovery of missing data in electricity distribution systems
,”
IEEE Trans. Smart Grid
10
,
4057
4067
(
2018
).
136.
D.
Osipov
and
J. H.
Chow
, “
PMU missing data recovery using tensor decomposition
,”
IEEE Trans. Power Syst.
35
,
4554
4563
(
2020
).
137.
Y.
Yuan
,
K.
Dehghanpour
,
F.
Bu
, and
Z.
Wang
, “
Outage detection in partially observable distribution systems using smart meters and generative adversarial networks
,”
IEEE Trans. Smart Grid
11
,
5418
5430
(
2020
).
138.
H.
Wu
,
X.
Meng
,
M. M.
Danziger
,
S. P.
Cornelius
,
H.
Tian
, and
A.-L.
Barabási
, “
Fragmentation of outage clusters during the recovery of power distribution grids
,”
Nat. Commun.
13
,
7372
(
2022
).
139.
K.
Chen
,
J.
Hu
,
Y.
Zhang
,
Z.
Yu
, and
J.
He
, “
Fault location in power distribution systems via deep graph convolutional networks
,”
IEEE J. Sel. Areas Commun.
38
,
119
131
(
2019
).
140.
A.
Zidan
and
E. F.
El-Saadany
, “
A cooperative multiagent framework for self-healing mechanisms in distribution systems
,”
IEEE Trans. Smart Grid
3
,
1525
1539
(
2012
).
141.
P. A.
Parra
,
D.
Celeita
,
G.
Ramos
,
W.
Martínez
, and
G.
Chaffey
, “
Reinforcement learning for service restoration algorithms in distribution networks
,” in
IEEE Industry Applications Society Annual Meeting (IAS)
(
IEEE
,
2022
).
142.
C.
Rasmussen
and
C.
Williams
,
Gaussian Processes for Machine Learning
, Adaptive Computation and Machine Learning (
MIT Press
,
2006
).
143.
H.
Liu
,
Y.-S.
Ong
,
X.
Shen
, and
J.
Cai
, “
When Gaussian process meets big data: A review of scalable GPs
,”
IEEE Trans. Neural Networks Learn. Syst.
31
,
4405
4423
(
2020
).
144.
M. R.
Asghar
,
G.
Dán
,
D.
Miorandi
, and
I.
Chlamtac
, “
Smart meter data privacy: A survey
,”
IEEE Commun. Surv. Tutorials
19
,
2820
2835
(
2017
).
145.
K.
Kashinath
,
M.
Mustafa
,
A.
Albert
,
J.
Wu
,
C.
Jiang
,
S.
Esmaeilzadeh
,
K.
Azizzadenesheli
,
R.
Wang
,
A.
Chattopadhyay
,
A.
Singh
et al, “
Physics-informed machine learning: Case studies for weather and climate modelling
,”
Philos. Trans. R. Soc. A
379
,
20200093
(
2021
).
146.
W.
Samek
,
G.
Montavon
,
S.
Lapuschkin
,
C. J.
Anders
, and
K.-R.
Müller
, “
Explaining deep neural networks and beyond: A review of methods and applications
,”
Proc. IEEE
109
,
247
278
(
2021
).
147.
W.
Samek
,
G.
Montavon
,
A.
Vedaldi
,
L.
Hansen
, and
K.
Müller
,
Explainable AI: Interpreting, Explaining and Visualizing Deep Learning
, Lecture Notes in Computer Science (
Springer International Publishing
,
2019
).
148.
C.
Molnar
,
Interpretable Machine Learning
(
Leanpub
,
2020
).
149.
A.
Venzke
,
G.
Qu
,
S.
Low
, and
S.
Chatzivasileiadis
, “
Learning optimal power flow: Worst-case guarantees for neural networks
,” in
IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)
(
IEEE
,
2020
).
150.
C.
Ren
,
X.
Du
,
Y.
Xu
,
Q.
Song
,
Y.
Liu
, and
R.
Tan
, “
Vulnerability analysis, robustness verification, and mitigation strategy for machine learning-based power system stability assessment model under adversarial examples
,”
IEEE Trans. Smart Grid
13
,
1622
1632
(
2021
).
151.
C.
Blake
, see http://www.ics.uci.edu/mlearn/MLRepository.html for “
UCI repository of machine learning databases
” (
1998
).
152.
Z.
Ghahramani
, “
Probabilistic machine learning and artificial intelligence
,”
Nature
521
,
452
459
(
2015
).
153.
T.
Hong
,
P.
Pinson
,
Y.
Wang
,
R.
Weron
,
D.
Yang
, and
H.
Zareipour
, “
Energy forecasting: A review and outlook
,”
IEEE Open Access J. Power Energy
7
,
376
388
(
2020
).