Authors:
(1) Mengshuo Jia, Department of Information Technology and Electrical Engineering, ETH Zürich, Physikstrasse 3, 8092, Zürich, Switzerland;
(2) Gabriela Hug, Department of Information Technology and Electrical Engineering, ETH Zürich, Physikstrasse 3, 8092, Zürich, Switzerland;
(3) Ning Zhang, Department of Electrical Engineering, Tsinghua University, Shuangqing Rd 30, 100084, Beijing, China;
(4) Zhaojian Wang, Department of Automation, Shanghai Jiao Tong University, Dongchuan Rd 800, 200240, Shanghai, China;
(5) Yi Wang, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pok Fu Lam, Hong Kong, China;
(6) Chongqing Kang, Department of Electrical Engineering, Tsinghua University, Shuangqing Rd 30, 100084, Beijing, China.
2. Evaluated Methods
3. Review of Existing Experiments
4. Generalizability and Applicability Evaluations and 4.1. Predictor and Response Generalizability
4.2. Applicability to Cases with Multicollinearity and 4.3. Zero Predictor Applicability
4.4. Constant Predictor Applicability and 4.5. Normalization Applicability
5. Numerical Evaluations and 5.1. Experiment Settings
The discussions and simulations presented in this paper, alongside [6], imply a range of open problems and suggest potential areas for future research. Note that [6] already dived into the limitations of specific methods, revealing potential open problems from an individual method’s view. Hence, this section adopts a broader and complementary perspective, not merely summarizing the overarching issues and directions for future research from a general standpoint, but also highlighting the inconsistency between expected capabilities and actual simulation outcomes for several specific approaches.
Data Pollution
How to handle data noise and outliers is still an open question. Current approaches lack reliable methods for effectively dealing with outliers. As discussed above, while SVR-related and Huber-loss-related methods easily detect the outliers in the voltage data, they did not perform well in identifying outliers within the data of branch flows. Similarly, existing methods to mitigate noise, such as using LS_TOL, do not yield satisfactory results. A more robust solution is needed, one that can filter noise and outliers without the impracticalities of traditional methods like Kalman filtering, which demands assumptions about distribution and fluctuation that are not readily applicable in the dynamic environment of power systems. The challenge lies in developing an approach that not only improves upon the accuracy of detecting outliers and noise in branch flow data but is also adaptable and practical for real-world application.
Suboptimality
All DPFL methods may exhibit suboptimal performance for several reasons. Firstly, the inherent hyperparameters in many DPFL techniques pose a challenge for parameter optimization. Cross-validation is commonly employed for tuning hyperparameters, but this approach is time-intensive and potentially suboptimal due to the limitation of predefined hyperparameter ranges. Furthermore, DPFL methods exhibit evident modular characteristics, allowing for the integration of various modules to address various issues, such as multicollinearity and inherent nonlinearity. In methods without conventional hyperparameters, the selection or combination of modules can also be seen as a hyperparameter in a general sense, which is more complex to tune and prone to suboptimality. Thus, theoretically defining optimality in DPFL method design and identifying optimal hyperparameters (both conventional and modular) without significantly sacrificing computational efficiency, remain critical yet unresolved areas. Note, again, that the PLS_CLS method, simply designed to highlight DPFL’s modular nature, already demonstrates considerable potential. The potential performance of a DPFL method thoughtfully designed by leveraging its modular advantages is likely to be substantial.
Computation Efficiency
Over 20 methods are highly time-consuming, especially for large power systems with over 1000 buses. Even though some of these methods are quite accurate, they cannot satisfy the time requirements of several applications such as real-time dispatch. These time-consuming methods are mainly optimization-based, nonlinear-model-fitting-based, or iteration-based approaches. Therefore, enhancing the efficiency of optimization problem-solving, improving the fitting of nonlinear models, accelerating the convergence of specific algorithms, and effectively managing large-scale datasets are critical strategies for addressing the issue of high computational burden. These measures can significantly contribute to the practical applicability and performance of DPFL methods in large scenarios. However, how to achieve such acceleration remains a quite large open area to investigate.
Inherent Nonlinearity
In addressing the inherent nonlinearity within AC power flow models, coordinate transformation strategies such as LS_LIFX, LS_LIFXi, and RR_VCS have shown promise. However, they compromise the practical applicability of DPFL models for optimization or control purposes, as they render the model a linear function of transformed variables that lack clear physical interpretation. Furthermore, the application of nonlinear kernels to project the AC model into another space, despite its theoretical appeal, presents challenges in identifying an effective nonlinear kernel that can facilitate a highly linear representation of the AC power flow model in the transformed space. The commonly used SVR_POL with a 3rd-polynomial kernel falls short of achieving such an effective power flow representation. Moreover, piecewise linear models, developed through clustering-based DPFL approaches such as RR_KPC and PLS_CLS, demonstrate improved accuracy in cases where the original power flow model exhibits significant nonlinearity. Yet, again, they are challenging to use in practical applications because they introduce integer decision variables, making the programming problem difficult to solve. Consequently, to handle the inherent nonlinearity of AC power flow models, the investigation to find an optimal balance between model accuracy and practical applicability remains a significant challenge and needs further exploration.
Physical Knowledge
Several DPFL approaches incorporate various forms of physical knowledge, such as boundary operating conditions (LCP_JGD), network topologies (LCP_COU and RR_VCS), line admittances (LCP_BOX and LCP_JGD), and PPFL model formulations (DLPF_C and DC_LS). Yet, accessing this physical data may not always be straightforward in realworld scenarios. Future studies need to address this challenge, evaluating the accessibility of such information and reaching a consensus on its availability. Should certain physical data prove difficult to be obtained, assessing the associated costs becomes crucial. This evaluation will help determine the practicality and value of incorporating physical knowledge into DPFL models to enhance their accuracy, considering the potential implications for system operators.
Additionally, integrating physical knowledge into DPFL poses several open questions due to its inherent limitations. First, the process might inadvertently omit critical data by narrowing the focus to certain variables, such as excluding known voltages from the predictors, which could lead to a significant loss of information. Second, the approach faces challenges with normalized datasets, especially when variables are not scaled uniformly, potentially hindering the training process. Third, the effectiveness of this integration heavily depends on the accuracy of the physical knowledge itself, which is not always guaranteed to be precise or universally applicable, as seen with the use of the Jacobian matrix, which is just an approximation of reality. This introduces doubts about the overall enhancement of DPFL model accuracy through physical knowledge, as shown in the simulations. Fourth, the constraints on selecting predictors and responses due to the integration of physical knowledge limit the DPFL model’s application scope. The above limitations highlight the need for a critical examination of the role and implementation of physical knowledge in DPFL models, questioning the balance between enrichment and potential data exclusion, and the need for more inclusive and accurate modeling approaches.
Remedial Action
In practical scenarios, operating systems often require various remedial measures like topology modifications or adjustments in phase-shifting angles. Models trained from the data prior to these adjustments may lose relevance postimplementation. Thus, incorporating remedial actions as predictors rather than parameters during the model training phase could enhance applicability. However, how to integrate such actions as predictors remains an unresolved question. Furthermore, updating models in light of frequent remedial interventions presents an additional challenge, particularly when transitional data is sparse. With the expected increase in the frequency of remedial actions due to greater integration of renewable energy sources, there is a pressing need for developing a more robust solution. This solution should be grounded in solid theoretical principles to effectively incorporate remedial actions into DPFL models.
Bus-type Variation
Bus-type variation is a notable issue in power flow models. Yet, in DPFL, the current solution to this issue, particularly the bundle strategy employed in methods like PLS_BDL, PLS_BDLY2, and PLS_JGD, faces significant challenges. One major issue is the potential non-invertibility of a key matrix within this strategy’s computational framework. Situations where this matrix becomes non-invertible, often due to zero or constant predictors, severely limit the methods’ applicability and effectiveness, as evidenced in simulation results. How to effectively address bus-type variations remains a critical area for improvement, especially the need for a more robust solution to ensure the reliability and utility of DPFL approaches.
Limited Observability
In scenarios where the system is not fully observed, current DPFL methods are constrained to modeling only those variables that are measured, leading to models that are truncated and limited in their comprehensiveness. The challenge of developing a DPFL model that covers the entire system with only partial observations remains unsolved. Identifying a strategy that overcomes this limitation is crucial, especially for distribution grids where the data availability is often limited.
Test Standardization
The field of DPFL is currently hindered by the absence of standard testbeds grounded in real-world measurements. To conduct a reliable analysis of DPFL techniques, it is essential to establish a standardized system based on the data from actual power grids. Such a resource would be invaluable for both researchers and practitioners in the DPFL domain, allowing for the evaluation and comparison of various methods in a consistent framework that mirrors realworld conditions. Such a dataset can advance DPFL research and ensure its applicability and effectiveness in practical settings.
DPFL Toolbox
Beyond the need for standard testsets, the ability to benchmark against established DPFL methods is also vital for researchers to validate their findings. Moreover, providing straightforward access to established DPFL methods also empowers researchers to effortlessly obtain accurate linear models for their projects. Classic PPFL methods such as DC or PTDF are already built-in in many toolboxes, e.g., MATPOWER. However, replicating existing DPFL benchmarks is still a significant challenge due to the lack of opensource codes — over 95% of the DPFL literature does not provide accessible codes. This gap emphasizes the need for a comprehensive, open-source DPFL toolbox. Such a toolbox not only facilitates the application of all DPFL methods but also enables easy customization of tunable hyperparameters. It should offer a suite of features including data generation, processing, method evaluation, and diverse visualization options. Despite its broad functionality, the toolbox must prioritize user-friendliness; ideally, complex tasks such as method comparison and hyperparameter tuning could be executed with minimal coding effort. The creation of this toolbox would represent a significant leap forward, providing a robust platform for DPFL research and applications, while pushing the boundaries of DPFL into new horizons.
This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.