Preliminaries on Kernel-based Learning

Get Complete Project Material File(s) Now! »

Chapter 3 Preliminaries on Kernel-based Learning

Given pairs f(z_s; y_s)g^S_s₌₁ of features z_s belonging to a measurable space Z and target values y_s 2 R, kernel-based learning aims at finding a function or mapping f : Z ! R. From all possible options of arbitrarily complex functions, one needs to select a specific family where Kernel-based learning postulates that f lies in the function space [28]
This is the space of functions that can be expressed as linear combinations of a given kernel (basis) function K : Z Z ! R evaluated at arbitrary points z_s. When K( ; ) is a symmetric positive definite function, then H_K becomes a reproducing kernel Hilbert space fitting loss L include the least-squares (LS) fit (ysf(zs)b)2, or the ϵ-insensitive loss [ysf(zs) b]ϵ. The second term in (3.2) ensures f 2 HK and facilitates generalization over unseen data [29]. Parameter > 0 balances fitting versus generalization, and is tuned via cross-validation: i) problem (3.2) is solved for a specific using 4/5 of the data; ii) the learned function is validated on the unused 1/5 of the data; iii) the process is repeated 5 times to calculate the average fitting error for this ; and iv) the attaining the best fit is selected; see [28] for details.
The advantage of confining f to lie in the RKHS HK is that the functional optimization of (3.2) can be equivalently posed as an minimization problem over a finite-dimensional vector: The celebrated Representer’s Theorem asserts that the solution to (3.2) admits the form [28]

Chapter 4 Kernel-based Control Policies

The reactive injection by inverter n is modeled by the rule qng(zn) = fn(zn) + bn (4.1) whose ingredients (fn; zn; bn) are explained next.
Control inputs: Vector zn 2 Zn RMn is the input to control rule for inverter n. This vector may include load, solar generation, and/or line flow measurements collected locally or remotely. For a purely local rule, this input can be selected aswhere the first entry qng relates to the apparent power constraint and has been defined in (2.2). The voltage vn could also be appended in zn; however the stability of the resultant control loop is hard to analyze even when fn is linear; see e.g., [31], [17], [18], [22], [1].
Selecting the controller structure, i.e., the content for each zn, can affect critically the perfor-mance of this control scheme. Ideally, each inverter rule can be fed all uncertain quantities, that is the three numbers in the right-hand side of (4.2) across all buses. In that case, the input vectors zn become all equal and of size 3N. However, this incurs the communication burden of broadcasting 3N values in real time. Hybrid setups with zn’s carrying a combination of local and remote data can be envisioned. To eliminate the effect of this trade-off between communications and performance, this work assumes that the content of zn’s is prespecified. The task of input selection could be possibly pursued along the lines of sparse linear or polynomial regression [22], [1], [32]; and automatic relevance determination [33, Sec. 6.4].
Control function: Selecting the form of fn is the second design task. To leverage kernel-based learning, the inverter rule fn is postulated to lie in the RKHS

Learning rules from scenario data

The rules of (4.1) can be learned from scenario data indexed by s 2 S with S := f1; : : : ; Sg. Scenario s consists of the control inputs zn;s for n 2 N , and the associated vector ys := R(pgs pcs) Xqcs defined in (2.4). Evaluating rule n of (4.1) under scenario s yields the inverter response qn;sg := qng(zn;s). Let us collect the outputs qn;sg from all inverters into vector qgs. Note that the goal is not to fit ys by qgs, but to minimize the voltage deviations Xq^g_s + y_s. The control functions ff_ng^N_n₌₁ and the intercepts fb_ng^N_n₌₁ accomplishing this goal can be found via the functional minimization where ∆ is a voltage regulation objective [cf. (2.6)–(2.7)].

Remark 4.1.

The proposed approach is related to [1]–[2], where inverter rules are also trained using machine learning. However, the aforementioned works proceed in two steps: They first solve a sequence of OPF problems similar to (2.5) to find the optimal inverter setpoints q~^g under different scenarios. Secondly, they learn the mapping between controller inputs fz_n;sg_s2S and optimal setpoints fq~_n;s^gg decided by the OPF problems. During this process, they also select which inputs are more effective to be communicated to inverters. The mapping is learned via linear or kernel-based regression. On the other hand, the approach proposed here consolidates the OPF and the learning steps into a single step: The advantage is that the OPF decisions of (4.4) are taken under the explicit practical limitation that q_n^g can only be a function of z_n, since inverter n will not have access to the complete grid conditions. To get some intuition, suppose ones designs linear control rules of known input structure using the single-step approach of (4.4) with = 0 and the two-step approach of [1]–[2]. The single-step approach yields rules R₁, and the two-step approach yields rules R₂. Let us evaluate R₁ and R₂ on the training scenarios. Rules R₂ are not necessarily feasible per scenario s 2 S, whereas rules R₁ are. Moreover, rules R₂ do not necessarily coincide with the minimizers of (2.5). For the sake of comparison, let us assume that rules R₂ turn out to be feasible per scenario, and hence feasible for (4.4). Being the minimizers of (4.4), rules R₁ attain equal or smaller voltage deviation cost compared to R₂ over the training data. Numerical tests in chapter 6 corroborate the advantage of R₁ over R₂ for > 0 and during the operational phase as well.
Different from (3.2), the optimization in (4.4) entails learning multiple functions (one per inverter). Since inverter injections affect voltages feeder-wise, inverter rules are naturally coupled through ∆ in (4.4). Similar multi-function setups can be found in collaborative filtering or multi-task learning [29], [34].
Fortunately, Representer’s Theorem can be applied successively over n in (4.4). Therefore, each rule n is written as given scenario data z_n;s and y_s for n 2 N and s 2 S, we would like to find fw_n; b_ng_n through (4.4). Collect the input data for inverter n in the M_n S matrix Z_n := [z_n;₁ z_n;S ]. According to Representer’s Theorem, the optimal w_n can be expressed as w_n = Z_na_n for some a_n. Evaluating the control rule for any input z_n;s yields
Example 2: Non-linear rules. For non-linear rules, transform the input z_n;s to vector ϕ_n;s := ϕ_n(z_n;s) via a non-linear mapping ϕ_n : R^Mn ! R n . The entries of ϕ_n;s could be for example all the first- and second-order monomials formed by the entries of z_n;s. The dimension _n of ϕ_n;s can be finite (e.g., polynomial kernels) or infinite (Gaussian kernels) [33]. Then, the control function with w_n 2 R n is non-linear in z_n. The developments of Example 1 carry over to Example 2 by using K_n = ^⊤_{n n} and replacing Z_n by _n := [ϕ_n;₁ ϕ_n;S ]. Depending on the mapping ϕ_n, the vectors ϕ_n;s may be of finite or infinite length [28]. The critical point is that f_n does not depend on ϕ_n;s’s directly, but only on their inner products ϕ^⊤_n;sϕ_n;s′ for any s and s^′. These products can be easily calculated through the kernel function as ϕ^⊤_n;sϕ_n;s′ see
Since the constraints in (4.4) are enforced for the scenario data, the learned rules do not necessarily satisfy these constraints for all z_n;s with s 2/ f1; : : : ; Sg. This limitation appears also in scenario-based and chance-constrained designs [20]. Once a control rule is learned, in real-time t, it can be heuristically projected within [ q_n;t^g; +q_n;t^g] as

Implementing reactive control rules

Our control scheme involves four steps; see also Fig. 4.1:
T1) The utility collects scenario data z_n;s for all n and s.
T2) The utility designs rules by solving (4.4); see chapter 5.
T3) Each inverter n receives S + 1 data (a_n; b_n) from the utility, which describe f_n.
T4) Over the next 30 minutes and at real time t, each inverter n will be collecting z_n;t′ and applying the rule.
The aforesaid process is explicated next. Regarding T1), scenario data should be as rep-resentative as possible for the grid conditions anticipated over the following 30-min control period. One option would be to use load and solar generation forecasts. A second option would be to use historical data from the previous day and same time, if they representative of today’s conditions. A third alternative would be to use the most recent grid conditions known to the utility. For example, if smart meter data are collected every 30 min anyway, they can be used in lieu of forecasts for the next control period
Figure 4.1: Implementing reactive power control rules. Left: Data are collected from buses. Center: utility designs rules and downloads rules to inverters. Right: Inverters follow control rules fed by local and/or remote data.
The numerical tests of chapter 6 adopt the third option and use the minute-based grid con-ditions observed over the last 30-minutes as S = 30 scenarios to train the inverter rules for the upcoming 30-minute interval. Obviously, the number of training scenarios S does not have to coincide with the length of the control period measured in minutes. These two pa-rameters relate to loading conditions; feeder details; availability and quality of scenario data; communication and computational resources. Selecting their optimal values goes beyond the scope of this work.
During T4), inverter n has already received (a_n; b_n) and fz_n;sg^S_s₌₁ during T3). Each z_n may consist of local data and a few active flow readings collected from major lines or transformers. If the entries of z_n are all local, the rule can be applied with no communication. Otherwise, the non-local entries of z_n have to be sent to inverter n. If non-local inputs are shared among inverters, broadcasting protocols can reduce the communication overhead.
Remark 4.2. Suppose each inverter n knows the training data z_n;s for s 2 S. Function f_n can be described in two ways: Either through (4.5) using the data described under T3); or through (4.8)–(4.9) via w_n. For the second way, vector w_n has M_n entries in the linear case and _n entries in the nonlinear case. For the linear case, if M_n < S + 1, representing f_n through (4.8) by w_n is more parsimonious. Representation (4.5) becomes advantageous only when _n ≫ S + 1 under the nonlinear case.

1 Introduction
2 Reactive Power Control
3 Preliminaries on Kernel-based Learning
4 Kernel-based Control Policies
5 Support Vector Reactive Power Control
6 Numerical Tests
7 Conclusions
GET THE COMPLETE PROJECT
Voltage Regulation of Smart Grids using Machine Learning Tools