Get Complete Project Material File(s) Now! »

**Chapter 3 ****Preliminaries on Kernel-based Learning**

Given pairs *f*(*z*_{s}*; y** _{s}*)

*g*

^{S}

_{s}_{=1}of features

*z*

*belonging to a measurable space*

_{s}*Z*and target values

*y*

_{s}*2*R, kernel-based learning aims at finding a function or mapping

*f*:

*Z !*R. From all possible options of arbitrarily complex functions, one needs to select a specific family where

*Kernel-based learning postulates that*

*f*lies in the function space [28]

This is the space of functions that can be expressed as linear combinations of a given kernel (basis) function

*K*:

*Z Z !*R evaluated at arbitrary points

*z*

*. When*

_{s}*K*(

*;*) is a symmetric positive definite function, then

*H*

*becomes a reproducing kernel Hilbert space fitting loss*

_{K}*L*include the least-squares (LS) fit (

*y*

*s*

*f*(

*z*

*s*)

*b*)2, or the

*ϵ*-insensitive loss [

*y*

*s*

*f*(

*z*

*s*)

*b*]

*ϵ*. The second term in (3.2) ensures

*f 2 H*

*K*and facilitates generalization over unseen data [29]. Parameter

*>*0 balances fitting versus generalization, and is tuned via cross-validation:

*i)*problem (3.2) is solved for a specific using 4/5 of the data;

*ii)*the learned function is validated on the unused 1/5 of the data;

*iii)*the process is repeated 5 times to calculate the average fitting error for this ; and

*iv)*the attaining the best fit is selected; see [28] for details.

The advantage of confining f to lie in the RKHS HK is that the functional optimization of (3.2) can be equivalently posed as an minimization problem over a finite-dimensional vector: The celebrated Representer’s Theorem asserts that the solution to (3.2) admits the form [28]

**Chapter 4 ****Kernel-based Control Policies**

The reactive injection by inverter n is modeled by the rule qng(zn) = fn(zn) + bn (4.1) whose ingredients (fn; zn; bn) are explained next.

Control inputs: Vector zn 2 Zn RMn is the input to control rule for inverter n. This vector may include load, solar generation, and/or line flow measurements collected locally or remotely. For a purely local rule, this input can be selected aswhere the first entry qng relates to the apparent power constraint and has been defined in (2.2). The voltage vn could also be appended in zn; however the stability of the resultant control loop is hard to analyze even when fn is linear; see e.g., [31], [17], [18], [22], [1].

Selecting the controller structure, i.e., the content for each zn, can affect critically the perfor-mance of this control scheme. Ideally, each inverter rule can be fed all uncertain quantities, that is the three numbers in the right-hand side of (4.2) across all buses. In that case, the input vectors zn become all equal and of size 3N. However, this incurs the communication burden of broadcasting 3N values in real time. Hybrid setups with zn’s carrying a combination of local and remote data can be envisioned. To eliminate the effect of this trade-off between communications and performance, this work assumes that the content of zn’s is prespecified. The task of input selection could be possibly pursued along the lines of sparse linear or polynomial regression [22], [1], [32]; and automatic relevance determination [33, Sec. 6.4].

Control function: Selecting the form of fn is the second design task. To leverage kernel-based learning, the inverter rule fn is postulated to lie in the RKHS

**Learning rules from scenario data**

The rules of (4.1) can be learned from scenario data indexed by s 2 S with S := f1; : : : ; Sg. Scenario s consists of the control inputs zn;s for n 2 N , and the associated vector ys := R(pgs pcs) Xqcs defined in (2.4). Evaluating rule n of (4.1) under scenario s yields the inverter response qn;sg := qng(zn;s). Let us collect the outputs qn;sg from all inverters into vector qgs. Note that the goal is not to fit ys by qgs, but to minimize the voltage deviations **Xq**^{g}* _{s}* +

**y**

*. The control functions*

_{s}*ff*

_{n}*g*

^{N}

_{n}_{=1}and the intercepts

*fb*

_{n}*g*

^{N}

_{n}_{=1}accomplishing this goal can be found via the functional minimization where ∆ is a voltage regulation objective [cf. (2.6)–(2.7)].

**Remark 4.1.**

The proposed approach is related to [1]–[2], where inverter rules are also trained using machine learning. However, the aforementioned works proceed in two steps: They first solve a sequence of OPF problems similar to (2.5) to find the optimal inverter setpoints **q**~* ^{g}* under different scenarios. Secondly, they learn the mapping between controller inputs

*f*

**z**

_{n;s}*g*

*and optimal setpoints*

_{s2S}*fq*~

_{n;s}

^{g}*g*decided by the OPF problems. During this process, they also select which inputs are more effective to be communicated to inverters. The mapping is learned via linear or kernel-based regression. On the other hand, the approach proposed here consolidates the OPF and the learning steps into a single step: The advantage is that the OPF decisions of (4.4) are taken under the explicit practical limitation that

*q*

_{n}*can only be a function of*

^{g}**z**

*, since inverter*

_{n}*n*will not have access to the complete grid conditions. To get some intuition, suppose ones designs linear control rules of known input structure using the single-step approach of (4.4) with = 0 and the two-step approach of [1]–[2]. The single-step approach yields rules

*R*

_{1}, and the two-step approach yields rules

*R*

_{2}. Let us evaluate

*R*

_{1}and

*R*

_{2}on the training scenarios. Rules

*R*

_{2}are not necessarily feasible per scenario

*s*

*2 S*, whereas rules

*R*

_{1}are. Moreover, rules

*R*

_{2}do not necessarily coincide with the minimizers of (2.5). For the sake of comparison, let us assume that rules

*R*

_{2}turn out to be feasible per scenario, and hence feasible for (4.4). Being the minimizers of (4.4), rules

*R*

_{1}attain equal or smaller voltage deviation cost compared to

*R*

_{2}over the training data. Numerical tests in chapter 6 corroborate the advantage of

*R*

_{1}over

*R*

_{2}for

*>*0 and during the operational phase as well.

Different from (3.2), the optimization in (4.4) entails learning multiple functions (one per inverter). Since inverter injections affect voltages feeder-wise, inverter rules are naturally coupled through ∆ in (4.4). Similar multi-function setups can be found in collaborative filtering or multi-task learning [29], [34].

Fortunately, Representer’s Theorem can be applied successively over

*n*in (4.4). Therefore, each rule

*n*is written as given scenario data

**z**

*and*

_{n;s}**y**

*for*

_{s}*n*

*2 N*and

*s*

*2 S*, we would like to find

*f*

**w**

_{n}*; b*

_{n}*g*

*through (4.4). Collect the input data for inverter*

_{n}*n*in the

*M*

_{n}*S*matrix

**Z**

*:= [*

_{n}**z**

_{n;}_{1}

**z**

*]. According to Representer’s Theorem, the optimal*

_{n;S}**w**

*can be expressed as*

_{n}**w**

*=*

_{n}**Z**

_{n}**a**

*for some*

_{n}**a**

*. Evaluating the control rule for any input*

_{n}**z**

*yields*

_{n;s}**Example 2:**

*Non-linear rules.*For non-linear rules, transform the input

**z**

*to vector*

_{n;s}*ϕ*

*:=*

_{n;s}*ϕ*

*(*

_{n}**z**

*) via a non-linear mapping*

_{n;s}*ϕ*

*: R*

_{n}

^{M}*n*

*!*R

*n*. The entries of

*ϕ*

*could be for example all the first- and second-order monomials formed by the entries of*

_{n;s}**z**

*. The dimension*

_{n;s}*of*

_{n}*ϕ*

*can be finite (e.g., polynomial kernels) or infinite (Gaussian kernels) [33]. Then, the control function with*

_{n;s}**w**

_{n}*2*R

*n*is

*non-linear*in

**z**

*. The developments of Example 1 carry over to Example 2 by using*

_{n}**K**

*=*

_{n}

^{⊤}*and replacing*

_{n n}**Z**

*by*

_{n}*:= [*

_{n}*ϕ*

_{n;}_{1}

*ϕ*

*]. Depending on the mapping*

_{n;S}*ϕ*

*, the vectors*

_{n}*ϕ*

*may be of finite or infinite length [28]. The critical point is that*

_{n;s}*f*

*does not depend on*

_{n}*ϕ*

*’s directly, but only on their inner products*

_{n;s}*ϕ*

^{⊤}

_{n;s}*ϕ*

_{n;s}*′*for any

*s*and

*s*

*. These products can be easily calculated through the kernel function as*

^{′}*ϕ*see

^{⊤}_{n;s}ϕ_{n;s}′Since the constraints in (4.4) are enforced for the scenario data, the learned rules do not necessarily satisfy these constraints for all

**z**

*with*

_{n;s}*s*

*2*/

*f*1

*; : : : ; Sg*. This limitation appears also in scenario-based and chance-constrained designs [20]. Once a control rule is learned, in real-time

*t*, it can be heuristically projected within [

*q*

_{n;t}

^{g}*;*+

*q*

_{n;t}*] as*

^{g}**Implementing reactive control rules**

Our control scheme involves four steps; see also Fig. 4.1:

*T1) *The utility collects scenario data **z*** _{n;s}* for all

*n*and

*s*.

*T2)*The utility designs rules by solving (4.4); see chapter 5.

*T3)*Each inverter

*n*receives

*S*+ 1 data (

**a**

_{n}*; b*

*) from the utility, which describe*

_{n}*f*

*.*

_{n}*T4)*Over the next 30 minutes and at real time

*t*, each inverter

*n*will be collecting

**z**

_{n;t}*′*and applying the rule.

The aforesaid process is explicated next. Regarding

*T1)*, scenario data should be as rep-resentative as possible for the grid conditions anticipated over the following 30-min control period. One option would be to use load and solar generation forecasts. A second option would be to use historical data from the previous day and same time, if they representative of today’s conditions. A third alternative would be to use the most recent grid conditions known to the utility. For example, if smart meter data are collected every 30 min anyway, they can be used in lieu of forecasts for the next control period

Figure 4.1: Implementing reactive power control rules.

*Left:*Data are collected from buses.

*Center*: utility designs rules and downloads rules to inverters.

*Right:*Inverters follow control rules fed by local and/or remote data.

The numerical tests of chapter 6 adopt the third option and use the minute-based grid con-ditions observed over the last 30-minutes as

*S*= 30 scenarios to train the inverter rules for the upcoming 30-minute interval. Obviously, the number of training scenarios

*S*does not have to coincide with the length of the control period measured in minutes. These two pa-rameters relate to loading conditions; feeder details; availability and quality of scenario data; communication and computational resources. Selecting their optimal values goes beyond the scope of this work.

During

*T4)*, inverter

*n*has already received (

**a**

_{n}*; b*

*) and*

_{n}*f*

**z**

_{n;s}*g*

^{S}

_{s}_{=1}during

*T3)*. Each

**z**

*may consist of local data and a few active flow readings collected from major lines or transformers. If the entries of*

_{n}**z**

*are all local, the rule can be applied with no communication. Otherwise, the non-local entries of*

_{n}**z**

*have to be sent to inverter*

_{n}*n*. If non-local inputs are shared among inverters, broadcasting protocols can reduce the communication overhead.

**Remark 4.2.**Suppose each inverter

*n*knows the training data

**z**

*for*

_{n;s}*s*

*2 S*. Function

*f*

*can be described in two ways: Either through (4.5) using the data described under T3); or through (4.8)–(4.9) via*

_{n}**w**

*. For the second way, vector*

_{n}**w**

*has*

_{n}*M*

*entries in the linear case and*

_{n}*entries in the nonlinear case. For the linear case, if*

_{n}*M*

_{n}*< S*+ 1, representing

*f*

*through (4.8) by*

_{n}**w**

*is more parsimonious. Representation (4.5) becomes advantageous only when*

_{n}

_{n}*≫*

*S*+ 1 under the nonlinear case.

**1 Introduction
2 Reactive Power Control
3 Preliminaries on Kernel-based Learning
4 Kernel-based Control Policies
5 Support Vector Reactive Power Control
6 Numerical Tests
7 Conclusions**

GET THE COMPLETE PROJECT

Voltage Regulation of Smart Grids using Machine Learning Tools