## Abstract

To address phasor measurement unit (PMU) vulnerability to spoofing, we propose the use of a set-valued state estimation technique known as stochastic reachability (SR)-based distributed Kalman filter (DKF) that computes secure global positioning system (GPS) timing across a network of receivers. Utilizing SR, we estimate not only GPS time but also its stochastic reachable set, which is parameterized by probabilistic zonotope (p-Zonotope). While requiring known measurement error bounds in only non-spoofed conditions, we designed a two-tiered approach. We first performed measurement-level spoofing mitigation via deviation of a measurement innovation from its expected p-Zonotope. We then performed state-level timing risk analysis via a determination of the intersection probability of the estimated p-Zonotope with an unsafe set that violates IEEE C37.118.1a-2014 standards. Finally, we validated our SR-DKF algorithm by subjecting it to a simulated receiver network to coordinate signal-level spoofing. We demonstrate improved timing accuracy and successful spoofing mitigation via the use of our SR-DKF algorithm. We also validated the robustness of the estimated timing risk as the number of receivers were varied.

- distributed Kalman filter
- GPS
- phasor measurement units
- probabilistic zonotope
- spoofing mitigation
- stochastic reachability
- timing risk

## 1 INTRODUCTION

Modern power systems analyze real-time grid stability using a widely-dispersed network of advanced devices known as phasor measurement units (PMUs) (Al-Hammouri et al., 2012). As per the American Recovery and Reinvestment Act of 2009, more than 2500 PMUs were deployed across North America as of March 2015 (Tholomier et al., 2009). Figure 1 shows a map of the dense network of these spatially-distributed PMUs (Dagle, 2018). The North American synchrophasor initiative network (NASPInet), which is a collaborative effort by the United States Department of Energy and the North American Electric Reliability Corporation, provides a secure and decentralized infrastructure for communication among the PMUs (Myrda & Koellner, 2010). Each PMU is equipped with a Global Positioning System (GPS) receiver that precisely time-stamps the voltage and current phasor values at critical power substations. A network of geographically distributed GPS receivers facilitates robust time authentication by internal cross-checking, thereby eliminating the need for an external reference receiver (Heng et al., 2015). The networked approach in power systems overlays the existing grid infrastructure and can therefore be implemented without the need for additional hardware.

PMUs are susceptible to GPS spoofing because of the low signal power and unencrypted structure of civilian GPS/L1 signals (Hofmann-Wellenhof et al., 1992). As validated in prior literature (Shepard et al., 2012), this vulnerability can be exploited by attackers who can mislead the GPS timing at one or more critical substations and cause cascading system failure. The various spoofing attacks that have the potential to disrupt GPS timing vary from simple meaconing to more sophisticated signal-level spoofing and coordinated attacks. A spoofer executing meaconing (or time jump) broadcasts earlier recorded authentic GPS signals which are then used to induce a constant lag in the victim receiver’s computed time compared to the true time. In signal-level spoofing (or time walk), the spoofer broadcasts the simulated GPS signals and will gradually steer the computed time of victim receiver away from its true time. Coordinated attack occurs when multiple spoofers manipulate the GPS signals received at one or more victim receivers. We refer readers to our earlier work (Bhamidipati & Gao, 2019) for a detailed overview of these types of attacks.

Secure GPS timing is of paramount importance for reliable PMU operations. This requires ensuring that the timing accuracy satisfies the IEEE C37.118.1a-2014 standard “Synchrophasors for Power Systems” (Martin, 2015). The IEEE C37.118.1a-2014 standard defines a metric called total vector error (TVE) which is the square root of the difference squared between the real and imaginary parts of the theoretical phasor and the estimated phasor to the ratio of magnitude of the theoretical phasor and presented as a percentage. An error in estimated timing presents as an error in the phase angle of voltage and current phasors. Furthermore, without magnitude errors, a phase angle error of 0.573° ≈ 0.01 rad corresponds to a 1% TVE, which is the maximum allowable TVE by the IEEE C37.118.1a-2014 Standard. To ensure that the phase angle error is less than 0.01 rad, we require the error in GPS timing, Δ*T*, to satisfy 2*πf*_{samp}Δ*T* < 0.01 rad, where *f*_{samp} is the PMU’s data logging frequency. For a PMU operating at *f*_{samp} = 60 Hz, which is a commonly used frequency for phasor data logging (Martin, 2015), less than 1% TVE corresponds to a GPS timing error |Δ*T*| ≤ 26.5 *μ*s.

To address concerns regarding GPS vulnerability against spoofing in power systems, accounting for timing accuracy as a standalone metric will not be sufficient. It is equally important to quantify the risk associated with the estimated timing. While there are different ways (Georgescu et al., 2019; Ochieng et al., 2003) to quantify risk, for this work, we define risk as the probability of an estimation error that exceeds a **safety alert limit (AL)**. For instance, claiming that the estimated timing is 1 *μ*s accurate does not suffice, unless this is accompanied by a certain risk, say 10^{−3}. In this work, we ensure secure GPS timing in power systems by performing not only **spoofing attack mitigation** but also **timing risk analysis**. However, there are no defined standards for acceptable risk in power systems at this time. Therefore, we set a conservative *AL* = ±1 *μ*s, as compared to the aforementioned IEEE C37.118.1a-2014 standard. This is done to ensure that the estimated GPS time complies with the commonly desired timing accuracy (Heng et al., 2015) and never exceeds the IEEE C37.118.1a-2014 threshold of ±26.5 *μ*s without raising an alert.

The state-of-the-art prior work (Bhamidipati et al., 2018; Khalajmehrabadi et al., 2018; Risbud et al., 2018) on networked anti-spoofing methods falls under the category of point-valued state estimation approaches. A point-valued state estimate denotes the expected value of the state vector for a given set of measurements. The existing literature focused on of point-valued anti-spoofing methods (Bhamidipati & Gao, 2019; Günther, 2014; Schmidt et al., 2016) includes studies that discuss signal quality monitoring, angle-of-arrival, and time-difference-of-arrival-based analyses of the incoming satellite signals, validation of the encryption codes, and attack magnitude estimation. Unlike the point-valued techniques, a set-valued approach utilizes the pre-computed (or known) error bounds on system dynamics and received measurements to propagate the set of state estimates (Shi et al., 2015). An illustration of the difference between point-valued and set-valued propagation is shown in Figure 2(a).

There is much prior available literature that focuses on set-valued theory (Shi et al., 2015; Shiryaev & Podivilova, 2015). Our current work was inspired by a set-valued formulation designed for stochastic processes known as stochastic reachability (Bujorianu, 2012). While stochastic reachability is widely used for path planning and collision avoidance in robotics applications, it has not been applied in the field of power systems. As shown in Figure 2(b), stochastic reachability computes a set of stochastic reachable states given an initial stochastic set of states and known stochastic sets of system disturbances and measurement errors. In this context, each state included in the stochastic reachable set is associated with a measure that indicates its maximum probability of occurrence. The use of stochastic reachability ensures secure state estimation while attempting to account for difficult-to-model and explicitly neglected process or measurement errors in the receiver clock dynamics, communication latency, and spoofed GPS measurements (i.e., when the associated error distribution is not known). For brevity, these errors will be referred to as unmodeled biases.

In this work, we propose a set-valued state estimation technique that can be applied to a network of GPS receivers, termed **Stochastic Reachability-Based Distributed Kalman Filter (SR-DKF)**. While the stochastic reachability formulation requires known error bounds, the attack errors introduced by GPS spoofing are unbounded and time-varying in nature. To account for this, we consider the error bounds in GPS measurements to be known factors only in non-spoofed (i.e., authentic) conditions. We pre-compute these bounds via offline empirical analysis of historical data (Bhamidipati & Gao, 2018). We parameterize the stochastic reachable set via an enclosing probabilistic hull known as probabilistic Zonotope (p-Zonotope) (Althoff et al., 2009) that can efficiently model stochastic biases. This work is based on our recent ION GNSS+2020 conference paper (Bhamidipati & Gao, 2020). Note that we modified the mathematical notations compared to those presented in our prior work (Bhamidipati & Gao, 2020) to ensure consistency with the standard notations utilized in estimation theory (Crassidis & Junkins, 2011; Simon, 2007).

In a conventional point-valued DKF approach (Talebi & Werner, 2018), each receiver compares the deviation of the received measurements against those that are expected and thereafter fuses the measurements obtained from different receivers to compute a point-valued state estimate. However, this approach is designed for use under nominal fault-free conditions with the initial states, process, and measurement noise modeled to follow a Gaussian distribution. Therefore, this approach cannot account directly for unmodeled biases in the system. Other point-valued DKF approaches (Li et al., 2013; Najafi et al., 2019) account for non-zero biases by adaptively estimating the process and measurement noise covariances. However, approximating unmodeled biases using a zero-mean Gaussian distribution can become restrictive, thereby limiting the practical applicability of these approaches. Of note, these point-valued techniques must be fused with independent modules, i.e., fault mitigation and integrity monitoring, in order to detect anomalies and provide a measure of timing risk. Our proposed SR-DKF describes an alternative approach to previously-identified methods that focus on point-valued anti-spoofing. In contrast to the point-valued techniques, our proposed SR-DKF algorithm performs secure state estimations of GPS time and spoofing attack mitigation by propagating these unmodeled biases (i.e., those with known bounds but unknown distribution) in the set-valued domain. Therefore, we estimate not only the point-valued mean but also the stochastic reachable set of GPS times that are later utilized to compute the associated risk of violating the pre-defined safety AL. Essentially, our proposed SR-DKF algorithm combines the three important steps, namely, secure state estimation, fault mitigation, and timing risk analysis into a single integrated pipeline, thus ensuring tolerance against anomalies. Essentially, our proposed SR-DKF algorithm provides a sequential estimate of stochastic timing output that remains valid even in the presence of anomalies, wherein the validity is quantified based on the estimated measure of timing risk. To the best of our knowledge, there is no prior literature that addresses methods used to perform secure state estimation using stochastic reachability nor are there any that focus on quantifying the timing risk associated with the estimated GPS time.

### 1.1 Design Aspects of the Proposed SR-DKF Algorithm

Here we present the three main design aspects of the proposed SR-DKF algorithm, including i) a spatially-distributed network of static GPS receivers, ii) a distributed processing framework, and iii) the set representation via p-Zonotopes, which are described as follows:

#### 1.1.1 A spatially-distributed network of static GPS receivers

We utilize the existing grid infrastructure shown earlier in Figure 1 to consider a small group of spatially-distributed GPS receivers (approximately four to eight). This framework provides two advantages. First, by considering a spatially-distributed network, we increase the measurement redundancy while minimizing the probability of multiple receivers being simultaneously spoofed. Second, given that the power substations are static, we can utilize the known receiver positions to aid the proposed SR-DKF algorithm.

#### 1.1.2 A distributed processing framework

We opted for distributed processing to minimize the reaction time to attacks and also to localize the extent of failure. This is justified because the GPS reference time for synchronizing the PMUs is no longer reliable during spoofing. Each receiver in our distributed network asynchronously broadcasts the latest GPS measurement data to its neighbors, thereby reducing communication latency.

We consider a combination of an independent clock model at each power substation and communication links between various power substations to provide short-term time synchronization within the network. By contrast, GPS receiver algorithms (in this work, our proposed SR-DKF algorithm) play a key role in maintaining a bound on both the synchronization errors between any two receivers and the absolute time at each receiver in the network. This indicates a standard implementation of PMUs/power grids since the development of newer clocks, for example, the chip-scale atomic clock (CSAC) and communication protocols such as network-time protocol (NTP) and precise-time protocol (PTP) that can provide up to tens of microsecond-level accuracy over shorter time spans (Allnutt et al., 2017; Wang et al., 2008; Zhan et al., 2017).

#### 1.1.3 A set representation via p-Zonotopes

Among all available set representations (Althoff & Dolan, 2014) such as zonotope, ellipsoid, polytope, and others, our choice of p-Zonotope was justified because it can efficiently enclose the stochastic biases in GPS measurements. A p-Zonotope can enclose a variety of distributions such as mixture models, distributions with uncertain/time-dependent mean, and non-Gaussian/unknown distributions, among others. A one-dimensional (1D) example of a p-Zonotope that encloses multiple Gaussian distributions with varying mean and covariance is illustrated in Figure 3(a). The probabilistic nature of p-Zonotopes blends well with the Gaussian-centric framework of DKF.

Unlike the conventional techniques that propagate point-valued state estimates, the proposed SR-DKF algorithm utilizes pre-computed error bounds on the receiver clock dynamics and measurements to propagate the p-Zonotope of the state estimate recursively. An example of a two-dimensional (2D) p-Zonotope is shown in Figure 3(b). In this work, the 2D p-Zonotope is used to represent the state estimate that is introduced later in Section 2.

### 1.2 Contributions

The main contributions of this paper are as follows:

Utilizing an efficient set representation known as p-Zonotope (Althoff et al., 2009) to parameterize the stochastic reachable set and account for stochastic biases, we develop a

**set-valued DKF framework**to compute secure GPS time for a network of receivers.Considering offline-estimated measurement error bounds in non-spoofed (i.e., authentic) conditions, we design a two-tier approach that performs:

**Measurement-level spoofing attack mitigation**by adaptive evaluation of the deviation of point-valued GPS measurements from the set-valued domain of measurement innovation. Therefore, the proposed SR-DKF algorithm is resilient to a variety of GPS attacks, including meaconing, signal-level spoofing, and coordinated attacks, among others.**State-level timing risk analysis**by estimating the intersection probability of the estimated set of stochastic reachable states (represented by p-Zonotope) with an unsafe set, i.e., a set that violates the pre-defined safety AL (Martin, 2015).

For a simulated setup, we demonstrate the successful mitigation of both simple meaconing and sophisticated signal-level spoofing attacks. During a simulated case of coordinated attack that affects multiple receivers, we show an improved state estimation accuracy via the proposed SR-DKF algorithm compared to conventional point-valued and adaptive DKF. We also demonstrate the robustness of the proposed SR-DKF algorithm by analyzing the variation in timing risk with respect to the number of receivers and the magnitude of spoofing attacks.

The remaining sections of the paper are organized as follows: Section 2 describes the preliminaries of p-Zonotopes; Section 3 outlines the system model; Section 4 describes the proposed SR-DKF algorithm; Section 5 experimentally validates the improved state estimation accuracy of GPS time computed via the proposed SR-DKF algorithm as compared to that estimated by point-valued DKF techniques, and also computes a robust measure of timing risk associated with the estimated GPS time; and Section 6 concludes the paper.

## 2 PRELIMINARIES OF P-ZONOTOPES

A p-Zonotope , as seen in Equation (1) and shown earlier in Figure 3(a), is characterized by three parameters (Althoff et al., 2009), namely, a) the mean of its center, denoted by , b) the uncertainty in the center, represented by the generator matrix , and c) the covariance of Gaussian tails, denoted by .

1

Intuitively, a p-Zonotope with a certain center mean, i.e., zero center uncertainty *G* = **0**, represents a Gaussian distribution that overbounds multiple probability density functions (PDFs) with the same center mean *c*. This is represented by in Equation (2).

2

where {·} denotes the set representation, denotes pairwise independent Gaussian distributed random variables, and denotes the associated generator vectors of , such that with [·;·] as the horizontal concatenation operator. Note that, the covariance ∑ of a p-Zonotope defined earlier in Equation (1) is equivalent to .

Similarly, a p-Zonotope with zero covariance Σ = **0** simplifies to a zonotope (Althoff & Dolan, 2014) that encompasses all the possible values of the center, such that the generator matrix of *Z* is *G* = [*g*^{1}, ⋯, ** g^{e}**]. This case is mathematically represented by

*Z*in Equation (3).

3

where denotes independent numbers. Note that the parameter *G* defined in Equations (1) and (3) is not the same as defined in Equation (2).

The p-Zonotope is a combination of both sets, i.e., , where is a set-addition operator. Alternatively, , where represents the PDF of distributions represented by and denotes the expectation operator. From Equations (1)–(3), note that depends on both *G* and . Additionally, unlike a PDF, the area enclosed by a p-Zonotope is not equal to one. More details regarding the characteristics of Zonotopes and p-Zonotopes can be found in the prior literature (Alanwar et al., 2019; Althoff et al., 2009).

For example, the 1D p-Zonotope shown in Figure 3(a) is formulated by considering the mean of distributions as lying within [−2, 2] and the covariance as lying between [0, 4]. Thereafter, we formulate the three parameters of a 1D p-Zonotope as follows: . Similarly, the 2D p-Zonotope shown in Figure 3(b) is formulated by considering the bounds on the mean and covariance of the first state to be [−2, 2] and [0, 4], respectively, and of the second state to be [−10, 10] and [0, 6], respectively. Mathematically, we represent this 2D p-Zonotope by and .

To perform set operations on stochastic reachable sets using p-Zonotopes, we require the following important properties (Althoff et al., 2009; Girard, 2005):

**Minkowski sum of addition:**The addition of two p-Zonotopes and is given by4a

4b

where ⊕ is known as the Minkowski sum operator.

**Linear map:**Given a matrix and a p-Zonotope ,4c

**Translation:**Given a vector and a p-Zonotope ,4d

**Order reduction:**For any p-Zonotope with and a given maximum allowable order of*e*_{max}, the p-Zonotope after order reduction is given by4e

where denotes the generator matrix after order reduction (Girard, 2005). To achieve this order reduction, the generators of

*G*_{1}are first sorted based on the Frobenius norm. Thereafter, one holds the first (*e*_{max}− 1) columns of*G*_{1}which are denoted by*G*_{>}, and encloses the remaining columns of*G*_{1}, which are denoted by*G*_{<}, into a smallest box (interval hull), such that where*rs*(·) denotes the row-sum of a matrix.γ

**-confidence:**For with and*Z*= 〈,**c***G*〉, transforming a p-Zonotope to a confidence zonotope, which is denoted by*Z*, is given by_{γ}4f

where γ denotes the pre-defined parameter at the threshold of the tail of a p-Zonotope. This property is used to estimate the size of the p-Zonotope for a given γ. Note that, is well-known, where erf(·) denotes the error function (Oldham et al., 2008). Given

*l*probabilistic generators in , the probability of lying outside the γ-confidence set is . When the state vector is considered as clock bias and clock drift, this variableis analogous to the term**r***timing risk*associated with the γ-confidence set. As a simple example, for a predefined parameter γ = 3.5 and number of generators*l*= 2, the timing risk*r*= 1.9*e*^{−3}. Extending this further, Section 4.4 of our proposed SR-DKF algorithm computes the timing risk associated with violating the pre-defined safety AL. For a more theoretical explanation regarding risk calculation for p-Zonotopes, we refer the readers to previous work (Althoff et al., 2009).

## 3 SYSTEM MODEL

Our system model comprises a network of power substations. Each power substation is equipped with a clock, PMU, GPS a receiver antenna, a communication transmit/receive antenna and a processing unit. In this work, for simplicity, we use the terms GPS receiver and power substation interchangeably. In the proposed SR-DKF algorithm, we mathematically represent the inter-connected network of power substations, each mounted with one GPS receiver, as an un-directed graph , where *M* indicates the set of nodes and *E* denotes the set of connections between the nodes. Each node represents a GPS receiver and each edge indicates a secure bidirectional communication link between the power substations. The total number of receivers in the network are |*M*|, where |·| denotes the cardinality of a set. Our SR-DKF algorithm does not require the network of power substations to be fully connected, hence,

The neighborhood set of any *i*th receiver ∀** i** ∈

**is represented by**

*M***and indicates the subset of nodes among the**

*M*_{i}**nodes that can communicate with the**

*M**i*th receiver, including itself. Therefore, the associated cardinality of the neighborhood set is denoted by |

**| where |**

*M*_{i}**| ≥**

*M*_{i}**1**∀

**∈**

*i***. Every receiver in the network has prior knowledge regarding the pre-computed static position of all the receivers in its neighborhood set.**

*M*We define the true global clock time (which is expressed in GPS time) at the *i*th receiver and at the *k*th time iteration to be . Furthermore, based on prior work (Coleman & Beard, 2020; Khalife & Kassas, 2017), we define a local state vector at each *i*th receiver as , where *cδt _{i,k}* denotes the receiver clock bias, denotes the receiver clock drift and

*c*represents the speed of light. The global clock time and the local state vector

*are related as*

**x**_{i,k}5

where denotes the local measured time output (known) at the *i*th receiver. Note that, while the local measured time can be determined from the power substation at each time iteration, the global clock time is unknown and needs to be estimated. In practice, one way to estimate local measured time is to propagate the previously authenticated global time (e.g., when estimated timing risk from our proposed SR-DKF algorithm is less than the user-acceptable value) at a past time iteration over shorter time spans using independent clock model at each power substation or via communication links between base stations.

As explained in Section 1, the estimate of global clock time at each *i*th receiver is used by the PMU for time-tagging the recorded phasor values. Therefore, the accuracy of the global clock time depends on the accuracy of the estimated receiver’s clock bias *cδt _{i,k}*, which in-turn depends on the estimated clock drift . When a receiver is spoofed, the estimate of local state vector

*x*is manipulated, thereby providing a misleading estimate of global clock time. Given this direct dependency, we investigate a distributed state estimation framework that considers

_{i,k}*to be the state vector at the*

**x**_{i,k}*i*th receiver.

The state-space model representing the true system at the *i*th receiver ∀*i* ∈ *M* is given by

6

with

where

–

denotes the 2**z**_{i,k}*N*× 1 GPS measurement vector and_{i}*N*denotes the number of GPS satellites at the_{i}*i*th receiver. Here, and denote the received pseudorange and Doppler value from the*n*th satellite ∀*n*∈ {1, ⋯,*N*}, respectively. Also,_{i}and denote the known position and zero velocity of the**p**_{i}*i*th static receiver, respectively, andand denote the position and velocity of the**p**^{n}*n*th satellite computed from the broadcast navigation message, respectively.–

*F*denotes the 2 × 2 state transition matrix, such that , where_{i}*δt*denotes the update rate.–

*H*denotes the 2_{i}*N*× 1 GPS measurement model, such that , where_{i}denotes an**I**_{a}*a*×*a*identity matrix and**0**_{a}denotes an*a*×*a*zero matrix;–

*ω*denotes the 2 × 1 process noise vector that includes all the clock-related errors that depict its short-term and long-term stability characteristics;_{i k}–

*ν*denotes the 2_{i,k}*N*× 1 measurement noise vector that is GPS pseudoranges and Doppler values, caused by errors in satellite clock corrections and ionospheric and tropospheric effects, among others._{i}

At any *i*th receiver, GPS measurements are received from all receivers in its neighborhood set *M _{i}* along with their estimated attack status, which quantifies their confidence in the authenticity of their received measurements. A detailed explanation regarding estimations of attack status is provided later in Section 4.3. In addition, each

*j*th receiver

*j*∈

*M*sends the known local measured time to the

_{i}*i*th receiver, which represents the time at which the message is transmitted.

Based on this, the state-space model representing the relationship between the global clock time of the *i*th receiver ∀*i* ∈ *M* and any *j*th receiver belonging to its neighborhood set ∀*j* ∈ *M _{i}* is given by

7

where *D _{ij,k}* denotes the average transmission delay in sending the data packet from any

*j*th receiver at its local measured time to the

*i*th receiver. The average transmission delay

*D*can be estimated from the two-way exchange of messages over time. Also,

_{ij,k}*ω*denotes the zero-mean random term associated with the transmission delay between the

_{ij,k}*i*th and the

*j*th receivers. Without loss of generality, we can consider the data packet from each

*j*th receiver in the neighborhood set

*j*∈

*M*to be received at the

_{i}*i*th receiver at its own local measured time with

*k*∈[

_{j}*k*,

*k*+ 1). For detailed explanation regarding the clock modeling for distributed framework, the reader is referred to previous work (Bhamidipati et al., 2022; Mo et al., 2013)

From Equations (5) and (7), we formulate a true system that depicts the relationship between the clock biases at the *i*th receiver and the *j*th receiver ∀*j* ∈ *M _{i}* as

8

At any *i*th receiver, we formulate the relationship between the state vector * x_{i,k}* and the measurement model of the data sent from any

*j*th receiver ∀

*j*∈

*M*as follows:

_{i}9

where denotes the measurement vector, such that , which can be observed from Equations (6), (7), and (8). Also, *H _{ij}* denotes the associated measurement model such that with

*N*as the number of visible GPS satellites at the

_{j}*j*th receiver and

*v*denotes the noise in the measurement vector

_{ij,k}*that includes noise in the*

**z**_{ij,k}*j*th receiver GPS pseudorange and the transmission delay

*ω*. Therefore, the

_{ij,k}*i*th receiver constructs the measurement vector

*using the data packet sent from the*

**z**_{ij,k}*j*th receiver comprising and .

Intuitively, the measurement vector * z_{ij,k}* provides a noisy estimate of the

*i*th receiver’s clock bias given the GPS measurements of the

*i*th receiver. On the other hand, the measurement vector

*provides a noisy estimate of the*

**z**_{ij,k}*i*th receiver’s clock bias given the GPS measurements of the

*j*th receiver. At any

*i*th receiver, the measurement model

*H*depicts that only the GPS pseudoranges from the

_{ij}*j*th receiver are utilized in our proposed DKF formulation, while Doppler values are discarded. This is because the clock drift at the

*i*th receiver is independent of the clock drift at any

*j*th receiver; thus, we cannot utilize only one data packet sent from any

*j*th receiver to the

*i*th receiver to construct a noisy estimate of the

*i*th receiver’s clock drift given the

*j*th receiver’s GPS measurements. However, these additional measurements can be constructed by sending two (rather than one) consecutive data packets from the

*j*th receiver to the

*i*th receiver and thereafter by estimating the variation in the difference between the associated local measured times at both the receivers across the data packets. This procedure is explained in Section 3.2 of Mo et al., 2013 and is left as a potential future extension of our work.

For the system model described in Equations (5)–(9), we refer the readers to Appendix A for the mathematical formulation of the point-valued DKF comprising of two steps, i.e., time update and measurement update. The two point-valued DKF techniques (conventional and adaptive) are used as benchmarks for validating the proposed SR-DKF algorithm, which is discussed via simulated experiments in Section 5.

In this work, we feature several specific design considerations/assumptions. First, we consider the power substations as independent units that are installed in open-sky areas and geographically separated by large distances. Also, we make no assumption regarding whether the power substations in the network have the same hardware specifications, i.e., they can have GPS receivers of varying quality. Therefore, we do not consider the receiver clock drift across the network to be the same. We also do not account for correlations in measurement errors across receivers, i.e., we model them as pairwise independent. Given the open-sky conditions, we also assume that the multipath effects have a negligible impact on received measurements. Furthermore, our SR-DKF algorithm does not account for the presence of nominal orbit errors or Signal-In-Space (SIS) anomalies in the broadcast navigation message that can potentially affect multiple receivers and cause correlated errors. These assumptions are justified because there exists rich literature (Jiang & Groves, 2012; Ochieng et al., 2003; Xu et al., 2020) that addresses other sources of measurement error, including SIS anomalies, atmospheric effects, multipath, and others that can be independently added as a pre-processing step to the proposed SR-DKF algorithm.

## 4 THE PROPOSED SR-DKF: A SET-VALUED SECURE STATE ESTIMATION ALGORITHM

Based on the design aspects discussed in Section 1.1, we begin by outlining the architecture of the proposed SR-DKF algorithm and later describe the detials in a step-by-step fashion. While our current mathematical formulation considers only GPS measurements, the proposed SR-DKF can be easily generalized to incorporate multiple Global Navigation Satellite System (GNSS) constellations. This will require the system model to be updated to incorporate clock bias associated with all GNSS constellations to be considered with the state transition matrix and the measurement model modified accordingly. The proposed SR-DKF algorithm can also be updated to incorporate higher-order clock error models (Coleman & Beard, 2020) by considering additional states and modifying the state transition matrix accordingly.

Figure 4 presents a flowchart of the proposed SR-DKF algorithm that is executed independently at the *i*th receiver/power substation, and comprises four main modules, namely time update of set-valued DKF, spoofing attack mitigation, measurement update of set-valued DKF, and timing risk analysis.

First, we perform a set-valued time update to compute the predicted p-Zonotope of the local state vector. Next, at each *i*th receiver, we independently analyze the received measurements from visible satellites in the set-valued domain by comparing them against the predicted p-Zonotope and known measurement error bounds in authentic conditions. Based on this, we estimate a metric termed as **attack status** that lies between 0 − 1, where 0 indicates that the received measurements comply with the known measurement error bounds in authentic conditions and thus are likely to be authentic whereas 1 indicates that the measurements are likely spoofed. In an intuitive sense, the attack status adaptively scales the known p-Zonotope of measurement error bounds in authentic conditions to account for the received measurements. Thereafter, the GPS measurements and their estimated attack status are asynchronously broadcast across receivers within the network. In the measurement update at *i*th receiver, the GPS measurements and their attack status from neighbors, i.e., *j* ∈ *M _{i}*, are collectively processed to compute the corrected p-Zonotope of the local state vector. The center mean of the corrected p-Zonotope along with the measured time at the

*i*th receiver is used to estimate the point-valued GPS time that is later provided to the PMUs. The center uncertainty and covariance of the corrected p-Zonotope are further analyzed for the probability of their intersection with an unsafe set to compute the associated timing risk. In this context, we define

**unsafe set**by a set of states, i.e., receiver clock bias and clock drift that violate the pre-defined safety AL that was defined earlier in Section 1. Thereafter, the process then repeats for next iteration.

In our proposed SR-DKF algorithm, we chose to perform spoofing attack mitigation; however, another potential approach includes attack detection/exclusion, provided there are sufficient number of non-spoofed measurements in the receiver network.

### 4.1 During Initialization: Pre-Defined Process and Measurement Error Bounds in Non-Spoofed Conditions

As explained earlier in Section 1 and from prior literature (Bhamidipati & Gao, 2018), it is justified to consider the error statistics to be known and bounded during non-spoofed (i.e., authentic) conditions. We can perform offline empirical analysis to calculate the process and measurement noise bounds for each GPS receiver under authentic conditions by analyzing the associated historical GPS data. Specifically, using the three-parameter representation of p-Zonotopes we can efficiently represent the different non-zero mean but bounded stochastic errors in a process and measurement model. Some examples of error sources were discussed earlier in Equation (6). Additional details regarding process and measurement error bounds considered in our experimental validation are provided in Section 5. For all receivers in the network, i.e., ∀*i* ∈ *M*, we model the process and measurement noise bounds as p-Zonotopes using Equation (1), which are represented by and , respectively, as follows:

10

where and denote the generator and covariance matrices of p-Zonotope that represents process noise bounds. Similarly, and denote the generator and covariance matrices of p-Zonotope that represents measurement noise bounds. If one considers the center of p-Zonotope for the process and measurement noise bounds to be a zero vector, the biases in errors associated with the empirical analysis are modeled via and , respectively. Furthermore, the covariance matrices and represent the characteristics of p-Zonotopes and are different from the point-valued process noise covariance, which is denoted by *Q _{i,k}*, and point-valued measurement covariance, which is denoted by

*R*. More details regarding the matrices

_{i,k}*Q*and

_{i,k}*R*are provided in Appendix A.

_{i,k}### 4.2 Set-Valued DKF

We formulate the set-valued DKF by applying stochastic reachability to the point-valued DKF as described in Appendix A. Similar to point-valued DKF, the set-valued DKF also performs time and measurement updates at each *i*th receiver to compute the predicted and corrected stochastic reachable sets of the state vector * x_{i,k}*, respectively. To perform spoofing attack mitigation and timing risk analysis across a network of GPS receivers, we analyze the predicted and corrected stochastic reachable sets of the state estimation error, respectively. To represent the predicted and corrected point-valued state estimates at the

*k*th iteration by and , respectively, the associated estimation error is given by and , where and .

We represent the predicted and corrected stochastic reachable sets of state estimation error via predicted and corrected p-Zonotopes, respectively. The predicted and corrected p-Zonotopes of the state estimation error denoted by and , respectively, are given by

11

where and denote the generator and covariance matrices of the corrected p-Zonotope of the state estimation error. The corrected p-Zonotope is estimated during the measurement update of set-valued DKF, which is explained later in Section 4.2.2. Similarly, and denote the generator and covariance matrices of the predicted p-Zonotope of the state estimation error. This predicted p-Zonotope is estimated during the time update of set-valued DKF, which is explained later in Section 4.2.1. Note that the aforementioned covariance matrices and represent the Gaussian characteristics of the p-Zonotopes and are different from the predicted point-valued state covariance matrix, which is denoted by , and the corrected point-valued state covariance matrix, which is denoted by . More details regarding the matrices and are provided in Appendix A.

At the *k*th iteration, we utilize the translation set property from Equation (4d) to represent the predicted and corrected p-Zonotopes of the local state vector and , respectively, as and . Therefore, the predicted and corrected p-Zonotopes of the local state vector are given by Equation (12a) and Equation (12b), respectively. Note that translating the center of p-Zonotope as seen in Equation (12) does not change the generator and covariance matrix. Therefore, the generator *G* and covariance matrix ∑ of the predicted (or corrected) p-Zonotope of the state estimate * x_{i,k}* and state estimation error Δ

*are the same. From Equations (12a)–(12b) and prior literature (Althoff et al., 2009; Shi et al., 2015), it has been established that the center mean of predicted and corrected p-Zonotopes is equivalent to their point-valued DKF estimates.*

**x**_{i,k}12a

12b

#### 4.2.1 Predicted p-Zonotope of the Local State Vector via Time Update

At the *k*th iteration, we perform time update of set-valued DKF independently at each *i*th receiver in the network ∀*i* ∈ *M* to compute the predicted p-Zonotope of the local state vector. As explained earlier in Equation (12a), the center mean of the predicted p-Zonotope of the local state vector is the same as the point-valued estimate of the conventional DKF, whose mathematical formulation is described in Appendix A. Thereafter, we estimate the generator and covariance matrices of the predicted p-Zonotope of the state estimation error by utilizing the corrected p-Zonotope of the previous (*k* – 1) th iteration and the p-Zonotope of the process noise bound. To execute these steps, we first formulate the point-valued state estimation error as

13

Given that are independent random variables, we then apply the properties of p-Zonotopes discussed earlier in Equations (4a)–(4b) for converting the point-valued representation of Equation (13) to estimate the predicted p-Zonotope as seen in Equations (14a)–(14b). We refer readers to relevant literature (Althoff & Dolan, 2014; Althoff et al., 2009) for examples of the conversion of point-valued state space Equations to set-valued formulations.

14a

such that

14b

For the example shown in Figure 5, the predicted p-Zonotope obtained after the Minkowski sum in Equation (14a) is indicated by the Parula colormap. The predicted p-Zonotope of the state estimation error is used to perform spoofing attack mitigation as described in Section 4.3.

#### 4.2.2 Corrected p-Zonotope of the State Estimation Error via Measurement Update

In the measurement update executed at *i*th receiver, we compute the corrected p-Zonotope of the local state vector. As explained earlier in Equation (12b), the center mean of the corrected p-Zonotope of the local state vector is the same as the point-valued estimate of the conventional DKF, whose mathematical formulation is described in Appendix A. Thereafter, we estimate the generator and covariance matrices of the corrected p-Zonotope of the state estimation error by utilizing the p-Zonotopes of measurement noise bounds associated with all the receivers in the neighborhood set *M _{i}* and predicted p-Zonotope of the state estimation error. In our distributed framework, as explained earlier in Section 3, each

*i*th receiver utilizes both pseudoranges and Doppler values received at its location but utilizes only the pseudoranges received at other receivers in the neighborhood set (

*M*).

_{i}– iGiven that we only know the measurement error bounds under authentic conditions, we designed an adaptive framework to perform the measurement update of the set-valued DKF. The adaptive gain matrix associated with *i*th receiver measurement is denoted by and that associated with received measurements from neighborhood set *j* ∈ (*M _{i} – i*) is denoted by . Further details regarding the adaptive gain matrix formulation are provided in the spoofing attack mitigation module, which is explained later in Section 4.3. To execute this function, we first formulate the point-valued state estimation error as

15

where * z_{i,k}* and

*H*denotes the received measurements from

_{i}*i*th receiver in the network and the associated measurement model, respectively, as defined earlier in Equation (6). As described in Appendix A,

*and*

**z**_{ij,k}*H*denote the received measurements from other receivers in the neighborhood set of the

_{ij,k}*i*th receiver and the associated measurement model, respectively, as defined earlier in Equation (9).

Similar to the time update of set-valued DKF explained earlier in Section 4.2.1, we observe that are independent random variables. Based on this observation, we the apply the properties of p-Zonotopes discussed earlier in Equations (4a)–(4b) for converting the point-valued representation of Equation (15) to set-valued stochastic reachable sets as seen in Equation (16a).

16a

We further simplify using Equation (4c) to obtain Equation (16b). Intuitively, the estimated generator and covariance matrices, i.e., and , provide bounds on the unknown state estimation error associated with the point-valued state estimate .

16b

In the example shown in Figure 6, the corrected p-Zonotope obtained after the Minkowski sum in Equation (16a) is indicated by a Parula colormap. Also, given that the size of the generator matrix shown in Equation (16b) is increasing during each measurement update, we apply the set property of p-Zonotope discussed earlier in Equation (4e) to reduce the order of the generator matrix of the corrected p-Zonotope of the state estimation error . The corrected p-Zonotope of the state estimation error is analyzed to perform timing risk analysis as described in Section 4.4.

### 4.3 Spoofing attack mitigation and data transfer

Before performing measurement updates at the *i*th receiver and *k*th iteration, we independently utilized the predicted p-Zonotope of the state estimation error from Equation (14) and the pre-computed measurement noise bound to estimate the p-Zonotope of expected measurement innovation. For this application, we utilized the point-valued measurement innovation given by . More details are discussed in Appendix A. We then apply the set properties of p-Zonotopes shown in Equations (4a) and (4c) on the point-valued measurement innovation *ϵ*_{i,k} to compute the p-Zonotope of expected measurement innovation, which is denoted by , as follows:

17

The p-Zonotope of expected measurement innovation maps each point-valued measurement innovation *ϵ*_{i,k} to a scalar value that is denoted by *α _{i,k}*. Based on this, at each

*i*th receiver, we independently compute a metric termed as attack status by normalizing the obtained value

*α*with the value corresponding to the center mean of the p-Zonotope and subtracting this value from one. Thus, the estimated attack status lies between [0, 1]. As seen earlier in Equation (10), the pre-computed process and measurement noise bounds are defined for non-spoofed conditions. Therefore, in a non-spoofed condition, such as that shown in Figure 7(a), the point-valued measurement innovation

_{ij,k}

*ϵ*_{i,k}and its p-Zonotope of expected measurement innovation are in close agreement, and low attack status is obtained. However, during spoofing, as shown in Figure 7(b), the point-valued measurement innovation does not comply with this estimated p-Zonotope, and thus a high value of is observed. Note that our aforementioned analysis on spoofing attack mitigation is conducted in vector domain. Therefore, even when a subset of visible GPS satellites is affected by spoofing, the estimated attack status remains high at , indicating low overall confidence in the

*i*th receiver.

Each *i*th receiver in the network *i* ∈ *M* broadcasts its received GPS measurement * z_{ij,k}* as was defined earlier in Equation (15), as well as the estimated attack status to all the receivers in the neighborhood set

*j*∈ (

*M*). The set (

_{i}– i*M*) simply implies that the

_{i}– i*i*th receiver need not broadcast GPS measurements to itself. Similarly, each

*i*th receiver also receives the GPS measurements

*and their associated attack status from one or more receivers in its neighborhood set*

**z**_{ij,k}*j*∈ (

*M*). In an intuitive sense, the attack status obtained from each

_{i}– i*j*th receiver in the neighborhood set indicates its belief/confidence in the received GPS measurement

*. Thereafter, at the*

**z**_{ij,k}*i*th receiver, the attack status is used to compute the adaptive gain matrices and as follows:

18

where *R _{ij,k}* denotes the measurement covariance associated with the received measurements

*at the*

**z**_{ij,k}*i*th receiver from

*j*th receiver ∀

*j*∈ (

*M*). The mathematical formulation of

_{i}– i*,*

**z**_{ij,k}*R*and

_{ij,k}*H*were described earlier in Equation (15). More details regarding the mathematical formulation of the point-valued state covariance matrix can be found in Appendix A. Formulated adaptive gain matrices and are utilized to perform the set-valued measurement update of the SR-DKF algorithm, as was described earlier in Equation (16).

_{ij,k}By comparing Equation (18) and Equation (16), the adaptive gain matrices scale the known p-Zonotope of measurement error bounds in authentic conditions adjusts to comply with the received measurements. This is reflected in Figure 7(a) that represents non-spoofed conditions, wherein the adaptive scaled p-Zonotope remains close to its original size (i.e., before scaling) with the received measurement innovation captured by the center uncertainty (i.e., generator matrix). Similarly, in Figure 7(b) which represents the spoofed conditions, the size of the adaptively-scaled p-Zonotope increases significantly with the center uncertainty accounting for the corresponding received measurement innovation. Note that, unlike the adaptive point-valued DKF techniques discussed earlier in Section 1 that accounts for stochastic biases as a point-valued covariance, the proposed SR-DKF algorithm models stochastic biases in set-valued domains using center uncertainty (via the generator matrix).

### 4.4 Timing Risk Analysis

We evaluate the probability that the corrected p-Zonotope of state estimation error derived in Equation (16) falls within an unsafe set, i.e., a set of states that violate the pre-defined safety *AL* = ±1 *μ*s. As shown in both Figures 8(a) and 8(b), the unsafe set, which is denoted by , is represented by two gray strips, i.e., and .

We perform timing risk analysis by over-approximating the corrected p-Zonotope using a finite series of polytopes (Althoff et al., 2009). To over-approximate the corrected p-Zonotope efficiently, we first apply the γ-confidence set property of p-Zonotope that was discussed earlier in Equation (4f) to compute a series of (*L* +1) confidence zonotopes, which are denoted by , for the corresponding pre-defined values of γ = *m*_{0}, *m*_{1}, *m*_{2}, ⋯, *m _{L}* where

*m*

_{0}>

*m*

_{1}>

*m*

_{2}> ⋯ >

*m*. Note that the confidence zonotope

_{L}*Z*

_{m0}represents the upper bound at the threshold of the tail of the corrected p-Zonotope , and therefore, the larger the value of

*m*

_{0}, the more exact the representation of the corrected p-Zonotope by a confidence zonotope will be. Furthermore, as the value of

*L*increases, the finite series of polytopes more closely approximate the corrected p-Zonotope, but at the expense of an increased computational cost.

Using the (*L* +1) confidence zonotopes *Z*_{m0} to *Z _{mL}*, we formulate a series of

*L*polytopes, which are denoted by

*D*

_{1}to

*D*, as follows: , where the set operation for any given sets

_{L}*S*and

_{a}*S*. Further details regarding the over-approximation of p-Zonotopes by a series of polytopes are discussed in Althoff et al., 2009. An example of the over-approximation of p-Zonotope by four polytopes

_{b}*D*

_{2}to

*D*

_{4}is shown in Figure 8(a). We then compute the area of intersection of each

*l*th polytope

*D*∀

_{l}*l*∈ {1, ⋯,

*L*} with the unsafe set . Figure 8(b) shows the top-view of an over-approximated series of polytopes and an example of the intersection region of

*D*

_{3}polytope with the unsafe set .

In the context of p-Zonotopes (Althoff et al., 2009; Georgescu et al., 2019), the timing risk at *i*th receiver, which is denoted by *r _{i,k}*, is theoretically defined as

19

For computational tractability, we re-wrote the timing risk *r _{i,k}*, which is seen in Equation (20), as a summation of two terms. The first term represents the risk associated with the confidence zonotope

*Z*

_{m0}as exceeding the unsafe set and is formulated based on the

*γ*-confidence property of p-Zonotope described earlier in Equation (4f). The second term denotes the sum of the intersection area across polytopes

*D*, which is denoted by , with the unsafe set times the maximum probability of

_{l}*D*th polytope, which is denoted by

_{l}*p*

_{max,l}. The maximum probability of

*D*th polytope is at its top face and equals the probability of the corrected p-Zonotope at that level. Note that, since the area under the p-Zonotope does not equal one, the associated timing risk

_{l}*r*described in Equation (20) also does not necessarily need to be less than one. Therefore, based on the theoretical formulation explained in Althoff et al., 2009, if

_{i,k}*r*< 1, the estimated timing exceeds the safety AL with a risk of

_{i,k}*r*whereas if

_{i,k}*r*≥ 1 then the estimated timing exceeds the safety AL with a risk of 1.

_{i,k}20

### 4.5 Implementation Details

Algorithm 1 summarizes the implementation steps of the proposed SR-DKF algorithm described earlier in Sections 4.2, 4.3, and 4.4. To compute the adaptive gain matrix at each *k*th iteration, we also propagate the point-valued state covariance matrix with the predicted estimate denoted by and the corrected estimate denoted by . More details regarding the formulation of predicted and corrected point-valued state covariance can be found in Appendix A. During initialization, i.e., at iteration *k* = −1, we denote the initial point-valued mean and covariance of the state vector by and , respectively. Furthermore, we consider all receivers to be authentic during initialization. Based on this, we represent the initial p-Zonotope of the state estimation error to be zero-mean center, with a non-zero generator and over-bounding covariance matrices. For simplicity and without the loss of generality, we process the measurements from different receivers in a single neighborhood set simultaneously. Alternatively, for a more generalized framework, one can process the measurements in an asynchronous manner, i.e., execute the proposed SR-DKF algorithm at every time sub-iteration when measurements are received from any *j*th receiver in the neighborhood set.

## 5 RESULTS AND ANALYSIS

We validate the robustness of the proposed SR-DKF algorithm to not only mitigate the effects of spoofing but also to estimate the secure GPS timing and its associated timing risk across all the receivers in a network. We perform set operations of stochastic reachability and transform across various set representations, including polytopes, zonotopes, p-Zonotopes, and others, using a publicly-available MATLAB toolbox, known as the COntinuous Reachability Analyzer (CORA) (Althoff, 2016).

As shown in Figure 9, we designed a simulated network of seven GPS receivers, *|M*| = 7, that are spatially distributed across the US. The network of GPS receivers indexed from *Rx* : 1 to *Rx* : 7 are located in 1) Stanford, CA; 2) Atlanta, GA; 3) Urbana, IL; 4) Boulder, CO; 5) Austin, TX; 6) Boston, MA; and 7) Auburn, AL. We utilized a C++ language-based software-defined GPS simulator known as GPS-SIM-SDR (Bhamidipati & Gao, 2019; Ebinuma, 2018) to simulate the raw GPS signals received at each location in the network. We then post-processed the simulated GPS signals using a MATLAB-based software-defined radio known as SoftGNSS (Paul, 2011). In our simulation setup, we set the average transmission delay to a pre-surveyed/constant value, which is also considered in modeling the measurement vector *z _{ij,k}* in the system model of our proposed SR-DKF algorithm.

We defined the simulated errors observed in the system during authentic conditions as having the following characteristics:

For the process noise in local state vector (the first state

*cδt*), the mean and covariance of the time-varying Gaussian distribution lies between [−36 m, 36 m] and [0 m, 60 m], respectively; similarly, for the process noise in its clock drift (the second state ), the time-varying mean and covariance lies between [−0.024 m/s, 0.024 m/s] and [0 m/s, 0.03 m/s], respectively._{i,k}For the measurement noise in GPS pseudoranges, the mean and covariance of the time-varying Gaussian distribution lies between [−2.5 m, 2.5 m] and [0 m, 3.68 m], respectively; similarly, for the measurement noise in GPS Doppler, the time-varying mean and covariance lies between [−0.33 Hz, 0.33 Hz] and [0 Hz, 0.05 Hz], respectively. We consider all the receivers in this simulated setup to have the same error bounds. We refer to prior studies (Hetet, 2000; Kuusniemi et al., 2004) for information on deriving the empirical measurement noise covariance of pseudoranges and Doppler values as a function of carrier-to-noise density (C/N0). These values are formulated by setting

*C*/*N*0 = 38 dB-Hz to depict authentic conditions.For the initial state error bounds of local state vector (the first state

*cδt*), the mean and covariance of the time-varying Gaussian distribution lies between [−0.01_{i,k}*μ*s, 0.01*μ*s] and [0*μ*s, 0.04*μ*s], respectively; similarly, for the initial state error bounds in the second state , the time-varying mean and covariance lies between [−0.02 ns/s, 0.02 ns/s] and [0 ns/s, 0.06 ns/s], respectively.

As discussed earlier in Algorithm 1, we initialized our SR-DKF algorithm by pre-defining the following p-Zonotopes ∀*i* ∈ *M* including process noise , measurement noise , and initial state estimation error . We formulate these p-Zonotopes from the aforementioned errors bounds in the same manner as the 2D example discussed in Section 2. Additional complexities associated with the simulation setup and exhaustive sensitivity analysis for different ranges of hyperparameters will be explored in our future work.

### 5.1 During a Coordinated Signal-Level Spoofing Attack

In the first set of experiments, we consider sparse connectivity among the seven receivers in the network. Figure 10(a) showcases the connectivity matrix where a green tick indicates the presence of a bidirectional communication link while a red cross indicates otherwise. Note that each receiver is connected to itself by default.

We induce a simulated coordinated signal-level spoofing attack in the simulated GPS measurements received at two GPS receivers in the network, i.e., *Rx* : 1, which is located in Stanford, CA and *Rx* : 5, which is located in Austin, TX. Between the time duration *k* = 40 −1040 s, the first spoofing attack manipulates the GPS time at *Rx* : 5 to deviate at a rate of 100 ns/s. The second spoofing attack, which occurs between *k* = 800 −1300 s, deviates the GPS time of *Rx* : 1 at a rate of 400 ns/s.

We validated the performance of the proposed SR-DKF algorithm against two benchmark point-valued state estimation techniques, namely adaptive DKF, which is shown in Figure 10(c), and conventional DKF, which is shown in Figure 10(d). The mathematical formulations for these point-valued techniques are provided in Appendix A; the measurement noise covariance *R _{i,k}* is estimated by setting the forgetting factor at

*ψ*= 0.3 in the case of adaptive DKF and

*ψ*= 1 in the case of conventional DKF. The quantitative maximum error values of the receiver network for the aforementioned state estimation approaches are listed in Table 1. The conventional DKF shown in Figure 10(d) exhibits high state estimation errors in the receivers that are being spoofed, i.e.,

*Rx*: 5 and

*Rx*: 1, with maximum clock bias errors of 69.24

*μ*s and 13.77

*μ*s, respectively. The conventional DKF also corrupts the timing at other non-spoofed receivers in the network, e.g.,

*Rx*: 3 and

*Rx*: 2 exhibit large state estimation errors in clock bias, whose maximum values are 100.07

*μ*s and 42.88

*μ*s, respectively. Furthermore, while the adaptive DKF, which is shown in Figure 10(c), demonstrates higher state estimation accuracy than the conventional DKF, this point-valued technique still affects the receivers that are not directly spoofed, e.g.,

*Rx*: 3 exhibits state estimation errors with a maximum value of 2.47

*μ*s. Based on the maximum error values listed in Table 1, we observe that both point-valued state estimation techniques violate the pre-defined safety

*AL*= ±1

*μ*s. By contrast, our proposed SR-DKF algorithm, which is shown in Figure 10(b), provides a secure estimate of GPS timing with a maximum clock bias error of 0.42

*μ*s and clock drift error of 2.23 ns/s across all receivers; thus, the SR-DKF algorithm complies with the acceptable timing standards that are commonly practiced in the power community.

In Figure 11, we plot the attack status of the GPS measurements * z_{i,k}* at each receiver in the network. In accordance with Section 4.3, we demonstrate that the proposed SR-DKF algorithm successfully detects and mitigates spoofing in the GPS measurements reported by

*Rx*: 1 and 5 while accurately categorizing other receivers, i.e.,

*Rx*: 2, 3, 4, 6, 7 as non-spoofed with attack status at all times. We also observe that, irrespective of the drift rate of spoofing, i.e., 400 ns/s in

*Rx*: 1 and 100 ns/s in

*Rx*:5, both attacks were quickly mitigated, thereby ensuring high robustness. Even after the spoofing ends, i.e., at

*k*= 1040 s in

*Rx*:5 and at

*k*= 1300 s in

*Rx*: 1, the GPS measurements still exhibit errors as their tracking loops have not yet converged to the authentic satellite signals. This effect is also captured by our proposed SR-DKF algorithm which has successfully assigned an attack status of 1 for both

*Rx*:5 and

*Rx*: 1.

In Figure 12, we showcase the timing risk associated with the estimated GPS timing, i.e., the probability of estimated clock bias that violates the pre-defined safety *AL* = ±1 *μ*s. This is computed by analyzing the intersection of the corrected p-Zonotope with the unsafe set , as discussed in Section 4.4. An increase or jump in measure of timing risk is based on the adaptive gain matrix that in turn depends on the estimated attack status of the receiver. By comparing results shown in Figure 10(a) and Figure 12, we observe a direct correlation between the value of timing risk to the number of reliable measurements available for the calculation of the corrected p-Zonotope. For instance, since none of the four neighbors *of Rx* : 4 are spoofed, the estimated timing risk is consistently low, i.e., ≈ 10^{−7}, for the entire duration of the experiment. By contrast, *Rx* : 3 has a highly fluctuating timing risk because two of its three neighbors have been spoofed. Similarly, when only one of three neighbors at *Rx* : 1 is spoofed, the order of timing risk magnitude is 10^{−3}; by contrast, when only one of two neighbors are spoofed at *Rx* :5, the timing risk exhibits a higher order magnitude at 0.25. Therefore, our SR-DKF algorithm estimates a robust measure of the timing risk across all receivers in the network.

### 5.2 Robustness Analysis of the Proposed SR-DKF Algorithm

In the second set of experiments, we demonstrate the robustness of the proposed SR-DKF algorithm as the number of receivers in the neighborhood set |*M*| varies. Considering a total experiment duration of 100 s and the network of seven receivers shown in Figure 9, we induced a simulated meaconing attack between *k* = 10 −100 s in the GPS measurements that were received at *Rx* : 1 located at Stanford, CA.

In Figure 13, we analyzed the variation in the timing risk associated with the estimated GPS time of *Rx* : 1 as the number of receivers in the network *M* increased. For each receiver network size that ranges from |*M*| = 2 to |*M*| = 7, we consider the presence of a fully-connected bidirectional communication network and perform 50 Monte-Carlo runs for four meaconing magnitudes of 30 *μ*s, 45 *μ*s, 60 *μ*s, and 100 *μ*s. We successfully demonstrated that, as the number of receivers interacting with the *Rx* :1, i.e., |*M*_{1}|, increases, the associated timing risk decreases. Particularly for |*M*_{1}| > 5, the decrease in timing risk is on the order of ≤ 10^{−5} and hence considered insignificant. Similarly, the effect of meaconing magnitude on the timing risk decreases as the number of interacting receivers associated with *Rx* : 1 increases. Therefore, the proposed SR-DKF algorithm requires only a handful of distributed GPS receivers to achieve robust performance.

## 6 CONCLUSIONS

In summary, we developed a Stochastic Reachability-Based Distributed Kalman Filter (SR-DKF) algorithm to perform set-valued state estimation using a network of GPS receivers that estimate not only the point-valued mean but also the stochastic reachable set of the local state vector (i.e., receiver clock bias and receiver clock drift). By parameterizing the stochastic reachable set via probabilistic Zonotopes (p-Zonotopes), we derived both time and measurement update steps of the set-valued DKF to estimate the predicted and corrected p-Zonotope of timing error, respectively. To compute secure GPS time for a network of receivers, we designed a two-tiered approach. First, we performed spoofing mitigation at measurement-level via deviation of measurement innovation from its expected stochastic set (represented by the p-Zonotope). Second, we performed timing risk analysis at state-level via the probably of an intersection of the corrected p-Zonotope with an unsafe set, i.e., a set that violates the pre-defined safety alert limit (AL). To the best of our knowledge, no prior literature exists that addresses secure state estimation via stochastic reachability nor on the use of timing risk quantification during spoofing.

Under a simulated coordinated signal-level spoofing that affects two members of a network of seven sparsely-connected GPS receivers, the proposed SR-DKF algorithm demonstrated a higher timing accuracy, successfully mitigated the spoofing attacks, and estimated a robust measure of timing risk. While our algorithm showed a maximum timing error of only 0.42 *μ*s, both the point-valued state estimation techniques, i.e., conventional DKF and adaptive DKF, showed high estimation errors in clock bias that violate the pre-defined safety *AL* = ±1 μs, which is the commonly accepted timing accuracy for PMUs. Furthermore, by varying the number of receivers and attack magnitudes, we performed Monte Carlo runs to validate our finding that only a handful of GPS receivers (approximately five) are necessary to estimate GPS time with a low timing risk of 10^{−5}.

## HOW TO CITE THIS ARTICLE

Bhamidipati, S., & Gao, G. (2023). GPS spoofing mitigation and timing risk analysis in networked phasor measurement units via stochastic reachability. *NAVIGATION, 70*(3). https://doi.org/10.33012/navi.574

## ACKNOWLEDGEMENTS

We would like to thank Ashwin Kanhere, Siddharth Tanwar, and Tara Mina for their help in revising the paper. We would also like to thank all members of the NAV Lab at Stanford University for their thoughtful input on this work.

This material is based upon work supported by the Department of Energy under Award Number DE-OE0000780.

This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

## APPENDIX A PRELIMINARY FINDINGS OF THE POINT-VALUED DKF

Based on Talebi and Werner, 2018, the general framework of a point-valued DKF implementation performed independently at the *i*th receiver is shown in Equations (A.1)–(A.3). With as the expectation operator, we denote the point-valued mean and covariance during initialization, i.e., *k* = −1, by and , respectively, which are defined as follows:

A.1

The process and measurement noise at each *i*th receiver are modeled via a zero-mean Gaussian distribution as and , respectively, where *Q _{i,k}* and

*R*represent the process and measurement noise covariance matrix, respectively. After initialization, the conventional point-valued DKF recursively executes two steps, including the time update and the measurement update.

_{i,k}During the time update, as seen in Equation (A.2), the corrected variables and from the previous time, i.e., at the (*k* – 1)th iteration, are propagated forward-in-time via a state transition matrix *F _{i}* to predict the estimates for the next

*k*th iteration, which are given by and , respectively.

A.2a

A.2b

At the *k*th iteration, each *i*th receiver receives the GPS measurements and measurement noise covariance matrix from the other receivers in the neighborhood set, i.e., *j* ∈ *M _{i}* and processes them sequentially during the measurement update. In the measurement update, the expected measurements (obtained via the predicted estimates) and the received measurements are weighted via an estimated Kalman gain to obtain the corrected estimates of the point-valued mean and covariance of the state vector, i.e., and , respectively. In this context,

*and*

**z**_{i,k}*H*denotes the received measurements from the

_{i}*i*th receiver itself and the associated measurement model, respectively, while

*,*

**z**_{ij,k}*H*and

_{ij,k}*R*denotes the measurements received from other receivers in the neighborhood set of the

_{ij,k}*i*th receiver and the associated measurement model and measurement noise covariance matrix, respectively. The optimal Kalman gain denoted by

*K*and

_{i,k}*K*∀

_{ij,k}*j*∈ (

*M*) is shown in Equation (A.3).

_{i}– iA.3a

A.3b

where gain matrix and . The mathematical formulation of *H _{i}*,

*z*and for this work is discussed in Equation (5) and Equation (6) of Section 3. Similarly, the mathematical formulation of

_{i,k}*H*and is discussed in Equations (7), (8) and (9) of Section 3.

_{ij}, z_{j,k}For an adaptive implementation of point-valued DKF, additional parameters, namely, forgetting factor *ψ* and pre-measurement residuals *ϵ _{i,k}* and

*ϵ*, are considered to provide an adaptive estimate the measurement noise covariance matrix

_{ij,k}*R*and

_{i,k}*R*at each time iteration. Here, with and with . Of note, we obtain the conventional point-valued DKF by setting

_{ij,k}*ψ*= 1.

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.