Residual-based multi-filter methodology for all-source fault detection, exclusion, and performance monitoring

  • NAVIGATION: Journal of the Institute of Navigation
  • September 2020,
  • 67
  • (3)
  • 493;
  • DOI: https://doi.org/10.1002/navi.384

Abstract

All-source navigation has become increasingly relevant over the past decade with the development of viable alternative sensor technologies. However, as the number and type of sensors informing a system increases, so does the probability of corrupting the system with sensor modeling errors, signal interference, and undetected faults. Though the latter of these has been extensively researched, the majority of existing approaches have constrained faults to biases and designed algorithms centered around the assumption of simultaneously redundant, synchronous sensors with valid measurement models, none of which are guaranteed for all-source systems. As part of an overall all-source assured or resilient navigation objective, this research contributes a fault- and sensor-agnostic fault detection and exclusion method that can provide the user with performance guarantees without constraining the statistical distribution of the fault. The proposed method is compared against normalized solution separation approaches using Monte-Carlo simulations in a 2D non-GPS navigation problem.

1 INTRODUCTION

All-source navigation and Assured Position Navigation and Timing (APNT) have become increasingly important research areas over the past two decades, especially as alternative navigation sensor technologies (e.g., vision (Venable, 2016), radio (Curro & Raquet, 2016), magnetic (Canciani & Raquet, 2016), etc.) have been matured and integrated into navigation systems (Grejner-Brzezinska et al., 2016). However, each additional sensor allowed into a navigation system introduces another opportunity for corrupting the navigation solution with errors in sensor modeling, unexpected signal interference, or undetected sensor faults. Of these challenges, the latter has been extensively researched (Bhatti, 2006; Bhatti, Ochieng, & Feng, 2007a, 2007b; Brenner, 1996, 1990; Brumback & Srinath, 1987; Call, Ibis, McDonald, & Vanderwerf, 2006; Joerger, Chan, & Pervan, 2014; Kerr, 1980; Lee et al., 1986; Parkinson & Axelrad, 1988; Sturza, 1988; van Graas & Farrell, 1993; Young & Mcgraw, 2003) as a multi-sensor fault detection problem where each satellite in the Global Positioning System (GPS) constellation is regarded as a different (albeit identical in nature and synchronous) sensor in the multi-sensor system, and the “fault” is defined as an unmodeled bias that is assumed to only affect one of the sensors (satellites) at any given time. As shown in Jurado, Raquet, and Schubert Kabban (2019, 2020), our overall research motivation is to create a resilient sensor management system that provides APNT through the online detection and self-correction (i.e., auto-tuning) of sensor models that do not match observed measurements. In support of this overall effort, the specific developments presented in this paper seek to determine when any of the above sources of corruption are present by detecting any general mismatches between a sensor’s stated model (i.e., measurement function, function parameters, and error covariance matrix) and its observed measurements, where an unmodeled bias is simply one specific case of a mismatch. Additionally, our research shifts away from identical and synchronous sensors, such as GPS satellites, and focuses on all-source multi-domain (e.g., position, velocity, etc.) and asynchronous sensors. In the following section, we discuss traditional techniques generally used in fault detection and exclusion and highlight the novel developments and adaptations we have made in order to achieve our research objectives.

2 BACKGROUND

Multi-sensor Fault Detection and Exclusion (FDE) research to date has generally approached the FDE problem through the use of either least-squares (Sturza, 1988; van Graas & Farrell, 1993) or filtered (Joerger et al., 2014; Young & Mcgraw, 2003) approaches. In general, least-squares approaches rely on the availability of redundant measurements and perform the FDE function on a sample-by-sample basis. This approach is useful for systems where simultaneously redundant measurements are available by design, and the sensors in question measure the similar quantities (e.g., the GPS constellation). In contrast, filtered approaches rely on estimation filters, such as the Kalman Filter (KF), Extended Kalman Filter (EKF), and Unscented Kalman Filter (UKF), to integrate measurements from various sensor types into a consolidated state-space estimate that can then be tested for faults using a variety of test statistics. In this case, the fault exclusion function is often provided by employing a series of subfilters, each excluding the measurements from a subset of sensors in oder to guarantee the existence of a fault-free solution. One of the most prolific filtered approaches, referred to as Normalized Solution Separation (Young & Mcgraw, 2003), is based on testing the statistical distribution of the difference between the state-space estimate obtained by a filter informed by all available sensors and each of the various exclusion subfilters previously described. Based on our all-source research objective, filtered approaches appear most useful since, by definition, we seek to integrate information from sensors of various types (i.e., position, velocity, pseudorange, etc.). However, as later shown, Normalized Solution Separation methods were found ineffective for detecting sensor model mismatches (i.e., faults) that do not result in significant differences in state-space estimates between the main filter and the subfilters. One example of such faults is an incorrectly stated sensor measurement error covariance matrix and is later shown in Section 4. Additionally, as part of our research objective, we also seek to provide users with a measure of system performance guarantee beyond the filter-computed estimation error covariance since filter-estimated error statistics are not guaranteed to be consistent in the presence of undetected faults. While the majority of FDE research has solved this problem via the computation of system integrity figures such as Horizontal Protection Level (HPL) and associated alert limits, the lack of an assumed fault-present statistical distribution prevents us from estimating quantities such as a probability of missed detection, which are the basis for the majority of integrity computations.

Given the developments required to enable our all-source research objective, our proposed method, henceforth referred to as Sensor-Agnostic All-source Residual Monitoring (SAARM), provides a significant contribution to the state-of-the-art in that it: enables reliable fault-agnostic and sensor-agnostic FDE across a larger variety of fault types when compared to Normalized Solution Separation and provides a method of establishing system performance guarantees without defining a fault-present condition or distribution, or a probability of missed detection. In this work, we also demonstrate our FDE approach using “all-source” sensors across various domains and with different update rates, which directly addresses the emerging all-source APNT challenge.

The remainder of this paper is divided into three additional sections. Section 3 develops the necessary multi-filter multi-sensor notation, the residual-based test statistic, the fault detection and exclusion process, and the system performance assumptions and guarantees. In Section 4, the detection performance of the proposed method is compared against a normalized solution separation method in a variety of simulated all-source navigation problems. Finally, Section 5 summarizes the research contributions and provides ideas for future work.

3 METHODOLOGY

3.1 Multi-sensor multi-filter notation

This section expands the conventional Kalman filter (Kalman, 1960) notation from Maybeck (1982, 1984) to include estimates from multiple filters as well as measurements from multiple non-identical sensors. The notation and underlying considerations will be crucial in the later development of the residual-space test statistic and the resulting fault exclusion process. Consider a (possibly) nonlinear dynamic system of the form

1 1

where x is the N × 1 navigation state vector containing the system states, u is the control input vector, G is an N × W linear operator, and w is a W × 1 white Gaussian noise process with a W × W continuous process noise strength matrix Q. Suppose the discretized (Van Loan, 1978) system states are estimated by J separate filters. Then at time t = tk, the system state estimate vector and corresponding state estimation error covariance matrix from filters j = 1,…, J are given by Graphic and Graphic, respectively. Next, each of the J filters can be informed by any, all, or a subset of I sensors. At time tk, the i = 1,…, I sensor provides (possibly) multidimensional Zi × 1 measurements of the form

2 2

where h[i] is a (possibly) nonlinear measurement function, and v[i](tk) is a Zi × 1 discrete white Gaussian noise process with covariance matrix R[i](tk). Immediately prior to a measurement update, the estimated measurement for sensor i from filter j, Graphic, is generated using

3 3

while its estimated covariance matrix, Graphic, is generated based on the type of filtering algorithm. For example, in a linearized filter (such as an Extended Kalman Filter), it can be computed using

4 4

where the time index Graphic is omitted for simplicity, and H[i] represents the Jacobian of h[i] about the point Graphic. For information on generating Graphic in an UKF, the reader is referred to Wan and Van Der Merwe (2000). Finally, the (so-called) pre-update residual vector computed between sensor i and filter j, r[i,j], and associated covariance matrix, Graphic, is given by

5 5

6 6

3.2 Fault detection test statistic

Having derived the residual vector, r[i,j](tk), and its associated covariance matrix, Graphic, in Equations (5) and (6), we now define a residual-space test statistic to determine if a set of observed residuals between a specific sensor-filter pair are adhering to their expected distribution. Since our goal is to limit the assumptions on the type of fault (i.e., the fault could be a bias, an incorrectly stated noise covariance matrix, or incorrect calibration of measurement function parameters), we did not model two competing distributions as would be needed to employ a Likelihood Ratio Test (LRT) (Kay, 1998). Instead, we focused on the distribution resulting from summing the squared Mahalanobis distance (De Maesschalck, Jouan-Rimbaud, & Massart, 2000) across a sequence of pre-update residuals and selecting a threshold based on a desired probability of false alarm, Pf.

Given a Zi-dimensional Gaussian distribution with mean μ, and covariance matrix , the squared Mahalanobis distance, d2, between an observation y, and the centroid of the distribution is then given by

7 7

Additionally, d2 is known (Casella & Berger, 2002; De Maesschalck et al., 2000) to follow a Chi-Square distribution with Zi degrees of freedom. Moreover, the sum of M independent d2 distances is also known to follow a Chi-Square distribution with M × Zi degrees of freedom. As proven in Maybeck (1982) and Young and Mcgraw (2003), Kalman filter pre-update residuals form a zero-mean white sequence. As such, we let y = r[i,j](tk) from Equation (5), Graphic from Equation (6), and μ = O. Subsequently, we can compute the fault detection test statistic, Graphic, using

8 8

where M is the number of trailing samples in the residual sequence, a fault is declared if

9 9

and α is derived from the overall desired system Pf, which is further discussed in Section 3.3. It is important to note the test above can only determine if any of the I sensors providing measurement updates to filter j is faulty. In order to determine the actual sensor(s) within filter j that are faulty, additional assumptions and computations must be made, as shown in the next section.

3.3 Fault identification process

Up to this point, we’ve defined how a time sequence of residual vectors from a specific sensor-filter combination may be tested for likelihood, without making assumptions on the domain of the sensor measurement, or the type of fault. Here, it is important to emphasize that a fault detection derived from a set of residual vectors from a particular sensor-filter pair (i, j) does not imply that sensor i is faulty. It is only an indication of an inconsistency between the information provided by sensor i and the rest of the sensors informing filter j. In other words, low-likelihood residuals can then either be caused by faulty measurements, z[i], or faulty estimated measurements, Graphic, the latter of which is influenced by all sensors informing filter j whose state-space overlaps with sensor i. Therefore, in order to identify and exclude the faulty measurements, we developed a “fault consensus” process that associates the presence of a sensor with the presence of a fault in order to determine the most probable sensor associated with faulty results. Though the proposed method is not limited to just single sensor faults, it is best to begin our discussion with this case before scaling to the generalized multiple simultaneous fault cases. The next two sections develop the single and multiple fault cases, respectively.

3.3.1 Single serial faults

As described in Section 2, a commonly assumed fault scenario is a single sensor fault per testing epoch (i.e., during a single M-sample test window in our case). Multiple faults are still considered, but restricted to occur serially. In this case, we set up our fault identification process by creating J = I filters, each informed by a unique set of I − 1 sensors. In other words, each filter excludes one of the I sensors. Here, it is important to note two points. First, since we expect all-source sensors to be non-identical, some states may become unobservable within a particular filter if the only sensor that has observability over them is excluded from that filter. To prevent potential numerical issues with the covariance of unobservable states, we can remove unobservable states from each subfilter or perform a stochastic observability test (Bageshwar, Gebre-Egziabher, Garrard, & Georgiou, 2009) to detect unbounded growth in a subfilter’s position error covariance matrix. Second, as with other parallel-filter methods, a “main filter” that is informed by all sensors is also created, but in our method, we do not use its information for solution separation comparisons, thereby eliminating the need for computing the cross-covariance terms between it and all other filters. Having designed the set of filters using this method guarantees,

under the assumption that, at most, one sensor can fail at a time, at least one of the filters will be completely unaffected by faulty measurements. As shown below, we can then use this axiom in conjunction with the full set of (i, j) residual test results to determine the culprit sensor.

We begin the fault identification process by populating the I × J (which becomes I × I in this single-fault case) test results matrix, T, using

10 10

Figure 1 illustrates the information from each sensor-filter pair needed to populate T in the case where the jth filter excludes the jth sensor. In the figure, each of the i = 1,…,I rows corresponds to the measurement, z[i], and its associated error covariance matrix, R[i], obtained from the ith sensor. These two parameters define the modeled distribution of the sensor measurement and make up the first half of Equations (5) and (6), respectively. Next, each of the j = 1, …, J (J = I in this single fault case) columns corresponds to the estimated measurement, Graphic, and its associated error covariance matrix, Graphic. These two parameters make up the remainder of Equations (5) and (6) and define the modeled distribution of the estimated sensor measurement. As shown in the figure, these last two parameters are influenced by all sensors informing the filter in the jth column, which corresponds to all sensors except the jth sensor.

FIGURE 1

Illustration of the multi-sensor multi-filter test statistic matrix, T

A fault is declared when T contains any non-zero entries. Here, it is important to highlight that based on Equation (10), T(i, j) = 0 when Sensor i does not inform filter j, which forces the diagonals of T to zero in the J = I case. This means a fault can be declared when any of the I2I test statistics from Equation (9) that are contained in T result in a fault. Given each row in T is informed by a particular sensor, we expect some interdependence among the “cells” in T; however, since this dependence is not predictable a priori, we can upper-bound the family-wise error rate on the entire set of tests found in T using a Bonferroni correction (Bonferroni, 1936), which is not affected by such dependence (Goeman & Solari, 2014). Therefore, in order to guarantee a maximum system (i.e., family-wise) Pfαmax, we compute each α from Equation (9) in T using

11 11

Table 4 summarizes individual α values resulting from a series of desired αmax rates in the range αmax ∈ {1 × 10−3, 1 × 10−1} along with actual Pf rates achieved in the Monte Carlo simulations detailed in Section 4.

Under our proposed method, if a fault is declared, the culprit sensor may only be identified if a consensus is reached. That is, since each sensor is excluded from one filter, we can identify a faulty sensor if only a single filter (namely the filter that excluded it) remains fault free. Mathematically, we first compute the fault scores vector, s, whose dimension is equal to the number of filters, J, using

12 12

which produces a sum across the rows (sensors) for each column (filter) in T. Once computed, we have four possible scenarios:

  1. If s contains all zeros (i.e., [0 0 0 0]), then no fault has been detected.

  2. If s contains at least one non-zero and more than one zero (i.e., [0 0 1 0]), then a fault is declared, but the culprit is not yet identified. However, the SAARM position estimate is still bounded by the Guaranteed Position Zone (GPZ) defined in Section 3.4.

  3. If s(j) is the only zero remaining in s (i.e., [0 1 1 1]), then a fault is declared and the culprit sensor is the sensor that was excluded from the jth column in T, or the jth filter, if constructed according to Figure 1.

  4. Finally, if s contains no zeros (i.e., [1 1 1 1]), then more than one sensor is faulty, and the assumptions of the test have been violated.

Each of these “states” can be used in conjunction with the performance guarantee computations presented in Section 3.4 in order to continuously inform users of their APNT protection status. Depending on the type and dynamics of the fault, as well as the set and type of sensors in the system, the results in T may continue to change during every epoch and eventually lead to a culprit. If and when a culprit is determined, the corresponding fault-free filter is used as the new main filter, and a new set of I − 1 filters is initialized using its state-space estimate and associated error covariance matrix. The process can then be repeated sequentially for multiple serial faults with the assumption that a second fault does not occur during the first M samples after having re-spawned the filter set, which will be addressed in the next section.

3.3.2 Simultaneous faults

The serial-fault methodology described above can be easily scaled to enable detection of a secondary fault occurring during the first M samples after an initial fault, as well as multiple simultaneous faults. To do so, we first redefine the number of filters required, J, the structure of the associated test results matrix, T, and the dimension of the faults score vector, s, as functions of the assumed maximum number of simultaneous faults, which we define as a “layer.” In general, the number of additional filters required, J, for each layer, N, is given by

13 13

As shown in Section 3.3.1, in layer one, we assumed N = 1 simultaneous fault was possible and created

14 14

filters each excluding one sensor, which were then used to populate T1 ∈ ℝI×J1, detect the fault, and identify the single culprit using s1. If we now assume N = 2 simultaneous faults are possible, we require an additional

15 15

filters each excluding two sensors, which are then used to populate T2 ∈ ℝI×J2. Using this two-layer configuration, the culprit sensor in a single fault scenario can continue to be identified as previously described, using T1 and s1. In the case of a simultaneous fault, s1 indicates that the single fault assumption has been violated (no zeros remain), which prompts the system to use T2 and s2 to identify the two culprits. In the case of a secondary fault during the first M samples after an initial fault, the subset of filters in the J2 layer that excluded the first culprit corresponds exactly to the new J1 layer of filters needed after re-spawning, which enables uninterrupted fault detection.

For example, consider a system with I = 5 sensors where up to two simultaneous sensor faults are assumed. The first layer consists of J1 = 5 filters, and each filter excludes one of the sensors, as shown in Table 1. Using Equation (15), the second layer consists of J2 = 10 filters, and each filter excludes two of the sensors, as shown in Table 2. Suppose Sensor 3 experiences a fault. In this case, Filter 3 is uncorrupted by any faulty measurements, and its corresponding column in T1 ∈ ℝ5×5 uniquely contains all zeros. After determining Sensor 3 is the culprit via Equation (12), Sensor 3 is taken offline and a new set of J1 = I −1 = 4 filters, each excluding one of the remaining four sensors, is spawned. Without a J2 layer, this would mean the system could not detect a subsequent fault while it repopulates the new T1 ∈ ℝ4×4. However, having the J2 layer already running, we can see the new J1 layer of filters is actually equivalent to the subset of J2 filters that had also excluded Sensor 3, which guarantees uninterrupted fault detection after detecting an initial fault. Finally, suppose both Sensor 3 and Sensor 5 experience a simultaneous fault. In this case, every single filter in the J1 layer (i.e., every column in Table 1) would be corrupted and no column in T1 would contain all zeros. However, the J2 filter that excluded both Sensor 3 and Sensor 5 would be guaranteed to be uncorrupted by faulty measurements, and its corresponding column in T2 ∈ ℝ5×10 would uniquely contain all zeros. In principle, this process can be scaled up to any number of layers, corresponding to any number of simultaneous faults. It is important, however, to consider the (previously mentioned) stochastic observability of each state as we exclude additional sensors and the trade-off in computational power required to support the growing number of required filters for each additional layer.

View this table:
TABLE 1

Sensor-filter configuration for layer J1t 1 = 5 sensors

View this table:
TABLE 2

Sensor-filter configuration for layer J2, I = 5 sensors

3.4 Performance assumptions and guarantees

Having defined an all-source fault detection, identification, and exclusion process, we now turn our attention to providing users with a measure of system performance guarantee without constraining the nature of the fault-present condition. Though FDE literature often provides a rigorous definition of system “Integrity,” here we aim to simply provide a guarantee of system performance under a particular set of assumptions. As already mentioned, under our all-source fault-agnostic goal, we are unable to define the nature of the fault-present condition or the distribution of the fault-present test statistic, thereby precluding any computations involving the probability of missed detection or missed alert. However, given the multi-filter FDE mechanism already in place, we can still provide users with a GPZ, or a position error ellipsoid based on both the filter-computed position error statistics and the assumption that one of the filters in the system is guaranteed to be fault-free and therefore consistent.

In general, the GPZ is constructed under one condition: assuming at least one of the filters is informed entirely by properly modeled, uncorrupted sensors, then at least one filter contains consistent state estimation error statistics. In other words, since it is assumed that one of the filters is fault-free (based on properly designing the set of filters), then the estimated error statistics from one of the filters truly describe actual errors committed by the filter. Defining αI as the acceptable error bound, we can then derive an accurate 100(1 − αI)% error ellipse on the horizontal position using the uncorrupted filter’s horizontal position estimate and its associated error covariance matrix. Given the uncorrupted filter is not identifiable prior to determining a culprit, we union the 100(1 − αI)% horizontal position error ellipses from all of the filters, thereby guaranteeing the true horizontal position is contained within the union with at least a 100(1 − αI)% probability, since the union can only grow the resulting ellipse. This guarantee is valid regardless of the status of the underlying fault detection and identification process.

To clarify the difference between the GPZ and other ellipsoids found in integrity literature, we examine a few important notes. The GPZ derivation begins with each subfilter’s estimate of position. For each subfilter, the acceptable error bound, αI, represents the probability of the true vehicle position lying outside the error ellipsoid centered on the subfilter’s position estimate and derived from the subfilter’s position error covariance matrix. If an undetected fault is present in the system, only one subfilter is guaranteed to have consistent error statistics, meaning only one 100(1 − αI)% error-bound ellipsoid is guaranteed to accurately describe the probability of containing the true vehicle position. Since the uncorrupted subfilter is not knowable prior to identifying a culprit, we cannot choose a single correct error-bound ellipsoid to display to the user. Therefore, we union all error-bound ellipsoids from all subfilters, thereby guaranteeing the “correct one” is contained. Finally, since the correct one is guaranteed to contain the true vehicle position with a 100(1 − αI)% probability, then unioning additional areas can only increase the probability of capturing the true vehicle position. As later shown, this union does not produce a contiguous shape centered about a mean position estimate, it is simply the union of all individual ellipsoids centered on their respective subfilter position estimate. Note the main filter position estimate and error statistics are not used in the computation of the GPZ.

To illustrate our GPZ, consider a 2D navigation problem using I = 4 sensors. The sensor suite is the same as used in the simulations in Section 4 and is summarized in Table 3. As shown, the sensor suite is composed of two 2D position sensors: POS1, POS2, and two 2D velocity sensors: VEL1, VEL2. The system dynamics are propagated using a 2D kinematics model driven by 2D First Order Gauss-Markov (FOGM) acceleration as described in Section 4. In order to best visualize the effects of faults on GPZ, the fault has been defined as a growing velocity bias starting from 0 [m/s] at tk = 40 [s], growing at a rate of 0.1 [m/s/s], and applied to the x-dimension measurements from the VEL1 sensor. Figures 2 through 5 illustrate a time sequence of events along a sample instantiation of the simulation. As shown in the figures, an error bound of αI = 0.05 was used to produce 95% error ellipses. The GPZ derived from the union of the 95% horizontal position error ellipses from all subfilters is guaranteed to contain the true vehicle position at least 95% of the time, regardless of the presence of a fault, ability to detect, or ability to determine a culprit. In all examples, the main filter is informed by all sensors but its states and their error covariances are not used in the detection of faults or the computation of GPZ. Finally, though these sample illustrations were limited to a single fault and two dimensions, the underlying axiom and assumptions are still valid for multiple faults, and the error ellipses can be scaled to 3D error ellipsoids, if desired.

View this table:
TABLE 3

Sensor and fault configuration for Monte Carlo simulations

View this table:
TABLE 4

False alarm rate summary for SAARM simulations, Scenario 1 (I = 4)

FIGURE 2

Example SAARM GPZ: No fault present. In this example, there is no fault induced into any of the four sensors, and no fault has been detected (all entries in s are zero), which is shown to the user as a green GPZ. All filters are uncorrupted and the main filter is consistent. The GPZ is comprised of the union of the 95% position error ellipses from all filters and contains the true position at least 95% of the time

FIGURE 3

Example SAARM GPZ: Undetected fault. In this example, a 1.0 [m/s] bias is affecting the VEL1 sensor, but no fault has been detected yet (all entries in s are still zero), which is shown to the user as a green GPZ. All filters except Filter 1 are corrupted and potentially inconsistent. The GPZ is comprised of the union of the 95% position error ellipses from all filters, and it is guaranteed to contain the true position at least 95% of the time since one of the filters is guaranteed to be uncorrupted

FIGURE 4

Example SAARM GPZ: Unidentified culprit. In this example, a 3.0 [m/s] bias is affecting the VEL1 sensor, and a fault has been detected (at least one entry in s is non-zero), but no culprit has been identified (there is more than one zero entry in s), which is shown to the user as an orange GPZ. All filters except Filter 1 (“No VEL1”) are corrupted and the main filter is clearly inconsistent. The GPZ is comprised of the union of the 95% position error ellipses from all filters, and it is guaranteed to contain the true position at least 95% of the time since one of the filters is guaranteed to be uncorrupted

FIGURE 5

Example SAARM GPZ: Culprit identified. In this example, a 3.1 [m/s] bias is affecting the VEL1 sensor, a fault has been detected and the culprit has been identified (there is a single zero-entry in s), which is shown to the user as a red GPZ. All filters except Filter 1 (“No VEL1”) are corrupted, and the main filter is clearly inconsistent. The GPZ is comprised of the union of the 95% position error ellipses from all filters, and it is guaranteed to contain the true position at least 95% of the time since one of the filters is guaranteed to be uncorrupted. Immediately after this time step, the VEL1 sensor is taken offline, and a new set of filters is re-spawned from Filter 1

4 SIMULATION RESULTS

The proposed method was evaluated via a series of Monte Carlo simulations using two vehicles, each informed by the same set of all-source sensors described below and summarized in Table 3. The FDE function was provided by the proposed SAARM algorithm in the first vehicle while the normalized horizontal position solution separation test statistic as implemented in Young and Mcgraw (2003) was used for the second vehicle. For all simulations, the true system dynamics were driven by a 2D kinematic model given by

16 16

where xp is the vehicle’s 2D position in [m], xv is the 2D velocity in [m/s], xa is the 2D acceleration in [m/s2] and propagated by a FOGM process with time constant τa = 10 [s], and variance σa = 0.012 [m2/s4], making w(t) a 2D white Gaussian noise process with E[w(t)w(t + τ)T] = Qδ(τ) and

17 17

Each vehicle in the simulation was informed by two velocity sensors and two position sensors for a total of four (I = 4) sensors. Sensor 1 (“VEL1”) measurements were modeled as

18 18

19 19

and its update rate was set to 0.5 [s] or 2 [Hz]. Sensor 2 (“POS1”) measurements were modeled as

20 20

21 21

and its update rate was set to 1.0 [s] or 1 [Hz]. Sensor 3 (“VEL2”) measurements were modeled as

22 22

23 23

and its update rate was set to 1.5 [s] or 0.67 [Hz], and finally, Sensor 4 (“POS2”) measurements were modeled as

24 24

25 25

and its update rate was set to 2.0 [s] or 0.5 [Hz]. It is important to note here the different update rates among the sensors will lead to a different number of samples (M) and Chi-Square degrees of freedom captured for each sensor within the fixed SAARM monitoring time window discussed below.

The fault-free (H0) condition was characterized using 10,000 Monte Carlo trials where no faults were introduced into the system and all measurements were drawn from their modeled distributions. Next, two fault-present (H1) scenarios where simulated, each using additional 10,000 Monte Carlo trials. The two fault-present scenarios are summarized in Table 3. In the first scenario, measurements from Sensor 1 (“VEL1”) were corrupted with an unmodeled constant 1.0 [m/s] x-velocity bias, starting at tk = 40 [s]. In the second scenario, the measurements from Sensor 1 (“VEL1”) sensor were corrupted by scaling R[1] by a factor of 2×, again starting at tk = 40 [s], and without specifying the change in the filter measurement model.

For each trial, the initial state estimation error covariance matrix was set to

26 26

while the initial state estimate, Graphic, was set to zeros, and the true initial state was drawn from a Graphic distribution. Each trial was propagated using Δtk = 0.5 [s], starting at tk = 0 [s]. The SAARM monitoring period was set to 30 [s], yielding M = 60 for Sensor 1, M = 30 for Sensor 2, M = 20 for Sensor 3, and M = 15 for Sensor 4 in Equation (9). All trials were terminated at tk = 70 [s], at which point both the SAARM and solution separation tests statistics were recorded. Here, it is important to note that at the time each trial was terminated, the associated SAARM test matrix, T, was based on the sum of residual-based Chi-Square samples over the trailing 30 [s] (i.e., the monitoring period), while the normalized solution separation test statistic was based on the single solution-separation vector associated with the final measurement update prior to termination.

Figure 6 compares the detection performance between the SAARM and normalized solution separation test statistics for Scenario 1 (1.0 [m/s] bias). Meanwhile, Figures 7 and 8 illustrate the test statistic distributions for the H0 and H1 conditions in the SAARM T matrix and each solution separation subfilter, respectively. As shown, in the case of a bias-type fault, both SAARM and normalized solution separation provide useful detection performance, with solution separation outperforming SAARM by approximately 17% at the optimal Neyman-Pearson (Kay, 1993) detection point. The results from Figure 7 indicate SAARM detection performance was mostly derived from the VEL1 vs. subfilter 3 (“No VEL2”) residual test statistic. Meanwhile, the results from Figure 8 indicate solution separation detection performance was mostly derived from the subfilter 1 (“No VEL1”) solution separation test statistic.

FIGURE 6

Detection performance comparison, SAARM vs. Sol. Sep., 1.0 [m/s] bias. Though both methods appear to provide useful detection performance, the solution separation detector outperforms SAARM by approximately 17% at the optimal Neyman-Pearson detection point

FIGURE 7

Chi-Square test statistic distributions in the SAARM T matrix. The expected H0 degrees of freedom are derived using Equation (9) and are based on the sensor update rate, measurement dimension, and the SAARM monitoring time window. In this case, the residual test statistic formed between VEL1 and subfilter 3 (“No VEL2”) provided the best detection performance

FIGURE 8

Chi-Square solution separation test statistic distributions for each of the required subfilters. The expected H0 degrees of freedom (i.e., 2) are based on testing the 2D horizontal position solution separation vector. In this case, subfilter 1 (“No VEL1”) provided the best detection performance

Figure 9 compares the detection performance between the SAARM and normalized solution separation test statistics for Scenario 2 (2 × R scale). Similarly, Figures 10 and 11 illustrate the test statistic distributions for the H0 and H1 conditions in the SAARM T matrix and each solution separation subfilter, respectively. In the case of a wrongly stated covariance matrix fault type, both SAARM and normalized solution separation provide some detection performance; however, in this case, the solution separation detection performance was marginal, with SAARM outperforming it by approximately 81% at the optimal detection point. Again, the results from Figure 10 indicate SAARM detection performance was mostly derived from the VEL1 vs. subfilter 3 (“No VEL2”) residual test statistic, while the results from Figure 8 indicate solution separation detection performance was mostly derived from the subfilter 1 (“No VEL1”) solution separation test statistic.

FIGURE 9

Detection performance comparison, SAARM vs. Sol. Sep., 2 × R scale. The solution separation detector performs marginally and is outperformed by SAARM by approximately 81% at the optimal Neyman-Pearson detection point

FIGURE 10

Chi-Square test statistic distributions in the SAARM T matrix. The expected H0 degrees of freedom are derived using Equation (9) and are based on the sensor update rate, measurement dimension, and the SAARM monitoring time window. In this case, the residual test statistic formed between VEL1 and subfilter 3 (“No VEL2”) provided the only detection performance, which was marginal

FIGURE 11

Chi-Square solution separation test statistic distributions for each of the required subfilters. The expected H0 degrees of freedom (i.e., 2) are based on testing the 2D horizontal position solution separation vector. In this case, subfilter 1 (“No VEL1”) provided the best detection performance

Table 5 summarizes the performance of the proposed GPZ error bound. Each row of Table 5 is based on 10,000 Monte Carlo trials and displays the percentage of time the true horizontal position was contained within the 95% error bound derived from either the main filter position error covariance matrix or the SAARM GPZ. As shown, the 95% error bound derived from the main filter position error covariance matrix underestimated the true error in both fault-present scenarios. Meanwhile, the 95% error bound derived from the SAARM GPZ successfully bounded the true error at least 95% of the time, as expected.

View this table:
TABLE 5

Horizontal position error consistency, 95% error bounds

The results from the above scenarios indicate fault types that systematically corrupt the state-space estimate (namely the position estimate) of the subfilters that are more easily detected via solution separation methods than via sensor-specific (velocity in this case) residual test statistics. In contrast, fault types that do not systematically affect the subfilter solutions are not easily detected via solution separation methods, while sensor-specific residual test statistics provide excellent detection performance. It is also important to note that based on these two cases, the SAARM approach generally provides the best detection performance if the fault type is unknown or not constrained to a bias, which exactly meets the original research objective laid out in Section 1. Additionally, though not shown here for brevity, a third set of Monte-Carlo trials with a larger velocity bias (2.0 [m/s]) led to identical detection performance between SAARM and solution separation (i.e., when the bias is large enough, both detectors perform equally). The motivation for using the proposed method is also highlighted by the ability to provide performance guarantees without a fault-present model via the proposed GPZ.

5 CONCLUSIONS

This research has proposed a novel method for fault detection and exclusion in all-source navigation systems. The proposed method, referred to as Sensor-Agnostic All-source Residual Monitoring, was shown to provide reliable FDE in cases where the fault type is unknown or unconstrained, such as when using emerging all-source sensors with measurement models that are not well understood. When compared to traditional filtered approaches such as normalized solution separation, the proposed method was shown to be approximately 17% less sensitive to detecting bias-type faults, but outperformed such methods by approximately 81% when the fault was defined by changes in covariance. Finally, the mechanization of the proposed method lended itself to a practical and robust guarantee of performance for the user in the form of a so-called Guaranteed Position Zone. The proposed GPZ ellipsoid error bound was shown to contain the true vehicle position within the specified error rate during all phases of the FDE process and without the need to constrain the fault-present distribution or probability of missed detection. This research directly enables self-correcting plug-and-play open architecture navigation systems as well as APNT in the challenging application of all-source multi-domain navigation. Future work in this area will focus on experimental results using real-world data, further modeling and simulation in the case of multiple simultaneous faults, and research into how to compare the GPZ ellipsoid to a predetermined alert limit in order to warn users of degraded performance.

HOW TO CITE THIS ARTICLE

Jurado J, Raquet J, Schubert Kabban CM, Gipson J. Residual-based multi-filter methodology for all-source fault detection, exclusion, and performance monitoring. NAVIGATION. 2020;67: 493–509. https://doi.org/10.1002/navi.384

DISCLAIMER

The views expressed in this paper are those of the authors, and do not reflect the official policy or position of the United States Air Force, Department of Defense, or the U.S. Government.

Footnotes

  • At the time of writing, he was a Ph.D. student at the Air Force Institute of Technology, OH 45324, USA.

  • At the time of writing, he was a Professor of Electrical Engineering at the Air Force Institute of Technology, OH 45324, USA.

  • Funding information

    Air Force Materiel Command

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

REFERENCES

  1. Bageshwar, V. L., Gebre-Egziabher, D., Garrard, W. L., & Georgiou, T. T. (2009). Stochastic observability test for discrete-time kalman filters. Journal of Guidance, Control, and Dynamics, 32(4), 13561370. https://doi.org/10.2514/1.38128
  2. Bhatti, U. I. (2006). An improved sensor level integrity algorithm for GPS/INS integrated system. In Procceding of ION GNSS 19th International Technical Meeting, pp. 30123023. Retrieved from https://www.ion.org/publications/abstract.cfm?articleID=7063
  3. Bhatti, U. I., Ochieng, W. Y., & Feng, S. (2007a). Integrity of an integrated GPS/INS system in the presence of slowly growing errors. Part I: A critical review. GPS Solutions, 11(3), 173181. https://doi.org/10.1007/s10291-006-0048-2
  4. Bhatti, U. I., Ochieng, W. Y., & Feng, S. (2007b). Integrity of an integrated GPS/INS system in the presence of slowly growing errors. Part II: Analysis. GPS Solutions, 11(3), 183192. https://doi.org/10.1007/s10291-006-0049-1
  5. Bonferroni, C. E. (1936). Teoria statistica delle classi e calcolo delle probabilita. Libreria internazionale Seeber.
  6. Brenner, M. (1990). Implementation of a RAIM monitor in a GPS receiver and an integrated GPS/IRS. In Proceedings of the 3rd International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GPS 1990), pp. 397406. Retrieved from https://www.ion.org/publications/abstract.cfm?articleID=5044
  7. Brenner, M. (1996). Integrated GPS/inertial fault detection availability. NAVIGATION, 43(2), 111130. https://doi.org/10.1002/j.2161-4296.1996.tb01920.x
  8. Brumback, B., & Srinath, M. (1987). A chi-square test for fault-detection in kalman filters. IEEE Transactions on Automatic Control, 32(6), 552554. https://doi.org/10.1109/TAC.1987.1104658
  9. Call, C., Ibis, M., McDonald, J., & Vanderwerf, K. (2006). Performance of Honeywell’s Inertial/GPS Hybrid (HIGH) for RNP operations. In 2006 IEEE/ION Position, Location, And Navigation Symposium, p. 244. https://doi.org/10.1109/PLANS.2006.1650610
  10. Canciani, A., & Raquet, J. (2016). Absolute positioning using the earth’s magnetic anomaly field. NAVIGATION, 63(2), 111126. https://doi.org/10.1002/navi.138
  11. Casella, G., & Berger, R. L. (2002). Statistical inference (2nd ed.). Pacific Grove, CA: Duxbury.
  12. Curro, J., & Raquet, J. (2016). Navigation using VLF environmental features. In 2016 IEEE/ION Position, Location and Navigation Symposium (PLANS), pp. 373379. https://doi.org/10.1109/PLANS.2016.7479723
  13. De Maesschalck, R., Jouan-Rimbaud, D., & Massart, D. L. (2000). The mahalanobis distance. Chemometrics and Intelligent Laboratory Systems, 50(1), 118. https://doi.org/10.1016/S0169-7439(99)00047-7
  14. Goeman, J. J., & Solari, A. (2014). Multiple hypothesis testing in genomics. Statistics in Medicine, 33(11), 19461978. https://doi.org/10.1002/sim.6082
  15. Grejner-Brzezinska, D. A., Toth, C. K., Moore, T., Raquet, J. F., Miller, M. M., & Kealy, A. (2016). Multisensor navigation systems: A remedy for GNSS vulnerabilities? Proceedings of the IEEE, 104(6), 13391353. https://doi.org/10.1109/JPROC.2016.2528538
  16. Joerger, M., Chan, F.-C., & Pervan, B. (2014). Solution separation versus residual-based raim. NAVIGATION, 61(4), 273291. https://doi.org/10.1002/navi.71
  17. Jurado, J., Raquet, J., & Schubert Kabban, C. M. (2019). Autonomous and resilient management of all-source sensors for navigation. In Proceedings of the ION 2019 Pacific PNT Meeting, pp. 142159. https://doi.org/10.33012/2019.16800
  18. Jurado, J. D., Raquet, J. F., & Schubert Kabban, C. M. (2020). Single-filter finite fault detection and exclusion methodology for real-time validation of plug-and-play sensors. IEEE Transactions on Aerospace and Electronic Systems. Advance online publication. https://doi.org/10.1109/TAES.2020.3010394
  19. Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 3545. Retrieved from https://www.cs.unc.edu/∼welch/kalman/media/pdf/Kalman1960.pdf
  20. Kay, S. M. (1993). Fundamentals of statistical signal processing, volume I: Estimation theory. Upper Saddle River, NJ: Prentice Hall.
  21. Kay, S. M. (1998). Fundamentals of statistical signal processing, volume II: Detection theory. Upper Saddle River, NJ: Prentice Hall.
  22. Kerr, T. (1980). Statistical analysis of a two-ellipsoid overlap test for real-time failure detection. IEEE Transactions on Automatic Control, 25(4), 762773. https://doi.org/10.1109/TAC.1980.1102423
  23. Lee, Y. C., (1986). Analysis of range and position comparison methods as a means to provide GPS integrity in the user receiver. In Proceedings of the 42nd Annual Meeting of the Institute of Navigation, pp. 14. Retrieved from https://www.ion.org/publications/abstract.cfm?articleID=12197
  24. Maybeck, P. S. (1982). Stochastic models, estimation, and control: Volume 1. New York, NY: Academic Press.
  25. Maybeck, P. S. (1984). Stochastic models, estimation, and control: Volume 2. New York, NY: Academic Press.
  26. Parkinson, B. W., & Axelrad, P. (1988). Autonomous GPS integrity monitoring using the pseudorange residual. NAVIGATION, 35(2), 255274. https://doi.org/10.1002/j.2161-4296.1988.tb00955.x
  27. Sturza, M. A. (1988). Navigation system integrity monitoring using redundant measurements. NAVIGATION, 35(4), 483501. https://doi.org/10.1002/j.2161-4296.1988.tb00975.x
  28. van Graas, F., & Farrell, J. L. (1993). Baseline fault detection and exclusion algorithm. In Proceedings of the 49th Annual Meeting of The Institute of Navigation, pp. 413420. Retrieved from https://www.ion.org/publications/abstract.cfm?articleID=4504
  29. Van Loan, C. F. (1978). Computing integrals involving the matrix exponential. IEEE Transactions on Automatic Control, 23(3), 395404. https://doi.org/10.1109/TAC.1978.1101743
  30. Venable, D. T. (2016). Improving real world performance of vision aided navigation in a flight environment (PhD dissertation). Air Force Institute of Technology WPAFB.
  31. Wan, E. A., & Van Der Merwe, R. (2000). The unscented kalman filter for nonlinear estimation. In Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373), pp. 153158. https://doi.org/10.1109/ASSPCC.2000.882463
  32. Young, R. S., & Mcgraw, G. A. (2003). Fault detection and exclusion using normalized solution separation and residual monitoring methods. NAVIGATION, 50(3), 151169. https://doi.org/10.1002/j.2161-4296.2003.tb00326.x
Loading
Loading
Loading
Loading