Skip to main content

Main menu

  • Home
  • Current Issue
  • Archive
  • About Us
    • About NAVIGATION
    • Editorial Board
    • Peer Review Statement
    • Open Access
  • More
    • Email Alerts
    • Info for Authors
    • Info for Subscribers
  • Other Publications
    • ion

User menu

  • My alerts

Search

  • Advanced search
NAVIGATION: Journal of the Institute of Navigation
  • Other Publications
    • ion
  • My alerts
NAVIGATION: Journal of the Institute of Navigation

Advanced Search

  • Home
  • Current Issue
  • Archive
  • About Us
    • About NAVIGATION
    • Editorial Board
    • Peer Review Statement
    • Open Access
  • More
    • Email Alerts
    • Info for Authors
    • Info for Subscribers
  • Follow ion on Twitter
  • Visit ion on Facebook
  • Follow ion on Instagram
  • Visit ion on YouTube
Research ArticleOriginal Article
Open Access

Data-driven protection levels for camera and 3D map-based safe urban localization

Shubh Gupta and Grace Gao
NAVIGATION: Journal of the Institute of Navigation September 2021, 68 (3) 643-660; DOI: https://doi.org/10.1002/navi.445
Shubh Gupta
Stanford University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Grace Gao
Stanford University
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: [email protected]
  • Article
  • Figures & Data
  • Supplemental
  • References
  • Info & Metrics
  • PDF
Loading

Abstract

Reliably assessing the error in an estimated vehicle position is integral for ensuring the vehicle’s safety in urban environments. Many existing approaches use GNSS measurements to characterize protection levels (PLs) as probabilistic upper bounds on position error. However, GNSS signals might be reflected or blocked in urban environments, and thus additional sensor modalities need to be considered to determine PLs. In this paper, we propose an approach for computing PLs by matching camera image measurements to a LiDAR-based 3D map of the environment. We specify a Gaussian mixture model probability distribution of position error using deep neural-network-based data-driven models and statistical outlier weighting techniques. From the probability distribution, we compute PL by evaluating the position error bound using numerical line-search methods. Through experimental validation with real-world data, we demonstrate that the PLs computed from our method are reliable bounds on the position error in urban environments.

KEYWORDS
  • deep learning
  • image registration
  • integrity monitoring
  • localization safety
  • protection level
  • vision-based localization

1 INTRODUCTION

In recent years, research on autonomous navigation for urban environments has been garnering increasing attention. Many publications have targeted different aspects of navigation such as route planning (Delling et al., 2017), perception (Jensen et al., 2016), and localization (Caselitz et al., 2016; Wolcott & Eustice, 2017). For trustworthy operation in each of these aspects, assessing the level of safety of the vehicle from potential system failures is critical. However, few works have examined the problem of safety quantification for autonomous vehicles.

In the context of satellite-based localization, safety is typically addressed via integrity monitoring (IM) (Spilker Jr. et al., 1996). Within IM, protection levels (PLs) specify a statistical upper bound on the error in an estimated position of the vehicle, which can be trusted to enclose the position errors with a required probabilistic guarantee. To detect an unsafe estimated vehicle position, these protection levels are compared with the maximum allowable position error value, known as the alarm limit.

Various methods (Cezón et al., 2013; Jiang & Wang, 2016; Tran & Lo Presti, 2019) have been proposed over the years for computing protection levels, however, most of these approaches focus on GNSS-only navigation. These approaches do not directly apply to GNSS-denied urban environments, where visual sensors are becoming increasingly preferred (Badue et al., 2021). Although various options in visual sensors exist in the market, camera sensors are inexpensive, lightweight, and have been widely employed in industry. For quantifying localization safety in GNSS-denied urban environments, there is thus a need to develop new ways of computing protection levels using camera image measurements.

Since protection levels are bounds over position error, computing them from camera image measurements requires a model that relates the measurements to position error in the estimate of the vehicle location. Furthermore, since the lateral, longitudinal, and vertical directions are well-defined with respect to a vehicle’s location on the road, the model must estimate the maximum position error in each of these directions for computing protection levels (Reid et al., 2019). However, characterizing such a model is not so straightforward. This is because the relation between a vehicle location in an environment and the corresponding camera image measurement is complex, depending on identifying and matching structural patterns in the measurements with prior known information about the environment (Caselitz et al., 2016; Kim et al., 2018; Taira et al., 2021; Wolcott & Eustice, 2017).

Recently, data-driven techniques based on deep neural networks (DNN) have demonstrated state-of-the-art performance in determining the state of the camera sensor, comprised of its position and orientation, by identifying and matching patterns in images with a known map of the environment (Cattaneo et al., 2019; Lyrio et al., 2015; Oliveira et al., 2020) or an existing database of images (Sarlin et al., 2019; Taira et al., 2021).

By leveraging data sets consisting of multiple images with known camera states in an environment, these approaches can train a DNN to model the relationship between an image and the corresponding state. However, the model characterized by the DNN can often be erroneous or brittle. For instance, recent research has shown that the output of a DNN can change significantly with minimal changes to the inputs (Recht et al., 2019). Thus, for using DNNs to determine position error, uncertainty in the output of the DNN must also be addressed.

DNN-based algorithms consider two types of uncertainty (Kendall & Gal, 2017; Loquercio et al., 2020). Aleatoric or statistical uncertainty results from the noise present in the inputs to the DNN, because of which a precise output cannot be produced. For camera image inputs, sources of noise include illumination changes, occlusion, or the presence of visually ambiguous structures, such as windows tessellated along a wall (Kendall & Gal, 2017). On the other hand, epistemic or systematic uncertainty exists within the model itself. Sources of epistemic uncertainty include poorly determined DNN model parameters as well as external factors that are not considered in the model (Kiureghian & Ditlevsen, 2009), such as environmental features which might be ignored by the algorithm while matching the camera images to the environment map.

While aleatoric uncertainty is typically modeled as the input-dependent variance in the output of the DNN (Kendall & Gal, 2017; McAllister et al., 2017; Yang et al., 2020), epistemic uncertainty relates to the DNN model and, therefore, requires further deliberation. Existing approaches approximate epistemic uncertainty by assuming a probability distribution over the weight parameters of the DNN to represent ignorance about the correct parameters (Blundell et al., 2015; Gal & Ghahramani, 2016; Kendall & Cipolla, 2016).

However, these approaches assume that a correct value of the parameters exists and that the probability distribution over the weight parameters captures the uncertainty in the model, both of which do not necessarily hold up in practice (Smith & Gal, 2018). This inability of existing DNN-based methods to properly characterize uncertainty limits their applicability to safety-critical applications, such as the localization of autonomous vehicles.

In this paper, we propose a novel method for computing protection levels associated with a given vehicular state estimate (position and orientation) from camera image measurements and a 3D map of the environment. This work is based on our recent ION GNSS+ 2020 conference paper (Gupta & Gao, 2020) and includes additional experiments and improvements to the DNN training process.

Recently, high-definition 3D environment maps in the form of LiDAR point clouds have become increasingly available through industry players such as HERE, TomTom, Waymo, and NVIDIA, as well as through projects such as USGS 3DEP (Lukas & Stoker, 2016) and OpenTopography (Krishnan et al., 2011). Furthermore, LiDAR-based 3D maps are more robust to noise from environmental factors, such as illumination and weather, than image-based maps (Wang et al., 2020). Hence, we use LiDAR-based 3D point cloud maps in our approach.

Previously, CMRNet (Cattaneo et al., 2019) has been proposed as a DNN-based approach for determining the vehicular state from camera images and a LiDAR-based 3D map. In our approach, we extend the DNN architecture proposed in Cattaneo et al. (2019) to model the position error and the covariance matrix (aleatoric uncertainty) in the vehicular state estimate.

To assess the epistemic uncertainty in position error, we evaluate the DNN position error outputs at multiple candidate states in the vicinity of the state estimate, and combine the outputs into samples of the state estimate position error. Figure 1 shows the architecture of our proposed approach.

Given a state estimate, we first select multiple candidate states from its neighborhood. Using the DNN, we then evaluate the position error and covariance for each candidate state by comparing the camera image measurement with a local map constructed from the candidate state and 3D environment map. Next, we linearly transform the position error and covariance outputs from the DNN with relative positions of candidate states into samples of the state estimate position error and variance. We then separate these samples into the lateral, longitudinal, and vertical directions and weight the samples to mitigate the impact of outliers in each direction. Subsequently, we combine the position error samples, outlier weights, and variance samples to construct a Gaussian mixture model probability distribution of the position error in each direction, and numerically evaluate its intervals to compute protection levels.

FIGURE 1
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 1

Architecture of our proposed approach for computing protection levels. Given a state estimate, multiple candidate states are selected from its neighborhood and the corresponding position error and the covariance matrix for each candidate state are evaluated using the DNN. The position errors and covariance are then linearly transformed to obtain samples of the state estimate position error and variance, which are then weighted to determine outliers. Finally, the position error samples, outlier weights, and variance are combined to construct a Gaussian mixture model probability distribution, from which the lateral, longitudinal, and vertical protection levels are computed through numerical evaluation of its probability intervals

Our main contributions are as follows:

  1. We extend the CMRNet (Cattaneo et al., 2019) architecture to model both the position error in the vehicular state estimate and the associated covariance matrix. Using the 3D LiDAR-based map of the environment, we first construct a local map representation with respect to the vehicular state estimate. Then, we use the DNN to analyze correspondence between the camera image measurement and the local map for determining the position error and the covariance matrix

  2. We develop a novel method for capturing epistemic uncertainty in the DNN position error output. Unlike existing approaches which assume a probability distribution over DNN weight parameters, we directly analyze different position errors that are determined by the DNN for multiple candidate states selected from within a neighborhood of the state estimate. The position error outputs from the DNN corresponding to the candidate states are then linearly combined with the candidate states’ relative position from the state estimate to obtain an empirical distribution of the state estimate position error

  3. We design an outlier weighting scheme to account for possible errors in the DNN output at inputs that differ from the training data. Our approach weighs the position error samples from the empirical distribution using a robust outlier detection metric known as a robust Z-score (Iglewicz & Hoaglin, 1993), along the lateral, longitudinal, and vertical directions individually

  4. We construct the lateral, longitudinal, and vertical protection levels as intervals over the probability distribution of the position error. We model this probability distribution as a Gaussian mixture model (Lindsay, 1995) from the position error samples, DNN covariance, and outlier weights

  5. We demonstrate the applicability of our approach in urban environments by experimentally validating the protection levels computed from our method using real-world data with multiple camera images and different state estimates

The remainder of this paper is structured as follows: Section 2 discusses related work; Section 3 formulates the problem of estimating protection levels; Section 4 describes the two types of uncertainties considered in our approach; Section 5 details our algorithm; Section 6 presents the results from experimentation with real-world data; and we conclude the paper in Section 7.

2 RELATED WORK

Several methods have been developed over the years which characterize protection levels in the context of GNSS-based urban navigation. Jiang and Wang (2016) computed horizontal protection levels using an iterative search-based method and test statistic based on the bivariate normal distribution. Cezón et al. (2013) analyzed methods which utilize the isotropy of residual vectors from the least-squares position estimation to compute the protection levels. Tran and Lo Presti (2019) combined advanced receiver autonomous integrity monitoring (ARAIM) with Kalman filtering, and computed the protection levels by considering the set of position solutions which arise after excluding faulty measurements.

These approaches compute the protection levels by deriving the mathematical relation between measurement and position domain errors. However, such a relation is difficult to formulate with camera image measurements and a LiDAR-based 3D map, since the position error in this case depends on various factors such as the structure of buildings in the environment, available visual features, and illumination levels.

Previous works have proposed IM approaches for LiDAR- and camera-based navigation where the vehicle is localized by associating identified landmarks with a stored map or a database. Joerger and Pervan (2019) developed a method to quantify integrity risk for LiDAR-based navigation algorithms by analyzing failures of feature extraction and data association subroutines. Zhu et al. (2020) derived a bound on the integrity risk in camera-based navigation using EKF caused by incorrect feature associations.

However, these IM approaches have been developed for localization algorithms based on data association and cannot be directly applied to many recent camera and LiDAR-based localization techniques which use deep learning to model the complex relation between measurements and the stored map or database. Furthermore, these IM techniques do not estimate protection levels, which are the focus of our work.

Deep learning has been widely applied to determine position information from camera images. Kendall et al. (2015) trained a DNN using images from a single environment to learn the relationship between image and the camera 6-DOF pose. Taira et al. (2021) learned image features using a DNN to apply feature extraction and matching techniques to estimate the 6-DOF camera pose relative to a known 3D map of the environment. Sarlin et al. (2019) developed a deep learning-based 2D-3D matching technique to obtain a 6-DOF camera pose from images and a 3D environment model. However, these approaches do not model the corresponding uncertainty associated with the estimated camera pose, or account for failures in DNN approximation (Smith & Gal, 2018), which is necessary for characterizing safety measures such as protection levels.

Some recent works have proposed to estimate the uncertainty associated with deep learning algorithms. Kendall and Cipolla (2016) estimate the uncertainty in DNN-based camera pose estimation from images by evaluating the network multiple times through dropout (Gal & Ghahramani, 2016). Loquercio et al. (2020) propose a general framework for estimating uncertainty in deep learning as variance computed from both aleatoric and epistemic sources. McAllister et al. (2017) suggest using Bayesian deep learning to determine uncertainty and quantify safety in autonomous vehicles by placing probability distributions over DNN weights to represent the uncertainty in the DNN model. Yang et al. (2020) jointly estimate the vehicle odometry, scene depth, and uncertainty from sequential camera images.

However, the uncertainty estimates from these algorithms do not take into account the inaccuracy of the trained DNN model, or the influence of the underlying environment structure on the DNN outputs. In our approach, we evaluate the DNN position error outputs at inputs corresponding to multiple states in the environment, and utilize these position errors for characterizing uncertainty both from inaccuracy in the DNN model as well as from the environment structure around the state estimate.

To the best of our knowledge, our approach is the first that applies data-driven algorithms for computing protection levels by characterizing the uncertainty from different error sources. The proposed method seeks to leverage the high-fidelity function modeling capability of DNNs and combine it with techniques from robust statistics and integrity monitoring to compute robust protection levels using camera image measurements and 3D maps of the environment.

3 PROBLEM FORMULATION

Consider the scenario of a vehicle navigating in an urban environment using measurements acquired by an onboard camera. The 3D LiDAR map of the environment ℳ that consists of points p ∈ ℝ3 is assumed to be pre-known from either openly available repositories (Krishnan et al., 2011; Lukas & Stoker, 2016) or simultaneous localization and mapping algorithms (Cadena et al., 2016).

The vehicular state st = [xt, ot] at time t is a seven-element vector comprising of its 3D position xt = [xt,yt,zt]T ∈ ℝ3 along x, y, and z dimensions as well as 3D orientation unit quaternion ot = [o1,t,o2,t,o3,t,o4,t] ∈ SU(2). The vehicle state estimates over time are denoted as Embedded Image where Tmax denotes the total time in a navigation sequence. At each time t, the vehicle captures an RGB camera image It ∈ ℝl × w × 3 from the onboard camera where l and w denote pixels along length and width dimensions, respectively.

Given an integrity risk specification IR, our objective is to compute the lateral protection level PLlat,t, longitudinal protection level PLlon,t, and vertical protection level PLvert,t at time t, which denote the maximal bounds on the position error magnitude with a probabilistic guarantee of at least 1 − IR. Considering x, y, and z dimensions in the rotational frame of the vehicle:

Embedded Image 1

where Embedded Image denotes the unknown true vehicle position at time t.

4 TYPES OF UNCERTAINTY IN POSITION ERROR

Protection levels for a state estimate st at time t depend on the uncertainty in determining the associated position error Δxt = [Δxt,Δyt,Δzt] between the state estimate position xt and the true position Embedded Image from the camera image It and the environment map ℳ. We consider two different kinds of uncertainty, which are categorized by the source of inaccuracy in determining the position error Δxt: aleatoric uncertainty and epistemic uncertainty.

4.1 Aleatoric uncertainty

Aleatoric uncertainty refers to the uncertainty from noise present in the camera image measurements It and the environment map ℳ, due to which a precise value of the position error Δxt cannot be determined. Existing DNN-based localization approaches model the aleatoric uncertainty as a covariance matrix with only diagonal entries (Kendall & Gal, 2017; McAllister et al., 2017; Yang et al., 2020) or with both diagonal and off-diagonal terms (Liu et al., 2018; Russell & Reale, 2019).

Similar to the existing approaches, we characterize the aleatoric uncertainty by using a DNN to model the covariance matrix Σt associated with the position error Δxt. We consider both nonzero diagonal and off-diagonal terms in Σt to model the correlation between x-, y-, and z-dimension uncertainties, such as along the ground plane.

Aleatoric uncertainty by itself does not accurately represent the uncertainty in determining position error. This is because aleatoric uncertainty assumes that the noise present in training data also represents the noise in all future inputs and the DNN approximation is error-free. These assumptions fail in scenarios when the input at evaluation time is different from the training data or when the input contains features that occur rarely in the real world (Smith & Gal, 2018). Thus, relying purely on aleatoric uncertainty can lead to overconfident estimates of the position error uncertainty (Kendall & Gal, 2017).

4.2 Epistemic uncertainty

Epistemic uncertainty relates to the inaccuracies in the model for determining the position error Δxt. In our approach, we characterize the epistemic uncertainty by leveraging a geometrical property of the position error Δxt, where for the same camera image It, Δxt can be obtained by linearly combining the position error Embedded Image computed for any candidate state Embedded Image and the relative position of Embedded Image from the state estimate st (Figure 2). Hence, using known relative positions and orientations of NC candidate states Embedded Image from st, we transform the different position errors Embedded Image determined for the candidate states into samples of the state estimate position error Δxt. The empirical distribution comprised of these position error samples characterizes the epistemic uncertainty in the position error estimated using the DNN.

FIGURE 2
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 2

Position error Δxt in the state estimate position xt is a linear combination of the position error Embedded Image in position Embedded Image of any candidate state Embedded Image and the relative position vector between Embedded Image

5 DATA-DRIVEN PROTECTION LEVELS

This section details our algorithm for computing data-driven protection levels for the state estimate st at time t, using the camera image It and environment map ℳ. First, we describe the method for generating local representations of the 3D environment map ℳ with respect to the state estimate st. Then, we illustrate the architecture of the DNN. Next, we discuss the loss functions used in DNN training. We then detail the method for selecting multiple candidate states from the neighborhood of the state estimate st.

Using the position errors and covariance matrix evaluated from the DNN for each of these candidate states, we then illustrate the process for transforming the candidate state position errors into multiple samples of the state estimate position error. To mitigate the impact of outliers on the computed position error samples in each of the lateral, longitudinal, and vertical directions, we then detail the procedure for computing outlier weights. Next, we characterize the probability distribution over position error in lateral, longitudinal, and vertical directions. Finally, we detail the approach for determining protection levels from the probability distribution by numerical methods.

5.1 Local map construction

A local representation of the 3D LiDAR map of the environment captures the environment information in the vicinity of the state estimate st at time t. By comparing the environment information captured in the local map with the camera image It ∈ ℝl × w × 3 using a DNN, we estimate the position error Δxt and covariance Σt in the state estimate st.

For computing local maps, we utilize the LiDAR-image generation procedure described in Cattaneo et al. (2019). Similar to their approach, we generate the local map L(s, ℳ) ∈ ℝl × w associated with vehicle state s and LiDAR environment map ℳ in two steps.

  1. First, we determine the rigid-body transformation matrix Hs in the special Euclidean group SE(3) corresponding to the vehicle state s:

    Embedded Image 2

    where

    • – Rs denotes the rotation matrix corresponding to the orientation quaternion elements o = [o1,o2,o3,o4] in the state s

    • – Ts denotes the translation vector corresponding to the position elements x = [x,y,z] in the state s

    Using the matrix Hs, we rotate and translate the points in the map ℳ to the map ℳs in the reference frame of the state s:

    Embedded Image 3

    where I denotes the identity matrix. For maintaining computational efficiency in the case of large maps, we use the points in the LiDAR map ℳs that lie in a subregion around the state s and in the direction of the vehicle orientation.

  2. In the second step, we apply the occlusion estimation filter presented in Pintus et al. (2011) to identify and remove occluded points along rays from the camera center. For each pair of points (p(i),p(j)) where p(i) is closer to the state s, p(j) is marked occluded if the angle between the ray from p(j) to the camera center and the line from p(j) to p(i) is less than the threshold. Then, the remaining points are projected to the camera image frame using the camera projection matrix K to generate the local depth map L(s, ℳ). The i-th point p(i) in ℳs is projected as:

    Embedded Image 4

    where

    • – px,py denote the projected 2D coordinates with scaling term c

    • – [L(s, ℳ)](px, py) denotes the (px,py) pixel position in the local map L(s, ℳ)

The local depth map L(s, ℳ) for state s visualizes the environment features that are expected to be captured in a camera image obtained from the state s. However, the obtained camera image It is associated with the true state Embedded Image that might be different from the state estimate st. Nevertheless, for reasonably small position and orientation differences between the state estimate st and true state Embedded Image, the local map L(s, ℳ) contains features that correspond with some of the features in the camera image It that we use to estimate the position error.

5.2 DNN architecture

We use a DNN to estimate the position error Δxt and associated covariance matrix Σt by implicitly identifying and comparing the positions of corresponding features in camera image It and the local depth map L(st, ℳ) associated with the state estimate st.

The architecture of our DNN is given in Figure 3. Our DNN is comprised of two separate modules: one for estimating the position error Δxt and other for the parameters of the covariance matrix Σt. The first module for estimating the position error Δxt is based on CMRNet (Cattaneo et al., 2019).

FIGURE 3
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 3

Architecture of our deep neural network (DNN) for estimating translation and rotation errors as well as parameters of the covariance matrix. The translation and rotation errors are determined using CMRNet (Cattaneo et al., 2019), and employs correlation layers (Dosovitskiy et al., 2015) for comparing feature representations of the camera image and the local depth map. Using a similar architecture, we design CovarianceNet which produces parameters of the covariance matrix associated with the translation error output

CMRNet was originally proposed as an algorithm to iteratively determine the position and orientation of a vehicle using a camera image and 3D LiDAR map, starting from a provided initial state. For determining position error Δxt using CMRNet, we use the state estimate st as the provided initial state and the corresponding DNN translation Embedded Image and rotation Embedded Image error output for transforming the state st towards the true state Embedded Image. Formally, given any state s and camera image It at time t, the translation error Embedded Image and rotation error Embedded Image are expressed as:

Embedded Image 5

CMRNet estimates the rotation error Embedded Image as a unit quaternion. Furthermore, the architecture determines both the translation error Embedded Image and rotation error Embedded Image in the reference frame of the state s. Since the protection levels depend on the position error Δx in the reference frame from which the camera image It is captured (the vehicle reference frame), we transform the translation error Embedded Image to the vehicle reference frame by rotating it with the inverse of Embedded Image:

Embedded Image 6

where Embedded Image is the 3 × 3 rotation matrix corresponding to the rotation error quaternion Embedded Image.

In the second module, we determine the covariance matrix Σ associated with Δx by first estimating the covariance matrix Embedded Image associated with the translation error Embedded Image obtained from CMRNet and then transforming it to the vehicle reference frame using Embedded Image.

We model the covariance matrix Embedded Image by following a similar approach to Russell & Reale (2019). Since the covariance matrix is both symmetric and positive-definite, we consider the decomposition of Embedded Image into diagonal standard deviations σ = [σ1,σ2,σ3] and correlation coefficients η = [η21,η31,η32]:

Embedded Image 7

where i, j ∈ {1, 2, 3} and j < i. We estimate these terms using our second DNN module (referred to as CovarianceNet) which has a similar network structure as CMR-Net, but with 256 and 6 artificial neurons in the last two fully connected layers to prevent overfitting.

For stable training, CovarianceNet produces a logarithm of the standard deviation output, which is converted to the standard deviation by then taking the exponent. Additionally, we use tanh function to scale the correlation coefficient outputs η in CovarianceNet between ±1. Formally, given a vehicle state s and camera image It at time t, while the standard deviation σ and correlation coefficients η is approximated as:

Embedded Image 8

Using the constructed Embedded Image from the obtained σ, η, we obtain the covariance matrix Σ associated with Δx as:

Embedded Image 9

We keep the aleatoric uncertainty restricted to position domain errors in this work for simplicity, and thus treat Embedded Image as a point estimate. The impact of errors in estimating Embedded Image on protection levels is taken into consideration as epistemic uncertainty and discussed in more detail in Sections 5.5 and 5.7.

The feature extraction modules in CovarianceNet and CMRNet are separate since the two tasks are complementary; for estimating position error, the DNN must learn features that are robust to noise in the inputs while the variance in the estimated position error depends on the noise itself.

5.3 Loss functions

The loss function for training the DNN must penalize position error outputs that differ from the corresponding ground truth present in the dataset, as well as penalize any covariance that overestimates or underestimates the uncertainty in the position error predictions. Furthermore, the loss must incentivize the DNN to extract useful features from the camera image and local map inputs for predicting the position error. Hence, we consider three additive components in our loss function Embedded Image(·):

Embedded Image 10

where:

  • – Embedded Image, Embedded Image denotes the vector-valued translation and rotation error in the reference frame of the state estimate s to the unknown true state s*

  • – Embedded ImageHuber(·) denotes the Huber loss function (Huber, 1992)

  • – Embedded ImageMLE(·) denotes the loss function for the maximum likelihood estimation of position error Δx and covariance Embedded Image

  • – Embedded ImageAng(·) denotes the quaternion angular distance from Cattaneo et al. (2019)

  • – αHuber,αMLE,αAng are coefficients for weighting each loss term

We employ the Huber loss Embedded ImageHuber(·) and quaternion angular distance Embedded ImageAng(·) terms from Cattaneo et al. (2019). The Huber loss term Embedded ImageHuber(·) penalizes the translation error output Embedded Image of the DNN:

Embedded Image 11

where δ is a hyperparameter for adjusting the penalty assignment to small error values. In this paper, we set δ = 1. Unlike the more common mean squared error, the penalty assigned to higher error values is linear in Huber loss instead of quadratic. Thus, Huber loss is more robust to outliers and leads to more stable training as compared to squared error. The quaternion angular distance term Embedded ImageAng(·) penalizes the rotation error output Embedded Image from CMRNet:

Embedded Image 12

where:

  • – qi denotes the i-th element in quaternion q

  • – Δr− 1 denotes the inverse of the quaternion Δr

  • – q × r here denotes element-wise multiplication of the quaternions q and r

  • – atan2(·) is the two-argument version of the arctangent function.

Including the quaternion angular distance term Embedded ImageAng(·) in the loss function incentivizes the DNN to learn features that are relevant to the geometry between the camera image and the local depth map. Hence, it provides additional supervision to the DNN training as a multi-task objective (Zeng & Ji, 2015), and is important for the stability and speed of the training process.

The maximum likelihood loss term Embedded ImageMLE(·) depends on both the translation error Embedded Image and covariance matrix Embedded Image estimated from the DNN. The loss function is analogous to the negative log-likelihood of the Gaussian distribution:

Embedded Image 13

If the covariance output from the DNN has small values, the corresponding translation error is penalized much more than the translation error corresponding to a large valued covariance. Hence, the maximum likelihood loss term Embedded ImageMLE(·) incentivizes the DNN to output small covariance only when the corresponding translation error output has high confidence, and otherwise output large covariance.

5.4 Multiple candidate state selection

To assess the uncertainty in the DNN-based position error estimation process as well as uncertainty from environmental factors, we evaluate the DNN output at NC candidate states Embedded Image in the neighborhood of the state estimate st.

For selecting the candidate states Embedded Image, we randomly generate multiple values of translation offset {t1, …,tNC} and rotation offset {r1, …,rNC} about the state estimate st, where NC is the total number of selected candidate states. The i-th translation offset ti ∈ ℝ3 denotes translation in x, y, and z dimensions and is sampled from a uniform probability distribution between a specified range ±tmax in each dimension.

Similarly, the i-th rotation offset ri ∈ SU(2) is obtained by uniformly sampling between ±rmax angular deviations about each axis and converting the resulting rotation to a quaternion. The i-th candidate state Embedded Image is generated by rotating and translating the state estimate st by ri and ti, respectively. Corresponding to each candidate state Embedded Image, we generate a local depth map Embedded Image using the procedure laid out in Section 5.1.

5.5 Linear transformation of position errors

Using each local depth map Embedded Image and camera image It for the i-th candidate state Embedded Image as inputs to the DNN in Section 5.2, we evaluate the candidate state position error Embedded Image and covariance matrix Embedded Image. From the known translation offset ti between the candidate state Embedded Image and the state estimate st and the DNN-based rotation error Embedded Imaget in st, we compute the transformation matrix Embedded Image for converting the candidate state position error Embedded Image to the state estimate position error Δxt in the vehicle reference frame:

Embedded Image 14

where I3 × 3 denotes the identity matrix and Embedded Image is the 3 × 3 rotation matrix computed from the DNN-based rotation error Embedded Image between the state estimate st and the unknown true state Embedded Image. Note that the rotation offset ri is not used in the transformation, since we are only concerned with the position errors from the true state Embedded Image to the state estimate st, which are invariant to the orientation of the candidate state Embedded Image. Using the transformation matrix Embedded Image, we obtain the i-th sample of the state estimate position error Embedded Image:

Embedded Image 15

We use parentheses in the notation Embedded Image for the transformed samples of the position error between the true state Embedded Image and the state estimate st to differentiate from the position error Embedded Image between Embedded Image and the candidate state Embedded Image. Next, we modify the candidate state covariance matrix Embedded Image to account for uncertainty in DNN-based rotation error Embedded Image. The resulting covariance matrix Embedded Image in terms of the covariance matrix Embedded Image for Embedded Image, Embedded Image and ti is:

Embedded Image 16

Assuming small errors in determining the true rotation offsets between state estimate st and the true state Embedded Image, we consider the random variable Embedded Image where R′ represents the random rotation matrix corresponding to small angular deviations (Barfoot et al., 2011). Using Embedded Image, we approximate the covariance matrix Embedded Image as:

Embedded Image 17

where Embedded Image represents the i-th row vector in R′ − I. Since errors in Embedded Image depend on the DNN output, we specify R′ through the empirical distribution of the angular deviations in Embedded Image as observed for the trained DNN on the training and validation data, and precompute the expectation Qi′j′ for each (i′,j′) pair.

The samples of state estimate position error Embedded Image represent both inaccuracy in the DNN estimation as well as uncertainties due to environmental factors.

If the DNN approximation fails at the input corresponding to the state estimate st, the estimated position errors at candidate states would lead to a wide range of different values for the state estimate position errors. Similarly, if the environment map ℳ near the state estimate st contains repetitive features, the position errors computed from candidate states would be different and hence indicate high uncertainty.

5.6 Outlier weights

Since the candidate states Embedded Image are selected randomly, some position error samples may correspond to the local depth map and camera image pairs for which the DNN performs poorly. Thus, we compute outlier weights Embedded Image corresponding to the position error samples Embedded Image to mitigate the effect of these erroneous position error values in determining the protection levels.

We compute outlier weights in each of the x, y, and z dimensions separately, since the DNN approximation might not necessarily fail in all of its outputs. An example of this scenario would be when the input camera image and local map contain features such as building edges that can be used to robustly determine errors along certain directions but not others.

For computing the outlier weights Embedded Image associated with the i-th position error value Embedded Image, we employ the robust Z-score-based outlier detection technique (Iglewicz & Hoaglin, 1993). The robust Z-score is used in a variety of anomaly detection approaches due to its resilience to outliers (Rousseeuw & Hubert, 2018). We apply the following operations in each dimension X = x,y, and z:

  1. We compute the median absolute deviation statistic (Iglewicz & Hoaglin, 1993) MADX using all position error values Embedded Image:

    Embedded Image 18

  2. Using the statistic MADX, we compute the robust Z-score Embedded Image for each position error value Embedded Image:

    Embedded Image 19

    The robust Z-score Embedded Image is high if the position error Δx(i) deviates from the median error with a large value when compared with the median deviation value.

  3. We compute the outlier weights Embedded Image from the robust Z-scores Embedded Image by applying the softmax operation (Goodfellow et al., 2016) such that the sum of weights is unity:

    Embedded Image 20

    where γ denotes the scaling coefficient in the softmax function. We set γ = 0.6745 as the approximate inverse of the standard normal distribution evaluated at 3/4 to make the scaling in the statistic consistent with the standard deviation of a normal distribution (Iglewicz & Hoaglin, 1993). A small value of outlier weight Embedded Image indicates that the position error Embedded Image is an outlier.

For brevity, we extract the diagonal variances associated with each dimension for all position error samples:

Embedded Image 21

5.7 Probability distribution of position error

We construct a probability distribution in each of the X = x, y, and z dimensions from the previously obtained samples of position errors Embedded Image, variances Embedded Image, and outlier weights Embedded Image. We model the probability distribution using the Gaussian mixture model (GMM) distribution (Lindsay, 1995):

Embedded Image 22

where:

  • – ρX,t denotes the position error random variable

  • – Embedded Image is the Gaussian distribution with mean π and variance σ2

The probability distributions ℙ(ρx,t), ℙ(ρy,t) and ℙ(ρz,t) incorporate both aleatoric uncertainty from the DNN-based covariance and epistemic uncertainty from the multiple DNN evaluations associated with different candidate states. Both the position error and covariance matrix depend on the rotation error point estimate from CMR-Net for transforming the error values to the vehicle reference frame.

Since each DNN evaluation for a candidate state estimates the rotation error independently, the epistemic uncertainty incorporates the effects of errors in DNN-based estimation of both rotation and translation. The epistemic uncertainty is reflected in the multiple GMM components and their weight coefficients, which represent the different possible position error values that may arise from the same camera image measurement and the environment map. The aleatoric uncertainty is present as the variance in each possible value of the position error is represented by the individual components.

5.8 Protection levels

We compute the protection levels along the lateral, longitudinal, and vertical directions using the probability distributions obtained in the previous section. Since the position errors are in the vehicle reference frame, the x, y, and z dimensions coincide with the lateral, longitudinal, and the vertical directions, respectively. First, we obtain the cumulative distribution function CDF(·) for each probability distribution:

Embedded Image 23

where Φ(·) is the cumulative distribution function of the standard normal distribution.

Then, for a specified value of the integrity risk IR, we compute the protection level PL in lateral, longitudinal, and vertical directions from Equation 1 using the CDF as the probability distribution. For numerical optimization, we employ a simple interval halving method for line search or the bisection method (Burden & Faires, 2011). To account for both positive and negative errors, we perform the optimization both using CDF (supremum) and 1 − CDF (infemum) with IR/2 as the integrity risk and use the maximum absolute value as the protection level.

The computed protection levels consider heavy-tails in the GMM probability distribution of the position error that arise because of the different possible values of the position error that can be computed from the available camera measurements and environment map. Our method computes large protection levels when many different values of position error may be equally probable from the measurements, resulting in larger tail probabilities in the GMM, and small protection levels only if the uncertainty from both aleatoric and epistemic sources is small.

6 EXPERIMENTAL RESULTS

6.1 Real-world driving dataset

We use the KITTI visual odometry dataset (Geiger et al., 2012) to evaluate the performance of the protection levels computed by our approach. The dataset was recorded around Karlsruhe, Germany, over multiple driving sequences and contains images recorded by multiple onboard cameras, along with ground truth positions and orientations.

Additionally, the dataset contains LiDAR point cloud measurements which we use to generate the environment map corresponding to each sequence. Since our approach for computing protection levels just requires a monocular camera sensor, we use the images recorded by the left RGB camera in our experiments. We use the sequences 00, 03, 05, 06, 07, 08, and 09 from the dataset based on the availability of a LiDAR environment map. We use sequence 00 for validation of our approach and the rest of the sequences are utilized in training our DNN. The experimental parameters are provided in Table 1.

View this table:
  • View inline
  • View popup
TABLE 1

Experimental parameters

6.2 LiDAR environment map

To construct a precise LiDAR point cloud map ℳ of the environment, we exploit the openly available position and orientation values for the dataset computed via simultaneous localization and mapping (Caselitz et al., 2016). Similar to Cattaneo et al. (2019), we aggregate the LiDAR point clouds across all time instances. Then, we detect and remove sparse outliers within the aggregated point cloud by computing the Z-score (Iglewicz & Hoaglin, 1993) of each point in a 0.1 m local neighborhood. We discarded the points which had a higher Z-score than 3. Finally, the remaining points are down sampled into a voxel map of the environment ℳ with resolution of 0.1 m. The corresponding map for sequence 00 in the KITTI dataset is shown in Figure 4. For storing large maps, we divide the LiDAR point cloud sequences into multiple overlapping parts and construct separate maps of roughly 500 megabytes each.

FIGURE 4
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 4

3D LiDAR environment map from KITTI dataset sequence 00 (Geiger et al., 2012)

6.3 DNN training and testing datasets

We generate the training dataset for our DNN in two steps. First, we randomly select a state estimate st at time t from within a 2 m translation and a 10° rotation of the ground truth positions and orientations in each driving sequence. The translation and rotation used for generating the state estimate is utilized as the ground truth position error Embedded Image and orientation error Embedded Image.

Then, using the LiDAR map ℳ, we generate the local depth map L(st, ℳ) corresponding to the state estimate st and use it as the DNN input along with the camera image It from the driving sequence data. The training dataset is comprised of camera images from 11,455 different time instances, with the state estimate selected at runtime so as to have different state estimates for the same camera images in different epochs.

Similar to the data augmentation techniques described in Cattaneo et al. (2019), we:

  1. Randomly changed contrast, saturation, and brightness of images

  2. Applied random rotations in the range of ±5° to both the camera images and local depth maps

  3. Horizontally mirrored the camera image and computed the local depth map using a modified camera projection matrix

All three of these data augmentation techniques are used in training CMRNet in the first half of the optimization process. However, for training CovarianceNet, we skip the contrast, saturation, and brightness changes during the second half of the optimization so that the DNN can learn real-world noise features from camera images.

We generate the validation and test datasets from sequence 00 in the KITTI odometry dataset, which is not used for training. We follow a similar procedure as the one for generating the training dataset, except we do not augment the data. The validation dataset comprised of randomly selected 100 time instances from sequence 00, while the test dataset contains the remaining 4,441 time instances in sequence 00.

6.4 Training procedure

We train the DNN using stochastic gradient descent. Directly optimizing via the maximum likelihood loss term Embedded ImageMLE(·) might suffer from instability caused by the interdependence between the translation error Embedded Image and covariance Embedded Image outputs (Skafte et al., 2019). Therefore, we employ the mean-variance split training strategy proposed in Skafte et al. (2019): First, we set (αHuber = 1,αMLE = 1, αAng = 1) and only optimize the parameters of CMR-Net until validation error stops decreasing. Next, we set (αHuber = 0,αMLE = 1,αAng = 0) and optimize the parameters of CovarianceNet. We alternate between these two steps until validation loss stops decreasing.

Our DNN is implemented using the PyTorch library (Paszke et al., 2019) and takes advantage of the open-source implementation available for CMRNet (Cattaneo et al., 2019) as well as the available pre-trained weights for initialization. Similar to CMRNet, all the layers in our DNN use the leaky RELU activation function with a negative slope of 0.1. We train the DNN on using a single NVIDIA Tesla P40 GPU with a batch size of 24 and learning rate of 10−5 selected via grid search.

6.5 Metrics

We evaluated the lateral, longitudinal, and vertical protection levels computed with our approach using the following three metrics (with subscript t dropped for brevity):

  1. Bound gap measures the difference between the computed protection levels PLlat, PLlon, PLvert, and the true position error magnitude during nominal operations (protection level is less than the alarm limit and greater than the position error):

    Embedded Image 24

    where:

    • – BGlat, BGlon, and BGvert denote bound gaps in lateral, longitudinal, and vertical dimensions respectively

    • – avg(·) denotes the average computed over the test dataset for which the value of protection level is greater than the position error and less than the alarm limit

    A small bound gap value BGlat, BGlon, and BGvert is desirable because it implies that the algorithm both estimates the position error magnitude during nominal operations accurately and has low uncertainty in the prediction. We only consider the bound gap for nominal operations since the estimated position is declared unsafe when the protection level exceeds the alarm limit.

  2. Failure rate measures the total fraction of time instances in the test data sequence for which the computed protection levels PLlat,PLlon, and PLvert are smaller than the true position error magnitude:

    Embedded Image 25

    where:

    • – FRlat, FRlon, and FRvert denote failure rates for lateral, longitudinal, and vertical protection levels, respectively

    • – Embedded Image denotes the indicator function computed using the protection level and true position error values at time t. The indicator function evaluates to 1 if the event in its argument holds true, and otherwise evaluates to 0

    • – Tmax denotes the total time duration of the test sequence

    The failure rate FRlat, FRlon, and FRvert should be consistent with the specified value of the integrity risk IR to meet the safety requirements.

  3. False alarm rate is computed for a specified alarm limit ALlat, ALlon, and ALvert in the lateral, longitudinal, and vertical directions and measures the fraction of time instances in the test data sequence for which the computed protection levels PLlat,PLlon, andPLvert exceed the alarm limit ALlat,ALlon, and ALvert while the position error magnitude is within the alarm limits. We first define the following integrity events:

    Embedded Image 26

The complement of each event is denoted by Embedded Image. Next, we define the counts for false alarms NX,FA, true alarms NX,TA, and the number of times the position error exceeds the alarm limit NX,PE with X = lat, lon, and vert:

Embedded Image 27

Finally, we compute the false alarm rates FARlat, FARlon, and FARvert after normalizing the total number of position error magnitudes lying above and below the alarm limit AL:

Embedded Image 28

6.6 Results

Figure 5 shows the lateral and longitudinal protection levels computed by our approach on two 200 s subsets of the test sequence. For clarity, protection levels are computed at every 5th time instance. Similarly, Figure 6 shows the vertical protection levels along with the vertical position error magnitude in a subset of the test sequence.

FIGURE 5
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 5

Lateral and longitudinal protection level results on the test sequence in real-world dataset. We show protection levels for two subsets of the total sequence, computed at 5 s intervals. The protection levels successfully enclose the state estimates in ∼ 99% of the cases

FIGURE 6
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 6

Vertical protection level results on the test sequence in real-world dataset. We show protection levels for a subset of the total sequence. The protection levels successfully enclose the position error magnitudes with a small bound gap

As can be seen from both the figures, the computed protection levels successfully enclose the position error magnitudes at a majority of the points (∼ 99%) in the visualized subsequences. Furthermore, the vertical protection levels are observed to be visually closer to the position error as compared to the lateral and longitudinal protection levels. This is due to the superior performance of the DNN in determining position errors along the vertical dimension, which is easier to determine since all the camera images in the dataset are captured by a ground-based vehicle.

Figure 7 displays the integrity diagrams generated after the Stanford-ESA integrity diagram proposed for SBAS integrity (Tossaint et al., 2007). The diagram is generated from 15,000 samples of protection levels corresponding to randomly selected state estimates and camera images within the test sequence.

FIGURE 7
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 7

Integrity diagram results for the lateral, longitudinal, and vertical protection levels. The diagram contains protection levels evaluated across 15,000 different state estimates and camera images randomly selected from the test sequence. A majority of the samples are close to and greater than the position error magnitude, validating the applicability of the computed protection levels as a robust safety measure

For protection levels in each direction, we set the alarm limit (Table 1) based on the specifications suggested for mid-size vehicles in Reid et al. (2019), beyond which the state estimate is declared unsafe to use. The lateral, longitudinal, and vertical protection levels are greater than the position error magnitudes in ∼ 99% cases, which is consistent with the specified integrity requirement. Furthermore, a large fraction of the failures is in the region where the protection level is greater than the alarm limit and thus the system has been correctly identified to be under unsafe operation.

We conducted an ablation study to numerically evaluate the impact of our proposed epistemic uncertainty measure and outlier weighting method in computing protection levels. We evaluated protection levels in three different cases: Incorporating DNN covariance, epistemic uncertainty, and outlier weighting (VAR+EO); incorporating just the DNN covariance and epistemic uncertainty with equal weights assigned to all position error samples (VAR+E); and only using the DNN covariance (VAR).

For VAR, we constructed a Gaussian distribution using the DNN position error output and diagonal variance entries in each dimension. Then, we computed protection levels from the inverse cumulative distribution function of the Gaussian distribution corresponding to the specified value of integrity risk IR. Table 2 summarizes our results.

View this table:
  • View inline
  • View popup
TABLE 2

Evaluation of lateral, longitudinal, and vertical protection levels from our approach. We compare protection levels computed by our trained model using DNN covariance, epistemic uncertainty, and outlier weighting (VAR+EO); DNN covariance and epistemic uncertainty (VAR+E); and only using the DNN covariance (VAR). Incorporating epistemic uncertainty results in lower failure rate while incorporating outlier weights reduces bound gap and false alarm rate

Incorporating the epistemic uncertainty in computing protection levels improved the failure rate from 0.05 in lateral protection levels, 0.05 in longitudinal protection levels, and 0.03 in vertical protection levels to within 0.01 in all cases. This is because the covariance estimate from the DNN provides an overconfident measure of uncertainty, which is corrected by our epistemic uncertainty measure. Furthermore, incorporating outlier weighting reduced the average nominal bound gap by about 0.02 m in lateral protection levels, 0.05 m in longitudinal protection levels, and 0.05 m in vertical protection levels as well as false alarm rate by about 0.02 for each direction while keeping the failure rate within the specified integrity risk requirement.

The mean bound gap between the lateral protection levels computed from our approach and the position error magnitudes in the nominal cases is smaller than a quarter of the width of a standard US lane. In the longitudinal direction, the bound gap is somewhat larger since fewer visual features are present along the road for determining the position error using the DNN. The corresponding value in the vertical dimension is smaller, owing to the DNN’s superior performance in determining position errors and uncertainty in the vertical dimension. This demonstrates the applicability of our approach to urban roads.

For an integrity risk requirement of 0.01, the protection levels computed by our method demonstrate a failure rate equal to or within 0.01 as well. However, further lowering the integrity risk requirement during our experiments either did not similarly improve the failure rate or caused a significant increase in the bound gaps and the false alarm rate.

A possible reason is that the uncertainty approximated by our approach through both the aleatoric and epistemic measures fails to act as an accurate uncertainty representation for smaller integrity risk requirements than 0.01. Future research would consider more and varied training data, better strategies for selecting candidate states, and different DNN architectures to meet smaller integrity risk requirements.

A shortcoming of our approach is the large false alarm rate exhibited by the computed protection levels shown in Table 2. The large value results both from the inherent noise in the DNN-based estimation of position and rotation error as well as from frequently selecting candidate states that result in large outlier error values. A future work direction for reducing the false alarm rate is to explore strategies for selecting candidate states and mitigating outliers.

A key advantage offered by our approach is its application to scenarios where a direct analysis of the error sources in the state estimation algorithm is difficult, such as when feature rich visual information is processed by a machine learning algorithm for estimating the state. In such scenarios, our approach computes protection levels separately from the state estimation algorithm by both evaluating a data-driven model of the position error uncertainty and characterizing the epistemic uncertainty in the model outputs.

7 CONCLUSION

In this work, we presented a data-driven approach for computing lateral, longitudinal, and vertical protection levels associated with a given state estimate from camera images and a 3D LiDAR map of the environment. Our approach estimates both aleatoric and epistemic measures of uncertainty for computing protection levels, thereby providing robust measures of localization safety.

We demonstrated the efficacy of our method on real-world data in terms of bound gap, failure rate, and false alarm rate. Results show that the lateral, longitudinal, and vertical protection levels computed from our method enclose the position error magnitudes with 0.01 probability of failure and less than 1 m bound gap in all directions, which demonstrates that our approach is applicable to GNSS-denied urban environments.

HOW TO CITE THIS ARTICLE

Gupta S, Gao G. Data-driven protection levels for camera and 3D map-based safe urban localization. NAVIGATION. 2021;68:643–660. https://doi.org/10.1002/navi.445

ACKNOWLEDGMENTS

This material is based upon work supported by the National Science Foundation under award #2006162.

Footnotes

  • Funding information

    NSF, Grant/Award Number: #2006162

  • Received October 19, 2020.
  • Revision received April 13, 2021.
  • Accepted July 15, 2021.
  • © 2021 Institute of Navigation

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

REFERENCES

  1. ↵
    1. Badue, C.,
    2. Guidolini, R.,
    3. Carneiro, R. V.,
    4. Azevedo, P.,
    5. Cardoso, V. B.,
    6. Forechi, A.,
    7. Jesus, L.,
    8. Berriel, R.,
    9. Paixão, T. M.,
    10. Mutz, F.,
    11. de Paula Veronese, L.,
    12. Oliveira-Santos, T., &
    13. De Souza, A. F.
    (2021). Self-driving cars: A survey. Expert Systems with Applications, 165. https://doi.org/10.1016/j.eswa.2020.113816
  2. ↵
    1. Barfoot, T.,
    2. Forbes, J. R., &
    3. Furgale, P. T.
    (2011). Pose estimation using linearized rotations and quaternion algebra. Acta Astronautica, 68(1-2), 101–112. ISSN 00945765. https://doi.org/10.1016/j.actaastro.2010.06.049
    CrossRef
  3. ↵
    1. Blundell, C.,
    2. Cornebise, J.,
    3. Kavukcuoglu, K. &
    4. Wierstra, D.
    (2015). Weight Uncertainty in Neural Network. In F. Bach, & D. Blei (Eds.) Proc. of the 32nd International Conference on Machine Learning, vol. 37 of Proceedings of Machine Learning Research (pp. 1613–1622). Lille, France: PMLR.
  4. ↵
    1. Burden, R. L., &
    2. Faires, J. D.
    (2011). Numerical Analysis. Cengage Learning.
  5. ↵
    1. Cadena, C.,
    2. Carlone, L.,
    3. Carrillo, H.,
    4. Latif, Y.,
    5. Scaramuzza, D.,
    6. Neira, J.,
    7. Reid, I., &
    8. Leonard, J. J.
    (2016). Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics, 32(6), 1309–1332. https://doi.org/10.1109/TRO.2016.2624754.ArXiv:1606.05830.
  6. ↵
    1. Caselitz, T.,
    2. Steder, B.,
    3. Ruhnke, M. &
    4. Burgard, W.
    (2016). Monocular camera localization in 3D LiDAR maps. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, South Korea: IEEE, 1926-1931. https://doi.org/10.1109/IROS.2016.7759304
  7. ↵
    1. Cattaneo, D.,
    2. Vaghi, M.,
    3. Ballardini, A. L.,
    4. Fontana, S.,
    5. Sorrenti, D. G. &
    6. Burgard, W.
    (2019). CMRNet: Camera to LiDAR-Map Registration. 2019 IEEE Intelligent Transportation Systems Conference (ITSC) (pp. 1283–1289). https://doi.org/10.1109/ITSC.2019.8917470
  8. ↵
    1. Cezón, A.,
    2. Cueto, M. &
    3. Fernández, I.
    (2013). Analysis of Multi-GNSS Service Performance Assessment: ARAIM vs. IBPL Performances Comparison. Proc. of the 26th International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2013), Nashville, TN, 2654–2663. https://www.ion.org/publications/abstract.cfm?articleID=11407
  9. ↵
    1. Delling, D.,
    2. Goldberg, A. V.,
    3. Pajor, T., &
    4. Werneck, R. F.
    (2017). Customizable Route Planning in Road Networks. Transportation Science, 51(2), 566–591. https://doi.org/10.1287/trsc.2014.0579
  10. ↵
    1. Dosovitskiy, A.,
    2. Fischer, P.,
    3. Ilg, E.,
    4. Hausser, P.,
    5. Hazirbas, C.,
    6. Golkov, V.,
    7. Smagt, P. v. d.,
    8. Cremers, D. &
    9. Brox, T.
    (2015). FlowNet: Learning Optical Flow with Convolutional Networks. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2758–2766. https://doi.org/10.1109/ICCV.2015.316
  11. ↵
    1. Gal, Y. &
    2. Ghahramani, Z.
    (2016). Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. In M. F. Balcan, & K. Q. Weinberger (Eds.) Proc. of The 33rd International Conference on Machine Learning, Proceedings of Machine Learning Research Vol. 48, New York, NY, 1050–1059. https://dl.acm.org/doi/10.5555/3045390.3045502
  12. ↵
    1. Geiger, A.,
    2. Lenz, P. &
    3. Urtasun, R.
    (2012). Are we ready for autonomous driving? The KITTI vision benchmark suite. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, 3354-3361. https://doi.org/10.1109/CVPR.2012.6248074
  13. ↵
    1. Goodfellow, I.,
    2. Bengio, Y., &
    3. Courville, A.
    (2016). Deep Learning. MIT Press.
  14. ↵
    1. Gupta, S. &
    2. Gao, G. X.
    (2020). Data-Driven Protection Levels for Camera and 3D Map-based Safe Urban Localization. Proc. of the 33rd International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2020), 2483–2499. https://doi.org/10.33012/2020.17698
  15. ↵
    1. Huber, P. J.
    (1992). Robust Estimation of a Location Parameter. In S. Kotz, & N. L. Johnson (Eds.) Breakthroughs in Statistics. Springer Series in Statistics, New York, NY: 492–518. https://doi.org/10.1007/978-1-4612-4380-9_35
  16. ↵
    1. Iglewicz, B., &
    2. Hoaglin, D. C.
    (1993). How to Detect and Handle Outliers. The ASQC Basic References in Quality Control: Statistical Techniques ASQC Quality Press.
  17. ↵
    1. Jensen, M. B.,
    2. Philipsen, M. P.,
    3. Mogelmose, A.,
    4. Moeslund, T. B., &
    5. Trivedi, M. M.
    (2016). Vision for Looking at Traffic Lights: Issues, Survey, and Perspectives. IEEE Transactions on Intelligent Transportation Systems, 17(7), 1800–1815. https://doi.org/10.1109/TITS.2015.2509509
  18. ↵
    1. Jiang, Y., &
    2. Wang, J.
    (2016). A New Approach to Calculate the Horizontal Protection Level. The Journal of Navigation, 69(1), 57–74. https://doi.org/10.1017/S0373463315000545
  19. ↵
    1. Joerger, M., &
    2. Pervan, B.
    (2019). Quantifying Safety of Laser-Based Navigation. IEEE Transactions on Aerospace and Electronic Systems, 55(1), 273–288. IEEE Transactions on Aerospace and Electronic Systems. https://doi.org/10.1109/TAES.2018.2850381
  20. ↵
    1. Kendall, A. &
    2. Cipolla, R.
    (2016). Modelling uncertainty in deep learning for camera relocalization. In 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 4762–4769. https://doi.org/10.1109/ICRA.2016.7487679
  21. ↵
    1. Kendall, A. &
    2. Gal, Y.
    (2017). What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? https://arxiv.org/abs/1703.04977
  22. ↵
    1. Kendall, A.,
    2. Grimes, M. &
    3. Cipolla, R.
    (2015). PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2938-2946. https://doi.org/10.1109/ICCV.2015.336
  23. ↵
    1. Kim, Y.,
    2. Jeong, J. &
    3. Kim, A.
    (2018). Stereo Camera Localization in 3D LiDAR Maps. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1-9. https://doi.org/10.1109/IROS.2018.8594362
  24. ↵
    1. Kiureghian, A. D., &
    2. Ditlevsen, O.
    (2009). Aleatory or epistemic? Does it matter? Structural Safety, 31(2), 105–112. https://doi.org/10.1016/j.strusafe.2008.06.020
    CrossRef
  25. ↵
    1. Krishnan, S.,
    2. Crosby, C.,
    3. Nandigam, V.,
    4. Phan, M.,
    5. Cowart, C.,
    6. Baru, C. &
    7. Arrowsmith, R.
    (2011). OpenTopography: a services oriented architecture for community access to LIDAR topography. Proc. of the 2nd International Conference on Computing for Geospatial Research & Applications (COM. Geo ‘11), Washington, DC, 1–8. https://doi.org/10.1145/1999320.1999327
  26. ↵
    1. Lindsay, B. G.
    (1995). Mixture Models: Theory, Geometry, and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics IMS.
  27. ↵
    1. Liu, K.,
    2. Ok, K.,
    3. Vega-Brown, W. &
    4. Roy, N.
    (2018). Deep Inference for Covariance Estimation: Learning Gaussian Noise Models for State Estimation. 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, 1436-1443, https://doi.org/10.1109/ICRA.2018.8461047
  28. ↵
    1. Loquercio, A.,
    2. Segu, M., &
    3. Scaramuzza, D.
    (2020). A General Framework for Uncertainty Estimation in Deep Learning. IEEE Robotics and Automation Letters, 5(2), 3153–3160. https://doi.org/10.1109/LRA.2020.2974682
  29. ↵
    1. Lukas, V. &
    2. Stoker, J. M.
    (2016). 3D Elevation Program—Virtual USA in 3D. US Geological Survey, Reston, VA. https://doi.org/10.3133/fs20163022
  30. ↵
    1. Lyrio, L. J.,
    2. Oliveira-Santos, T.,
    3. Badue, C. &
    4. De Souza, A. F.
    (2015). Image-based mapping, global localization and position tracking using VG-RAM weightless neural networks. 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, 3603–3610. https://doi.org/10.1109/ICRA.2015.7139699
  31. ↵
    1. McAllister, R.,
    2. Gal, Y.,
    3. Kendall, A.,
    4. van der Wilk, M.,
    5. Shah, A.,
    6. Cipolla, R. &
    7. Weller, A.
    (2017). Concrete Problems for Autonomous Vehicle Safety: Advantages of Bayesian Deep Learning. Proc. of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, 4745–4753. https://doi.org/10.24963/ijcai.2017/661
  32. ↵
    1. Oliveira, G. L.,
    2. Radwan, N.,
    3. Burgard, W., &
    4. Brox, T.
    (2020). Topometric Localization with Deep Learning. In N. M. Amato, G. Hager, S. Thomas, & M. Torres-Torriti (Eds.) Robotics Research Vol. 10. The 18th International Symposium ISRR: Springer Proceedings in Advanced Robotics. https://doi.org/10.1007/978-3-030-28619-4_38
  33. ↵
    1. Paszke, A.,
    2. Gross, S.,
    3. Massa, F.,
    4. Lerer, A.,
    5. Bradbury, J.,
    6. Chanan, G.,
    7. Killeen, T.,
    8. Lin, Z.,
    9. Gimelshein, N.,
    10. Antiga, L.,
    11. Desmaison, A.,
    12. Kopf, A.,
    13. Yang, E.,
    14. DeVito, Z.,
    15. Raison, M.,
    16. Tejani, A.,
    17. Chilamkurthy, S.,
    18. Steiner, B.,
    19. Fang, L., …
    20. Chintala, S.
    (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d. Alché-Buc, E. Fox, & R. Garnett (Eds.) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc.
  34. ↵
    1. Pintus, R.,
    2. Gobbetti, E. &
    3. Agus, M.
    (2011). Real-time rendering of massive unstructured raw point clouds using screen-space operators. Proc. of the 12th International conference on Virtual Reality, Archaeology and Cultural Heritage, 105–112. https://dl.acm.org/doi/10.5555/2384495.2384513
  35. ↵
    1. Recht, B.,
    2. Roelofs, R.,
    3. Schmidt, L. &
    4. Shankar, V.
    (2019). Do ImageNet Classifiers Generalize to ImageNet? In K. Chaudhuri, & R. Salakhutdinov (Eds.) Proc. of the 36th International Conference on Machine Learning, Vol. 97 of Proceedings of Machine Learning Research, 5389–5400.
  36. ↵
    1. Reid, T. G. R.,
    2. Houts, S. E.,
    3. Cammarata, R.,
    4. Mills, G.,
    5. Agarwal, S.,
    6. Vora, A., &
    7. Pandey, G.
    (2019). Localization Requirements for Autonomous Vehicles. SAE International Journal of Connected and Automated Vehicles, 2(3), 173–190. https://doi.org/10.4271/12-02-03-0012.
  37. ↵
    1. Rousseeuw, P. J., &
    2. Hubert, M.
    (2018). Anomaly detection by robust statistics. WIREs Data Mining and Knowledge Discovery, 8(2). https://doi.org/10.1002/widm.1236
    1. Russell, R. L. &
    2. Reale, C.
    (2021). Multivariate Uncertainty in Deep Learning. https://arxiv.org/pdf/1910.14215.pdf
  38. ↵
    1. Sarlin, P.,
    2. Cadena, C.,
    3. Siegwart, R. &
    4. Dymczyk, M.
    (2019). From Coarse to Fine: Robust Hierarchical Localization at Large Scale. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12708–12717. https://doi.org/10.1109/CVPR.2019.01300
  39. ↵
    1. Skafte, N.,
    2. Jørgensen, M. &
    3. Hauberg, S. r.
    (2019) Reliable training and estimation of variance networks. 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, 6326–6336.
  40. ↵
    1. Smith, L. &
    2. Gal, Y.
    (2018). Understanding Measures of Uncertainty for Adversarial Example Detection. https://arxiv.org/pdf/1803.08533.pdf
  41. ↵
    1. Spilker, J. J. Jr..,
    2. Axelrad, P.,
    3. Parkinson, B. W., &
    4. Enge, P.
    (Eds.) (1996). Global Positioning System: Theory and Applications, Volume I. Washington, DC, American Institute of Aeronautics and Astronautics. https://doi.org/10.2514/4.866388
  42. ↵
    1. Taira, H.,
    2. Okutomi, M.,
    3. Sattler, T.,
    4. Cimpoi, M.,
    5. Pollefeys, M.,
    6. Sivic, J.,
    7. Pajdla, T., &
    8. Torii, A.
    (2021). InLoc: Indoor Visual Localization with Dense Matching and View Synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(4), 1293–1307. https://doi.org/10.1109/TPAMI.2019.2952114
  43. ↵
    1. Tossaint, M.,
    2. Samson, J.,
    3. Toran, F.,
    4. Ventura-Traveset, J.,
    5. Hernandez-Pajares, M.,
    6. Juan, J.,
    7. Sanz, J., &
    8. Ramos-Bosch, P.
    (2007). The Stanford - ESA Integrity Diagram: A New Tool for The User Domain SBAS Integrity Assessment. NAVIGATION, 54(2), 153–162. https://doi.org/10.1002/j.2161-4296.2007.tb00401.x
  44. ↵
    1. Tran, H. T., &
    2. Lo Presti, L.
    (2019). Kalman filter-based ARAIM algorithm for integrity monitoring in urban environment. ICT Express, 5(1), 65–71. https://doi.org/10.1016/j.icte.2018.05.002
  45. ↵
    1. Wang, C.,
    2. Wen, C.,
    3. Dai, Y.,
    4. Yu, S., &
    5. Liu, M.
    (2020). Urban 3D modeling with mobile laser scanning: a review. Virtual Reality & Intelligent Hardware, 2(3), 175–212. https://doi.org/10.1016/j.vrih.2020.05.003
  46. ↵
    1. Wolcott, R. W., &
    2. Eustice, R. M.
    (2017). Robust LIDAR localization using multiresolution Gaussian mixture maps for autonomous driving. The International Journal of Robotics Research, 36(3), 292–319. https://doi.org/10.1177/0278364917696568
  47. ↵
    1. Yang, N.,
    2. Stumberg, L.,
    3. Wang, R. &
    4. Cremers, D.
    (2020). D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, 1278–1289. https://doi.org/10.1109/CVPR42600.2020.00136
  48. ↵
    1. Zeng, T. &
    2. Ji, S.
    (2015). Deep Convolutional Neural Networks for Multi-instance Multi-task Learning. 2015 IEEE International Conference on Data Mining, Atlantic City, NJ, 579–588. https://doi.org/10.1109/ICDM.2015.92
  49. ↵
    1. Zhu, C.,
    2. Joerger, M. &
    3. Meurer, M.
    (2020, April). Quantifying Feature Association Error in Camera-based Positioning. In 2020 IEEE/ION Position, Location and Navigation Symposium (PLANS) (pp. 967–972). ISSN: 2153-3598. https://doi.org/10.1109/PLANS46316.2020.9109919
PreviousNext
Back to top

In this issue

NAVIGATION: Journal of the Institute of Navigation: 68 (3)
NAVIGATION: Journal of the Institute of Navigation
Vol. 68, Issue 3
Fall 2021
  • Table of Contents
  • Index by author
Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on NAVIGATION: Journal of the Institute of Navigation.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Data-driven protection levels for camera and 3D map-based safe urban localization
(Your Name) has sent you a message from NAVIGATION: Journal of the Institute of Navigation
(Your Name) thought you would like to see the NAVIGATION: Journal of the Institute of Navigation web site.
Citation Tools
Data-driven protection levels for camera and 3D map-based safe urban localization
Shubh Gupta, Grace Gao
NAVIGATION: Journal of the Institute of Navigation Sep 2021, 68 (3) 643-660; DOI: 10.1002/navi.445

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Share
Data-driven protection levels for camera and 3D map-based safe urban localization
Shubh Gupta, Grace Gao
NAVIGATION: Journal of the Institute of Navigation Sep 2021, 68 (3) 643-660; DOI: 10.1002/navi.445
Reddit logo Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One
Bookmark this article

Jump to section

  • Article
    • Abstract
    • 1 INTRODUCTION
    • 2 RELATED WORK
    • 3 PROBLEM FORMULATION
    • 4 TYPES OF UNCERTAINTY IN POSITION ERROR
    • 5 DATA-DRIVEN PROTECTION LEVELS
    • 6 EXPERIMENTAL RESULTS
    • 7 CONCLUSION
    • HOW TO CITE THIS ARTICLE
    • ACKNOWLEDGMENTS
    • Footnotes
    • REFERENCES
  • Figures & Data
  • Supplemental
  • References
  • Info & Metrics
  • PDF

Related Articles

  • Google Scholar

Cited By...

  • No citing articles found.
  • Google Scholar

More in this TOC Section

  • GPS Spoofing Mitigation and Timing Risk Analysis in Networked Phasor Measurement Units via Stochastic Reachability
  • A Consistent Regional Vertical Ionospheric Model and Application in PPP-RTK Under Sparse Networks
  • Real-Time Ionosphere Prediction Based on IGS Rapid Products Using Long Short-Term Memory Deep Learning
Show more Original Article

Similar Articles

Keywords

  • deep learning
  • image registration
  • integrity monitoring
  • localization safety
  • protection level
  • vision-based localization

Unless otherwise noted, NAVIGATION content is licensed under a Creative Commons CC BY 4.0 License.

© 2023 The Institute of Navigation, Inc.

Powered by HighWire