Abstract
3D lidar-based simultaneous localization and mapping (SLAM) is a well-recognized solution for mapping and localization applications. However, the typical 3D lidar sensor (e.g., Velodyne HDL-32E) only provides a very limited field of view vertically. As a result, the vertical accuracy of pose estimation suffers. This paper aims to alleviate this problem by detecting the absolute ground plane to constrain vertical pose estimation. Different from the conventional relative plane constraint, this paper employs the absolute plane distance to refine the position in the z-axis and the norm vector of the ground plane to constrain the attitude drift. Finally, relative positioning from lidar odometry, constraint from ground plane detection, and loop closure are integrated under a proposed factor graph-based 3D lidar SLAM framework (AGPC-SLAM). The effectiveness is verified using several data sets collected in Hong Kong.
1 INTRODUCTION
Accurate mapping and localization (Dill & Uijt de Haag, 2016) are significant for autonomous systems such as autonomous driving vehicles (ADVs; Huang et al., 2019) and indoor mobile robotics (Hess et al., 2016). Great efforts have been devoted to achieving accurate simultaneous localization and mapping (SLAM) using 3D light detection and ranging (lidar; Hess et al., 2016) sensors due to their robustness compared with the vision-based SLAM methods (Qin et al., 2018). Vision-based SLAM based on a passive sensor like a camera can be sensitive to illumination and viewpoint changes. Conversely, an active sensor like 3D lidar can provide distance measurements for surrounding environments which are invariant to illumination. The outstanding robustness and precision make 3D lidar an indispensable sensor for large-scale mapping and localization.
3D lidar-based SLAM (Koide et al., 2019; Lin & Zhang, 2020; Shan & Englot, 2018; Wen et al., 2020; Zhang & Singh, 2014; Zhao et al., 2019) has been extensively studied over past decades. In general, the 3D lidar SLAM algorithm can be gracefully divided into two parts, the front-end (Grisetti et al., 2010) and the back-end (Grisetti et al., 2010). The front-end focuses on point cloud registration (Pang et al., 2019; Saarinen et al., 2013). The back-end integrates multiple constraints, such as positioning from the front-end, positioning from the global navigation satellite system (GNSS; Dow et al., 2009; Shetty & Gao, 2019), and loop closure (Magnusson et al., 2009).
The accuracy of point cloud registration, which is the major part of the front-end, dominates the performance of 3D lidar SLAM. Therefore, numerous works have been studied to improve the point cloud registration process, such as the iterative closest point (ICP; Kuramachi et al., 2015), the normal distribution transform (NDT; Magnusson et al., 2007), and lidar odometry and mapping (LOAM; Zhang & Singh, 2017). The LOAM algorithm obtained top accuracy in the KITTI data set odometry benchmark (Geiger et al., 2012) until December 2020 due to its feature extraction and impressive data association structure. Unlike the point-wise registration methods (e.g., ICP and NDT), the LOAM algorithm extracts meaningful edge and planar features from raw 3D point clouds, leading to lower computational load and decreased sensitivity to the local minimums. Instead of simply relying on finding the transformation between consecutive frames of point clouds (Low, 2004; Pang et al., 2019; Saarinen et al., 2013) via scan-to-scan manner, the LOAM algorithm decouples the registration problem into two parts, coarse odometry and fine mapping.
First, the coarse odometry conducts scan-to-scan matching of the edge and planar features, respectively, to estimate a coarse relative transformation. Second, the innovative fine-mapping process is conducted to map the current frame of point clouds to the continuously generated global map (scan-to-map) based on an initial guess derived by the coarse odometry. The mapping process helps to mitigate the accumulated relative drift from lidar odometry in the first step. In short, the LOAM algorithm obtains better performance compared with the listed ICP and NDT. However, due to the limited field of view (FOV, typically +10° ~ −30° vertically, 0~360° horizontally) for typical 3D lidar, the available features in the vertical direction are significantly fewer than those in the horizontal direction. As a result, inevitably, the 3D lidar SLAM can drift in the vertical direction. To find out the major challenge of 3D lidar SLAM in urban canyons, we extensively evaluated the performance of 3D lidar SLAM in diverse urban canyons (Wen et al., 2018). The results showed that the vertical drift due to a limited FOV of 3D lidar was one of the major problems to be solved for the application of 3D lidar SLAM in urban canyons.
According to our evaluations (Wen et al., 2018), in numerous urban scenarios, it was shown that the ground surface is usually available for ground vehicles installed with 3D lidar. Furthermore, the 3D lidar scanning could provide sufficient ground points even in dense traffic scenes that make ground surface detection possible. Inspired by this remarkable feature, this paper proposes to detect and apply the Absolute Ground Plane Constraint (AGPC) to constrain the state of the vehicle to further mitigate the vertical drift. In other words, the AGPC is employed to improve the geometry of the constraint. To mitigate the overall drift over time, the loop closure detection method is applied to exploit the loop closure constraint. Finally, a factor graph-based 3D lidar SLAM framework (AGPC-SLAM) is proposed to fuse relative positioning from LiDAR odometry, constraints from AGPC, and loop closure.
The major contributions of this paper are listed as follows:
We propose to exploit the AGPC to mitigate the vertical drift of 3D LiDAR SLAM.
We propose a general 3D SLAM framework (AGPC-SLAM) to integrate the constraint of relative positioning from lidar odometry, the AGPC, and the constraint from the loop closure. The proposed AGPC-SLAM is a general framework that can easily fuse additional information from sensors such as the GNSS receiver and magnetometers.
The proposed method is validated using two challenging large-scale data sets collected in Hong Kong. We believe that the proposed method can have a positive impact on both the academic and industrial fields.
The remainder of the paper is structured as follows. The related work is reviewed in Section 2. Then, the overview of the proposed method is introduced in Section 3. The detail of the methodology is introduced in Section 4 before the experimental evaluation presented in Section 5. The conclusion and future work are given in Section 6.
3 OVERVIEW OF THE PROPOSED METHOD
3.1 Notations and Coordinates
Matrices are denoted as uppercase bold letters. Vectors are denoted as lowercase bold letters. Variable scalars are denoted as lowercase italic letters. Constant scalars are denoted as lowercase letters. To make the proposed framework clearer, the following notations are defined and followed by the rest of this paper.
The local world frame (W{XW, YW, ZW}) is fixed to the world at the start point of the vehicle, which can also be fixed to the GNSS absolute frame when GNSS is available.
The local base frame (L{XL, YL, ZL}) originates at the starting point of the local lidar odometry.
The vehicle body frame (B{XB, YB, ZB}) is fixed at the center of the 3D lidar sensor.
The considered coordinate system is shown in Figure 1. denotes the transformation from the body frame to the local base frame, encoding the information of the lidar odometry during the experiment. The transformation
denotes the transformation between the local base frame and the local world frame, which is the drift compensation estimated by the additional constraints (e.g., the loop closure constraints and the AGPC constraint). In other words, the local-based frame is fixed on the local world frame if there are no ground and loop closure constraints. Note that the definition of the coordinates is referred to in the work of Mascaro et al. (2018) which is commonly used in the SLAM field with loop closure constraint.
Overview of the applied coordinate systems
Therefore, the state of the ego-vehicle at epoch k is encoded by the transformation from the body frame to the local world frame, , as follows:
1
where the represents the position and the
represents the orientation in the local world frame with quaternion form.
3.2 Overview of the Proposed AGPC-SLAM
The overview of the proposed AGPC-SLAM framework is shown in Figure 2. The input of the AGPC-SLAM is the 3D lidar point cloud (Sk, the raw point cloud at epoch k). The outputs of the proposed AGPC-SLAM are the vehicular trajectory and the consistent point cloud map.
Overview of the proposed AGPC-SLAM; the input is the 3D point cloud from 3D lidar. The outputs include the points map and pose estimation of vehicular trajectory.
The AGPC-SLAM framework can be divided into two major parts, the front-end (the light green shaded boxes in Figure 2), and the back-end (the light blue shaded boxes in Figure 2). The light red shaded box denotes the AGPC proposed in this paper. Finally, the constraints from the front-end and the AGPC are integrated using factor graph optimization in the back-end. The details of the major parts are presented in the following sections.
4 METHODOLOGY
4.1 Front-End
The performance of the front-end is dominated by lidar odometry whose objective is to make the best use of the consecutive point clouds to estimate the relative motion. Lidar odometry is implemented based on the LOAM algorithm proposed in Zhang and Singh (2017) in which three steps are included for lidar odometry: feature extraction, coarse odometry, and fine mapping. Although lidar odometry, itself, is not a contribution of this paper, we still briefly present the formulation for the sake of completeness.
4.1.1 Feature Extraction
Different from the point-wise registration method (e.g., NDT and ICP), the feature definition and extraction proposed in Zhang and Singh (2014) explores the representative primitives before the data association process. The input of the feature extraction is the Pk{Pk,1, Pk,2,…, Pk,i, Pk,N}. The variable N denotes the number of points inside a frame of point cloud. A point is classified as a planar or an edge point depending on the roughness of its local region. The roughness of the local region is determined as follows (Zhang & Singh, 2017):
2
where ck,i represents the roughness of the point. The variable S denotes the small local region near the given point Pk,i and, usually, five points are involved in the local region. Pk,j indicates the point belonging to the local region S. If the calculated roughness is larger than a pre-determined threshold tc, the point is classified as an edge point. The point with roughness that is smaller than the threshold is classified as a planar point. The output of the feature extraction process is the feature set , where the
and
represent the feature set containing all the planar and edge points, respectively. Meanwhile, the authors (Zhang & Singh, 2017) also proposed careful feature point selection strategies to avoid unreliable points. For example, points that are on the boundary of the occluded regions should not be selected as those points can be unobservable in the coming epochs. Meanwhile, the points that lie on local planar surfaces that are roughly parallel to the lidar beams should not be selected as well. We strictly follow these strategies to select feature points from the raw 3D point clouds.
4.1.2 Coarse Odometry
Based on the features extracted in the last section, coarse odometry is performed to efficiently estimate the relative motion between consecutive frames of point clouds. The relative motion is calculated by conducting point-to-edge and point-to-plane scan-matching. In short, the objective is to find the corresponding features for points in from the feature points set
. Detailed steps can be found in Zhang and Singh (2017). We formulated the point cloud registration process as follows using:
3
where PCR denotes the point cloud registration function. The output of the point cloud registration process is the coarse relative motion, denoted by . Note that
is the motion between the frame k – 1 and k. Since the
is estimated based on scan-to-scan matching, the efficiency is guaranteed. Meanwhile, the coarse motion estimation is applied as the initial guess for the fine mapping in the next section.
4.1.3 Fine Mapping
To refine the relative motion estimation, the fine-mapping process is applied to refine the . The principle of the mapping process is that the extracted
is mapped onto the incrementally built map to refine the motion estimate
. Note that the map here is generated incrementally. This was one of the major contributions of the work by Zhang and Singh (2017) that provided impressive performance locally. The output of the mapping process is
. However, the mapping process is conducted at a low frequency circle due to computational cost. Therefore, the local transform integration is applied to integrate the high frequency but rough relative motion estimate
and the low frequency but locally accurate motion estimation
. Details of the mapping process can be found in Zhang and Singh (2014).
4.2 AGPC Detection
This section presents the detection of the AGPC based on the raw 3D point clouds from 3D lidar. For a given epoch k, the clouds Pk usually involve abundant points scanned from the ground surface for ground vehicles installed with 3D lidar. For a typical 3D lidar sensor installed on the roof of a vehicle, the sensor height relative to the ground surface is approximately known. Numerous works (Choi et al., 2014; Yang & Förstner, 2010) were done to extract the ground plane based on the raw 3D point clouds. The Random Sample Consensus (RANSAC) is one of the most efficient algorithms for detecting the ground plane parameters from the raw 3D point clouds. In this paper, we define the model parameters of a ground plane as: a) the Euclidean distance from the center of 3D lidar to the detected ground surface; and b) the norm vector of the surface as follows:
4
where the represents the norm vector and dk denotes Euclidean distance. Therefore, the ground plane detection problem using the RANSAC can be defined as follows:
5
where denotes the optimal parameters of the ground plane. The variable
denotes a set of the given points on the ground surface. The operator f (*) is employed to evaluate the distance between a given point Pk,i and the plane Gk. Given a point Pk,i = {xk,i, yk,i, zk,i} and ground plane Gk, the distance can be calculated as follows:
6
where f(Gk, Pk,i) represents the Euclidean distance. The detail of the plane detection is given in the following Algorithm 1. Note that Pk is filtered before being input to the algorithm by only keeping the points with z-axis values ranging from −2.5 to 2.5 meters. The parameters required in Algorithm 1 include the minimum number of points (tnp) required to detect the ground plane, the maximum number of iterations (titer) allowed in the RANSAC, a threshold value (tdis) of distance for judging whether a point fits the ground plane, and a threshold of the number of points (ta) belonging to the close data indicating that the estimated ground plane model fits well to the points.
Ground Plane Detection Using RANSAC
The ground detection results based on Algorithm 1 are illustrated in Figure 3 which shows the ground detection with data collected from typical intersections and roadways. Figure 3(a) shows ground detection results in an intersection case. The white points denote the non-ground points and the red points denote the ground points. For the roadway case in Figure 3(b) with the dynamic vehicles located on both sides, the ground points in the front and back of the ego-vehicle are also effectively detected and annotated with red points.
Illustration of ground detection using Algorithm 1 (the white points represent the raw point cloud from 3D lidar and the red points denote the detected ground point cloud).
4.3 Back-End: Loop Closure Detection
Loop closure (Magnusson et al., 2009) is an effective approach for reducing accumulated error over time. The principle of loop closure is to detect the frames of the point cloud with a large percentage of overlap. The accumulated driving distance is denoted as Dk from the first epoch to the current one. The loop closure detection is performed only when the Dk exceeds a threshold tD that is tuned heuristically. Then, the loop closure is checked by matching the current keyframe and historical keyframes using the NDT in Magnusson et al. (2009). Loop closure is found when the fitness condition (Magnusson et al., 2009) is smaller than a given threshold tF. The output of the loop closure is the relative constraint between the current frame and the historical keyframe as follows:
7
where denotes the transformation between the two keyframes corresponding to the detected loop closure, frame i and j. The
value denotes translation and
represents the rotation in quaternion form.
4.4 Back-End: Factor Graph-Based Optimization
This section presents the factor graph construction based on the previously derived constraints and optimization. To effectively integrate measurements from lidar odometry (Section 4.1), APGC detection (Section 4.2), and loop closure (Section 4.3), we made use of the state-of-the-art factor graph (Indelman et al., 2012) to formulate sensor fusion as a non-linear optimization problem. Conventional filtering-based sensor fusion such as the extended Kalman filter and its variant (Li et al., 2015) which exploits the first-order Taylor expansion only once prone to get into sub-optimal solutions. The major advantages of the factor graph are the re-linearization and iteration which can enable the optimized states to approach the optimal iteratively. The graph structure of the proposed AGPC-SLAM is shown in Figure 4, which lists three kinds of constraints and ego-vehicle states. Note that the state of the ego-vehicle at epoch k is represented by the based on Equation (1), which encodes the transformation between the body frame and the local world frame.
Graph structure of the proposed AGPC-SLAM
Following Indelman et al. (2012), the objective of factor graph optimization is to minimize the error function between the observations and the states. The error function for lidar-odometry-derived relative measurements are expressed as:
8
where represents the error function for the relative motion between node k and k + 1 for lidar odometry (front-end).
denotes the information matrix of the error function which is tuned experimentally.
and
denote the states at epoch k and k − 1.
represents the motion which is derived from lidar odometry in Section 4.1. The operator ⊖ is the minus operation of the homogeneous matrix in SE(3).
Similarly, the error function for the loop closure between node i and j is as follows:
9
where the denotes the information matrix for the loop closure constraint
between nodes i and j.
Regarding constraint from the ground plane detection, the measurement contains the distance (dk) from the center of lidar to the detected ground surface and the norm vector of the ground surface. dk does not suffer from drift over time as it is relative to the absolute ground plane. Note that this is satisfied under the assumption that the vehicle is driving on almost plane ground.
Given the pose estimation at a given epoch k, the Gk can be transformed into the local world frame as
:
10
where the following are satisfied:
11
12
After transforming Gk into the local world frame, this paper applies the minimum parameterization proposed in Ma et al. (2016) , where θ, ϕ, d are the azimuth angle, elevation angle, and distance with respect to
, respectively.
can be derived as follows (Koide et al., 2019; Ma et al., 2016):
13
Therefore, the error function for the ground plane constraint can be expressed as follows:
14
where the is the information matrix of the
which is experimentally determined.
is the expected prior concerning the ground constraint.
In this case, we formulate three kinds of error functions for constraints. Therefore, the optimal state set can be solved as follows:
15
where denotes the state set to be estimated. The variable K denotes the number of epochs involved in the optimization. The operation ∑k=0,..K* denotes the summation of all the error functions. The
,
, and
denote the information matrix of the three listed types of constraint.
4.5 Global Mapping and Global Transformation Integration
The point clouds map relative to the local world frame can be obtained by registering all the frames of point clouds based on the state’s set optimized in Section 4.4.
To guarantee real-time performance, factor graph optimization was conducted at a frequency of 1 Hz. Each time the optimization finished, the transformation between the local world frame and the local base frame could be derived as follows:
16
Note that was obtained by tracking the relative motion from lidar odometry at a frequency of 10 Hz (the frequency of the raw 3D point clouds). Once
is obtained at epoch l and l > k,
is derived as follows:
17
Therefore, can be obtained at a frequency of 10 Hz which is significant for the application with real-time performance requirements.
5 EXPERIMENT RESULTS AND DISCUSSIONS
To validate the performance of the proposed AGPC-SLAM, we conducted two real experiments in Hong Kong. The first location was Nathan Road, which is an example of one of the typical scenes in Hong Kong. The road was almost flat throughout the test and the driving distance was about 4.2 kilometers. We were fully aware that the proposed method relied on the assumption that the ground was almost flat during operation. Therefore, we conducted the other experiment validation in another location (where the driving distance was about 2.1 kilometers with a partial ramp road) to further evaluate how the proposed method could perform.
During the experiment, 3D lidar (Velodyne 32) was used to collect 3D point clouds. Besides, the NovAtel SPAN-CPT, a GNSS (GPS, GLONASS, and Beidou) RTK/INS (fiber-optic gyroscopes) integrated navigation system was used to provide the ground truth of positioning. The gyro bias in-run stability of the FOG was 1 degree per hour and its random walk was 0.067 degrees per hour. The baseline between the rover and the GNSS base station was within 7 km.
All the data were collected and synchronized using a robotic operation system (ROS; Quigley et al., 2009). All the data were post-processed using a desktop (Intel Core i7-9700K CPU, 3.60Ghz) computer. Please note that the performance of the proposed method was evaluated by aligning the pose from SLAM and the NovAtel SPAN-CPT to the east, north, and up (ENU) coordinates. The extrinsic transformation matrix between the ENU frame and the local world frame was provided by the NovAtel SPAN-CPT. The parameters used in the proposed AGPC-SLAM are given in Table 1.
Parameters Used During Experimental Validation
5.1 Experimental Validation in Location 1
The experimental setup is shown in Figure 5. During the evaluation, we compared the proposed AGPC-SLAM framework with the state-of-the-art LeGO-LOAM (Shan & Englot, 2018).
(a) Data collection vehicle with all the sensors installed in a compact sensor kit; (b) tested scenarios; and (c) trajectory with a driving distance of 4.2 km
5.1.1 Evaluation Metrics
As Figure 5(b) shows, the fixed solution was hard to obtain using NovAtel SPAN-CPT throughout the evaluated Location 1. This was caused by low satellite visibility due to the dense and tall buildings. However, we found that the positioning status of the NovAtel SPAN-CPT was healthy in several road intersections (see A, B, C, D, E, and F in Figure 5[c]) where the satellite geometry was better. We carefully checked the positioning results of the NovAtel SPAN-CPT in the selected six intersection points and at the very least, the float solution was obtained. Therefore, we propose to evaluate the performance of the proposed method based on the following metrics.
Ground control point (GCP) error: Evaluating the accuracy at six selected GCPs (A–F in Figure 5); both absolute translation and rotation errors were evaluated.
Accumulated error: Evaluating the accumulated error after driving a loop with/without the loop closure constraints; without loss of generality, the accumulated error metric was also adopted in the KITTI data set (Geiger et al., 2012) for performance evaluation of lidar odometry.
5.1.2 Performance Analysis
The mean errors of the rotation and translation of the six selected GCPs used are shown in Tables 2 and 3, respectively. Regarding the mean rotation error (Table 2), the LeGO-LOAM introduced a mean error in pitch angle (6.93°). The error in yaw and roll were 2.57° and 2.16°, respectively. With the help of the proposed AGPC, the error in pitch angle decreased to 1.35° which shows that the proposed method can effectively mitigate the drift in pitch angle. The performance in the yaw and roll angle also improved slightly.
Performance of Rotation Estimation at Location 1
Performance of the Translation Estimation at Location 1
Regarding the mean translation error, the errors of LeGO-LOAM in the east and north direction were 3.89 and 11.58 meters, respectively. Meanwhile, the error in the altitude (up) was about 43.83 meters. The total mean error in east, north, and up directions was 45.49 meters. The accumulated error (the error at the last epoch) reached 34.80 meters. The last column represents the percentage of mean error concerning the total driving distance. With the help of the proposed AGPC-SLAM, the mean error decreased to 3.02 meters and the accumulated error was only 1.89 meters. A significant improvement in the vertical direction shows that the proposed AGPC can effectively constrain the drift. Meanwhile, the detailed performances at six GCPs can be found in the Appendix of this paper (see Figure 12 and Figure 13).
After applying the loop closure constraint, the LeGO-LOAM could not successfully detect the loop closure due to the large drift which can be seen in the top panel of Figure 6. Therefore, the loop closure result for LeGO-LOAM is not provided in Tables 2 and 3. However, the rotation estimation for the proposed method improved with the help of loop closure with a mean rotation error of 2.33° and the result of the map can be seen in the bottom panel Figure 6. With the help of the loop closure, the mean error of translation decreased to 2.73 meters and the accumulated error was only 0.52 meters. The percentage of the error was only 0.065%. Although the loop closure detection is not a major contribution of this paper, the results show that the proposed AGPC could further enhance the detection of the loop closure.
Illustration of the map generated by the two methods; the color of the map is annotated by the value of the z-axis (height) of each point.
Figure 7 shows the details of the generated point map with the color annotated by the reflectivity. The lane lines and building boundaries are clear which can be seen from Figure 7(a) and Figure 7(b). A detailed video of the proposed AGPC-SLAM in Location 1 can be found at the following link: https://www.youtube.com/watch?v=3nS895StJUo&feature=youtu.be.
Details of the generated map with reflectivity using the proposed AGPC-SLAM in Location 1
5.2 Experimental Validation in Location 2
5.2.1 Evaluation Metrics
To further evaluate how the proposed method could perform in the scenario with a partial ramp road, we performed the other experiment in Tsim Sha Tsui of Hong Kong. The location is shown in Figure 8 where the red dots denote the evaluated trajectory. The ramp road starts with an altitude of 0 meters (the height is relative to the start point), increases to a maximum altitude of 2 meters, and decreases to 0 meters by the end of the ramp road. More importantly, the fixed solutions were available frequently during the test. Therefore, the pose from the NovAtel SPAN-CPT was directly used as the ground truth solution of Location 2.
(a) Environmental condition at the start point (blue shaded circle); (b) the scene with partial ramp road (red shaded area)
Lidar SLAM only provides the relative positioning concerning the starting point, therefore, we applied the popular evaluating odometry (EVO; Grupp, 2017) toolkit to evaluate the relative error (RPE) for translation and rotation. Meanwhile, the evaluation metrics are as follows:
RMSE: Root-mean-square error of the relative translation and rotation
MEAN: Mean error of the relative translation and rotation
To further show the effectiveness of the proposed method in mitigating the vertical and overall drift, we added three additional evaluation metrics as follows:
Altitude: the accumulated altitude drift by the end of the trajectory
AE: the accumulated error of translation (meters) and rotation (degrees) by the end of the trajectory
%: calculated by the [accumulated drift]/[total driving distance]; this was adopted to evaluate the overall drift of the SLAM algorithm.
5.2.2 Performance Analysis
Table 4 shows the performance of the translation and rotation estimation using LeGO-LOAM and the proposed AGPC-SALM, respectively. Note that the loop closure is not applied regarding the results in Table 4 to evaluate the contribution of the standalone AGPC. Regarding the translation, both the MEAN and RMSE were reduced with the help of the AGPC using the proposed method. Although the LeGO-LOAM proposed to estimate the motion of the z-axis and pitch angle using ground points, the final altitude drift still reached 7.36 meters by the end of the test. Fortunately, the drift in altitude decreased to only 0.21 meters with the help of the AGPC, which shows the effectiveness of the proposed AGPC. Meanwhile, the overall drift was also reduced using the proposed AGPC-SLAM (see the AE and % columns in Table 4).
Performance of the Translation and Rotation Estimation in Location 2
The continuous 3D trajectories of LeGO-LOAM, AGPC-SLAM, and ground truth are shown in Figure 9. Specifically, the altitude during the test is shown in Figure 10. The black curve denotes the altitude provided by the NovAtel SPAN-CPT. We can see that the altitude estimation using the LeGO-LOAM deviated significantly from the ground truth during epoch 150~240 which corresponds to the red shaded area in Figure 9. During this period, one side of the road was filled with building surfaces which caused it to be a feature-insufficient location. The other side had dense foliage with numerous tree leaves that increased the difficulty of finding the correct correspondence of LOAM. As a result, the LeGO-LOAM drift significantly in the altitude direction. With the help of the AGPC, the altitude drift was mitigated accordingly.
Trajectories of the LeGO-LOAM (red curve), AGPC-SLAM (blue curve), and ground truth trajectory (black curve)
Altitude of the LeGO-LOAM (red curve), AGPC-SLAM (blue curve), and ground truth trajectory (black curve) during the testing area with the ramp road
The RPE of the translation and rotation estimations are provided in Figure 11. Improvements can be seen in the figure for both translation and rotation. The red and blue dash curves denote the translation errors of LeGO-LOAM and AGPC-SLAM, respectively. The black and cyan dash curves denote the rotation errors of LeGO-LOAM and AGPC-SLAM, respectively.
Relative translation and rotation errors for LeGO-LOAM and the proposed AGPC-SLAM, respectively
As shown in Figure 8, Location 2 involved a ramp road with an altitude deviating from 0 to 2 meters, which corresponds to the annotated area by the red rectangle in Figure 9. The estimated altitude using AGPC-SLAM deviates from the ground truth altitude (black curve). This was caused by the violation of the assumption that the ground plane was flat during operation, which is required by the proposed method in this paper. In the future, we will study the ramp road surface identification and remove the AGPC when the slope surface is detected. A detailed video of the test in Location 2 can be found at the following link: https://www.youtube.com/watch?v=mgLxxlhlscY&feature=youtu.be.
The performance of the translation and rotation angle estimation of AGPC-SLAM with loop closure is presented in Table 5. Although the loop closure was enabled in LeGO-LOAM, the loop was not successfully detected due to the large drift in the z-axis of LeGO-LOAM. However, the loop closure was detected using the proposed AGPC-SLAM as the vertical drift was improved. We believe that this is another contribution of the AGPC which improves the detection of the loop closure.
Performance of Translation and Rotation Estimation with Loop Closure
6 CONCLUSION AND FUTURE WORK
To mitigate the drift of lidar SLAM in the vertical direction due to the limited field of view of the lidar sensor, this paper proposes an AGPC-SLAM method that achieved improved accuracy compared to the existing state-of-the-art LeGO-LOAM. This paper innovatively proposes to employ absolute ground plane detection to mitigate the drift in z-axis related states. Moreover, better-constrained positioning in the z-axis and pitch angle can also improve the positioning in the x-axis and y-axis. We tested the proposed method in two typical scenarios in Hong Kong. The results show that the proposed method can effectively mitigate the drift on the z-axis in both the evaluated scenarios.
One of the limitations of our work is that the proposed method is effective only when a flat ground plane is available. In some extreme cases, the ground can be fully occluded by surrounding dynamic vehicles which can lead to misidentification of the ground surface. The ground surface can also be a slope that cannot be simply modeled using a plane function. In the future, we will study the scene with slope and derive the corresponding constraints to alleviate the drift of 3D lidar SLAM in urban canyons.
HOW TO CITE THIS ARTICLE
Hsu, L.-T., & Wen, W. (2022). AGPC-SLAM: Absolute ground plane constrained 3D lidar SLAM. NAVIGATION, 69(3). https://doi.org/10.33012/navi.527
APPENDIX: PERFORMANCE OF AGPC-SLAM AT 6 GCPS
Errors of the roll, pitch, and yaw angles for LeGO-LOAM and the proposed AGPC-SLAM (the x-axis denotes the ID of the GCPs from 1 to 6 and the y-axis denotes the errors)
Errors of the up (altitude) and ENU for LeGO-LOAM and the proposed AGPC-SLAM, respectively (the x-axis denotes the ID of the GCPs from 1 to 6 and the y-axis denotes the errors)
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.