RT Journal Article SR Electronic T1 Improving GNSS Positioning Correction Using Deep Reinforcement Learning with an Adaptive Reward Augmentation Method JF NAVIGATION: Journal of the Institute of Navigation JO NAVIGATION FD Institute of Navigation SP navi.667 DO 10.33012/navi.667 VO 71 IS 4 A1 Tang, Jianhao A1 Li, Zhenni A1 Hou, Kexian A1 Li, Peili A1 Zhao, Haoli A1 Wang, Qianming A1 Liu,, Ming A1 Xie, Shengli YR 2024 UL https://navi.ion.org/content/71/4/navi.667.abstract AB High-precision global navigation satellite system (GNSS) positioning for automatic driving in urban environments remains an unsolved problem because of the impact of multipath interference and non-line-of-sight reception. Recently, methods based on data-driven deep reinforcement learning (DRL), which are adaptable to nonstationary urban environments, have been used to learn positioning-correction policies without strict assumptions about model parameters. However, the performance of DRL relies heavily on the amount of training data, and high-quality, available GNSS data collected in urban environments are insufficient because of issues such as signal attenuation and large stochastic noise, resulting in poor performance and low training efficiency for DRL. In this paper, we propose a DRL-based positioning correction method with an adaptive reward augmentation method (ARAM) to improve the GNSS positioning accuracy in nonstationary urban environments. To address the problem of insufficient training data in the target domain environment, we leverage sufficient data collected in source domain environments to compensate for insufficient training data, where the source domain environments can be in different locations than the target environment. We then employ ARAM to achieve domain adaptation that adaptively modifies data matching between the source domain and target domain by a simple modification to the reward function, thus improving the performance and training efficiency of DRL. Hence, our novel DRL model can achieve an adaptive dynamic-positioning correction policy for nonstationary urban environments. Moreover, the proposed positioning-correction algorithm can be flexibly combined with different model-based positioning approaches. The proposed method was evaluated using the Google smartphone decimeter challenge data set and the Guangzhou GNSS measurement data set, with results demonstrating that our method can obtain an improvement of approximately 10% in positioning performance over existing model-based methods and 8% over learning-based approaches.