## Abstract

A server satellite that is parked in a stable distant retrograde orbit (DRO) in the vicinity of the Moon can provide navigation service for satellites in the Earth– Moon system. In this study, X-ray pulsar navigation (XPNAV) is investigated to determine the server’s orbital position and velocity, and maintain its onboard timing autonomously to ensure the autonomy of the server. First, the feasibility and potential advantages of XPNAV in DRO are analyzed qualitatively based on the unique properties of DRO: long-term orbital stability and short-term, nearly uniform-velocity rectilinear motion. Second, high-fidelity X-ray photon timestamps are simulated to serve as the original measurements for XPNAV. A classical extended Kalman filter is used to estimate the orbital states (i.e., position and velocity) and the clock timing of the satellite in DRO, wherein the pulse phase and Doppler frequency are estimated as intermediate measurements. Finally, Monte Carlo simulations are conducted to assess the variation in the XPNAV performance with the DRO orbital period, clock noise, and detector effective area. The simulation results show that XPNAV in DRO can achieve considerably improved position accuracy than that in low Earth orbit using the same hardware while maintaining the long-term stability of the satellite onboard clock, thus demonstrating the capability of XPNAV to be used in an operational spaceflight mission around or beyond the Moon.

## 1 INTRODUCTION

The satellite navigation system based on distant retrograde orbit (DRO, a type of three-body-dynamics orbit in the vicinity of the Moon) is proposed by Wang et al. (2019). This navigation system can provide navigation services to satellites in cislunar space (e.g., Earth, lunar, and Earth–Moon transfer orbits), as shown in Figure 1. It consists of server and arbitrary user satellites.

The server satellite in a stable DRO (Broucke, 1968) transmits navigation signals through a satellite-to-satellite tracking link. Conversely, the user satellite resolves the ranging measurements and server ephemeris. Accordingly, it autonomously determines its orbital position and velocity, as well as the relative clock difference to the server, utilizing linked, autonomous, interplanetary satellite orbit navigation (LiAISON; Hill, 2007).

In this configuration, the DRO-based satellite navigation system is expected to be autonomous such that its dependence on ground supports can be substantially reduced. The DRO-based navigation system might provide a broader coverage scope and higher signal intensities in cislunar space than those of the current Global Navigation Satellite System (Winternitz et al., 2019); thus, it does not increase the complexity of the user’s receiver. In addition, the server satellite does not require orbit maintenance maneuvers because DRO is characterized by long-term orbital stability (Bezrouk & Parker, 2014; Turner, 2016).

X-ray pulsar navigation (XPNAV) for satellites dates back to 1974 when Downs (1974) proposed the concept of navigation by observing pulsars in very high frequency (VHF) radio bands. Later, Chester and Butman (1981) investigated satellite navigation by observing pulsars in X-ray bands, for which a considerably smaller detector aperture could be used. Since then, research on the principles (Sheikh et al., 2006), systems (Shuai et al., 2009), and algorithms (Emadzadeh & Speyer, 2011) for XPNAV has continuously advanced. Then, a series of applications of XPNAV in space missions were proposed (Buist et al., 2011; Graven et al., 2008; Sala et al., 2004; Sheikh et al., 2011).

Recently, XPNAV experiments have been conducted in low Earth orbit (LEO) space missions, such as POLAR (Zheng et al., 2017), XPNAV-I (Zhang et al., 2017), SEX-TANT onboard the International Space Station (Mitchell et al., 2018), and Insight-HXMT (Zheng et al., 2019). In the future, XPNAV will be utilized beyond LEO (Stupl et al., 2018), where XPNAV might play a more important role in autonomous navigation. To date, the best orbit determination achieves a 5-km-level accuracy in SEXTANT using a state-of-the-art X-ray detector (Mitchell et al., 2018).

XPNAV timekeeping utilizes the stability of pulsar rotation to maintain the onboard clock accuracy. The long-term stability of pulsar rotation is comparable to the currently used timescale International Atomic Time (Rawley et al., 1987) and supports the implementation of a new timescale (Chen, 2018; Piriz et al., 2019). Such a timescale is capable of utilizing the short- and long-term stabilities of a hydrogen maser and the observed pulsar rotation, respectively. In particular, XPNAV timekeeping has been demonstrated to be feasible on the basis of the ground experiment SEXTANT using a relatively low-cost maser clock (Winternitz et al., 2018).

The existing research does not demonstrate the capability of XPNAV to achieve comparable performance to the Deep Space Network (DSN) in a practical deep space mission. By applying the same simulation settings used by Mitchell et al. (2018), Chen et al. (2017) showed that XPNAV can achieve an accuracy of about 7.5 km in an Earth-to-Jupiter interplanetary mission. This is worse than the accuracy of ground-based radio systems (up to 1.5 km).

This study has two motivations: 1) to investigate the feasibility and advantages of XPNAV in DRO, and 2) to evaluate the positioning and timing accuracy obtained by XPNAV in DRO. To explore these two unknowns, first, we analyze the unique properties of DRO that might improve the efficiency of XPNAV. Second, high-fidelity X-ray photon timestamps are simulated, and an extended Kalman filtering (EKF) estimator is proposed. Finally, we evaluate the XPNAV performance using the Monte Carlo method.

For the first time, it is revealed that XPNAV, based on the observation of highly stable pulsars, is an effective approach for autonomous orbit determination and timekeeping in an Earth–Moon system DRO spaceflight mission. This also implies that the application scope of XPNAV can be extended to other stable DROs in the Sun– Earth system and Jupiter–Europa system (Lam & Whiffen, 2005) to support improved mission autonomy.

The remainder of this paper is organized as follows: Section 2 introduces the technical background of XPNAV measurements, properties of DRO, and the predicted XPNAV performance in DRO; Section 3 presents the models, parameters, and procedures involved in generating and validating the original measurements—the recorded X-ray photon timestamps; Section 4 discusses the XPNAV algorithm and analyzes the noise of the intermediate measurements—pulse phase and Doppler frequency; Section 5 presents the XPNAV performance assessed via the Monte Carlo method considering several factors; and finally, conclusions are presented in Section 6.

## 2 TECHNICAL BACKGROUND

This section describes the low SNR feature of the original XPNAV measurements and potential approaches to improve it that also eventually improves the overall XPNAV performance. Thereafter, the long-term and short-term stability properties of DRO are introduced in detail, and the advantage of conducting XPNAV in DRO is discussed.

### 2.1 Low SNR feature of photon timestamps

X-ray photons emitted by pulsars (often also with light, radio waves, and other high-energy radiation) may travel through the solar system and be received by sensitive detectors onboard artificial satellites. The satellite clock stamps each receive photons, and these timestamps are used in XPNAV as original measurements. In reality, the original measurements suffer from a low signal-to-noise ratio (SNR) feature.

Most of the pulsars suitable for navigation feature a low flux radiation. Therefore, the radiated X-ray photons (e.g., the source flux) become extremely weak after traveling several light-years to reach the onboard detector. However, noisy photons from other sources (e.g., background flux) might be two or more times stronger than those from pulsars, thereby obscuring the source flux.

In a typical XPNAV procedure, to improve the SNR, the satellite needs to continuously observe a pulsar with a predefined observation period interval to collect sufficient X-ray photons (Sheikh & Pines, 2006). Theoretically, the SNR is proportional to the root square of the product of the detector effective area 𝐴 and the observation period interval length 𝑇 (referred to as the area-time product), as expressed by Equation (1):

1

As can be observed in Equation (1), the SNR can be increased by either enlarging 𝐴 or lengthening 𝑇. However, 𝑇 often has an upper bound beyond which the SNR drops because of the enlarged orbital errors caused by unpredictable orbital evolution.

For example, 𝑇 is limited to approximately 1,800 s (approximately one-third of a full cycle of LEO) in SEX-TANT for XPNAV in LEO (Anderson & Pines, 2014). The effective area of the detector is designed to be 1,800 cm^{2} to obtain a moderate navigation accuracy. However, this requires a heavy payload: the detector has a cubic shape with a side length of 1 m, weight of 263 kg, and working power of 337 W when it is in nominal operation (Ray et al., 2017). This would be an enormous burden for a regular satellite.

With the future advancement of X-ray telescope technology, the size and weight of the detector payload can be reduced. Focusing-type optic telescopes (Hong & Romaine, 2016) are more beneficial to XPNAV than collimator-type telescopes mainly because of the narrower field-of-view (FoV) of the former.

The narrow FoV filters out bright cosmic X-ray background sources and causes photon timestamps to record only photons from the target pulsar and its surrounding background. However, a narrow FoV typically implies a long focal length and large-sized detector, and it is impractical to observe multiple pulsars simultaneously. Therefore, a typical procedure of observing a pulsar is by pointing to each pulsar direction sequentially according to the predefined plan.

### 2.2 DRO and its unique properties for XPNAV

DRO in the Earth–Moon system is a periodic orbit circum-navigating the Moon in the circular, restricted three-body problem (CRTBP). It appears to be an ellipse-like retrograde orbit around the Moon, when viewed in the Earth– Moon Rotating Frame (EMRF), and non-Keplerian prograde orbit around the Earth, when viewed in the Earth-centered inertial frame (Figure 2).

The EMRF is defined as follows: the origin is at the barycenter of the Earth–Moon system, the +X axis points in the direction from the Earth to the Moon, the +Z-axis is perpendicular to the Moon orbital plane and points in the north celestial direction, and the Y-axis is determined by the right-hand principle. The Z-axis of the EMRF coincides with that of the inertial frame. *p*:*q* DRO is a DRO that completes *p* (integer) cycles around the Moon; every *q* (integer) cycles, the Moon orbits the Earth–Moon barycenter. In this study, 2:1 DRO, 3:1 DRO, and 4:1 DRO were selected as the parking orbits for the server satellite.

If a satellite continuously observes a pulsar in a DRO, the observation period interval length 𝑇 in Equation (1) can be much longer than that in LEO. Thus, the SNR of XPNAV in DRO can be enhanced. This fact relies on two unique properties of DRO: long-term orbital stability and short-term nearly uniform-velocity rectilinear motion.

*Long-term orbital stability* refers to the fact that a few DROs are stable and maintain nearly unchanged orbital shapes and orientations for years and even centuries in a realistic gravitational force environment (Bezrouk & Parker, 2014; Turner, 2016). Considering this feature, the National Aeronautics and Space Administration (NASA) has proposed the Asteroid Redirect Mission (Strange et al., 2013) wherein a piece of asteroid boulder will be redirected and stored in a stable DRO. With respect to XPNAV, the amplitude of orbital errors does not diverge rapidly. However, it remains nearly unchanged during each period of the DRO, which is beneficial for lengthening the observation period interval 𝑇.

The nearly uniform velocity rectilinear motion implies that the spacecraft motion can be approximated by simple parametric models with as few parameters as possible. This approximation holds for more than 10,000 s in DRO, at least an order of magnitude longer than that in LEO (e.g., 1,800 s), and it substantially reduces the computational complexity of parameter estimation that eventually facilitates estimating the spacecraft orbital states and timing efficiently and accurately, as demonstrated by Winternitz et al. (2015) and Xue et al. (2015).

In summary, the long-term orbital stability and short-term, nearly uniform-velocity rectilinear motion support the enhancement of the SNR for XPNAV in DRO. This would also facilitate achieving an improved XPNAV performance in DRO, compared with that in LEO using the same detector, or achieving the same-level navigation accuracy using a much smaller detector than that in LEO. Therefore, satellite parking in a DRO does not need to carry the large, weighty, and high-power detector used in the SEXTANT mission (Mitchell et al., 2018). This advantage strongly supports the autonomy of the server satellite used in this study.

## 3 SIMULATING PHOTON TIMESTAMPS

This section describes a method for generating high-fidelity simulated photon timestamps—the original measurements of XPNAV. The simulation is based on the elaborate models presented in Figure 3.

The photon timestamp relies on the models of pulsar timing (Edwards et al., 2006; Razzano, 2007), satellite orbit, and satellite clock. Section 3.1 introduces these models and their parameter settings, and Section 3.2 describes the generation and validation of the simulated photon timestamps.

### 3.1 Photon timestamp model

An X-ray photon timestamp 𝑠_{𝑘} is an onboard clock readout when the *k*-th photon from the observed pulsar (and its near background) arrives at the detector. It is modeled as an event of a non-homogeneous Poisson process (NHPP; Anderson & Pines, 2014), with the rate function 𝜆(⋅) expressed by Equation (2):

2

where 𝐴 is the detector effective area and 𝛼𝛽 denote the source and background flux count rates of the observed pulsar, respectively. The function ℎ(⋅) defines the unique pulse shape (also referred to as *template profile* or *light curve*) of a specific pulsar and describes the variation of the pulsar radiating flux to an observer in the solar system. Further, ℎ(⋅) is a nonnegative, continuous, and piecewise differentiable periodic real function and has a unit area such that , and it also has a period in cycles such that ℎ(𝜙 + 𝑛) = ℎ(𝜙), where *n* is an integer.

Figure 4 illustrates the pulse shapes (flux vs. pulse phase) for four pulsars (i.e., B1937+21, B1821-24, J0218+4232, and J0437-4715) that are obtained from long-term astronomical observation data. The amplitudes are normalized to the maximum flux rate of each pulsar (Strange et al., 2013). The *k*-th photon received by the satellite at time instant 𝑠_{𝑘} corresponds to a pulse phase value when this photon is emitted, and the pulse phase is denoted by 𝜙_{r}(𝑠_{𝑘}), which is related to the pulsar timing model.

#### 3.1.1 Pulsar timing model

The pulsar timing model describes the pulsar rotation phase 𝜙_{SSB}(𝑡^{TCB}) that is presumably observed at the solar system barycenter (SSB) with respect to time measured in the barycentric coordinate time (TCB), as shown in Equation (3):

3

where , and are the pulse phase, frequency, and time derivative of the frequency at an initial epoch , respectively. Further, 𝜂_{𝑃} is the timing uncertainty, and it is modeled as Gaussian white noise, as expressed in Equation (4), where 𝜎_{𝑃} denotes the standard deviation (Deneva et al., 2019).

4

Table 1 shows the pulsar timing model parameters for four X-ray pulsars with high stability, which are cited from Vivekanand (2020) and Perera et al. (2019).

Figure 5 shows the Hadamard deviations of the pulse phase series computed using Equation (3), with a period of 1,500 days and 1-Hz data sampling. Rather than using the Allan deviation, Hadamard deviation is used here because the former diverges when the time derivative of frequency exists. This demonstrates that pulsar rotations become more stable with an increase in averaging time. It should be noted that the rotation frequency of the brightest crab pulsar changes with unknown glitches, making it unsuitable for high-fidelity XPNAV (Ray et al., 2017); thus, it is not considered here.

Using the pulsar timing model, the pulse phase evolution function 𝜙_{r}(⋅) at the detector with respect to the onboard clock readout 𝑠_{𝑘} can be computed by Equation (5):

5

where denotes the time measured in the TCB when the *k*-th photon presumably arrives at the SSB. The relationship between 𝑠_{𝑘} and is modeled as Equations (6) and (7):

6

7

Here, the unit vector 𝒏 denotes the direction of the observing pulsar in the J2000 inertial reference frame and is expressed by the right ascension and declination. is the corresponding time instant given by the terrestrial time (TT) that is not affected by the clock noise and is obtained by iteratively solving Equation (7). is the position vector of the Earth mass center relative to the SSB and is computed by interpolating the planetary ephemeris, and is the position vector of the satellite relative to the Earth mass center.

Moreover, is the clock offset relative to TT. and are interpolated from the assumed true satellite orbit (see Section 3.1.2) and the true satellite clock offset series (see Section 3.1.3), respectively. denotes the delays, including the Einstein, Shapiro, and binary pulsar system delays (if the binary companion exists); however, it does not include the solar system or interstellar dispersion because X-ray band photons are immune to them (Deneva et al., 2019). Models for Δ_{other} can be found in Edwards et al. (2006); thus, they are not addressed here. Finally, *c* is the speed of light.

#### 3.1.2 Satellite orbit model

The assumed true satellite orbit is generated by numerically integrating a set of ordinary differential equations (ODEs) in Equation (8), given the initial orbital states at a specific epoch:

8

where 𝒀(𝑡^{TT}) = [ 𝒙(𝑡^{TT})^{T} 𝒗(𝑡^{TT})^{T} ]^{T} denotes the satellite position and velocity and 𝒂(𝑡^{TT},𝒙) denotes the acceleration experienced by the satellite comprised of the solar radiation pressure 𝒂_{𝑆𝑅𝑃} and the gravitational attraction of the Earth 𝒂_{𝐸}, Moon 𝒂_{𝑀}, and Sun 𝒂_{𝑆}. This is expressed by Equation (9):

9

To compute the gravitational attractions, the Earth and the Moon are modeled as non-spherical shapes, the Sun is modeled as a mass point, and their positions are read from the planetary ephemeris. The solar radiation pressure is modeled as Equation (10):

10

where P is the solar radiation pressure at one astronomical unit (AU), 𝐶_{𝑅} is the radiation pressure coefficient, is the area-to-mass ratio, and 𝐬 is the vector pointing from the satellite to the Sun (unit is AU). Table 2 provides the force model and integrator parameters to obtain high-fidelity orbits.

Three DROs were generated as shown in Figure 6. They are no longer perfectly periodic in the high-fidelity environment, but still hold shapes similar to those presented in Figure 2.

Table 3 shows their initial states at the start of Feb 1, 2018 TT, and the time labels for the generated assumed true satellite orbital states are expressed in TT.

#### 3.1.3 Satellite clock model

The satellite clock model describes the clock states (the clock offset, clock drift, and clock aging) at . With a given initial state 𝛿𝒕_{0}, the clock states at equally spaced TT time epochs are obtained using a recursive model (Zucca & Tavella, 2005) given by Equation (11):

11

where 𝛿𝜍 is the recursive step size, 𝜺_{𝑘} = [ 𝜀_{1} 𝜀_{2} 𝜀_{3} ]^{T} is a stochastic vector that represents the innovation of the stochastic process, and 𝜺_{𝑘} is distributed as a Gaussian random vector with zero mean and covariance matrix expressed by Equation (12).

12

where 𝑞_{1}, 𝑞_{2}, and 𝑞_{3} represent the process noise caused by frequency white noise, frequency random walk noise, and frequency random run noise in the clock, respectively (Hutsell, 1995).

We propose three types of clocks with different stability properties. *Clock 1* corresponds to a highly stable hydrogen maser in a laboratory environment (Piriz et al., 2019), *Clock 2* is an ultra-stable oscillator (USO) used for lunar exploration mission LRO (Dirkx, 2015), and *Clock 3* is the USO that is commonly used in near-Earth space missions (Winternitz et al., 2018). Table 4 provides the parameters for the three clocks.

The left panel of Figure 7 shows 10 series of simulated clock offsets for Clock 1, Clock 2, and Clock 3. These clock offsets are generated by setting all the initial states to zero (𝛿𝒕_{0} = **0**), starting at the start of Feb 1, 2018 TT, and spanning 1,500 days. The right panel of Figure 7 shows their analytical Hadamard deviations, calculated using the Hadamard-Q equation (Hutsell, 1995), given by Equation (13):

13

The Hadamard deviation of the rotation of PSRB1937+21 is also provided for comparison. The results indicate that the stabilities of Clock 2 and Clock 3 are worse than the pulsar rotation when the averaging times are larger than 3 × 10^{5} s and 2 × 10^{4} s, respectively. Thus, the long-term stability of these clocks can be improved by exploiting the stability of pulsar rotation.

### 3.2 Generation and validation of the simulated photon timestamps

Employing the photon timestamp model, the thinning method (Lewis & Shedler, 1979; Pasupathy, 2011) is used to generate a series of photon timestamps 𝑠_{𝑘} based on the following steps:

Initialization—set 𝑘 = 0 and , where is the start epoch of the simulation; thereafter, convert to using Equation (6).

Generate two independent and uniformly distributed random variables 𝑢

_{1},𝑢_{2}∼ 𝑈(0, 1).Update with ‒ ln𝑢

_{1}∕𝜆_{𝑢}and compute the NHPP rate function value using Equation (2).If , convert to 𝑠

_{𝑘}in terms of Equations (6) and (7), and then update 𝑘 with 𝑘+1. Else, ignore.Repeat the process from Step (2) until the end of the simulation time span.

To validate the correctness of the method for simulating photon timestamps, a test scenario was designed with the necessary parameters shown in Table 5 and Table 6. The simulation parameters are designed to ensure that each pulsar in the target list is consequently observed with an equal observation period interval of 1,800 s. The pulsar parameters are set according to those mentioned in Winternitz et al. (2015); they are assumed to be the ground truth for simulating photon timestamps.

The photon timestamps from each pulsar are output to a standard FITs format file and are processed by the Fermi plug-in of Tempo2 software (Ray et al., 2011). A pulse phase value is calculated for each photon timestamp. Subsequently, all pulse phases are folded into histograms by assigning them to 100 evenly distributed phase bins, as shown in Figure 8.

Here, the expected rate functions (denoted by red lines) are re-scaled in terms of the observation period interval and the bin size, yielding 𝜆(⋅) × 1800∕100. The consistency between the histograms and rate functions verifies the correctness of the models for simulating high-fidelity photon timestamps.

## 4 XPNAV ALGORITHMS

This section first describes the framework and procedures of the EKF and then analyzes the covariance for intermediate measurement noise that is essential for XPNAV. The EKF is devised because its concise form makes it easier to implement real-time computation, compared with other nonlinear filters (Chen et al., 2020), and the EKF serves as a baseline approach to conduct an assessment of the XPNAV performance.

### 4.1 EKF framework

Figure 9 shows the framework of the EKF. It consists of three main procedures: time update (prediction), intermediate measurement estimation (processing), and measurement update (correction).

First, the a priori states at an epoch 𝑡_{𝑖‒1} are used to predict the filter states (i.e., position, velocity, and clock offset) between 𝑡_{𝑖‒1}and 𝑡_{𝑖}, while the photons from a single pulsar are accumulated to form a batch of timestamps. Thereafter, the intermediate measurements, pulse phase, and Doppler frequency at 𝑡_{𝑖} are estimated from batched timestamps.

Finally, the predicted filter states at 𝑡_{𝑖} are corrected using the intermediate measurements to determine the estimated states. The estimated states serve as the *a priori* states for the next-time update procedure. Note that the filter epoch 𝑡_{𝑖} is expressed in the onboard clock readout, and this is the case for all the time labels used afterward, if not specifically noted.

The filter states at 𝑡_{𝑖} are expressed as Equation (14):

14

where denotes the orbital states (i.e., position and velocity) of the satellite, 𝐶_{𝑅𝑖} is the solar radiation pressure coefficient, and 𝛿𝐭_{𝑖} = [Δ_{𝐶}(𝑡_{𝜍𝑘}) Δ′_{𝐶}(𝑡_{𝜍𝑘}) Δ″_{𝐶}(𝑡_{𝜍𝑘})]^{T} represents the clock states (i.e., offset, drift, and aging) at the initial epoch of the 𝜍_{𝑘}-th clock arc interval [𝑡_{𝜍𝑘},𝑡_{𝜍𝑘+1}).

Figure 10 illustrates the concept of the filter step (𝑡_{𝑖‒1},𝑡_{𝑖}), observation period interval, clock arc interval, and simulation span [𝑡_{0},𝑡_{f}). The clock arc length is denoted by 𝜍 = 𝑁_{𝑐}(𝑡_{𝑖} ‒𝑡_{𝑖‒1}), where 𝑁_{𝑐} is a positive integer, whose value is determined according to the historical clock offset data.

### 4.2 Time update (prediction)

The time update procedure obtains the predicted filter states 𝐗_{𝑖|𝑖‒1} at 𝑡_{𝑖} using the *a priori* filter states 𝐗_{𝑖‒1} at 𝑡_{𝑖‒1} as input. Specifically, the predicted orbital states 𝐘_{𝑖|𝑖‒1} are solved by integrating the ODEs expressed in Equation (8) with 𝐘_{𝑖‒1} as the initial value, in terms of constant 𝐶_{𝑅} and the onboard predicted TT time that is computed using Equation (15):

15

where 𝑡 is the onboard clock readout and is the predicted clock offset computed by Equation (16):

16

The clock states are predicted by Equation (17):

17

where 𝚽_{𝐶} is the clock state transition matrix. It is an identity matrix if 𝑡_{𝑖} does not reach the end epoch of the clock arc; otherwise, 𝚽_{𝐶} is given by Equation (18):

18

In addition to the filter states, the corresponding covariance matrix is propagated using Equation (19), with 𝐏_{0} as the initial value that is a diagonal matrix, and its elements are given by the uncertainties of the initial states.

19

where 𝚽_{𝑖|𝑖‒1} is the filter state transition matrix and is denoted by Equation (20):

20

Here, the orbital state transition matrix 𝚽_{𝑌}(𝑡_{𝑖},𝑡_{𝑖‒1}) is obtained by solving the following ODEs shown by Equation (21):

21

with the initial value 𝚽_{𝑌}(𝑡_{0},𝑡_{0}) set as the identity matrix (Montenbruck & Gill, 2011).

In Equation (19), 𝐐_{𝑖} is the covariance matrix of the process noise, which is expressed as Equation (22):

22

where 𝑞_{𝑐} is a tuning parameter used to obtain the best representation of the orbital process noise and 𝚪_{𝑌} is the orbital process noise transform matrix, as expressed in Equation (23), with 𝑇 as the length of the filter step, which equals the observation period interval length; that is, 𝑇 = 𝑡_{𝑖} ‒𝑡_{𝑖‒1}.

23

Further, 𝐐_{𝐶} is the covariance of the clock process noise (Hutsell, 1995; Zucca & Tavella, 2005), which is a zero matrix before the filter epoch 𝑡_{𝑖} reaches the end epoch of the clock arc. Otherwise, 𝐐_{𝐶} is given by resetting 𝛿𝜍 for 𝐐_{clc𝑘} expressed in Equation (12), as given by Equation (24):

24

### 4.3 Intermediate measurement estimation (processing)

#### 4.3.1 Estimation of the pulse phase and Doppler frequency

Before the measurement update, the pulse phase and its derivative (i.e., Doppler frequency) at 𝑡_{𝑖} should be estimated using the batched photon timestamps collected during the observation period interval. As proposed by Winternitz et al. (2015), 𝜙_{r}(𝑡) is approximated by the sum of the predicted and a time-varying function 𝑒(𝑡) as given by Equation (25):

25

Further, the predicted pulse phase function is computed using Equations (5) through (7). Here, the satellite position is given by the predicted orbital positions , and the clock offset is computed as . The position of the Earth and delays required in Equation (6) can be computed using , and these delays are thus assumed to be known during the observation period interval. The analytical form of 𝑒(𝑡) can be expressed as given by Equation (26):

26

where and are the unknown predicted errors of the orbital position and the clock offset, respectively.

𝛿𝒙(𝑡) in DRO can be approximated as a linear model during an observation period interval. Thus, we have Equation (27):

27

where 𝛿𝒙(𝑡_{𝑖‒1}) and 𝛿𝒗(𝑡_{𝑖‒1}) are the unknown constant position and velocity estimation errors at 𝑡_{𝑖‒1}, respectively. The approximation error can be evaluated from the distance between the state transition matrix for a DRO and that for a uniform-velocity rectilinear motion as Equation (28):

28

Figure 11 illustrates the evolution of the approximation error with respect to the observation period interval length in LEO (altitude of 500 km), MEO (20,000 km), and GEO (36,000 km). This shows that the small value of the approximation error in DRO is maintained for a much longer time (10,000 s) compared with that in LEO (1,000 s), justifying the approximation.

Moreover, 𝛿Δ_{𝐶}(𝑡) in Equation (26) could be approximated by Equation (29):

29

where 𝛿Δ_{𝐶}(𝑡_{𝜍𝑘}) and are unknown constant estimation errors for the clock states at the start epoch of the 𝜍_{𝑘}-th clock arc. Therefore, 𝑒(𝑡) can be modeled by a linear polynomial and expressed as Equation (30):

30

where *q*_{1} and *q*_{2} are the constant unknown parameters. Here, we change the reference epoch at the end epoch of the observation period interval to accommodate the filter update epoch 𝑡_{𝑖}.

Once 𝑒(𝑡) or, equivalently, the pulse phase function 𝜙_{r}(𝑡) is parameterized, the maximum likelihood estimation (MLE) method is used to estimate *q*_{1} and *q*_{2} from the photon timestamps. The likelihood function (Winternitz et al., 2015) of the batched photon timestamps can be expressed as Equation (31):

31

The optimal value can be obtained by maximizing the log likelihood function as Equation (32):

32

Here, the exponential part is dropped because it is constant for a specified length of the observation period interval, and Ω is the parameter search grid. The search range for 𝑞_{1} is determined from the sum of the position and clock offset diagonal elements of the covariance matrix, and the search range for 𝑞_{2} is determined from that of the velocity and the clock rate components, all converted to the corresponding unit.

With the estimated parameters, the intermediate measurements at 𝑡_{𝑖} can be obtained from Equation (33):

33

where and are the estimated pulse phase and Doppler frequency, respectively, which are deemed to be the intermediate measurements of XPNAV.

#### 4.3.2 Covariance of the intermediate measurement noise

The covariance matrix 𝐑_{𝑖} of the intermediate measurement noise is required in the EKF. Poisson noise is the most significant part of the measurement noise, and the covariance of Poisson noise can reach the Cramer–Rao lower bound (CRLB; Emadzadeh & Speyer, 2010; Winternitz et al., 2015) as shown in Equation (34), given a sufficiently large SNR:

34

where 𝑇 and 𝐴 are the observation period interval length and detector effective area, respectively, and 𝐼_{𝑃} is defined by Equation (35). A narrow and sharp pulse shape provides a large 𝐼_{𝑃} value and, thus, a higher estimation accuracy for the intermediate measurements.

35

When the Doppler effect is sufficiently small, the CRLB of the estimated pulse phase decreases by half (Kay, 1993; Ray et al., 2017), relative to that of the linear model in Equation (34), which is beneficial for improving the XPNAV performance. Specifically, when the amplitude of *q*_{2} is smaller than a specific criterion, the evolution of 𝑒(𝑡) during the observation period interval is approximated by Equation (36):

36

In other words, the difference between the predicted pulse phase function and the real counterpart is constant during the observation period interval, and it is caused by the position and clock offset estimation error.

For example, when the sum of uncertainties from the velocity 𝛿𝒗(𝑡_{𝑖‒1}) and clock rate is less than 0.1 m/s (corresponding to the Doppler prediction error), the constant model can be used. This can only be realized in DRO because the velocity error in LEO is much larger, as presented later in Table 12.

Furthermore, for a specific pulsar, when the area-time product value (equivalent to the SNR) is less than the criterion noted by 𝑞_{𝑝}, the Poisson noise can diverge drastically (Golshan & Sheikh, 2007; Hanson et al., 2008; Runnels & Gebre-Egziabher, 2017; Sheikh et al., 2011), making the CRLB invalid. We used the Monte Carlo simulation method to determine the criterion 𝑞_{𝑝} for each pulsar.

Supposing the state errors 𝛿𝐱(𝑡_{𝑖‒1}), 𝛿𝐯(𝑡_{𝑖‒1}), 𝛿Δ_{𝐶}(𝑡_{𝜍𝑘}), and are all set to zero implies that we are performing the MLE on the true satellite orbit with an ideal onboard clock. The constant model in Equation (36) can be used to model the pulse phase, and the true value *q*_{1} = 0 holds during the observation period interval. Subsequently, for each simulation, different combinations of areas and observation period interval lengths are selected to generate simulated photon timestamps that are processed using the aforementioned MLE method.

The searching space for *q*_{1} is manually set to the ±0.9 cycle of the observing pulsar rotational period that is sufficiently large, ensuring the reliability of the searching results. For each combination, 50 simulations and estimations were performed. Figure 12 shows the root-mean-squares (RMSs) of these estimation results, where each MC point represents the RMS result of the simulations with the same area-time product. The analytical CRLB results are also provided for comparison.

The area-time product criterion 𝑞_{𝑝} is defined as the area-time product value when the Monte Carlo simulation results are within ±10% the analytical CRLB that are shown in Table 7.

This indicates that PSRB1821-24 has a larger 𝐼_{𝑃} than PSRJ0218+4232, but they have a similar flux ratio (the ratio between the source and the background flux rate). Therefore, a much smaller 𝑞_{𝑝} exists for the former.

PSRB1937+21 has a larger 𝐼_{𝑃} and smaller flux ratio than PSRJ0437-4715, and they have the same 𝑞_{𝑝} value. Therefore, it is understood that 𝑞_{𝑝} exhibits an inverse relationship with 𝐼_{𝑃} and the flux ratio of the pulsar. The specific value of 𝑞_{𝑝} may vary for different simulation settings (Deneva et al., 2019; Sheikh et al., 2011), and this may become a concern when a small-effective-area detector is onboard.

### 4.4 Measurement update (correction)

The estimated intermediate measurements and their noise covariance matrix are subsequently sent to the EKF to complete the measurement update step given by Equation (37):

37

Here, 𝐊_{𝑖} is the Kalman gain and 𝐇_{𝑖} is the design matrix given by Equation (38):

38

where 𝐧 is the direction vector of the observed pulsar. It is noted that is assumed to be zero in the design matrix because it is in the order of 10^{‒14} Hz/s and causes an extremely small-term approximating error during the observation period interval (up to 10^{4} s).

Repeating the aforementioned time update, intermediate measurement estimation, and measurement update procedures provides the state estimation 𝐗_{𝑖} and the covariance matrix 𝐏_{𝑖} at each epoch of the filter.

## 5 MONTE CARLO SIMULATIONS AND PERFORMANCE ASSESSMENT

In this section, the XPNAV performance in DRO is assessed through Monte Carlo simulations, considering three key factors: the DRO orbital period, onboard clock noise, and detector-effective areas. In Sections 5.2 through 5.4, one of the key factors is assessed in several scenarios, assuming that the other factors remain unchanged. For each scenario in which a set of three factors are specified, 100 trials of EFK runs are performed to assess the XPNAV performance.

### 5.1 Navigation settings and test

For all the trials of the EKF runs, we specify five aspects of simulation parameter settings commonly used in the following Monte Carlo simulations.

#### 5.1.1 Parameters of the pulsar timing model

The direction unit vector 𝐧, pulse shape ℎ(⋅), and flux rates 𝛼,𝛽 are all set to be the same as those used in simulating photon timestamps. This implies that uncertainties in these parameters (called *astrometric errors*) are ignored in this study.

#### 5.1.2 Uncertainty of the initial orbital states

The uncertainty of the initial orbital state, relative to the assumed true orbit, is randomly generated using zero-mean Gaussian noise models, with the standard deviations provided in Table 8. Such initial uncertainty in the position and velocity can be obtained autonomously using only an onboard optical sensor (Qian, 2013).

#### 5.1.3 Uncertainty of the orbit model parameters

The orbit model parameters used in the EKF runs are set differently from those used in simulating the photon timestamps. Table 9 shows the different items. As one of the orbit model parameters, 𝐶_{𝑅}, is also among the filter states, its initial value is set to that of the true 𝐶_{𝑅}, plus 20% uncertainty and is corrected during the navigation. The value of the process noise parameter 𝑞_{𝑐} in Equation (22) is tuned according to the navigation performance.

#### 5.1.4 Uncertainty of the clock model parameters

The uncertainty of the initial clock state, relative to the true clock state, is randomly selected from the zero-mean Gaussian white noise with the standard deviations shown in Table 10. The * q* parameters for each clock used in Equation (24) can be pre-determined from the historical clock readout data. Thus, they are assumed to be known and set to the same value as those used for simulating the photon timestamps.

The clock arc length 𝜍 for each clock is also determined by analyzing the simulated clock offset data to ensure that the clock offset revolves in a polynomial form within the clock arc interval. Specifically, the simulated clock offset series from each clock are fitted with an equally spaced piece-wise polynomial model, and the size of the piece is adjusted to minimize the RMS of the fitting errors. Table 10 shows the best-fitted pieces that are set as the clock arc lengths.

Figure 13 shows the fitting errors for these clock offset series. Further, the amplitudes of the fitting errors for the three clocks are all within 10 ns (3 m), which is negligible, compared with the amplitude of Poisson noise (kilometer level), and justifies the setting of the corresponding arc lengths.

#### 5.1.5 Observation schedule

When the angle between the line-of-sight of the pulsar and the Sun is less than the prescribed Sun avoidance angle, the pulsar is overwhelmed. Hence, the simulated photon timestamps are not available. The Sun avoidance angle is set to 40 degrees in the simulations, according to Branduardi-Raymont et al. (2011) and the NuSTAR team (2014). Such a value makes PSRB1937+21 unblocked all the time for satellites in DRO, PSRB1821-24 blocked from November through January, and PSRJ0218+4232 blocked from March through May every year.

Meanwhile, sheltering from the Earth and the Moon is not considered in this study because of their relatively short-term periods. The start epoch of the simulation is set to the start of Feb 1, 2018 TT, to avoid the obstacle of PSRB1821-24 at the start of the simulation. Moreover, the satellite is assumed to carry a single detector, and thus, a single pulsar is observed at a time. For each scenario, each pulsar in the target list is observed sequentially for a given fixed observation period interval and this observation process is restarted upon reaching the end of the target list.

The accuracy of the XPNAV performance is evaluated by comparing the estimated and true states. For a specific scenario, during each of its 100 trial EKF runs, the difference between the estimated and true values was calculated and recorded. Thereafter, the RMS of 100 recorded differences at the same epoch was computed, which is considered the error of this epoch. Finally, the errors for all epochs formed a series to illustrate the performance of this scenario.

When the amplitude of position errors converges within 1 km, the length of the interval from the start to the epoch is recorded as the convergence time for this scenario. The RMS of the 3D position errors (i.e., square root of the sum of the squared position errors in x, y, and z directions), 3D velocity errors, and clock offset errors after the filter converge, is called the *position accuracy*, *velocity accuracy*, and *clock offset accuracy*, respectively. The diagonal components of the state covariance matrix 𝐏_{𝑖} provide the precision estimation of the filter states.

We investigated XPNAV in a LEO demo scenario to validate the algorithms; thus, the parameters were set as close as possible to those used in Winternitz et al. (2015). The 3D position and velocity error series are shown in Figure 14.

The position accuracy is 3.25 km, the velocity accuracy is 2.9 m/s, and the 3D position errors converge below 5 km after 1.1 days. This result conforms to those in Winternitz et al. (2015) and verifies the correctness of the EKF algorithm. In addition, the position errors are well within the uncertainty given by the diagonal component of the covariance matrix, justifying the precision estimation of the filter.

The pre-fit residuals [𝐙_{𝑖} ‒ 𝐡(𝐗_{𝑖|𝑖‒1}) in Equation (37)] for all the pulsars observed are illustrated in Figure 15, which provides an inner consistency check for the EKF performance. After the convergence of the filter, the RMS of the errors for each pulsar is 3.1, 2.9, 12.7, and 44.6 km, respectively, which are close to the Poisson noise bounded by the CRLB. This confirms the effectiveness of the MLE and validates the convergence of the filter.

### 5.2 Sensitivity to the DRO orbital period

Three scenarios with different DRO orbital periods are described here. Table 11 shows the settings necessary for these scenarios. In addition to the effective area and the observation period interval, two other settings make the simulation conditions as close as possible to those in LEO.

First, an *ideal clock* that maintains TT time is used to provide a configuration comparable to that in Winternitz et al. (2015), where the onboard clock is synchronized to an ideal clock maintaining (GPS time and equivalent to) TT time at the onboard GPS receiver. Second, the simulation period is designed to avoid obstacles from the Sun, ensuring that the targets are all visible during the navigation.

Figure 16 shows the position error series for these scenarios and that the convergence time is proportional to the orbital period of the DRO.

Table 12 indicates that XPNAV in DRO achieves a considerably higher position and velocity accuracy than that in LEO. However, the convergence speed of XPNAV in DRO is much slower than that in LEO. In addition, the position accuracy is similar for different orbital periods, and the EKF convergence time is proportional to the DRO orbital periods, while the velocity accuracy is inversely proportional.

### 5.3 Sensitivity to clock noise

Three scenarios with different clocks are designed based on the settings provided in Table 13. Figure 17 presents the 3D position and clock offset errors for these scenarios, and Table 14 shows the navigation performance.

Evidently, if Clock 1 is used rather than Clock 2, the clock offset accuracy increases from 0.3 *μ*sto0.08 *μ*s, while the position accuracy remains comparable. However, if the less-stable Clock 3 is used, the 3D position errors cannot converge to a level below 1 km, and the position and clock accuracy degenerate drastically compared to those when Clock 2 is used.

As shown in Figure 18, the stability of the estimated clock offset provided by the onboard EKF is analyzed using the Hadamard deviation. For comparison, the Hadamard deviations for the simulated clock offset and the pulse phase data series of PSRB1937+21 are also shown.

Clearly, Clock 1 is more stable than pulsars, and the XPNAV estimated clock offset is less stable than the original clock offset that also appears at the small averaging time interval when Clock 2 or Clock 3 is used. This can be explained by the short-term unstable Poisson noise in XPNAV measurements.

The stability of the estimated clock offsets using Clock 2 and Clock 3 approaches the stability of pulsar rotation at a large averaging time interval and becomes more stable than the original clock offset, demonstrating the feasibility of improving onboard clock stability employing XPNAV.

To quantitatively compare the stabilities, we define the improvement in the frequency stability as Equation (39):

39

where 𝜎_{𝐻,𝑆}(𝜏) and 𝜎_{𝐻,𝐸}(𝜏) represent the Hadamard deviation for the simulated and estimated clock offsets, respectively. Improvements in clock stability are shown in Table 15. This implies that the less stable the onboard clock, the more likely that the XPNAV can improve its stability.

### 5.4 Sensitivity to the detector effective area

Five scenarios are designed using detectors with different effective areas. Table 16 presents the specific simulation settings.

For each scenario, the observation period for each pulsar is set to 𝑇 = 𝑞_{𝑝}∕𝐴, according to the area-time product criterion given in Table 7. A longer observation period is not pursued here because it also causes an increase in the predicted orbital position error and finally decreases the overall navigation performance. Table 17 shows the accuracy for these five scenarios.

Evidently, when the performance of effective area decreases, the position accuracy and the clock offset accuracy also decrease while the convergence time increases. As shown in Figure 19, a roughly linear relationship exists between the position accuracy, convergence time, and the square root of the inversed effective area.

Further, the velocity accuracy and the clock offset show a similar pattern. This is possibly because Poisson noise dominates the measurement noise after the convergence, considering that the amplitude of the noise is proportional to the squared root of the inverse effective area obtained when using Equation (34).

## 6 CONCLUSIONS

The following conclusions can be drawn from this study: 1) XPNAV in DRO is feasible, investigated by simulating high-fidelity photon timestamps, devising navigation algorithms, and performing Monte Carlo simulations; 2) The position accuracy in DRO (under 100 m) is considerably improved compared to that in LEO (approximately 5 km), using the same detector, clock, and observing pulsars and these results hold for several DROs with different orbital periods; and 3) The long-term frequency stability of the onboard clock can be improved and maintained by the XPNAV. The less stable the clock, the more likely that XPNAV can improve its stability.

Neglecting astrometric errors during the navigating Monte Carlo simulations may overstate the XPNAV performance in this study. However, the following two pieces of evidence justify the neglect. First, the simulation results lead to a comparable position accuracy as that of the on-orbit test in LEO (Mitchell et al., 2018), implying that astrometric errors are not significant, relative to other errors. Second, Liu et al. (2010) show that the astrometric error causes slow time-varying errors in the direction of the pulsar, and this error can be estimated and eliminated.

By augmenting a piece-wise constant state to the navigation states, the position accuracy increases from approximately 800 m to sub-100 m. What’s more, modeling the pulsar timing model based on Gaussian white noise may ignore the potential red noise, such as the potential existing spinning red noise and the noise caused by errors in the pulsar period or the pulsar direction. Considering that red noise often has a period longer than 1 year, it may prevent the onboard clock stability from being improved by XPNAV. However, the feasibility of timekeeping is not affected.

Although the obtained navigation accuracy is still worse than that of the traditional ground-based radio positioning system, XPNAV provides an autonomous navigation method for satellites in the Earth–Moon DRO. Its significance lies in the achievement of autonomy and reduction of the burden on the ground system.

This means that XPNAV can be used to support Earth– Moon system missions with the currently available technology, which has not been realized so far. In addition, it also implies that the application scope of XPNAV can be extended to other stable DROs in the Sun–Earth system and Jupiter–Europa system, etc., to support improved mission autonomy.

## HOW TO CITE THIS ARTICLE

Liu J, Wang W, Zhang H, Shu L, Gao Y. Autonomous orbit determination and timekeeping in lunar distant retrograde orbits by observing X-ray pulsars. *NAVIGATION*. 2021;*68:*687–708. https://doi.org/10.1002/navi.451

## ACKNOWLEDGMENT

This work was supported by the Chinese Academy of Sciences Joint Fund (Grant No. 6141A01011703) and the Key Research Program of the Chinese Academy of Sciences (Grant No. ZDRW-KT-2019-1). The authors wish to thank Chao Peng for his constructive feedback on this paper.

## Footnotes

**Funding information**Chinese Academy of Sciences Joint Fund, Grant/Award Number: 6141A01011703; Key Research Program of the Chinese Academy of Sciences, Grant/Award Number: ZDRW-KT-2019-1

- Received September 16, 2020.
- Revision received September 3, 2021.
- Accepted September 25, 2021.

- © 2021 Institute of Navigation

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.