Spatiotemporal Deep Learning Network for High-Latitude Ionospheric Phase Scintillation Forecasting

In this paper


INTRODUCTION
Our modern society has become increasingly reliant on the services provided by global navigation satellite systems (GNSSs). In addition to traditional functions, such as navigation and positioning, other high-impact applications, such as power grids, financial services, communications, and network systems, also rely on the precise timing service provided by GNSSs. However, GNSS receivers are vulnerable to disturbances because of the weak GNSS signal power. Ionospheric scintillation is one type of such disturbance that refers to rapid amplitude and/or phase fluctuations of the GNSS signals propagating through ionospheric irregularities (Breitsch et al., 2020;Jiao & Morton, 2015;Jiao et al., 2013;Liu et al., 2018). The occurrence of scintillation may impair the receiver signal tracking loop, resulting in position, navigation, and timing service degradation, discontinuities, and/or even outages (Kintner et al., 2007;R. Yang et al., 2019). Therefore, it is important to forecast the occurrence of scintillation in advance so that appropriate measures can be taken to alleviate potential impacts.
Ionospheric scintillation is difficult to forecast because of the complex physics governing the generation of ionospheric irregularities, which are driven by coupling among solar wind, the magnetosphere, and the ionosphere (Prikryl et al., 2012;Wernik et al., 2003;. Extensive study has been devoted to investigating the connections between scintillation and coupling effects at both equatorial (Carter et al., 2014;Costa et al., 2011;de Lima et al., 2015;Rezende et al., 2010;Secan et al., 1995;Taabu et al., 2016;Z. Yang & Liu, 2016 and high-latitude regions (Jin et al., 2015;Prikryl et al., 2012Prikryl et al., , 2014Prikryl et al., , 2013Secan et al., 1997). In this work, we focus on phase scintillation forecasting at high-latitude regions, where space weather has a more direct impact on the occurrence of scintillation (Cowley, 2000;McGranaghan et al., 2018). At high-latitude regions, phase scintillation is primarily caused by steep ionospheric charged particle density gradients and irregularities associated with auroral and cusp precipitation and polar cap patches, where solar wind disturbances have been closely linked to the occurrence of phase scintillation (Prikryl et al., 2014). Both diffractive scintillation due to signal scattering and refractive effects due to rapid horizontal drift of irregularities across GNSS signal paths may lead to phase scintillation (McCaffrey & Jayachandran, 2019;Morton et al., 2021). Because of these complexities, it is very difficult to forecast scintillation based on a physical model of the ionospheric structure.
Recently, data-driven machine learning methods have emerged as an alternative approach for phase scintillation forecasting (Lamb et al., 2019;McGranaghan et al., 2018). In McGranaghan et al. (2018), the author applied a machine learning algorithm known as a support vector machine (SVM) to high-latitude phase scintillation forecasting. Measurements from the GNSS receiver, geomagnetic activity, particle precipitation data, and solar wind parameters are used as features. The results in that work showed that the SVM-based forecasting model outperformed the persistence method. The authors in Lamb et al. (2019) used a deep learning model to forecast the magnitude of phase scintillation. A novel convolutional architecture and loss function were designed to tackle the problem in which features consist of solar wind parameters, geomagnetic activity indices, etc.
The above work focused on forecasting scintillation using historical measurements from a GNSS monitoring station and measurements related to potential drivers of scintillation (solar wind parameters, geomagnetic activity indices, etc). These methods did not incorporate spatial information. Moreover, the models in previous methods are not ideal choices for time-series forecasting tasks when they fail to capture temporal information. In this study, we incorporate both spatial and temporal information to forecast scintillation. GNSS measurements at both target and surrounding GNSS stations and external features, such as solar wind parameters, are utilized. A spatiotemporal deep learning (STDL) network is implemented to employ spatial fusion and temporal fusion modules to adaptively incorporate information for achieving optimal performance. To the best of our knowledge, this is the first work to incorporate both spatial and temporal information in a deep learning network for the task of scintillation forecasting.
The remainder of this paper is organized as follows: Section 2 discusses the methodology, including the features and model structure. Section 3 describes the data set, followed by a performance evaluation in Section 4. Finally, concluding remarks and future work are discussed in Section 5.

METHODOLOGY
We formulate the high-latitude scintillation prediction as a time-series forecasting problem. Let us suppose that we have a sequence of evenly spaced time steps { , , , , , } t t t t n k n k n n m 1   , where t n is the current time, t n k − represents a time that is k steps in the past, and t n m + is a time that is m steps in the future. The objective is to apply features collected at time steps { , , , } t t t n k n k n 1  to a machine learning algorithm to forecast scintillation at the time step t n m + . In this work, the forecasting label is binary, indicating either the presence of scintillation or no disturbance. An illustration of the machine-learning-based forecasting procedure is shown in Figure 1. In the following subsections, we will discuss the target label, feature engineering, and STDL network machine learning algorithm architecture.

Definition of Phase Scintillation Occurrence
The objective, as illustrated in Figure 1, is to forecast the occurrence of phase scintillation at the target station. We define the scintillation label (phase scintillation occurrence) by setting a threshold on the average value of V I from all satellites in view (elevation mask: 30°), where an average V I that is higher than the threshold denotes the occurrence of scintillation and vice versa. Here, V I is defined as the standard deviation of the detrended carrier phase measurements (Jiao & Morton, 2015). The threshold is empirically set to 5° by not excluding the weak scintillation cases (Jiao et al., 2013). A different threshold can be set depending on the design requirement. For example, if strong scintillation is the target, we can increase the threshold. It should be noted that a change in threshold results in re-training of the model, including hyperparameter tuning.

Feature Engineering
Feature engineering is the process of using domain knowledge of the data to create features that can provide information for the forecasting task. In this work, we utilize three types of features that contribute collaboratively to the forecasting task, i.e., local GNSS measurements, external features, and surrounding GNSS measurements. Here, "collaboratively" indicates that the spatiotemporal information is incorporated. This feature selection follows the work in Wu & Liu (2021). Similar to most machine learning models, normalization is required to preprocess all of the features to follow a normal distribution, which prevents certain features from dominating the prediction (Bishop, 2006).

Local GNSS Measurements
In this work, a high-latitude GNSS receiver located at Poker Flat, Alaska (65.1° N, 147.4° W) was selected as the target station (blue icon in Figure 2) FIGURE 1 Illustration of the machine-learning-based forecasting procedure (Wu & Liu, 2021). In a time-series forecasting task, historical labeling of the forecasting state is critical. As a result, the binary scintillation labels 1 (indicating whether phase scintillation occurred) of the target GNSS station from t n k − to t n are utilized as features. In addition, the average S 4 , average V I , and average signal-to-noise ratio (SNR) from all satellites in view (election mask: 30°) at the target station are also employed. Here, S 4 and V I are the commonly used amplitude and phase scintillation indicators, respectively (Jiao et al., 2013). A summary of the features in local GNSS measurements is provided in Table 1.

External Features
Solar wind parameters observed at Lagrange Point L1 from t n k − to t n are used as features, which are denoted as space-based observations. As it usually takes around 1 h or longer for the solar wind measured at Lagrange Point L1 to impact the Earth (Jensen, 2013), these observations potentially offer lead-time information. In addition, ground-based magnetic activity indices, geomagnetic storm indices, etc. are also employed (Mandea & Korte, 2010). These space-based and ground-based solar-geomagnetic activity observations are denoted as external features in this work. A summary of the features used is given in Table 2. It should be noted that the Lyman alpha feature is measured by geosynchronous-orbit satellites and is categorized as a space-based observation (Machol et al., 2019). 1 The definition of the scintillation label will be discussed in Section 2.1.

Measurements from Surrounding GNSS Stations
Plasma irregularities in the ionosphere, which cause scintillation, drift over time (Wang et al., 2018). A plasma irregularity near the target station may impact the target station at a later time if the irregularity moves toward the target station. One way to detect these nearby irregularities is through the GNSS monitoring stations in the proximity of the target station. Therefore, historical measurements from surrounding GNSS stations are used as features to offer spatial information regarding plasma irregularities. Low-rate (every 30 s per sample) GNSS station receivers from the UNAVCO network are utilized (Pritchard et al., 2012). The surrounding GNSS stations are marked as yellow icons in Figure 2. Because of the unavailability of V I in low-rate GNSS receivers, another commonly used disturbance indicator known as the rate of total electron content (TEC) index (ROTI) is employed (McCaffrey et al., 2018;Pi et al., 1997). The rate of TEC (ROT) is also included as a feature in this study. A summary of the features is given in Table 3.

Machine Learning Algorithm-STDL Network
Let us assume that we have a data set with m samples D is the corresponding binary forecast label (scintillation or no disturbance). The objective of the machine learning model is to learn a mapping function = ( ) y f x such that ŷ matches the ground truth y as accurately as possible. Concatenating all features extracted from local GNSS measurements, external features, and measurements from surrounding GNSS stations into a feature vector may potentially lead to a loss of temporal information from historical measurements as well as spatial information from surrounding GNSS stations.
To address this problem, we apply an STDL network that adaptively captures spatiotemporal information (Liang et al., 2018). A block diagram of the STDL network is shown in Figure 3. Local GNSS measurements are shown as blue dots, while measurements from surrounding GNSS stations are denoted as yellow dots. Historical GNSS measurements from t n k − to t n are employed. In addition, external features are also employed. The breakdown of the STDL network is as follows: 1. At each time step, both local and surrounding GNSS measurements are first passed to a spatial fusion module that adaptively incorporates spatial information and constructs a fused feature vector. Here, the distance from a surrounding station to the target station is also employed in the spatial fusion module (more details can be found in Equation (4) in Liang et al. (2018)). Details regarding the implementation of a spatial fusion module can be found in Section 3.1 in Liang et al. (2018). The spatial module takes information regarding the distance between stations into consideration, reflected as the similarity term P in the equation. In addition, the attention mechanism is also applied to adaptively learn the correlations between stations. Eventually, a weighted combination controlled by lambda is obtained. 2. The fused feature vector at each time step is passed to the long short-term memory (LSTM) module to construct a hidden state that incorporates the temporal information. LSTM is a popular recurrent neural network (Hochreiter & Schmidhuber, 1997). 3. The hidden states produced by LSTM are passed to a temporal fusion module to adaptively incorporate information from all time steps. The temporal fusion module is implemented based on the attention mechanism (Vaswani et al., 2017). More details on the implementation can be found in Section 3.2 in Liang et al. (2018). Here, the attention mechanism learns the optimal weighted combination across all hidden states, where the weight gamma is learned via an attention mechanism. 4. The temporally fused feature vector is concatenated with the external features and passed to a fully connected neural network (FCNN). Here, the FCNN is a basic deep learning architecture that conducts multilayer nonlinear mappings on the input vector (Bishop, 2006;Liu et al., 2019). This network is also known as a multilayer perceptron. 5. The output layer of the FCNN, which is also the output of STDL, is a single value that ranges from 0 to 1. This layer is designed to represent the probability of scintillation occurrence at the forecasted time. 6. To determine whether scintillation will occur, a threshold is determined beforehand and applied to the output. Any probability that is higher than the threshold denotes that scintillation will occur and vice versa. Here, the threshold is usually determined by the application requirements. For example, if a specific false positive rate is desired, a threshold that produces that false positive rate will be selected. In this work, we will evaluate the performance across all thresholds, as discussed in Section 4.1. The implementation of STDL follows the work in Liang et al. (2018), and the corresponding source code can be found at https://github.com/yoshall/GeoMAN. We use the same hyperparameters (e.g., learning rate, dropout rate) employed in Liang et al. (2018) to train the model. The original work in Liang et al. (2018) deals with a regression problem, where the loss function is the mean squared error. To adapt this network to our application, we modify the network to handle classification problems by applying a sigmoid function to the output of the original network. In addition, the loss function is changed to the focal loss, which deals with classification problems with an imbalanced data set (Lin et al., 2017). To incorporate station distance information into the network, Equation (4) in Liang et al. (2018) is implemented. The tunable parameter λ in Equation (4) is selected as 0.9 (Bishop, 2006;Liang et al., 2018). Here, a grid search of λ (step size of 0.1) from 0 to 1 is conducted to select the λ value that achieves the best performance in the validation set. P i j , in Equation (4), defined as the correlation between station i and j, is set to the reciprocal of the geographical distance between these two stations. This setup follows the intuition that two nearby stations are more correlated than two stations that are far from each other.

DATA SET DESCRIPTION
In this work, we use the 2015 data set utilized in Wu & Liu (2021). A GNSS station equipped with a Septentrio PolaRx5S receiver located at Poker Flat, Alaska (65.1° N, 147.4° W) is used as the target station, which collects high-rate data, with 50 − Hz signal intensities and 100 − Hz phase measurements. The surrounding GNSS stations belong to the UNAVCO GNSS network, where the station ID is given next to the yellow icons in Figure 2 (Pritchard et al., 2012). More details can be found in https://kb.unavco.org/kb/article/unavco-resources-permanent-gps-gnss -stations-634.html. The external features are extracted from the National Aeronautics and Space Administration (NASA)/Goddard Space Flight Center (GSFC) OMNI data set through OMNIWeb (J. H. King & Papitashvili, 2005). Because the solar wind parameters in OMNI have been time-shifted to the Bow-shock nose, we process the data by shifting back in time to locate the data measured at Lagrange Point L1 (J. King & Papitashvili, n.d.). For the data set, occasionally missing values are interpolated, whereas data containing large numbers of consecutive missing values are discarded.
Historical measurements from the past 6 h with a time interval of 10 min (36 time steps) are used as features, and the forecasting lead time is 1 h. Measurements with a high sampling frequency (sampling intervals smaller than 10 min, i.e., S 4 , V I ) are down-sampled, and measurements with low sampling frequency (sampling intervals greater than 10 min, i.e., sunspot number) are interpolated by propagating the last valid observation forward. The data set contains approximately 400,000 samples, where 13% of the samples are labeled as scintillation. Thus, this data set is an imbalanced data set, where a specific loss function known as the focal loss is utilized (discussed in Section 2.3). The first 70%, the next 10%, and the last 20% of data are used for training, validation, and testing, respectively.

PERFORMANCE EVALUATION
In this section, we first discuss the performance evaluation metrics, followed by a performance comparison against five baseline methods and an investigation of the importance of external features.

Evaluation Metrics
As discussed in Section 2.3, the STDL forecasts the probability of scintillation occurrence. To determine whether scintillation will occur, a threshold is applied. A comprehensive performance evaluation is conducted by varying the threshold from 0 to 1 with a step size of 10 −5 . For each threshold value, the STDL output for each sample test data set is converted to a binary label (scintillation/quiet time) based on a comparison with the threshold. The resulting labels are used as the forecasted labels for this threshold. The performance is then evaluated by comparing the forecasted labels with the truth.
Because our forecasting data set is imbalanced, the metric of accuracy can be misleading. Instead, we use the following metrics to evaluate the model performance, where positive samples denote scintillation: For each threshold, a detection rate, a false positive rate, and a precision are obtained. A comprehensive performance evaluation is conducted by representing these three metrics for all thresholds on two curves: the receiver operating characteristic (ROC) curve and the precision-detection rate curve (Boyd et al., 2013;Brown & Davis, 2006). Both curves take the detection rate into consideration and show dependence on other metrics. The ROC curve focuses on the false positive rate, whereas the precision-detection rate curve emphasizes precision. A comprehensive performance evaluation is established based on analyses of both curves.

Performance Comparison
The STDL is evaluated using the data set discussed in Section 3. For comparison purposes, we also evaluate the forecasting performance of five baseline methods on the same data set. A grid search of hyperparameters of these models is conducted on the validation data set: • Naïve method: The method takes the scintillation labels for the previous 36 time steps and computes the percentage of positive outcomes. This percentage is denoted as the probability of occurrence of scintillation for the future time step. This method serves as a baseline benchmark for all other methods. • Logistic regression (LR) (Bishop, 2006): LR models the forecast label y as a logistic sigmoid acting on a linear function of the feature vector . w ∈  n and b ∈  are the trainable weights and bias, respectively. Because LR only utilizes a feature vector, all features are concatenated to form this vector by ignoring the spatiotemporal information.
• Gradient boosting decision tree (GBDT) (Friedman, 2002;Mason et al., 1999;Wu & Liu, 2021): A GBDT conducts the classification by combining several weak learners, where each learner is a decision tree. The GBDT is known to be one of the most effective machine learning algorithms for a variety of applications. Similar to LR, the input feature vector is obtained by concatenating all features. • FCNN (Bishop, 2006;Liu et al., 2019): The FCNN is a multilayer deep learning structure. Each layer consists of a linear mapping followed by nonlinear activation functions. Its input is the same feature vector that is used for LR. In contrast to LR and GBDT, nonlinearity is introduced in the model. • LSTM (Hochreiter & Schmidhuber, 1997): LSTM is a type of recurrent neural network designed to handle time-series data by preserving temporal information. All features at each time step are concatenated to form a feature vector and passed to LSTM.
To ensure a fair comparison against STDL, all baseline methods (except the naïve method) utilize the exact same features. The only difference lies in the way each model organizes these features. In addition, all baseline methods forecast the probability of scintillation occurrence rather than the binary scintillation label, which ensures that each method can produce both the ROC curve and the precisiondetection rate curve.
ROC curves for all methods are shown in Figure 4. Here, a curve that is closer to the top left corner has a higher detection rate and a lower false positive rate, which indicates a better overall performance. The STDL achieves the best performance by adaptively incorporating both spatial and temporal information. The LR, GBDT, FCNN, and LSTM show slightly inferior performance compared with the STDL, as these four models are not capable of incorporating spatial information. Finally, the naïve method shows the worst performance, as it only utilizes historical scintillation labels and does not apply a machine learning model. The area under the curve (AUC), which is a metric used to quantify the overall performance of a model's ROC curve by integrating the AUC, is also listed in the legend of Figure 4. This metric corroborates our observations that STDL has the best performance. It should be noted that the training time for STDL is at least three-fold greater than that of other methods, which is a tradeoff that must be considered in model selection.

FIGURE 4 Performance comparison of ROC curves
Precision-detection rate curves are shown in Figure 5. Again, the STDL demonstrates the best overall performance. A precision improvement of approximately 5% over the second-best model can be observed near a detection rate of 70%.
It should be noted that this evaluation provides an overall picture of the performance across thresholds from 0 to 1. As mentioned in Section 2.3, the final threshold is usually determined by the application requirements.

Investigation on the Importance of External Features
STDL utilizes not only GNSS measurements, but also external features. An experiment with STDL excluding external features was conducted to study the effectiveness of utilizing external features. The ROC and precision-detection rate curves are shown in Figure 6 and Figure 7, respectively. The results demonstrate the importance of external features, where excluding these features leads to performance FIGURE 5 Performance comparison of precision-detection rate curves FIGURE 6 Performance comparison of ROC curves between STDL with external features (STDL-w/Ext) and without external features (STDL-noExt) degradation. This degradation occurs because space-based observations, such as solar wind parameters, offer lead-time information. In addition, ground-based observations are drivers of scintillation occurrence. Inputting these observations into the model provides additional information that aids in the forecasting task.

CONCLUSIONS AND FUTURE WORK
In this work, an STDL network was presented to forecast GNSS phase scintillation at high latitudes. The STDL achieves the best performance by incorporating spatial and temporal information and external features obtained by space-based and ground-based instruments that measure solar wind parameters and geomagnetic field disturbances. Several future efforts can be pursued to improve the forecasting performance. First, the current method employs all available external features, which may impair the performance by including irrelevant features. By following the work in Wu & Liu (2021), only important features can be used to investigate whether a better performance can be achieved. In addition, measurements from GNSS receivers on low Earth orbit satellites can also be incorporated into the network to enhance the spatial resolution of the surrounding information. Finally, we are working on image-based representations of scintillation occurrence over an entire region, similar to the manner in which meteorological weather forecasting is conducted. This method offers continuous indicators for any user-specified locations within the region.

a c k n o w l e d g m e n t s
This work is supported by a Defense Advanced Research Projects Agency (DARPA) contract (AWD-102938-G3) under the DARPA Space Environment Exploitation program. The data used in this study were collected at the global ionospheric scintillation monitoring network established by the Satellite Navigation and Sensing (SeNSe) Lab at the University of Colorado Boulder and the GNSS monitoring networks by UNAVCO. We acknowledge the use of NASA/GSFC's Space Physics Data Facility's OMNIWeb (or CDAWeb or ftp) service and OMNI data. In addition, we thank Professor William Bristow for providing the cross cap potential data.