Abstract
The ionospheric refraction of GNSS signals can have an impact on positioning accuracy, especially in cases of single-frequency observations. Ionosphere models that are broadcasted by the satellite systems (e.g., Klobuchar, NeQuick-G) do not include enough details to permit them to correct single-frequency observations with sufficient accuracy. To address this issue, regional ionosphere models (RIMs) have been developed in several countries in the western Balkans based on dense Continuous Operating Reference Stations (CORS) observations. Subsequently, a RIM for the western Balkans was built using an artificial neural network that combined regional ionosphere parameters estimated from the CORS data with spatiotemporal (latitude, longitude, hour of day), solar (F10.7) and geomagnetic (Kp, Dst) parameters. The RIMs were tested at the solar maximum (March 2014), a geomagnetic storm (March 2015), and the solar minimum (March 2018). The new RIMs mimic the integrated electron density much more effectively than the Klobuchar model. Furthermore, RIMs significantly reduce the ionospheric effects on single-frequency positioning, indicating their necessity for use in positioning applications.
- artificial neural network
- ionosphere delay modeling
- machine learning
- regional ionosphere model
- single-frequency positioning
- vertical total electron content
1 INTRODUCTION
The ionized upper part of the Earth’s atmosphere known as the ionosphere affects the propagation of radio waves that are generated by communication and navigation systems. Consequently, ionospheric refraction can affect the accuracy and reliability of positioning applications that rely on global navigation satellite systems (GNSS). Dual-frequency GNSS observations facilitate the estimation of ionospheric effects by forming the geometry-free linear combination (L4) as well as reductions of most of the ionospheric range error via an ionosphere-free linear combination (L3) (Hofmann-Wellenhof et al., 2001). However, the ionosphere remains one of the major sources of error in single-frequency positioning, where the first-order ionospheric term accounts for more than 99.9% of the total ionospheric delay associated with phase and code GNSS measurements (Hernández-Pajares et al., 2011). Because mass-market GNSS receivers commonly operate on a single frequency, the ionospheric range error needs to be corrected or at least mitigated by deploying models that minimize the ionospheric effects.
Due to the complex nature of these processes as well as the solar-terrestrial coupling system, different approaches for modeling the ionosphere have been developed. Current models of the ionosphere can be categorized as physical, empirical, or mathematical (Farzaneh & Forootan, 2018). State-of-the-art methods utilize artificial intelligence, specifically machine learning techniques, to identify nonlinear relationships among the variables to improve forecasting, especially factors related to space-weather processes (Camporeale et al., 2018; Natras & Schmidt, 2021). Physical ionosphere models are based on physical and chemical processes in the ionosphere, as shown by the Global Assimilation of Ionospheric Measurements (GAIM) model (Schunk et al., 2004) and the Global Ionosphere-Thermosphere Model (GITM) (Ridley et al., 2006). These models rely on observations and mathematical representations of physical laws and can thus be quite complicated, as they require formidable numerical procedures with high computational costs. By contrast, empirical models describe the ionosphere with mathematical functions derived from historical observational data and statistics (Radicella & Nava, 2020). These models represent average conditions and regular variations of the ionosphere (i.e., its “climate”). Examples of such climatological empirical models are the International Reference Ionosphere (IRI) (Bilitza, 2018) and NeQuick (Nava et al., 2008).
To correct the ionospheric delay in single-frequency observations, navigation satellite systems broadcast coefficients within the navigation message that are based on these empirical approaches. These models have been widely applied largely due to their simplicity. For instance, the well-known Klobuchar model (Klobuchar, 1987) that was adopted in the global positioning system (GPS) eliminates at least 50% of the ionospheric range delay error. Similarly, a special version of the NeQuick model denoted as NeQuick-G that has been implemented in Galileo can correct approximately 70% of the ionospheric code delay (Orus Perez et al., 2018).
The ionospheric refraction of the single-frequency observations can be modeled more precisely using GNSS observations to estimate the total electron content (TEC) along the signal path within the ionosphere; this value is proportional to the ionospheric refraction range. These models are typically based on estimations of the vertical TEC (VTEC) on a global scale using different mathematical approaches. For example, global ionosphere maps (GIMs) are routinely generated by Ionosphere Associated Analysis Centers (IAACs) of the International GNSS Service (IGS) including CODE (Center for Orbit Determination in Europe, Astronomical Institute, University of Bern, Switzerland), ESOC/ESA (European Space Operations Center from European Space Agency, Darmstadt, Germany), JPL (Jet Propulsion Laboratory, Pasadena, California, USA), UPC (Universitat Politècnica de Catalunya; Technical University of Catalonia, Spain), NRCan (Canadian Geodetic Survey of Natural Resources Canada), WHU (Wuhan University, China), CAS (Chinese Academy of Sciences, China) and the OPTIMAP-Group, DGFI-TUM (Deutsches Geodätisches Forschungsinstitut der Technischen Universität München; German Geodetic Research Institute of the Technical University of Munich, Germany). The global VTEC distribution can be represented mathematically in the CODE maps by spherical harmonics (Schaer, 1999) with a spatial sampling of 2.5° × 5° in latitude and longitude, respectively, and a temporal resolution of two hours until the year 2015 and one hour thereafter. The GIM products of the IAACs are used to generate a combined solution of the IGS (Hernández-Pajares et al., 2009), which generally has global relative errors of 10% to 20% compared to the VTEC estimated from topography experiment (TOPEX) satellite altimeter missions observations (Orús et al., 2002). The GNSS-derived ionosphere VTEC models and maps are useful external sources that provide information for single-frequency GNSS users seeking to mitigate the first-order ionospheric delay and directly reduce the ionospheric range error (IERS conventions, 2010). The final GIM products commonly feature a time delay of 1–2 weeks, while the rapid GIMs are generated with a latency of 1–2 days (Li et al., 2020; Liu et al., 2021). The latency of GIM products can limit their use in real-time positioning applications.
In contrast to these models, regional ionosphere VTEC models (RIMs) may be more accurate as they have higher spatial and in some cases, also higher temporal resolution because they can incorporate observations from dense GNSS networks. The RIMs are based on a different set of mathematical approaches for VTEC representation. For example, in Europe, the Royal Observatory of Belgium (ROB) generates RIMs in near real-time by applying thin plate spline interpolation (Bergeot et al., 2014). Other approaches for regional VTEC mapping in Europe include series expansions in tensor products of polynomial B-spline functions (Goss et al., 2020), weighting functions that take into account the location and the magnitude of the global electron maximum (Magnet, 2019), approximations of the ionosphere single layer model with Taylor expansions of degree two (Boisits et al., 2020), among others. It is critical to recognize that most of the RIMs were developed for large regions, such as the European continent.
Regarding the temporal resolution of the ionospheric products, use of the slant TEC (dSTEC) RMS can result in improvements of approximately 13% and 20% when the temporal sampling is increased from 2 hrs to 1 hr for low and high-resolution global ionosphere B-spline models, respectively, (Goss et al., 2019). However, increasing the temporal sampling from 1h to 10 min resulted in minimal improvements of only 3% to 4% for low and high-resolution global ionosphere B-spline products, respectively. Further investigation of global VTEC products in different latitudinal ranges (Liu et al., 2021) revealed no differences in the standard deviations of the GIMs at 1 hr, 45 min, 30 min, and 15 min temporal resolutions in the mid-latitudinal range (30° N to 50° N) with regard to VTEC from the Joint Altimetry Satellite Oceanography Network (JASON) satellite altimeter mission. By increasing the temporal resolution from 1 h to 15 min, the RMS of the dSTEC assessment decreased by less than 0.01 TECU in the mid-latitudinal range (30° N to 50° N). By contrast, an improvement of approximately 0.2 TECU in standard deviation and of 0.5 TECU in the RMS was obtained when the temporal resolution was increased from two hrs to one hr.
Recently, there has been increasing interest in state-of-the-art machine learning applications for global and regional ionosphere modeling. Among these studies, a feed-forward neural network was applied for regional VTEC modeling over Nigeria (Okoh et al., 2016) that was based on inputs of geomagnetic latitude and longitude, year, day of the year, the hour of the day, Dst index, sunspot number, and critical plasma frequency (foF2) from the International Reference Ionosphere (IRI) model. Other applications include regional VTEC modeling over Brazil using inputs of latitude and longitude (Leandro & Santos, 2007), global VTEC modeling with inputs of latitude and longitude, day of the year, F10.7 and Kp indices (Orus Perez, 2019), regional VTEC modeling over South Africa using the day of the year, the hour of the day, a four-month running mean of sunspot number and the running mean of the previous eight hours of the Ap index (Habarulema et al., 2009), as well as others. Zhao et al. (2021) used spherical harmonics with a neural network based on an extreme machine-learning technique for real-time modeling of the ionospheric delay. Similarly, Zhang et al. (2019) successfully combined the machine learning algorithm of a support vector machine with the regional VTEC polynomial model based on the CORS observations. Liu et al. (2020) utilized a type of recurrent neural network known as long short-term memory (LSTM) to forecast global VTEC maps; the LSTM model processes sequences of past observations of spherical harmonic coefficients, extreme ultraviolet (EUV) flux, and Dst index, among others to understand temporal-dependent relationships. Kaselimi et al. (2020) also applied an LSTM model with inputs that included satellite position coordinates, the azimuth and the elevation angles of the satellite, and the output of the VTEC. The wavelet neural network for VTEC modeling (El-Diasty, 2017) combines the inputs of the day of the year, the hour of the day, latitude, and longitude of the ionosphere pierce point (IPP) with the estimated Klobuchar model at the same IPP point and VTEC output data for the model training estimated from the CORS stations observations. Another application of support vector machines extrapolates VTEC in regions where GNSS observations are not available (Kim & Kim, 2019). The inputs include the hour of the day, the day of the year, F10.7 and Kp indices, sunspot number, and VTEC from areas where GNSS observations were available to provide estimates for VTEC for regions devoid of GNSS observations (Kim & Kim, 2019). Also, an ensemble of tree-based meta-estimators has been developed by combining random forest, adaptive boosting (AdaBoost), and extreme gradient boosting (XGBoost) methods for VTEC forecasting during both calm and stormy geomagnetic conditions (Natras et al., 2022a) and estimating forecast uncertainties as an ensemble spread (Natras et al., 2022b). Various machine-learning approaches and applications have been proposed. The input data can include spatial information (latitude and longitude of the station or the IPP or the satellite position, among others), temporal information (year, day of year, hour of day), solar activity (sunspot number, F10.7, EUV flux, and others), geomagnetic activity (Kp and Dst) as well as ionosphere information (IRI, Klobuchar, GNSS-derived VTEC, VTEC polynomial model, and spherical harmonic coefficients). Most previous studies employed a fully-connected feed-forward neural network. Regarding the size of the training dataset, different lengths of time were used in previous studies, including one day (Zhang et al., 2019; Zhao et al., 2021), four days (El-Diasty, 2017; Kaselimi et al., 2020), ten days (Leandro & Santos, 2007), one to two years (Kim & Kim, 2019; Liu et al., 2020; Natras et al., 2022a), and four years (Habarulema et al., 2009; Okoh et al., 2016; Orus Perez, 2019) among others. Results from previous studies demonstrated the feasibility of machine learning for VTEC estimation, with a focus on artificial neural networks and also when applied to a limited dataset of several days. While large datasets are typically perceived as essential for training neural networks, it is possible to train these models with relatively small amounts of data (Motamedi et al., 2021; Rajpurkar et al., 2020). The lack of large datasets will need to be compensated by high-quality data. This represents the core of a data-centric approach in machine learning (Motamedi et al., 2021). In these cases, data quality is more important than the dataset size, i.e., shifting from big data to good data. This facilitates effective training of neural networks with smaller datasets.
Most of the aforementioned ionosphere models rely on GPS or GPS + GLONASS observations from the IGS, the European EUREF Permanent Network (EPN), and/or the CORS network as a source of VTEC information. The yellow area in Figure 1 shows the approximate location of the Balkan Peninsula (36° N to 48° N, 13° E to 30° E). The distribution of IGS and EPN tracking ground stations is poorer in this region compared to the rest of Europe. However, most countries in this region operate dense CORS networks. This is also a case in the countries in the western part of the Balkan Peninsula, whose observations have not been previously used to generate ionosphere VTEC models. These networks provide denser GNSS observation coverage for VTEC modeling in the western Balkans.
This research aims to extend current knowledge in the field of RIM development based on GNSS observations by generating ionosphere models for a region that is much smaller than most continents using information available from dense CORS networks. With this in mind, this paper describes the development, validation, and applicability of regional GNSS-based ionosphere models while accounting for ionosphere effects in positioning applications. We will determine how to use data from CORS networks for regional/national VTEC modeling and identify the advantages that VTEC models developed for small regions can bring to positioning applications, using the western Balkans as a test case example. Therefore, the aim of the study is not to describe the physical and/or chemical processes associated with the ionosphere constituents, but instead to generate models that take into account and mitigate the ionospheric effects that hinder GNSS/GPS positioning utilizing available CORS data. Until now, no regional ionosphere model based on the national GNSS infrastructure has been established or developed in the countries in the western Balkans region. To address this knowledge gap, this paper presents three new regional ionosphere models based primarily on the observations from the CORS networks: (i) a first model developed for a small region inside one country; (ii) a second model developed for several countries within the western Balkans; and (iii) a third model that covers the entire western region of the Balkan Peninsula. As part of the third model, we propose an approach based on state-of-the-art machine learning techniques including the use of an artificial neural network. These models will be evaluated in single-frequency precise point positioning. To the best of our knowledge, this is one of the first studies to use observations from the CORS networks located in the western part of the Balkan Peninsula to generate GNSS-based ionosphere models and to evaluate them in positioning applications to mitigate the effects of ionospheric refraction.
2 METHODOLOGY
The impact of the ionosphere on the propagation of radio waves propagation can be described by the STEC in Equation (1):
1
where Ne(s) is the electron density along the signal ray path between the satellite i and the receiver k. STEC is measured in TEC units (TECU), where 1 TECU= 1016 electrons/m2. The vertical TEC (VTEC) can be expressed as shown in Equation (2):
2
where z’ is the zenith angle of the signal path in a mean altitude H of the ionospheric shell. The ionosphere is approximated as the single-layer model (SLM), which assumes that all free electrons are concentrated within a shell of infinitesimal thickness. The SLM mapping function F(z) can be written as shown in Equation (3):
3
where z is the zenith angle at the height of the receiver, R is the mean Earth radius and H is the aforementioned height of the SLM above the Earth’s surface (Schaer, 1999). The SLM height typically ranges between 350 km to 450 km (Jiang et al., 2017; Mannucci et al., 1998; Schaer, 1999). IGS analysis centers have adopted a thin layer height of 450 km for the ionosphere products (Feltens, 2003). In this study, we also adopted the SLM height of 450 km above the Earth’s surface.
2.1 Selection of Study Region and Data
Figure 2 shows the locations of stations that belong to the CORS (blue dots) and the EPN (red dots) networks whose dual-frequency GPS observations were used in this study to generate VTEC models. Two VTEC models were developed based on the regions covered, namely, RIM IONO_BH and RIM IONO_WB.
To estimate the RIM IONO_BH, the station network was selected to include nine Bosnia and Herzegovina Positioning Service (BIHPOS) stations each located approximately 80 km from the central EPN station in Sarajevo. An orange ellipse on the map marks this area. The RIM IONO_WB was derived from GPS observations obtained from the following CORS networks: the Albanian GNSS Permanent Stations (AlbGNSS), BIHPOS, the Croatian Positioning System (CROPOS), the Macedonian Positioning System (MAKPOS), and the Slovenia-Geodesy-Navigation-Location (SIGNAL). We also used observations from eight EPN stations within this region. Observations from the CORS networks located in Serbia and Kosovo were not available from the network providers and are not freely available to the public; thus this information was not used in this study. Also, historical data from the CORS network in Montenegro were not preserved for these study periods and thus not available for use in this study. Consequently, CORS observations from these countries were not included in the RIM IONO_WB. The network selected for the RIM IONO_WB included approximately 80 CORS and EPN stations between approximately 40° N to 47° N and 13° E to 23° E. RIM IONO_WB was developed as an independent model that would be applicable for each of the participating countries mentioned above. Therefore, the RIM IONO_WB includes files with the extension “ION” that contain separate sets of ionospheric corrections for each country. The reference point of the Taylor series expansion was set approximately in the mid-range of each country included in the study. Permanent stations that were used to validate the newly-developed RIMs are highlighted on the map with inner black and white circles (Figure 2). These stations were not used for estimating the regional models. The following stations shown on the map were used to validate the RIM IONO_BH: EPN SRJV and BIHPOS SARA (which are very close to one another), BIHPOS SEKO, and, additionally, two stations outside the modeled region, EPN POZE and EPN DUB2. Stations used for the RIM IONO_WB and the RIM IONO_WB_AI validation included EPN GSR1, EPN POZE, EPN SRJV, IGEWE TIRA, and EPN ORID.
2.2 Regional Ionosphere Modeling with Bernese GNSS Software
Bernese GNSS Software version v.5.2 (Dach et al., 2015) was used to process GNSS data, estimate ionosphere models and perform positioning. Before generating ionosphere models, several pre-processing steps must be carried out. Table 1 summarizes routines and the processing steps according to the order of their execution. Additional details on each routine and processing step in the Bernese GNSS software are described by Dach et al. (2015). The ionosphere models were estimated within the IONEST routine using the Bernese-recommended values (Dach et al., 2015) listed in Table 2. This routine works only on the assumption of GPS zero-difference observations.
Ionosphere mapping was performed on the undifferenced (zero-difference) level by analyzing the so-called geometry-free linear combination (L4) of phase observations which are formed by subtracting observables at different frequencies. This means that all frequency-independent effects such as the satellite-receiver geometrical range, clock errors, and tropospheric delay, among others, were canceled out, while the ionospheric effect remained (Ciraolo et al., 2007). Thus, the geometry-free linear combination includes the ionospheric delay and can be used to estimate the ionosphere model. With this approach, the carrier phases are not fitted to code (pseudo-range) observations; this means that no code-phase leveling was applied.
The observation equation for zero-difference phase observations can be outlined as shown in Equation (4):
4
where L4 is the geometry-free phase observable, α = 4.03 · 1017 ms−2TECU−1 is a constant, f1, f2 are the frequencies associated with the carriers L1 and L2, β · ϕ represents the wind-up term associated with the right-handed polarized GPS signal (typically a centimeter-level term), B4 = λ1 · B1 – λ2 · B2 is a constant phase bias caused by the initial phase ambiguities (i.e, unknown integer number of cycles) B1 and B2 with their corresponding wavelengths λ1 and λ2. The initial phase ambiguities B1 and B2 are estimated as real-value parameters within the initial least-squares adjustment using the phase observations L1 and L2 on both frequencies. The initial phase ambiguity has the same value provided that no loss of signal lock occurs. At least one parameter (B4) has to be solved for each receiver and satellite pass since it contains phase ambiguities. For cycle slip detection, differences between two satellites for the same epoch are formed from the measurements in the observation files. If cycle slips are detected, the geometry–free linear combination L4 is checked in order to determine their size in both frequencies. A new ambiguity is set up at the corresponding epoch of the detected cycle slip. Pre-processing options were set in the IONEST routine to define a new ambiguity parameter B4 for each detected cycle slip. The resulting geometry-free linear combination in Equation (4) allows us to estimate VTEC as a function of geographic latitude ϕIPP and the Sun-fixed longitude sIPP at the intersection point of the line-of-sight between the satellite and the receiver with the ionospheric layer. This intersection point is known as the ionospheric pierce point (IPP).
The regional VTEC model is based on the two-dimensional (2-D) Taylor series expansion shown in Equation (5):
5
where n, m are the degree values of the 2-D Taylor series expansion in geographic latitude and Sun-fixed longitude, nmax, mmax are the maximum degree values of the Taylor series expansion in geographic latitude and Sun-fixed longitude (Table 2), ϕ0, s0 denote the coordinates of the origin of the Taylor expansion and cnm stands for the unknown coefficients of the Taylor series expansion, i.e., the regional ionosphere model parameters to be estimated (Dach et al., 2015; Wild, 1994).
The Sun-fixed longitude s is related to the local solar time (LT) as shown in Equation (6):
6
where UT is an abbreviation for the Universal Time, LT is local time, and λIPP denotes the geographical longitude of the IPP; all values are in radians.
The latitude ϕIPP and the longitude λIPP as shown in Equation (7) and Equation (8):
7
8
where ϕk, λk are the latitude and longitude of the receiver k, Az stands for the azimuth from the receiver to the satellite, and ψ is the angle between the lines joining the center of the Earth with the IPP and receiver location. It can be calculated as shown in Equation (9):
9
where El stands for the elevation angle.
Ionosphere models were derived with a temporal sampling of one hour with the Bernese default extension ION and later combined into a 24-hour file for each day, namely the RIM IONO_BH and RIM IONO_WB, as described in Section 2.1.
When applying the ionosphere model presented in Equation (5), the ionospheric range corrections (in meters) for the zero-difference GPS observations of the i-th frequency can be computed as shown in Equation (10):
10
where the negative sign is used for phase observations and the positive sign for code observations.
2.3 Regional Ionosphere Modeling with Machine Learning
Machine learning represents a branch of artificial intelligence (AI) that can approximate the function between inputs and outputs based on rules/relationships that an AI system has “learned” from the data during the learning/training phase. This characteristic distinguishes machine learning from traditional modeling/programming approaches which require an extensive list of rules describing relationships between inputs and outputs to be explicitly specified (Natras & Schmidt, 2021). More specifically, machine learning addresses the problem of finding an approximation function that maps inputs (called predictors, features, or independent variables) to one or more outputs (called responses or dependent variables). This study focuses on the estimation of a single output (VTEC). Therefore, the output is presented below as a single-column vector. To start, a dataset was prepared that contained the measurements of the input feature and the output y for a set of observations i(i = 1, 2, …, n) as indicated in Equation (11):
11
Artificial neural networks (ANNs) represent a state-of-the-art technique with widespread applications in many fields. The ANN usually consists of an input layer with input variables, an output layer with output variables, and one or more hidden layers between them. Hidden layers allow the model to create a complex nonlinear mapping function between the network inputs and outputs by introducing a nonlinear activation function (Sharma et al., 2020). To estimate the values of the neurons in a current layer, parameters (weights) need to be added to the values of the neurons from the previous layers. The weight matrices can be defined as shown in Equation (12):
12
Where the matrix W contains all weights in the neural network. W(k–1) are weight matrices assigned to p neurons from the previous layer (k–1) in order to estimate values of L neuron(s) in the current hidden or output layer (k) with k = (2, 3, …, K), where k = 1 corresponds to the input layer. wl, o is the bias term (l = 1, 2…, L). In the matrix X we added an element xi,o = 1 to each row. It is convenient to include this element, because of the bias term wo as this will permit us to present Equation (13) in vector form as a scalar product. Thus, the linear model will predict the output yi given a vector of input features xi and weights wl with l = 1, i.e., the model consists of an input and an output layer, which can be expressed as shown in Equation (13):
13
where ei is an error. The weights in W are determined during the training phase using the method of least squares to minimize the sum of the squares of the error ei and as shown in Equation (14):
14
In a fully connected network, all neurons in one layer are connected to all neurons in the next layer; this is also known as a multilayer perceptron network (Ramchoun et al., 2016). A feed-forward network indicates that the connections between the neurons are all in one direction (from input to output) and hence the information is fed-forward from one layer to the next. Afterward, the backpropagation is used to adjust the weights of the ANN using a stochastic gradient descent method (Bottou, 1991) to minimize R(W), as shown in Equation (14). In this study, we tested different setups of an ANN architecture and inputs. These specifications are presented in Table 3. In total, six ANN and two Random Forest (RF) models have been developed. One ANN model does not contain a hidden layer (ANN1); all other ANN models have hidden layers. The ANN1 model is an example of the linear model, where the outputs are estimated as described by Equation (13). The architecture of the ANN5 model is shown in Figure 3. ANN5 is a fully-connected feed-forward network with an input layer, three hidden layers, and an output layer. The number of neurons in the first layer corresponds to the length of the vector xj. Neurons in each layer can be referred to as activation neuron vector . The input layer can be expressed as . Hidden layers compute the derived features from to . The last layer provides the output . Activation neurons in the hidden layer can be calculated as described in Equation (15):
15
where q is the activation function (Hastie et al., 2009). The activation function used in hidden layers is known as a ReLu (Rectified Linear unit) function (Sharma et al., 2020). ReLu is a non-linear activation function that is widely used in deep learning models because of its simplicity and effectiveness. It is defined in Equation (16):
16
meaning that the function will generate a value of zero for any negative value of z; for any positive value, it will return that value. The network output can be defined as shown in Equation (17):
17
The optimal hyper-parameters (number of neurons, number of hidden layers, learning rate for the stochastic gradient descent, and so on) were selected by testing different values within a certain range and observing the changes in R(W). Hyperparameters were searched within the following intervals: number of hidden units = [5, 10, 20, 30, 40, 60], number of hidden layers: [1, 2, 3, 4], epochs = [100, 200, 300, 400, 500], batch size = [50, 100, 200, 300, 400, 500]. Batch size refers to the number of training examples propagated through the network in one iteration (forward/backward pass). The number of epochs represents the number of complete passes through all the training examples. Hyperparameters that minimize R(W) were selected. The optimal number of hidden units was identified as 10 with three hidden layers. This architecture was kept constant in all ANN models from Table 3 tested; additional fine-tuning resulted in no significant differences. The addition of more neurons and more layers to the ANN increases the complexity of the interactions between layers and neurons. While this can be perceived as a positive, too much complexity can lead to overfitting the training data as the machine learns from its noise and is thus not capable of generalizing information from new examples. Choosing the appropriate complexity of an ANN requires careful attention when optimizing the model. Additionally, the Random Forest (RF) algorithm (Breiman, 2001), which builds an ensemble of decision trees, can be used to compare results and determine which input features are relevant. The RF algorithm provides the possibility of easily estimating the relative importance or contribution of each input feature (Breiman, 2001). A detailed explanation of how a decision tree and RF are built to address the VTEC problem can be found in Natras et al. (2022a). The RF model includes 300 decision trees; the quality of splits within the trees is measured with the mean squared error.
Most of the input data for these models consists of the coefficients cnm, i.e. regional ionosphere parameters from Equation (5), which is estimated for each hour in each country within the study region (Section 2.1) by processing the GNSS observations in the Bernese GNSS Software. During the training phase, the geographical coordinates belonging to the origin of the coefficient expansion were provided with the output data of the VTEC at these positions. Information on latitude and longitude was then added to extract spatial interactions. Additional input data were introduced to each new model. The information regarding time was added to extract temporal dependencies and features. For the ANN models, the sine and cosine components are calculated to preserve their cyclic significance, as shown in Equation (18):
18
where HoD is the hour of the day. Decision tree-based algorithms, such as RF, do not require time information to be split into sine and cosine components (Boussard et al., 2017). Also, RF considers just one input feature at a time as the splitting variable when building a tree (Breiman, 2001). Therefore, it will fail to process sine and cosine components simultaneously, although they are expected to be considered as one system. Data on solar and geomagnetic activity such as the solar flux F10.7 and the geomagnetic indices of Kp and Dst were also introduced gradually. Eventually, the models (ANN6 and RF2) were trained solely on information regarding time, location, and solar and geomagnetic activity. The data were shuffled and randomly divided into training (80%) and validation datasets (20%). Using the training data, we built models that can approximate the function between inputs and outputs and estimate the outcome for new inputs. Using validation data, values for the hyperparameters were selected. All input features were scaled (i.e. standardized) to obtain data with a mean value of zero and a standard deviation of one. This is standard procedure in many learning algorithms that are sensitive to the scale of the input features (Zheng & Casari, 2018). This ensures that all inputs are treated equally, even if the variables have different scales. Also, gradient descent can converge faster, meaning that the optimal parameters for each neuron can be located more quickly. Feature standardization is defined in Equation (19):
19
where xi,j is the data point i from j-th input feature, is the mean of the j-th input feature (over all data points from test and validation datasets together), σj is the standard deviation of the j-th input feature, is the resulting standardized data point i from j-th input feature. Scaling is performed on each input feature independently. Mean and standard deviation values are then stored for later data to perform the consistent transformation.
A correlation heatmap is created from a 2D correlation matrix of the input features and the output (Figure 4). Coefficients cnm of the regional ionosphere are presented as Cnm in the heatmap. Five coefficients were estimated for each country, as explained in Section 2.1. Mean correlations for coefficients are presented. The coefficient c00 has the strongest positive correlation to VTEC, followed by c10 and c02, both of which have a moderate negative relationship to VTEC. The F10.7 index has a positive moderate relationship, while the Dst and Kp indices have a weaker relationship with VTEC. The lowest correlations involve HoD, and latitude and longitude, which were added to extract temporal and spatial features.
Figure 5 and Table 4 show that the temporal, solar, and geomagnetic activity information increase the ANN model accuracy, respectively, in terms of the RMSE and the correlation coefficients (CCs) between the VTEC model output and VTEC from Equation (5). These input data are useful for deriving temporal, solar, and geomagnetic features and relationships. Moreover, removing the regional ionosphere coefficients from the AI models (ANN6 and RF2) increases the RMSE approximately three times and decreases the CCs by approximately 0.1. The ANN6 and RF2 models are not as accurate as the linear model with the regional ionosphere coefficients (ANN1). These results indicate that estimating the VTEC using only the spatiotemporal (latitude, longitude, and HoD), geomagnetic, and solar parameters (F10.7, Kp, and Dst) will not be sufficient. These data alone cannot provide an accurate description of VTEC. However, in conjunction with more influential, descriptive parameters of VTEC variations, such as regional ionosphere coefficients, one can significantly improve the model. Based on these results, the ANN5 model was selected as optimal and henceforth referred to as the RIM IONOWB_AI model.
Knowing the underlying relationships between the inputs and the output of the AI model is useful for interpreting what the AI model has learned and which input features have been selected as relevant. The relative importance of the input features was calculated as a root square of the sum of the squared improvements over the nodes in a tree generated by RF when a specific input feature xj was selected (Breiman, 2001). The importance of the input features was estimated using the RF1 and RF2 models (Figure 6). The relative importance of each coefficient is calculated as the mean importance of the regional ionosphere coefficients estimated for five countries. For the RF1 model, the first coefficient c00 contributes most to the VTEC output, followed by HoD and F10.7 index. Contributions from other coefficients were at similar levels. Smaller contributions were provided by the Dst index, longitude, latitude, and Kp index. These parameters extract spatial dependence and provide additional geomagnetic-dependent VTEC variations and relationships, but do not have a dominant influence in defining the VTEC variations with respect to the other inputs. Because the coefficients were calculated from the CORS and EPN observations, they already contain regional ionosphere information related to the origin of the Taylor series expansion and therefore, they have a large impact on the model output. When training a model with no regional ionosphere parameters, the highest contribution (nearly 70%) comes from HoD. The other most influential inputs are the F10.7, Dst, and Kp indices. The contributions of these inputs are higher in the RF2 model than in the RF1 model. This is because the RF2 model does not include input parameters that describe the regional ionosphere. Regional ionosphere coefficients correlate much more closely with the VTEC than any of the other input features. The least critical contributions come from latitude and longitude. This may be because the F10.7, Kp, and Dst indices are not spatially dependent. These results provide us with some intuition of how the AI models learn from these datasets.
Regional ionosphere parameters estimated from the CORS and EPN observations for each country included in the development of RIM IONO_WB were combined with spatiotemporal (latitude, longitude, and HoD), solar (F10.7 index), and geomagnetic (Kp and Dst indices) parameters via an AI technique using the ANN algorithm; this has resulted in the RIM IONOWB_AI model (Figure 7). This model can estimate VTEC for any location in the western Balkans.
2.4 Selection of the Study Period
Three time periods were selected for the study:
March 20–26, 2014,
March 15–20, 2015, and
March 20–26, 2018.
These periods were chosen because they included different phases of the solar cycle and varying ionosphere activity. Specifically, the solar maximum occurred in 2014, the declining phase of the solar cycle was in 2018, and a severe geomagnetic storm occurred in March 2015. There were large ionosphere disturbances over the study region that resulted in VTEC changes of 60% to 150% on the day of the storm in March 2015 and between 50% and 80% on the following days compared to its regular variability (Natras et al., 2023). Furthermore, all periods evaluated were around the spring equinox, when one can expect to find the highest number of electrons within the ionosphere over the study region (Natras et al., 2023).
The RIM IONOBH model was estimated for the first and third study periods, while the RIM IONOWB was generated for the first and second study periods. Therefore, while both models were established for the period of the solar maximum in 2014, different time frames were chosen for the second study period, including the solar minimum period (March 2018) for the RIM IONOBH and the geomagnetic storm period (March 2015) for the RIM IONOWB. This was because the models were developed independently of each other as part of two different projects implemented at different research institutes with different objectives, i.e., to study RIMs in the period of solar maximum (2014) and minimum (2018), and to analyze RIMs during a severe geomagnetic storm (2015). Furthermore, data were not available from all CORS networks for all periods examined; this resulted in different second study periods.
2.4.1 Overview of Solar and Geomagnetic Activity
Solar and geomagnetic indices are presented in Figure 8 for periods March 20–26, 2014 (left), March 15–20, 2015 (middle), and March 20-26, 2018 (right). These indices include the sunspot number R, radio flux F10.7 of the Sun’s emission at the 10.7 cm wavelength (Covington, 1969), disturbance storm time Dst (Sugiura, 1964), and Kp (Chapman & Bartels, 1962) obtained from the NASA/GSFC OMNI data set via OMNIWeb (https://omniweb.gsfc.nasa.gov/form/dxl.html). The first study period represents a solar maximum that reached its peak in April 2014. Therefore, the number of sunspots was increased throughout, ranging from 112 to 151, and F10.7 had values larger than 150 sfu. By contrast, the geomagnetic conditions were mostly quiet with Kp values below 4 with small fluctuations in the Dst index at approximately zero. The second investigation period included the strongest geomagnetic storm of the solar cycle 24, known as the St. Patrick’s Day Storm, that occurred on March 17, 2015. This severe storm was characterized by a main phase in which the Kp index reached a value of 8 and the Dst index was less than −200 nT. This was followed by a recovery phase that began on March 18, 2015, and lasted for several days until the geomagnetic field returned to normal conditions. It is worth mentioning that both the number of sunspots (from 20 to 53) and the solar radio flux (about 110 sfu) were significantly lower compared to those measured during the first period. The third study period featured low levels of solar activity during the decline phase of the solar cycle toward its minimum. This period was characterized by no sunspots and an F10.7 less than 70 sfu. The geomagnetic activity was calm to active, with no geomagnetic storms.
2.5 Validation of the Regional Ionosphere Models
The RIMs IONOBH and IONOWB were validated against other ionosphere models, including:
GIMs final products from CODE and IGS (from https://cddis.nasa.gov/archive/gnss/products/ionex/)
European RIM from the GIOMO model, based on weighting functions and developed at the Department of Geodesy and Geoinformation, Vienna University of Technology (Magnet, 2019)
European RIM based on polynomial B-spline functions developed at DGFI-TUM as a two-step VTEC model (TSM-Product 2c) named as OTHR model (Goss et al., 2020).
NeQuick2 (computed via https://t-ict4d.ictp.it/nequick2/nequick-2-web-model)
GPS broadcasted model Klobuchar. VTEC values were calculated from broadcast ephemeris data (https://cddis.nasa.gov/archive/gnss/data/daily/).
The climatological empirical model NeQuick2 (Nava et al., 2008) is the latest version of the NeQuick ionosphere electron density model developed for the computation of slant electron density profiles and TEC in the ionosphere at a given height, geocentric latitude, and geocentric longitude. Information on solar activity is provided to the model by the daily solar radio flux F10.7. The NeQuick2 model takes into account the contribution of the plasmasphere to VTEC up to 40,000 km. Although the electron density in the plasmasphere is much lower than in the ionosphere, under some conditions, such as at night and during periods of low solar activity, the contribution of the plasmasphere may represent a larger proportion of the VTEC. The relative global contribution of the plasmasphere to the VTEC depends on latitude and solar activity, with a minimum contribution of about 10% during daytime hours and a maximum of up to 60% at night.
The VTEC data of the RIM GIOMO model were provided for the study periods in March 2014 and March 2018 during the project that led to the development of the IONOBH. By contrast, VTEC data from the RIM OTHR model were available for all study periods.
The ionosphere models were also evaluated for precise point positioning (PPP) by processing single-frequency (L1) observations for selected GNSS stations using the Bernese GNSS Software. The following cases of L1 PPP solutions were carried out:
L1 positioning solution without ionospheric corrections,
L1 positioning solution with ionospheric corrections from the final GIM CODE, and
L1 positioning solution with ionospheric corrections from the RIMs IONOBH and IONOWB.
However, there is no interface in the Bernese software that can be used to process broadcasted ionospheric delay models. Therefore, we could not apply corrections from the Klobuchar model. The following processing steps were implemented for the PPP method: data preprocessing (RNXSMT), import of data into the Bernese format (RXOBV3), preparing orbit and Earth’s orientation information (POLUPD, PRETAB, ORBGEN), data preprocessing 2 (CODSPP, GPSEST, RESRMS, SATMRK), performing a solution for epoch parameters, and/or creation of normal equations NEQ (GPSEST) and an NEQ-based final session solution (ADDNEQ2). More details on each step are included in Dach et al. (2015). Positioning errors were estimated as differences between “true” positions and the single-frequency positioning results expressed as north, east, and up components. The weekly combined EPN solutions (https://www.epncb.oma.be/ftp/product/combin/) were used as “true” positions for the EPN stations. For the CORS stations, “true” positions were estimated with the dual-frequency PPP method in the Bernese GNSS software. For the analysis, errors were expressed as 1-D RMS vertical and 2-D RMS horizontal position errors, as well as, 3-D RMS position errors as shown in Equation (20):
20a
20b
20c
where ΔEi, ΔNi, ΔUi are errors in the east, north, and up (vertical) components, respectively, of the i-th position estimate sample and n is the total number of position estimate samples.
3 RESULTS
Section 3.1. compares the new RIMs with other ionosphere models as described in Section 2.5. The developed RIMs were then applied to correct the ionospheric range error in single-frequency positioning. The results of this procedure are presented in Section 3.2.
3.1 VTEC Results
The results shown in Figure 9 include the VTEC time series for the EPN station SRJV estimated from the RIM IONOBH for March 2014 (first panel) and March 2018 (third panel); differences between the RIM IONOBH and other ionosphere models are also shown (second and fourth panels). The IONOBH VTEC variability was at least five times higher in March 2014 (solar maximum) than in March 2018 (solar minimum). The VTEC values from the GIMs are mostly higher than the VTEC values from the RIM IONOBH. The largest differences between the GIMs and the RIM IONOBH occur during the night, with differences of up to 10 TECU in March 2014 and up to approximately 5 TECU in March 2018. During the day, the differences between the RIM IONOBH and the GIMs are reduced by a factor of 2 (i.e., mostly below 5 TECU in March 2014 and 2 TECU in March 2018). Higher differences were observed with respect to the GIOMO model, up to 20 TECU in 2014 and primarily up to 5 TECU in 2018. Interestingly, the GIOMO corresponds more effectively to the RIM IONOBH in 2018 and at night. Most of the results from RIM OTHR are in better agreement with the RIM IONOBH with differences smaller than those observed from the GIM CODE in March 2014 and March 2018. The models NeQuick2 and Klobuchar underestimate the VTEC in March 2014, while the Klobuchar model overestimates VTEC in March 2018. Differences from the Klobuchar model are up to 20 TECU. In March 2018, the results from the RIM IONOBH were in better agreement with those from the NeQuick2 model with differences up to 5 TECU during the day and mostly below 2 TECU at night. These results correspond to results reported by Wang et al. (2017) and Shi et al. (2019). In their studies, they show that climatological models, such as NeQuick2 and IRI2016, provide extremely underestimated values for VTEC during periods of high solar activity; however, these models are consistent with the GIM IGS in mid-latitudes and during periods of low solar activity (Shi et al., 2019). Differences between the RIMs IONOBH and IONOWB_AI are below 5 TECU for March 2014, with higher differences observed primarily at night.
The results shown in Figure 10 illustrate the differences between VTEC of the RIM IONOWB and VTEC of the other models, as well as VTEC time series from the RIMs IONOWB for the location of the EPN station SRJV for March 2014 (top two panels) and March 2015 (bottom two panels). In March 2014, the VTEC differences between the RIM IONOWB and other models were comparable to the results obtained with the RIM IONOBH (Figure 9, top).
In March 2015, the regular maximum VTEC daily values during March 15–16 (i.e., without disturbances from a geomagnetic storm) were about 20 TECU lower than in March 2014. During these days, the differences in VTEC from the RIM IONOWB with respect to the GIMs and the OTHR model were mostly less than 5 TECU; by contrast, differences from the NeQuick2 model were as high as 10 TECU during the daytime and below 3 TECU at night. On the day of the St. Patrick’s geomagnetic storm (March 17), the RIM IONOWB showed two VTEC peaks, including one around local noon (greater than 60 TECU) and another in the evening (greater than 40 TECU). The maximum differences detected on this day were 15 TECU. During the daytime, the differences from GIMs were mostly below 5 TECU, and almost zero during occurrences of the two aforementioned peaks. The higher VTEC differences between the RIM IONOWB and the other models occurred later in the evening. By contrast, the largest differences (up to 40 TECU) were those associated with the climatological model during the main storm phase; these findings correspond to results reported by Wang et al. (2017). The VTEC decreased during the recovery phase of the storm (March 18–21). Differences in the VTEC were mostly below 4 TECU compared to those of the GIMs and the OTHR model. With respect to the NeQuick2 model, the differences were below 10 TECU during the daytime and below 3 TECU at night. A comparison of the RIM IONOWB and IONOWB_AI reveals differences of up to 5 TECU, mainly occurring at night. During the daytime of the main phase of the storm (March 17), the differences were mostly around 0 with maximum differences of less than 4 TECU. Differences between the RIM IONOWB and the Klobuchar model were as high as 10 TECU before the storm, near 30 TECU during the main storm phase, and up to 20 TECU during the recovery.
A comparison between the newly-developed RIMs IONOBH and IONOWB and the other models shows that the smallest differences during daytime are between the new RIMs and the final GIM CODE. At night, the new RIMs agree most frequently with the NeQuick2 model, as it takes into account the contributions of the plasmasphere electron content to the ionospheric VTEC.
Figure 11 presents the RMSE between VTECs of the newly-developed RIMs and those from GIM CODE, RIM OTHR, RIM GIOMO, and the Klobuchar (Klob.) model for the study periods March 2014, March 2015, and March 2018. The RMSE is provided for the entire day from 00:00 to 23:00 UTC (upper left) and during day-time hours from 6:00 to 16:00 UTC (upper right). Correlation coefficients are provided for the entire day (bottom) together with the mean RMSE and the mean correlation coefficients.
An analysis of the RIMs IONOWB and IONOWB_AI was performed for the stations EPN GSR1, EPN POZE, EPN SRJV, IGEWE TIRA, and EPN ORID. An analysis of RIM IONOBH was done for the stations EPN SRJV, BIHPOS SEKO, EPN POZE, and EPN DUB2. The RMSE values were lower in the daytime than at night, with values mostly below 4 TECU, except for the Klobuchar model. The highest RMSE values were compared to the Klobuchar model, while the lowest were compared to the GIM CODE and the RIM OTHR. The RMSE values with respect to Klobuchar were about 7.5 TECU for all three study periods; the highest discrepancies were during daytime when the RMSE approached 10 TECU. A comparison of the RMSE values between the new RIMs and GIOMO revealed values of approximately 7 TECU in March 2014, including values below 5.5 TECU during the daytime. In March 2018, the RMS errors were reduced by a factor of 2, i.e., to below 3 TECU. In 2014, RMSE values compared to those from the GIM CODE and RIM OTHR were about 5 TECU for the entire day, and about 3.5 to 4 TECU during the daytime alone. In 2015, the RMSE values were below 4 TECU, while in 2018, the RMSE values were below 1.5 TECU during daytime and below 3 TECU at night when compared to the GIM CODE and RIM OTHR. In March 2014, the RMSE with respect to the RIM OTHR were approximately 0.3 TECU lower than RMSE for the GIM CODE; during the daytime, the RMSE values differed by less than 0.1 TECU. In March 2015, values from the newly-developed RIMs corresponded with those from the GIM CODE at 0.3 TECU level during the entire day, and 0.5 TECU better during the daytime than those provided by the RIM OTHR. In March 2018, the RMSE from the RIM OTHR were about 0.35 TECU lower than those from the GIM CODE, where their differences during the daytime can be neglected (i.e., 0.03 TECU). Correlations were the highest between new RIMs, and the GIM CODE and RIM OTHR.
Figure 12 presents VTEC maps for March 21, 2014, which was a day with high solar activity (F10.7 = 153 sfu) and quiet geomagnetic conditions (Kp = 2) at 12 UTC. The VTEC maps for March 17, 2015, the day of the severe geomagnetic storm at 12 UTC (F10.7 = 139 sfu, Kp = 8) and for March 18, 2015, the day after the main storm phase at 12 UTC (F10.7 = 114 sfu and Kp = 5) are shown in Figure 13 and Figure 14, respectively. VTECs from the RIM IONOWB_AI, RIM OTHR, GIM CODE, and the Klobuchar model were estimated on the 1° × 1° grid in latitude and longitude in the range from 40°N to 47°N and 13°E to 23°E. VTEC maps of RIM IONOWB_AI, RIM OTHR, and GIM CODE show the lowest daily ionization, mostly from 46°N to 47°N (upper parts of the maps), and the highest ionization from 40°N to 41°N (lower parts of the maps).
During solar maximum (Figure 12), RIMs IONOWB_AI and OTHR differed by less than 3 TECU with a mean difference of 0.9 TECU (Table 5). The largest differences between the RIMs were from 43°N 21°E to 44°N 23°E (3 TECU) and along 40°N (about 2 TECU). The smallest differences were in the areas covering Slovenia, Croatia, Bosnia-Herzegovina (BH; except around 44° latitude), and Montenegro with differences of less than 1 TECU and a mean difference of 0.4 TECU. Differences between the IONOWB_AI and the GIM CODE are up to approximately 4 TECU with a mean difference of 1.5 TECU. The largest differences were observed in the southern part of the map, where only a few IGS/EPN stations were used to estimate the GIM for the study area. By contrast, regions in which no GNSS stations were used to estimate the RIM IONOWB showed smaller differences. For example, the regions of Serbia and Kosovo (19°E–23°E, 42°N–46°N) showed a mean difference of 1.2 TECU compared to the RIM OTHR and 0.9 TECU compared to the GIM CODE. The average VTEC differences for the region of Montenegro (18°E–20°E and 42°N–44°N) were 0.8 TECU compared to RIM OTHR and 1.3 TECU compared to the GIM CODE. The VTEC difference maps between the RIM IONOWB_AI, RIM OTHR, and GIM CODE reveal unique VTEC features from the IONOWB_AI model with nonlinear spatial variations. The Klobuchar model underestimated VTEC by more than 20 TECU.
During the main phase of the storm (Figure 13), the ionization increased across the western Balkans. VTEC differences of RIM IONOWB_AI compared to RIM OTHR and GIM CODE show higher values in the northern part of the maps (up to 4.4 TECU). No GNSS observations from this region nor any of the more northern countries were used to generate the RIMs in this study; this led to the larger differences observed during the storm. The mean differences for the western Balkans regions, where GNSS observations were not used, were up to 1.2 TECU (Serbia, Kosovo) and 0.5 TECU (Montenegro) with respect to the RIM OTHR, and 1.4 TECU (Serbia, Kosovo) and 0.6 TECU (Montenegro) with respect to the GIM CODE. The VTEC values of the countries Slovenia, Croatia, and BH correspond better to the RIM OTHR with a mean difference of 0.5 TECU. The VTEC values in the southern part of this region correspond more closely with those from the GIM CODE with a mean difference of 0.7 TECU. The Klobuchar model underestimated VTEC by more than 24 TECU and failed to approximate the sudden VTEC increase during the main storm phase.
On the day after the main storm phase, the ionization was twice as low as on the previous day (Figure 14). The highest differences between the RIM IONOWB_AI, and the RIM OTHR and GIM CODE were identified in a region that included Italy, which was not part of this study. The average differences in the western Balkans countries were 0.6 TECU and 0.7 TECU between IONOWB_AI and the RIM OTHR and the GIM CODE, respectively. The mean differences over the western Balkans regions, with GNSS observations that were not used in this study, were 0.5 TECU (Serbia, Kosovo) and 0.6 TECU (Montenegro) compared to the RIM OTHR, and 0.6 TECU (Serbia, Kosovo) and 0.7 TECU (Montenegro) compared to the GIM CODE. The largest differences with regard to the RIM OTHR were along 44°N latitude (up to 1 TECU), while differences up to 0.6 TECU were detected for the same location compared to the GIM CODE. By contrast, the RIMs IONOWB_AI and OTHR showed better agreement in the area 18°E–23°E and 42°N–40°N with a mean difference of 0.6 TECU; the mean difference was twice as high. at 1.2 TECU compared to the GIM CODE. The Klobuchar model overestimated VTEC by more than 10 TECU, and therefore, cannot be used to approximate the decline in VTEC during the recovery phase of the storm.
Our results (Figure 9–Figure 14) reveal that results from the Klobuchar model deviate significantly from those generated by the newly-developed RIMs. Findings generated by the new RIMs correspond much more closely to the GIM CODE, which is considered the most accurate and precise source of VTEC information that is currently available. These results suggest that the new RIMs might replace the broadcasted Klobuchar model for applications involving single-frequency positioning.
3.2 Single-Frequency Positioning Solutions
To assess the RIM IONOBH, vertical and horizontal RMS position errors were calculated from 24-hour solutions for March 2014 and March 2018 for selected stations (Figure 15).
Positioning solutions without ionospheric corrections revealed RMS vertical position errors of approximately 5.5 m and 1.3 m in March 2014 and March 2018, respectively. As expected, single-frequency positioning solutions with ionospheric corrections represented a significant improvement, especially with respect to the vertical component. When applying the final GIM CODE, the vertical RMS errors were between 0.7 and 1 m in March 2014 and between 0.1 and 0.3 m in March 2018. The application of the new RIM IONOBH led to a higher vertical position accuracy for all stations with an RMS error between 0.4 and 0.7 m in March 2014 and 0.1 and 0.2 m in March 2018. The horizontal RMS errors between the GIM CODE and the RIM IONOBH were similar, at approximately 0.5 m for the RIM IONOBH and 0.6 m for the GIM CODE in March 2014 and 0.2 to 0.3 m in March 2018.
Figure 16 presents both vertical and horizontal RMS errors determined in March 2014 and March 2015 for selected stations that were used to assess the RIMs IONOWB and IONOWB_AI. Positioning solutions without ionospheric corrections had RMS vertical position errors of approximately 3.5 m in March 2015. Once the final GIM CODE was applied, the vertical RMS errors were between 0.6 and 1.2 m in March 2014 and 0.3 and 0.6 m in March 2015. After applying the new RIMs IONOWB and IONOWB_AI, the vertical RMS errors were between 0.5 and 0.8 m in March 2014 and 0.3 and 0.4 m in March 2015. The horizontal RMS errors were approximately 0.6 m for the CODE GIM and from 0.5 to 0.6 m for the RIMs IONOWB and IONOWB_AI in March 2014. In March 2015, the horizontal RMS errors were between 0.3 and 0.4 m for the CODE GIM, the RIMs IONO_WB, and IONOWB_AI. In March 2014, positioning errors after applying the IONOWB and IONOWB_AI corrections were at the same level as the GIM CODE (station GSR1) or better (for all the other stations). A significant improvement in position accuracy when using the RIMs IONOWB and IONOWB_AI was observed for stations SRJV, ORID, and TIRA in March 2014. In March 2015, both the vertical and horizontal position accuracy for the stations ORID and TIRA was improved by applying the RIMs IONOWB and IONOWB_AI. These stations are located in the lower part of the study region where larger VTEC differences between IONOWB_AI and GIM CODE were observed (Figure 9 and Figure 11). These differences may be attributed to the fact that information from very few stations was used to estimate the GIM CODE in this area; information from these stations was used in the newly-developed RIMs. Therefore, RIMs IONOWB and IONOWB_AI outperform the GIM CODE in this region. However, better positioning results were obtained with the GIM CODE in March 2015 stations located in the upper parts of the study region (i.e., GSR1 and POZE). There, larger VTEC differences between the RIM IONOWB_AI and GIM CODE were observed during the main storm phase (Figure 10) due to the larger VTEC differences in the neighboring countries, which are located outside the region examined in this study. Since a storm can induce significant variability in the ionosphere, the RIM for Slovenia would most likely benefit from including stations from other neighboring countries to improve position accuracy during a storm.
When using the L1 frequency without ionospheric range delay corrections, the vertical position errors are approximately three to four times higher than the horizontal position errors (Figure 12 and Figure 13). After applying ionospheric corrections from the GIM CODE and the RIMs, the vertical accuracy was improved by 80 to 90% and the horizontal accuracy was improved by 50 to 60%. This is consistent with findings reported by Wang et al. (2013) who showed that ionospheric corrections reduce ionospheric vertical delay errors more effectively than horizontal delay errors. Also, the ionosphere models can produce biases with a common sign in the line-of-sight slant observations that may be gathered from negative elevations. The ionosphere model projection bias in the north-south and the east-west directions can be partially compensated by one another. In this study, the elevation angle was set to 15° to gather more observations covering wider regions while reducing observations from the negative elevations to avoid positioning error bias.
Table 6 depicts RMS vertical, 2D horizontal, and 3D positional errors averaged over all stations evaluated; the improvement in RMS 3D error is compared to the solutions without ionosphere corrections. The newly-developed RIMs improved the 3D positional accuracy by 85% or more during the solar maximum (March 2014); by contrast, the improvement observed with GIM CODE was approximately 80%. During the geomagnetic storm, the new RIMs improved the 3D positional accuracy by approximately 84%, which was similar to that resulting from the GIM CODE. Interestingly, the new RIMs improve the vertical positional accuracy more effectively than the GIM CODE for all study periods. At times of low solar activity (March 2018), improvements of approximately 74% and 75% were observed for the GIM CODE and the new RIMs, respectively.
Since the mean differences and RMS errors between the new RIMs and the Klobuchar model are at least two to three times higher than those between the new RIMs and the GIM CODE (Section 3.1), the expected positioning error using the Klobuchar model is at least twice as large. For example, the RMS position error is expected to be more than 2 m during a solar maximum. In March 2014, the VTEC of the Klobuchar model was on average approximately 15 TECU lower than the VTEC of the new RIMs. Considering that 1 TECU is equivalent to 0.162 m of L1 signal delay, a difference of 15 TECU results in a remaining L1 signal delay of approximately 2.4 m. This is about two or three times higher than the 3D position solutions obtained using the GIM CODE or the new RIMs, respectively. Even larger position errors could be expected during the March 2015 geomagnetic storm.
4 DISCUSSION AND CONCLUSIONS
Regional VTEC models for BH (IONOBH), for specific countries in the western part of the Balkan peninsula (IONOWB), and the western Balkans region (IONOWB_AI) have been developed using the GNSS observations that were available from local CORS and the EPN networks in these countries. The IONOBH and IONOWB models are based on regional ionosphere coefficients of a Taylor series expansion estimated from the CORS observations processed in the Bernese GNSS Software. Machine learning techniques were also utilized to develop the IONOWB_AI model. The IONOWB_AI model includes coefficients of Taylor series expansion, spatial (latitude, longitude), temporal (time of day), solar (F10.7), and geomagnetic (Kp and Dst) parameters to estimate the VTEC at any specified position and time by applying a feed-forward neural network with backpropagation to minimize errors. These newly-developed RIMs were validated against GIMs, European RIMs, the NeQuick2 climatological empirical model, and the broadcasted Klobuchar model for different study periods including those featuring high and low solar cycle phases and quiet and disturbed geomagnetic periods. The spring equinox, when the highest VTEC values during a year can be expected in this study region, has also been covered. The new RIMs were applied in single-frequency positioning to evaluate their ability to mitigate the ionospheric refraction effect on precise positioning applications.
During the solar maximum, the differences between all examined models were higher than at the solar minimum, as expected. Differences between the VTEC values from the RIMs IONOBH and IONOWB and the GIMs were minor during the daytime and higher at night in March 2014. GIMs usually deliver higher VTEC values than the RIMs IONOBH and IONOWB. The mean difference in the solar maximum (March 2014) was approximately 5 TECU, while the mean difference at the solar minimum (March 2018) was approximately 2 TECU (i.e., reduced by more than a factor of 2). The largest differences were observed at night when the plasmaspheric contributions to VTEC prevail; agreement during the daytime is at least two times better. During the days before the storm (March 15–16, 2015), the mean differences were approximately 3 TECU. On the day of the severe storm (March 17, 2015), the mean differences were approximately 5 TECU, while during the recovery phase, the mean differences were below 2 TECU. Thus, we conclude that the differences between VTEC values from the RIMs IONOBH and IONOWB and those from the GIMs are dependent on the solar cycle phase and geomagnetic activity.
The analysis with respect to the European RIMs shows better agreements with the OTHR model than with the GIOMO model, especially during the maximum of the solar cycle phase (March 2014). Differences between the new RIMs and the OTHR are often smaller than differences between GIM CODE and the RIMs. However, differences with respect to the OTHR and the GIM CODE are quite similar to one another with similar trends, indicating small differences between these two models.
The largest discrepancy was between the newly-developed RIMs and the Klobuchar model. The Klobuchar model underestimated VTEC by more than 20 TECU during the daytime during the solar maximum and the main phase of the geomagnetic storm when compared to results from the new RIMs. In the recovery phase of the geomagnetic storm and at the solar minimum, the Klobuchar model overestimated VTEC by up to 20 TECU. The new RIMs correspond at least 4 to 10 times better to the GIM CODE than to the Klobuchar model, with mean differences from 2 to 5 TECU.
The NeQuick2 climatological model underestimated VTEC most of the time. During the solar minimum (March 2018), the NeQuick2 model was in much better agreement with the new RIMs than during the solar maximum (March 2014). During the main storm phase, differences with respect to the NeQuick2 model were the highest. These differences can be explained by the fact that the NeQuick2 provides a climatological description of the electron density in the ionosphere and its performance depends significantly on both solar and geomagnetic activity. Our results are consistent with those reported by Shi et al. (2019) and Wang et al. (2017).
The RIM IONOWB_AI, which is based on an ANN and the IONOWB regional ionosphere, spatiotemporal, solar, and geomagnetic parameters, can estimate VTEC for any location in the western part of the Balkan peninsula. In areas where no GNSS observations were used to estimate regional ionosphere parameters, the RIM IONOWB_AI provides VTEC with sufficient accuracy compared to the RIM OTHR and GIM CODE. Moreover, VTEC maps reveal non-linear VTEC signatures and regional ionosphere variations not seen in the GIM CODE. During the daily VTEC peaks (at 12 UTC) the mean differences in the study region are mostly below 1.50 TECU compared to findings generated by the RIM OTHR and GIM CODE. During the main storm phase, when the largest VTEC values were detected, the RIM IONOWB_AI estimated VTEC with differences that were primarily below 2 TECU (at 12 UTC) with respect to the RIM OTHR and the GIM CODE. During the recovery phase, the observed differences were even lower. By contrast, the Klobuchar model failed to describe the VTEC variations associated with geomagnetic activity. Collectively, our results demonstrate that the VTEC estimated by RIM IONOWB_AI provides a much better estimate of the integrated electron density than can be achieved using the Klobuchar model during the solar maximum or a geomagnetic storm.
Single-frequency positioning solutions show a similar or in some cases better positional accuracy when applying ionosphere corrections from the new RIMs (i.e., IONOBH, IONOWB, and IONOWB_AI) rather than from the final GIM CODE. Vertical positional accuracy can be achieved with these newly-developed RIMs at the decimeter level. Compared to the final GIM CODE, improvements in positioning with the new models are particularly effective when observed during the solar maximum. Also, the improvements are more significant for all study periods for stations located in the southern part of the region. This may be because of the higher ionization observed over those stations, and because no GNSS stations are used in this area to produce the GIM CODE. As a result, the RIMs show better performance in these regions. During the geomagnetic storm (March 2015), the positional accuracy achieved when applying RIMs IONOBH, IONOWB, and IONOWB_AI were within a similar range as that obtained with the final GIM CODE. Slightly better performance with the GIM CODE was achieved during March 2015 for stations located in the northern part of the study region (GSR1 and POZE). This may be because the GNSS stations from neighboring countries in this region were not used to generate the new RIMs. VTEC maps show larger differences when comparing outcomes from the RIM IONOWB_AI and GIM CODE in this northern area as a result of drastic changes in conditions in the ionosphere during the main phase of the severe geomagnetic storm. Given the fact that this was the strongest storm of the previous 11-year solar cycle and that modeling such intense ionospheric variations remains a significant challenge, the results obtained using the new RIMs are satisfactory. As the results indicate, positional accuracy depends strongly on solar and geomagnetic activity. Moreover, the newly developed IONOBH, IONOWB, and IONOWB_AI models facilitate the correction of the ionospheric refraction in a way that leads to increased positioning accuracy; improvements in 3D positioning of greater than 80% were observed during periods of solar maximum and severe geomagnetic storm.
The results presented here demonstrate that VTEC models generated for small regions from dense observations from the CORS networks can provide much more accurate estimates of VTEC than can be achieved using the Klobuchar model, which is currently used as the standard for single-frequency positioning. Additionally, a machine learning approach enabled us to generate spatiotemporal RIMs that can estimate VTEC for areas in which no GNSS observations were available. The results reveal that the new RIMs are in much better agreement with the GIM CODE than with the Klobuchar model. Taking into account that the GIM CODE is considered the most accurate and precise source of VTEC information currently available, the new RIMs might surpass this standard and significantly improve ionosphere modeling compared to the Klobuchar model. The new RIMs correct single-frequency range errors with an accuracy that is similar to that of the final GIM products and in some cases even better. Furthermore, the final GIMs are produced with a latency of several days or even weeks, and thus they are not well suited for real (or even near-real) time positioning applications. The available real-time Klobuchar model fails to provide an accurate description of the ionosphere and, consequently, it cannot precisely reduce the ionospheric signal refraction. The RIMs developed as part of this study can be generated from GNSS observations from the CORS network in near-real time (within approximately three hours) and corrections can be provided to the single-frequency GNSS users. The findings suggest that these new models may properly address the needs of single-frequency GNSS users and provide an effective ionospheric correction.
Recommendations and plans for future work include:
Models with a higher temporal resolution may be beneficial for use in capturing sudden VTEC variations due to geomagnetic storms, traveling ionosphere disturbances, and other features.
The addition of dense CORS observations from other countries within the study region can improve the accuracy of the RIMs and fill and/or reduce the observation gaps.
Observations from the EPN stations near and further away from the border of the study region can be introduced to fill in the observation gaps between the IPPs if additional CORS observations are not available.
GNSS data should be processed over a longer period to retrain the ANN model and thus improve its accuracy and applicability.
The machine learning approach can be improved by automatically extracting features in the spatial domain using a convolutional neural network and those in the temporal domain using a recurrent neural network. It would be interesting to compare this combination with the approach presented in this paper.
It may also be interesting to test the RIMs over a longer period.
To facilitate their application in national positioning services, models should be implemented in near-real time.
HOW TO CITE THIS ARTICLE
Natras, R., Goss, A., Halilovic, D., Magnet, N., Mulic, M., Schmidt, M., & Weber, R. (2023). Regional ionosphere delay models based on CORS data and machine learning. NAVIGATION, 70(3). https://doi.org/10.33012/navi.577
FUNDING
This research was funded by OeAD-GmbH (ICM-2017-06548) within the framework of Ernst Mach Grant Worldwide; a Short-Term Research Grant from Deutscher Akademischer Austauschdienst (DAAD); and Technical University of Munich (TUM) within the framework of Open Access Publishing Program within the DEAL agreement.
CONFLICT OF INTEREST / COMPETING INTERESTS
The authors declare no conflict of interest.
AUTHORS’ CONTRIBUTION
Conceptualization: R.N.; data curation: R.N., N.M, A.G., M.M.; methodology: R.N.; formal analysis: R.N., software: R.N., N.M.; validation: R.N., Dz.H.; investigation: R.N.; visualization: R.N.; writing – original draft: R.N.; writing - review and editing: all authors; supervision: M.S., R.W.; funding acquisition: R.N.
ACKNOWLEDGMENTS
The authors are grateful to the providers of CORS National permanent GNSS networks, including the Republic Administration for Geodetic and Property Law Affairs and Federal Administration for Geodetic and Property Affairs of Bosnia-Herzegovina for BIHPOS GNSS data; State Geodetic Administration of the Republic of Croatia for CROPOS GNSS data; Institute of Geosciences, Energy, Water, and Environment Tirana and Prof. B. Nurce for AlbGNSS GNSS data; Agency for the Real Estate Cadastre Republic of North Macedonia for MAKPOS GNSS data; and the Geodetic Institute of Slovenia for SIGNAL GNSS data. We also thank the EUREF network for EPN GNSS data, CODE for the GIM data, the IRI Working Group for the IRI-online computation service; NASA/GSFC’s Space Physics Data Facility OMNIWeb service and CDDIS, which is NASA’s Archive of Space Geodesy data. We also extend our thanks to N. Hanna for estimating VTEC from the Klobuchar model and we acknowledge the Open Source Geographic Information System (QGIS) for generating one of the figures. Finally, we thank the reviewers whose valuable comments and suggestions helped to clarify our thinking and improve the quality of this manuscript.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.