High-Precision Vision Localization System for Autonomous Guided Vehicles in Dusty Industrial Environments

Autonomous guided vehicles (AGVs) are widely used in ports and warehouses during the ongoing Fourth Industrial Revolution (Industry 4.0). AGVs can transfer cargoes to their correct destinations or carry industrial robots to work from various directions. In cases like Arthur et al. (2008) and Hofacker (1993), engineers put a robot on an AGV to finish aircraft manufacturing instead of a fixed robotic system such as a gantry crane or guide rail on the ground. An AGV can avoid structural transformation and has high adaptability in various applications. However, highly accurate AGV localization is required for safe driving and precise robotic operations. Several studies have investigated high-precision localization methods for AGVs since 1908. A classical method involves setting magnetic or electric cables underground for tracking. AGVs would follow the cable path by electromagnetic induction (Chen et al., 2004) using special sensors. This method is still widely used in modern factories. It can maintain a position accuracy of around 10 mm. However, it damages the original ground and can be easily affected by magnetic and electric fields. Some studies used a monocular camera (Wu et al., 2019) to track colored stripes and quick response (QR) codes. The positioning error in those studies could be State Key Laboratory of Tribology, Tsinghua University, Beijing, China

reduced to 10 mm with high resolution pictures. These fixed path tracking methods are stable for industrial use. However, they limit the flexible mobility of AGVs and cannot be easily modified for complex tasks.
Various path-free navigation methods have been researched to meet increasing localization demands. Schulze and Wullner (2006) set up reflectors and calculated the position with a laser range tracker on a robot. According to Ronzoni et al. (2011), efficient laser tracking can be achieved with a small localization error within an accuracy of 3 mm. However, the cost of building such a system is significantly higher for most applications. Alternatively, lidar can be used to guide mobile robots. However, the present lidar localization methods are mainly used in large scale road environments. Zou compared different indoor lidar localization methods for autonomous vehicles, and the best one achieved a translation error of 10 cm (Zou et al., 2021), which was not accurate enough for indoor robot operations requiring mm-level positioning.
Substantial numbers of wireless localization methods have been introduced in recent years, including wireless fidelity (WiFi; Zhang et al., 2020), ZigBee (Bianchi et al., 2019), radio frequency identification (RFID; Tao et al., 2021), and ultra wideband (UWB; Shule et al., 2020). These wireless tools are used in large-scale situations, and they provide an accuracy between 1 cm and 10 cm. Methods used by Ahmadinia et al. (2013Ahmadinia et al. ( , 2014) focus on energy consumption and improving the performance with data aggregation. It should be noted that even though wireless localization is convenient currently, it is not included in our project for safety concerns.
With the development of computer vision, there has been a rising trend of utilizing cameras for localization. Vision methods are of two types, namely the non-markers method and the visual markers method. Tang et al. (2021) built a vision odometer using convolutional neural networks for 2D-3D matching. Red-green-blue-depth (RGBD) cameras can perform better with extra depth information, which is popular in simultaneous localization and mapping (SLAM; Mur-Artal & Tardós, 2017). Karkus et al. (2021) proposed a differentiable SLAM network for indoor robot navigation with a mapping error of 20 cm. However, vision localization with only environmental features can be easily disturbed by image loss, and is not accurate enough. Artificial visual targets attached on a wall or ceiling can help in promoting localization precision (Ortiz-Fernandez et al., 2021) and stability. With well-calibrated targets, relative positioning error can be reduced to 4 mm.
Owing to the various advantages and disadvantages of single-sensor methods, multi-sensor localization methods have also been examined in the past (Li et al., 2019;Song et al., 2017). Nguyen et al. (2021) presented a tightly-coupled camera-IMU-UWB system focusing on drift reduction, and achieved an accuracy of 10 cm. The combination of different data provides a more robust and accurate position.
The Kalman filter (Kalman, 1960) method is a classical linear filtering method for data fusion. Several nonlinear filtering methods have been developed based on this filter. Extended Kalman filters (EKF; Jazwinski, 2007) use analytical approximation to approach nonlinear filtering, and unscented Kalman filters (UKF; Julier & Uhlmann, 1997) select state sampling. EKF and UKF are the most frequently used methods owing to their good performance and generalization ability.
In practical applications, the state covariance matrix might lose positive definiteness due to rounding in computer calculations. Methods presented by Holmes et al. (2008) used the square root for covariance matrix transmission to solve this problem. Li et al. (2020) proposed an improved maximum correntropy cubature Kalman filter to eliminate the measurement outliers in the cooperative localization of autonomous underwater vehicles (AUVs). Bai et al. (2020) utilized mixture variant Gaussian distributions to build a robust Kalman filter for AUVs.
In our laboratory, a mobile spraying system was built for aircraft manufacturing, as shown in Figure 1(c). It included an 8-m-long spraying robot laying on an 5-m-long AGV. The workspace coordinate system was measured previously and the AGV stations were obtained from offline spraying plannings. A small angle error of 0.2° at the robotic base implied a 14 mm position error at the endpoint, which is dangerous for spraying path compensation. Thus, the AGV must arrive at the right station with a high angular accuracy. However, the spraying house is bad for several precise localization methods, as shown in Figure 1(a).
Wireless techniques can be considered in this condition, but they have been banned in our project due to safety concerns. Moreover, localization methods with laser trackers are significantly more expensive. The lidar method or non-marker vision odometer is sensitive to noise and not accurate enough. Therefore, the best alternative in our project was putting artificial markers (such as QR codes) to guide AGVs.
Common visual marks are susceptible to light, and the brightness in industrial environments is not stable enough. Some researchers attached printed targets on the ground , but they were easily stained with paint, as shown in Figure 1 Factory ceilings are too high to set markers and walls are not allowed to be pasted with markers. Therefore, we made some movable targets (introduced in Section 2.1) with infrared light emitting diode (LED) lamp beads to build a vision localization system. LED light can efficiently penetrate the paint mist and grinding dust. Therefore, this system can run in spraying environments. The camera positioning result is at low frequency and might get disturbed by obstacles occasionally, so we integrated an odometer and inertial measurement unit (IMU) to improve the publishing rate. Moreover, we optimized the vision results to reduce errors.
The main contributions of the introduced vision localization solution are summarized as follows: 1. A vision localization system using LED light targets was designed for wirelessdenied aircraft manufacturing factories. The system is proven to have high adaptability and stability in dusty spraying environments in comparison to other localization methods. 2. A fast and convenient least-squares calibration method between the camera and AGV control center is introduced. 3. The characteristics of vision localization are analyzed geometrically, and its noise is modeled using Gaussian distribution.
The factory is always dusty during manufacturing; (b) The ground will be stained with paint after spraying; and (c) The mobile spraying system. An 8-m-long robotic arm was set on an 5-m-long AGV. It required high angle accuracy for spraying. 4. A novel multi-sensor fusion algorithm was created to estimate a more robust and precise pose at 10 Hz. The algorithm contains a vision localization preoptimization technique called VLPO, and a square root central difference Kalman filter (SR-CDKF; Nørgaard et al., 2000). The system obtained position accuracy within 10 mm and angle accuracy within 0.052°, thereby exhibiting superior performance.
The remainder of this paper is organized as follows: The vision localization system and multi-sensor fusion operation are introduced in Section 2; Section 3 discusses the experiments conducted to verify the performance of our localization method; and Section 4 concludes this work.

LOCALIZATION SYSTEM
In this section, an overview of the proposed AGV navigation system and its localization algorithm is presented. The AGV navigation system is illustrated in Figure 2. The camera is fixed on a rotating platform that is placed in front of the AGV. The LED target was designed by Chen et al. (2019) from our laboratory. There are specially arranged infrared LED lights for vision positioning. The camera tracks the LED target when the system is running. The valid distance between the target and camera is 3.5~10 m. The pose of the AGV can be obtained using the algorithm detailed in the following subsections.
The concrete structure of the system is shown in Figure 3. We used a personal computer (PC) with 4 Intel Core i7-7700HK CPU at 3.6 GHz, 16 GB of RAM, and a solid state drive of 256 GB for processing the localization algorithm. The AGV used in this study was a four-wheel steering vehicle with a length and width of 5,660 mm and 1,835 mm, respectively. It has a Siemens S2000 programmable logic controller (PLC) to control itself and an ethernet cable to communicate with the navigation PC. The AGV sends four-wheel angles and speeds to the PC at 10 Hz, which is similar to a car odometer. The camera is a complementary metal oxide semiconductor (CMOS) sensor with a resolution of 2560 × 2048. Image processing to compute the vision localization pose requires 0.2 s. The rotating platform has an angle accuracy of 0.01°, and it communicates with the PC at 10 Hz. Additionally, an IMU operating at 60 Hz provides high precision yaw angle information for AGV in a short time.

AGV
LED target (850 nm infrared light) camera (with infrared penetrating filter ) rotating platform FIGURE 2 Overview of the experimental AGV navigation system; the length and width of the AGV are 5 and 1.8 m, respectively. The camera located before the AGV and the LED target should be 3.5~10 m away from the camera. An infrared penetrating filter is mounted on the camera lens. Distributed LED lights are placed on the target; they emit infrared light of 850 nm. The camera will track the LED target through rotations. Thus, the pose of the AGV with full view of the LED target can be computed.
The camera, rotating platform, and IMU are put into a cabinet for explosion protection. They are connected with the navigation PC through USB ports. The LED target has a height and width of 560 mm and 320 mm, respectively. We attached the target to a tripod and adjusted it to the same height as the camera. The workings of the system are illustrated in Figure 4. When the system is initialized, the navigation PC controls the rotating platform and the camera captures the LED target. The PC computes a priori vision results. Then the localization filter progress launches with the data of the camera, IMU, and odometer. As the AGV moves, the camera tracks the target.

Vision Positioning Using LED Target
The core of our localization method in the vision positioning system consists of a camera and LED target. The camera is set to a short exposure time of 300 ∝s and low light gain. We installed an infrared penetrating filter on the camera lens. Thus, we could only see the 850 nm LED lights from the camera, as shown in Figure 5. This worked efficiently, and the image stayed almost unchanged during the test for the entire day in the factory. No brightness adjustment was made to the system, which is an advantage in comparison to conventional vision methods.
When the camera obtained a full sight of the LED lights on the target, it detected all 2-dimensional (2D) blobs and matched them with the real 3-dimensional (3D) points. The LED light blobs were recognized by a simple algorithm called sim-pleBlobDetector from OpenCV (Bradski, 2000) using the following steps: 1. The source image was converted to binary images (values from 0 to 255) with several gray-scale thresholds. Here, the neighboring thresholds start from 50 to 140 with an interval of 15. 2. In each binary image, the connected white pixels were grouped together as white blobs. 3. The blob centers were computed and the blobs of all binary images located closer than 4 pixels were merged. 4. The blobs with less than 16 or more than 60 pixels were filtered.
The final centers of blobs were returned, and we sorted them by height in correspondence to the correct LED lights. SimpleBlobDetector is the fastest method in comparison to others such as HoughCirclesDetector. However, the detection time varied from 0.14~0.19 s, which effectively limits the vision publishing rate.
The carbon fiber target in Figure 5 was calibrated using a coordinate measuring machine (CMM). We controlled the CMM probe to touch the 16 LED lights and ball probe seat LED target LED light LED target image LED light blob image FIGURE 5 LED light target (left) and its image captured by the camera (right); the LED lights fixed on the bracket are non-coplanar. There are five ball probe seats on the target that can hold the laser tracker target ball while the camera is set to dark mode and has an infrared penetrating filter. Thus, we can only see the white LED light blobs. recorded their coordinates P light C in the CMM coordinate system. We also measured the surface of the laser tracker target ball attached on the seats in Figure 5. The ball probe seat center coordinates P ball C were given through spherical center fitting. A Leica AT901 laser tracker was used in the project to define a world coordinate system for robotic spraying planning. The LED target can be transferred into the world coordinate system using simple operations. We attached the laser tracker ball on the ball probe seats of the LED target and obtained new coordinates P ball w in the world coordinate system. Rotation R cam and translation vector t cam , such that R P t P  (1) can be used for 3D-2D point transformation using the perspective projection model Π and the camera intrinsic parameter matrix A: ; , 0 1 is the desired pose of the camera in the world coordinate system in which there are 12 unknown parameters.
We used 16 LED lights for computation (i.e., we solved a 16-linear-equation problem). The camera intrinsic parameters f f c c x y x y , , , ( ) were known and camera distortion was considered in the program. The calculation was performed using an algorithm called Efficient Perspective-n-Point Camera Pose Estimation (ePnP; Lepetit et al., 2009). ePnP treats the control points (16 LED lights) as a weighted sum of four virtual control points. Therefore, the coordinates of the control points become the unknowns of the problem. The final pose of the camera can be obtained using these control points as seen in Figure 6. The ePnP code from OpenCV is efficient and costs less than 0.2 ms.

Calibration Between the Camera and AGV
The camera has a horizontal sight angle of 37°. We installed it on a rotating platform to widen the measurement field, as shown in Figure 3. The naive vision result calculated by the ePnP method yielded the camera pose matrix R t cam cam , ( ) in the world frame. It is necessary to transfer the camera pose to the rotating center. In Figure 7, we set two calibrated LED targets and fixed the AGV.

LED target 1 LED target 2 Rotate
Rotating center FIGURE 7 This is an illustration of a calibration experiment that transfers the camera pose to the rotating center. Two calibrated LED targets are fixed on the ground while the AGV remains still in the middle. We rotate the platform to ensure that the camera views each target and computes the vision results.
The rotating center in the world coordinate with: The rotating center in the camera coordinate is expressed as: When the camera obtains full view of a target, the pose matrix R t cam cam , ( ) can be obtained as: where X cen cam is unchanged because the rotating center is fixed in the camera coordinate system. The rotating platform is rotated so that the camera can view each target and get R t The navigation system focuses on the 2D plane. Thus, only x cen w and y cen w are useful. The camera angle θ cam c en w cen w minus the rotating platform angle provides the yaw angle θ agv of the AGV.
For AGV navigation, the pose at the AGV control center (center of gravity) is required. In Figure 8 The relative position transformation between the camera rotating center and AGV control center remained unchanged. ρ t and θ t denote the position bias and angle bias in the polar coordinate system as shown in Figure 9. We can estimate the optimal bias ρ t and θˆt with:

Error Distribution of Static Vision Localization Result
The LED blob in the camera image is fitted as a circle to compute the center. The AGV vibrates when the electric and mechanical systems are running. Thus, the white light blob will jitter in camera imaging, as demonstrated in Figure 10. The blob center that recognized from the image will move among adjacent pixels. These vibrations make changes in u i and v i , as demonstrated in Equation (1). Correspondingly, there was noise in R t cam cam , ( ) , which in turn produced errors for the vision results. To improve the accuracy of localization, the error distribution FIGURE 9 Position transformation between the camera rotating center and AGV control center had to be determined. The angle error was approximately 0.02°, and we determined that the angle was precise enough. The position error was approximately 2 mm, and it had to be reduced. It is difficult to parse out the error directly from the computations used in the localization process. However, we could analyze the image vibrations geometrically. The vertical bias of the blob center in the image affects the height of the vision result. However, the navigation is on the 2D-plane without height so that the vertical vibration can be ignored.
The horizontal bias introduced a vision pose result variation on the navigation plane. In Figure 11(a), the camera was fixed and heading toward the target. We recorded the vision result of the camera rotating center and obtained the purple dots in Figure 11(b). Similar operations were performed at different places in the LED valid region. We found that the vision results could be plotted as an ellipse as shown in Figure 11(b), which is suitable for a Gaussian distribution approximation. The shape of the ellipse is relative to the distance d cam and α cam . The covariance matrix of the vision results Σ vo can be calculated using Equation (10). The angle of ellipse β cam is roughly equal to angle α cam as shown in Figure 11(b). The numerical model of the ellipse axis fitted by the MATLAB optimization tool is presented in Equation (11). We can obtain α cam and d cam with simple math computations to ensure that Σ vo is available for later vision optimization. Although Σ vo is computed in static situations, the error distribution is still helpful in kinematic localization.

Data Fusion
AGVs in this aircraft painting application require at least 10-Hz pose messages for navigation. The vision localization data were obtained at a low frequency so they are sensitive to environmental vibrations. The camera cannot work if its vision is obstructed. Therefore, it is necessary to use other sensors to improve performance.
The odometer and IMU are reliable for short-time measurements. When the camera data are missing, the odometer and IMU can still predict the pose for a short duration until the target is found again. If the vision element works accurately, the odometer and IMU contribute to the Kalman filter. Filtering can reduce the noise such that the result is maintained in a reasonable range. Additionally, a high-frequency Kalman filter loop can provide a stable localization solution. The fusion process is illustrated in Figure 12.
FIGURE 12 This is the structure of data fusion. The vision result of the camera rotating center p i = [x i , y i , θ i ] is sent to the optimization process, VLPO. The odometer and IMU provide the transformation constraint p̂ = [∅x i , ∅y i , ∆θ i ] to improve p i to obtain a more accurate result p opt . The square root central difference Kalman filter (SR-CDKF; Nørgaard et al., 2000) runs at 10 Hz to estimate the current AGV pose z k . The SR-CDKF receives the measurement vector y k (AGV control center pose), control input vector u k = [v, w], and pose z k−1 in the last frame. The pose will be sent to navigation block for AGV trajectory tracking control.

Vision Localization Pre-Optimization
The Kalman filter is based on the Markov model. Therefore, estimating the state of the current frame is related only to the last frame and the constraint between the two frames. Hence, other recent earlier frames are wasted.
In this study, the nearest frames and constraints were collected to optimize the sum of the re-projection error, called VLPO. Vision localization provides the camera rotating center pose [ , ] [ , , ] p x y i i T i i i T θ θ = and its covariance S vo i , mentioned in Section 2.3 for the i-th frame. The constraint ˆi p from the odometer and IMU indicates the movement between two poses in the i-th frame. Because vision localization includes efficient angle measurement, the cost function is built only with the position errors. Equation (12) uses the inverse square root of the covariance matrix to multiply the residual to give the cost function F p ( ) of k-frames: We are solving the least squares problem to get the optimal position vector p opt : The Jacobian matrices J p ( ) are given by J F p ∆ is utilized to solve a sequence of approximations. The initial vector p with prior variable ∅p is calculated, and a new least squares problem is obtained, as follows: The problem can be expressed in another way (i.e., solving an equation): is the gradient vector. The problem is sparse when referring to several frames. The most efficient strategy is the decomposition of H p R p R p with Cholesky factorization, where R is an upper triangular matrix. The correction is calculated in Equation (16): ∆p multiplies a scale factor ∝ to limit the step and µ∆ = +ˆ. opt p p p Then, the aforementioned processes are iterated until the convergence is realized. The problem was solved using the Ceres solver (Agarwal et al., n.d.). After pre-optimization (i.e., VLPO), a better localization sequence was obtained. The latest optimal frame was transferred to the AGV control center for the following filtering process.

Filtering Loop
The optimized vision localization technique yields 5 Hz which is not enough for AGV navigation with the 10 Hz mentioned above. Moreover, the latest frame is not the current frame due to the computational time cost. The fusion algorithm uses a square root central difference Kalman filter to provide better localization. A central difference Kalman filter (CDKF) requires only one parameter, which is simpler than UKF but as accurate as UKF. Vision localization is treated as an observation.
The odometer and IMU data offer the input message in the state equation following the motion model. The state estimation problem for the system is described in Equation (17) , . / are their corresponding covariance matrices.
The filter is initialized with a state mean vector 0 z and the square root of a covariance: where E[] calculates the expectation and chol{} calculates the Cholesky factorization. Then, for k = 1 2 , ,..., the SR-CDKF obtains the sigma-point with: where the optimal parameter h is always 3 and w T = [ , , ] 0 0 0 , diag{} indicates the diagonal matrix.
The current attitude is predicted according to each sigma-point as ψ ψ Next, the sigma-point for measurement update is calculated as: The Kalman gain matrix K k is calculated as follows: Subsequently, the state mean and covariance of the latest frame are updated: where cholupdate{} is a Cholesky factor updating method. During the vision data intervals, the system only executes Equation (19) with ẑ k to estimate the current ψ k k +1/ as an AGV pose. Thus, the filtering process can estimate ẑ k at 10 Hz with optimal vision observation y k at 5 Hz. When the LED target is lost, the fusion can still operate for 5 s until the error exceeds 10 mm.

EXPERIMENT
To evaluate the validity of the vision localization system and performance of the data fusion method, experiments were performed. The world coordinate system was defined using the laser tracker. As shown in Figure 13, we installed two ball probe seats on the AGV to represent the AGV pose. Owing to the equipment limitations, we could only measure one target ball at a time. Consequently, we obtained the ground truth pose only when the AGV stopped. The real-time angle cannot be measured via laser tracker during driving.   The AGV was driven to 20 different places to test the vision system. The positioning results were recorded after a while at each station, and the mean value was taken as the estimated AGV pose. The ground truth of the AGV pose was measured using the two ball probe seats by the laser tracker. Figure 14 shows the visual results of the experiment. Table 1 lists the position and angle errors in our vision results. The system reached a precise accuracy of 6 mm and 0.052° for static localization, which is equivalent to the state-of-the-art in AGV positioning.
The kinematic localization experiment was performed as shown in Figure 15(a). The AGV moved along the path in various modes. The laser tracker tracked the target ball attached to the AGV control center to obtain the ground truth of the AGV pose. Our data fusion technique focused on introducing pre-optimization during the vision process instead of improving filtering. Thus, we compared our data fusion method with pure SR-CDKF to observe the proposed method.
As shown in Figure 15(c), the AGV is driving back during 0~20 s, rotating during 20~40 s, and driving forward during the remaining time. The positioning results are almost the same during rotation because the angles of AGV wheels remained unchanged in this mode. The positioning errors are smaller than 8 mm, similar to static situations. The wheels will turn to follow the path when driving backward or forward, which leads to a bigger localization error. The traditional SR-CDKF has a position accuracy of approximately 15 mm. The fusion of pre-optimization and SR-CDKF improves the accuracy to 10 mm. The yaw angle errors cannot be obtained during moving; however, it can be observed that the angle results are quite smooth in both methods.
The computation cost of SR-CDKF is approximately 0.018 s, while that of our additional pre-optimization method is less than 0.006 s. However, the optimization process runs on a separate thread on multi-core CPUs, implying that our method does not increase the computing time of data fusion.  Some localization systems use the visible light position (VLP; Zhuang et al., 2019) signal strength. These low-cost methods have an average position accuracy of 10 cm for indoor AGVs. Wireless positioning methods are now common in AGV localization, including GPS, cellular communication, Zigbee, RFID, antennae , and UWB (He et al., 2020).
These methods are suitable for dusty spraying factories. However, their position accuracy is always over 2 cm, and they are banned in our project for safety reasons. High precision laser sensors can provide positions with 10 mm errors (Cho et al., 2018); however, the building cost of such systems is significant. Visual methods can achieve the best accuracy with general cost for AGV navigation.
The most accurate sample was obtained by Tang et al. (2020), which aimed to guide the aircraft transport AGV. This method placed QR codes on the floor for dual cameras to read and encode. Its distance error and angle were within 10 mm and 0.15°, respectively. QR codes were considered in our system owing to their convenience and performance. However, normal image recognition is considerably affected by the environmental brightness and the printed QR codes stain easily in spraying works. Boutteau et al. (2020) used distributed LED lights attached to a robot for localization, which is the opposite of our method. However, they obtained a localization accuracy at 30 mm and 1° on a small robot. This may be because their camera was not in dark mode, and they only used four LED lights for computation. The experiment suggested that our vision localization system would have a precise positioning error at 10 mm and 0.052°. To the best of our knowledge, the accuracy obtained by the proposed method is higher than that of other methods. Our vision localization system includes a camera, rotating platform, IMU, and LED targets, which is affordable for industrial AGVs. An expensive laser tracker was used to transfer the LED target to the global coordinate system for aircraft spraying. A laser tracker has been already used in the project to measure the work-piece, and it will not increase the cost of localization.
Irrespective of the method used, the coordinate system must be transferred by the laser tracker. The positioning method used in this study ensures high accuracy and flexibility for industrial use. LED targets can be conveniently placed on the ground and other devices can be installed on the AGV. Therefore, no structural reforms are required in factories. The camera works in the dark mode and can only see the LED blobs as mentioned in Section 2.1.
The good penetration ability of infrared LED light ensures that the system can work in dusty spraying factories. Consequently, our system exhibits strong adaptability in terrible industrial environments. Practical tests proved that localization can be performed all day during aircraft manufacturing with high accuracy. Table 2 compares the latest well-performed localization methods that can be applied to industrial AGVs. Considering the general cost and high adaptability, the proposed method obtained the best angle accuracy, which plays an important role in large mobile spraying systems.
It should be noted that our method still requires several improvements; specifically, a fast and stable LED blob detection mechanism. We are developing a blob tracking algorithm to improve detection performance. The visual process cannot continue if an obstacle is present between camera and target. This system used an IMU and odometer to ensure that localization is performed for a short duration. We are also investigating an auto-fill algorithm with partial LED blobs that can obtain similarly accurate results. Additionally, we plan to install two cameras in the front and back of the AGV. The fusion of double vision localization is another important topic for future work.

CONCLUSION
In this paper, we presented a highly accurate vision localization system, which was implemented for industrial applications. The system obtained the pose by processing the LED target image captured by a camera installed on a rotating platform. To provide high frequency and stable positioning, we introduced a Kalman filter loop that integrated the IMU and odometer data.
Our method can provide an affordable localization method for AGVs in wireless-banned industrial environments. The proposed system could achieve a position and angle accuracy of 10 mm and 0.052°, respectively, which denote state-of-the-art results in AGV navigation among the known methods. The system was tested in a real spraying experiment, and it satisfied the localization requirements at high precision and adaptability. We will continue to improve the accuracy and efficiency of the proposed system in future work.

CONFLICT OF INTEREST
The authors declare no potential conflict of interests.