UWB Indoor Positioning: Solving the 'Hall of Mirrors' Problem
Source PublicationIEEE Transactions on Neural Networks and Learning Systems
Primary AuthorsXiao, Yao, Gao et al.

The Marco Polo Effect
Imagine playing a game of Marco Polo. But you are not in a swimming pool. You are in a pitch-black warehouse packed with metal shelves, glass dividers, and concrete pillars. You shout "Marco!".
Your friend shouts "Polo!". But the sound does not travel in a straight line. It bounces. It ricochets off a steel beam, hits the ceiling, slaps against a wall, and finally reaches your ears from the left. Your brain is tricked. You think your friend is to your left, but they are actually standing straight ahead. In the world of radio signals, this is called the multipath effect.
This is the primary obstacle for UWB indoor positioning. Ultra-wideband (UWB) technology tries to locate objects by measuring the time it takes for a radio pulse to travel from a tag to an anchor. In an empty field, it works perfectly. In a cluttered factory, the signals bounce like our warehouse echoes. Traditional geometry-based systems get confused. They overestimate distances because the signal took the long, bouncing route rather than the direct one.
How PosFormer Untangles the Mess
To fix this, researchers developed a new model called PosFormer. It does not just look at when the signal arrives; it looks at the shape of the signal wave to figure out which parts are real and which are echoes.
Think of the incoming radio signal as a messy handwriting sample. The PosFormer system uses a dual-brain approach to read it:
- The Magnifying Glass (CNN): First, a Convolutional Neural Network looks at the tiny, local spikes in the signal data. It analyses the sharp edges and immediate distortions. If the signal has a specific jagged shape, the CNN recognises it as a direct hit.
- The Wide Lens (Transformer): Simultaneously, a Transformer module looks at the entire timeline of the signal. It checks for long-distance dependencies. If a loud spike is followed by a specific pattern of quieter echoes 50 nanoseconds later, the Transformer understands the context of the room.
The system fuses these two views. If the CNN sees a sharp lead spike and the Transformer confirms the echo pattern matches a metal obstacle, then the system calculates the true position, discarding the bounce data.
Results: Pinpoint Accuracy
The team tested this in industrial halls full of metal obstacles. The results were sharp. The model achieved a mean absolute error (MAE) of just 17.64 cm. This performance beat classical models like Long Short-Term Memory (LSTM) networks.
Furthermore, the study introduced a transfer learning framework. Usually, robots need to 're-learn' everything when moved to a new building. Here, the pretrained model maintained an accuracy of 35.92 cm in a new environment using only 20% of the usual training data. This suggests that UWB indoor positioning could soon be deployed rapidly across different factories without starting from scratch every time.