Author: Tong Boon Tang - March 2023
Indoor localization is an active area of research dominated by traditional machine-learning techniques. Deep learning-based systems have shown unprecedented improvements and have accomplished exceptional results over the past decade, especially the Transformer network within natural language processing (NLP) and computer vision domains. We propose the hyper-class Transformer (HyTra), an encoder-only Transformer with multiple classification heads (one per class) and learnable embeddings, to investigate the effectiveness of Transformer-based models for received signal strength (RSS) based WiFi fingerprinting. HyTra leverages learnable embeddings and the self-attention mechanism to determine the relative position of the wireless access points (WAPs) within the high-dimensional embedding space, improving the prediction of user location. From an NLP perspective, we consider a fixed order sequence of all observed WAPs as a sentence and the captured RSS value(s) for every given WAP at a given reference point from a given user as words. We test our proposed network on public and private datasets of different sizes, proving that the quality of the learned embeddings and overall accuracy improves with increments in samples.
We evaluate our proposed Transformer model on the UJIndoorLoc public dataset (UJI) for comparison with existing techniques and a larger private dataset with approximately ten times more training and validation samples. We discuss our results from training and testing on both public and private dataset but only describe the public dataset in this section. The private datasets were constructed with similar details but differed in the number of samples, unique locations and WAP. UJIIndoorLoc [15] is the largest and most widely referenced dataset within the indoor localization literature and is easily accessible from the UC Irvine machine learning repository. The UJI dataset was compiled in 2013 at the Jaume I University, Castello de la Plana, Valencian Community, Spain and is partitioned into a training and validation set comprising 19,937 and 1,111 records, respectively. Twenty (20) users on twenty-five (25) different android devices took measurements across three (3) buildings, each with four (4) floors on average, spanning a space of 110,000 m2. The training and testing (addressed as validation in UJI) sets were generated four months apart to ensure data independence.
Advancement in Indoor Localization Technology:The proposed Hyper-class Transformer (HyTra) suggests a novel approach to indoor localization using Transformer-based models, departing from traditional machine learning techniques.
Integration of Deep Learning in WiFi Fingerprinting:The integration of deep learning, specifically the Transformer architecture, into WiFi fingerprinting for indoor localization represents a shift towards more sophisticated and powerful modeling techniques.
Application of NLP Concepts to Wireless Networks:This approach might leverage the strengths of NLP models in understanding sequential data, potentially enhancing the accuracy of location prediction based on the sequence of observed WAPs.
Scalability and Generalization:If the model performs well across datasets of varying sizes, it suggests that the HyTra approach is not limited to specific conditions or dataset characteristics, making it more adaptable for real-world scenarios.
Indoor Navigation and Positioning:The emphasis on classifying complex indoor environments suggests potential applications in indoor navigation, location-based services, or facility management.
Transformer-Based Approach: The use of a Transformer-based Encoder-Only network sets it apart, offering a potentially more effective solution for indoor environment classification.
SPOT and UJI Datasets: Mention of testing on private (SPOT) and public (UJI) datasets, along with performance improvements with increased training samples, indicates a thorough evaluation and potential scalability.
Performance Metrics:The reported floor accuracy of 96.47% on the UJI dataset suggests strong performance compared to existing deep-learning-based techniques.