Visual indoor localisation using a 3D building model
Document TypePhD thesis
Access StatusOpen Access
© 2020 Debaditya Acharya
With the emergence of the global navigation satellite system (GNSS), the performance of outdoor localisation has become excellent over the years. Applications like navigation, location-based services and augmented reality demand seamless localisation capabilities in all environments. However, a single technology that can satisfy the needs of localisation in indoor spaces is absent to date. The major limiting factor in the large scale adoption of indoor localisation systems is the overhead cost of installation and maintenance of dedicated local infrastructure. Consequently, infrastructure-independent indoor localisation has become a focus of research during the past decade. The ubiquity of smartphones with integrated cameras has resulted in a renewed interest in infrastructure-independent visual localisation approaches for the indoor environments. However, the existing visual approaches face two challenges that restrict the wide applicability of such approaches. Firstly, current visual simultaneous localisation and mapping (SLAM) approaches, where the challenge is the drift caused by the accumulation of errors. Loop closing is a solution to eliminate drift but it poses a limitation in the practical application of visual positioning. For navigation purposes, revisiting the same location might be impractical. The second challenge for the existing visual approaches is the requirement of an initial location. The existing visual approaches that are independent of the initial locations require either the construction of large database of images with known location or a 3D reconstruction of the indoor environment in the form of depth images or 3D point clouds. However, the creation of a database of images with known locations and the acquisition of additional data for large indoor spaces is a challenge due to the cost, time and post-processing involved. This research presents approaches to address the two above-mentioned challenges of the existing visual approaches by using a 3D building model that is usually available through the building information modelling process or can be generated with little effort from existing 2D plans. The motivation of using BIM for this research comes from the fact that BIMs of modern buildings are readily available, as they are jointly maintained by the constructors and facility managers. The research can be broadly classified into two parts. The first part proposes a novel 3D model-based visual tracking approach called BIM-Tracker. BIM-tracker uses the 3D building model to perform a drift-free localisation and addresses the challenge of accumulation of error. Localisation is performed by integrating image sequences captured by a camera, with the 3D building model. A comprehensive evaluation of the approach with photo-realistic synthetic datasets shows the robustness of the localisation approach under challenging conditions. Additionally, the approach is evaluated on real data captured by a smartphone, and achieves an accuracy of ten centimetres. Similar to the requirement of many visual approaches, BIM-Tracker depends on the availability of the initial location. The second part of the research proposes a deep learning-based method called BIM-PoseNet to estimate the initial location. The requirement of image-based reconstruction of the indoor environment is eliminated by using a 3D building model, thereby addressing another challenge of the existing visual approaches that estimate the initial location. BIM-PoseNet is based on training a CNN with synthetic images obtained from the 3D indoor model to regress the location of a real image taken by a camera. In addition, the uncertainties of camera location estimates are modelled by adopting a Bayesian CNN, as uncertainty provides an indication of confidence and trust in an estimated location in the absence of ground truth. Furthermore, the use of sequences of synthetic images is explored to exploit the spatio-temporal information from the images to improve the performance of BIM-PoseNet by using recurrent neural networks. The results of the qualitative and quantitative experimentation of the proposed approaches with photo-realistic synthetic and real datasets indicate the proposed research addresses the two major limitations of the existing visual indoor localisation approaches. In addition, the proposed research demonstrates the potential of visual indoor localisation as a single technology for achieving an infrastructure-independent localisation.
KeywordsComputer vision; Indoor localisation; Deep learning; Uncertainty modelling; Camera pose estimation; Synthetic images; Transfer learning; Building Information modelling; BIM; Recurrent Neural networks; LSTM; 3D model-based visual tracking; Photogrammetry
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References