Simultaneous Localisation and Mapping (SLAM)

The ability to map an unknown environment is important for field robots, drones and autonomous vehicles to navigate independently. Simultaneous Localisation and Mapping (or SLAM for short) is a relatively well-studied problem in robotics with a two-fold aim:

• building a representation of the environment (aka mapping)

• finding where the robot is with respect to the map (aka localisation).

When is SLAM needed?

For mobile robotics, it is important to know where the agent is at any moment in time. Normally, GPS can provide a rough location for a robot. However, in GPS-denied environments such as indoors, underground, or underwater, the mobile agent has to rely solely on its on-board sensors to construct a representation of the environment, which would then allow it to locate itself. This is the scenario in which SLAM is needed. Even in situations where GPS can provide coarse localisation, SLAM can be used to provide a fine-grained estimate of the robot location.

SLAM and deep learning

Different variants of the SLAM problem can be formed using various combinations of sensors such as a monocular, stereo and RGBD cameras, laser scanners, and Inertial Measurement Units. When cameras are used as the primary sensor, the problem is termed Visual SLAM and inherits many problems that come with cameras such as errors caused by illumination changes.

AIML researches are applying deep learning techniques to address many of the perceptual shortcomings of Visual SLAM including, single-view depth prediction, better features for matching images across large baselines, two-view pose and depth estimation and object-based SLAM via object detection in images.

Real-time Monocular Sparse SLAM with semantically meaningful landmarks (video prepared by Mehdi Hosseinzadeh).

Featured papers

2016 Cesar Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, José Neira, Ian Reid, John J Leonard

This paper presents a survey of the current status of SLAM and discusses the open problems and future research directions for SLAM.

2016 Ravi Garg, Vijay Kumar BG, Gustavo Carneiro, Ian Reid

We present a single-view depth estimation system that can be trained end-to-end from scratch, in a fully unsupervised fashion, using data captured using a stereo rig; removing the need for vast amounts of annotated training data. Our network is trained on less than half of the KITTI dataset and gives comparable performance to that of the state-of-the-art supervised methods for single view depth estimation.

2019 Mehdi Hosseinzadeh, Kejie Li, Yasir Latif, Ian Reid

We introduced a monocular SLAM system that can incorporate plane and object models to allow for more accurate camera tracking and a richer map representation without huge computational cost.

2018 Huangying Zhan, Ravi Garg, Chamara Saroj Weerasekera, Kejie Li, Harsh Agarwal, Ian Reid

We present an unsupervised learning framework for single-view depth-estimation and monocular visual odometry using stereo data for training.

2019 Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Yu Liu, Thanh-Toan Do, Ian Reid

A major challenge in place recognition for autonomous driving is to be robust against appearance changes due to short-term (e.g., weather, lighting) and long-term (seasons, vegetation growth, etc.) environmental variations.

We propose a novel method for scalable place recognition, which is lightweight in both training and testing with the data continuously accumulated to maintain all of the appearance variations for long-term place recognition. From the results, our algorithm shows significant potential towards achieving long-term autonomy.
Projects

Visual sensing for localisation and mapping in mining

Current mine surveying involves scanning from a number of fixed points using laser range-finding equipment. The aim of this project is to develop computer vision algorithms to improve the speed and accuracy of this digital mapping of mines, to allow accurate mapping in locations denied GPS, and in locations where other sensors cannot be deployed.

Ian Reid, Tat-Jun Chin, Maptek

ARC Grant ID: LP140100946

Lifelong Computer Vision Systems

The aim of the project is to develop robust computer vision systems that can operate over a wide area and over long periods. This is a major challenge because the geometry and appearance of an environment can change over time, and long-term operation requires robustness to this change. The outcome will be a system that can capture an understanding of a wide area in real time, through building a geometric map endowed with semantic descriptions, and which uses machine learning to continuously improve performance. The significance will lie in turning an inexpensive camera into a high-level sensor of the world, ushering in cognitive robotics and autonomous systems.

Ian Reid

ARC Grant ID: FL130100102

Recognising and reconstructing objects in real time from a moving camera

The aim of this project is to visually understand an environment as seen from a moving camera in real time. This entails the recovery of 3D shape and the recognition of individual objects in the environment, while also recognising overall scene types (indoor or outdoor, home or office). This is a significant advance over existing systems, which focus on sparse 3D shape estimation, and produces a model of the environment which is akin to that maintained by a human observer. Such a model has applications beyond the typical domain of robotics, including driver assistance, automated map annotation, environment capture and true scene understanding, which is the original and ongoing goal of computer vision.

Ian Reid, Anthony Dick

ARC Grant ID: DP130104413

��˴�Ƭ