Prediction and anticipation of futures states of movable objects
Future prediction is a key cognitive task for human beings, and one of the pillars of Autonomous Driving Technology.
IMRA is convinced that being able to forecast the multiple possible futures of already visible objects (“prediction”) and yet-to-appear objects (“anticipation”) could be a game changing technology, enabling much smarter path planning procedures for autonomous cars.
IMRA has been both developing in-house algorithms and establishing collaborations with external academic partners to address this target.
At IMRA we have been focusing on the usage of GAN (Generative Adversarial Networks) technology to generate high-resolution semantic maps of the perceived environment corresponding to the multiple potential futures. IMRA’s final GAN technology will be specific in that it will be conditioned both on the past+present perceptions of the environment and on the desired future time horizon. This will allow generating directly the future semantic maps at the desired time horizon without relying on auto-regressive approaches which tend to drift in time.
Pipeline of IMRA’s time-conditioned GAN
In parallel, IMRA’s academic collaborators at University of Freiburg and Florence have been working on novel approaches for multiple-future trajectory prediction, each one with some specific flavor.
At the University of Freiburg, Prof. Thomas Brox’s team is focusing on the estimation of the multiple futures in a Bayesian framework i.e. estimating the futures as multi-modal probability distributions. They designed a new network architecture for sampling the multiple futures and fitting multimodal distribution to them. This work has been presented at the CVPR 2019 conference in Long Beach, USA.
Article and related material: here (CVPR 2019 at CVF)
Video presentation: here
Source code: here
CVPR 2019 Video Presentation
This project was continued in collaboration with Prof. Thomas Brox’s team, in the University of Freiburg, to now adapt such multimodal future prediction in the context of egocentric view.
This new work is focusing on the estimation of the multiple futures in a unified framework for future localization and future emergence prediction i.e. estimating as multi-modal probability distributions, the future localization of objects in the scene and the future emergence of not yet seen objects. They designed a new network architecture for sampling the multiple futures and fitting multimodal distribution to them, they also constrained this future localization/emergence with a reachability prior acquired from the scene. This work has been presented at the CVPR 2020 virtual conference.
CVPR 2020 Video Presentation
At the University of Florence, Prof. Alberto Del Bimbo’s team is working on a new supervised machine learning approach that support behavior analysis and prediction of actions and scene evolution. In order to feed their supervised learning procedure with ground truth data, they have designed an innovative way to extract and accumulate the trajectories of the moving objects from egocentric views. This work, based on their Iterative Plane Registration technology, has been presented at the ICIAP 2019 Conference in Trento, Italy.
Article: here (ICIAP 2019 at Springer Link)
Iterative Plane Registration pipeline