MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
Abstract
Real-time large-scale office reconstruction using an uncalibrated RGB camera (playback at 8x speed).
Method
data:image/s3,"s3://crabby-images/cbe85/cbe85780a265af316034d8953c949dd6ffabb35a" alt="System Diagram"
The fundamental building blocks are MASt3R, which outputs pointmaps in a common coordinate frame given two images, and our efficient pointmap matching. This is used in the frontend for camera tracking and pointmap fusion, as well as in the backend for loop closure and large-scale global optimisation.
Generic Camera Model: Pointmap to Rays
For each frame, MASt3R-SLAM defines a generic central camera model by normalising a pointmap into rays. This enables SLAM with time-varying camera models such as the highly dynamic zooming shown above.
Efficient Pointmap Matching
Matching in 3D or feature space is too slow for real-time SLAM. Given the pointmap from DUSt3R or MASt3R in a common coordinate frame, MASt3R-SLAM performs massively parallel matching by minimising the angular error between the ray from the camera centre to a 3D point and the ray queried by the current pixel.
Large-Scale Backend Optimisation
Backend optimisation ensures global consistency of poses and dense geometry. Since gradient descent converges slowly, MASt3R-SLAM leverages Gauss-Newton optimisation to achieve efficient large-scale updates.
Video
BibTex
@article{murai2024_mast3rslam, title={{MASt3R-SLAM}: Real-Time Dense {SLAM} with {3D} Reconstruction Priors}, author={Murai, Riku and Dexheimer, Eric and Davison, Andrew J.}, journal={arXiv preprint}, year={2024}, }