ViPlanner: Visual Semantic Imperative Learning for Local Navigation

Abstract

Real-time path planning in outdoor environments still challenges modern robotic systems due to differences in terrain traversability, diverse obstacles, and the necessity for fast decision-making. Established approaches have primarily focused on geometric navigation solutions, which work well for structured geometric obstacles but have limitations regarding the semantic interpretation of different terrain types and their affordances. Moreover, these methods fail to identify traversable geometric occurrences, such as stairs.

To overcome these issues, we introduce ViPlanner, a learned local path planning approach that generates local plans based on geometric and semantic information. The system is trained using the Imperative Learning paradigm, for which the network weights are optimized endto-end based on the planning task objective. This optimization uses a differentiable formulation of a semantic costmap, which enables the planner to distinguish between the traversability of different terrains and accurately identify obstacles. The semantic information is represented in 30 classes using an RGB colorspace that can effectively encode the multiple levels of traversability. We show that the planner can adapt to diverse real-world environments without requiring any realworld training. In fact, the planner is trained purely in simulation, enabling a highly scalable training data generation. Experimental results demonstrate resistance to noise, zeroshot sim-to-real transfer, and a decrease of 38.02% in terms of traversability cost compared to purely geometric-based approaches.

Video

Experiments

Simulation Experiments

We train and evaluate our planner in simulation using three different datasets. Namely, Matterport 3D, Carla and NVIDIA Warehouse. We show that it can take the semantic and geometric information into account to plan safe paths in these environments.

Matterport

Carla

Warehouse

Real-World Experiments

Several real-world experiments have been conducted to show the generalization capabilities of our planner. The planner is purely trained in simulation and shown here applied to a crosswalk and stairs scenario.

Crosswalk

Stairs

BibTeX

If you use ViPlanner in your research, please cite our paper:

@inproceedings{roth2024viplanner,
  title={Viplanner: Visual semantic imperative learning for local navigation},
  author={Roth, Pascal and Nubert, Julian and Yang, Fan and Mittal, Mayank and Hutter, Marco},
  booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={5243--5249},
  year={2024},
  organization={IEEE}
}