Stair climbing
A person moves upward through a stair area, changing the navigable space around the agent.
Human-aware navigation data and environments
A public resource for vision-and-language navigation in dynamic scenes with humans, activities, social constraints, and grounded instructions.
Interactive demos
Simulator clips, motion cases, and trajectory visualizations show how humans shape the navigation task before the page moves into dataset details.
Demo gallery
Short clips show the variety of human movements, interaction cues, and navigation cases that HA-VLN makes available to researchers.
Scene diversity
Dynamic people and activity references make navigation decisions depend on the evolving context around the agent.
A person moves upward through a stair area, changing the navigable space around the agent.
Human motion inside a room provides activity and route context for grounded instructions.
A person walks through the hallway while using a phone, creating a moving social cue.
Fast motion followed by a stop makes timing and clearance important for navigation.
A backyard motion sequence shifts from crouching to walking while the surrounding scene remains visible.
The person bends down while moving indoors, varying height, posture, and visibility.
Navigation visualization
These examples pair instructions with agent trajectories in scenes where human motion changes what a reasonable route looks like.
Overview
Many VLN setups focus on static environments. HA-VLN shifts attention to scenes where people move, interact, occlude views, and shape what a safe navigation decision means.
Dynamic Humans
Human activities, motion, partial observability, and local crowd behavior create navigation decisions that static VLN scenes usually miss.
Social Grounding
Agents must interpret language that refers to people, activities, rooms, objects, and social context rather than only final goal locations.
Data + Environment
HA-VLN releases human-populated navigation data, environment support, baseline references, and evaluation metrics for reproducible study.
Safety Signals
Strict success, collision rate, trajectory collision rate, and navigation error help separate reaching goals from reaching them safely.
Data and environments
HA-VLN brings together navigation instructions, dynamic human motions, activity cues, and Habitat-based environments so agents can be tested beyond static shortest-path behavior.
Dataset instructions in docs
Environment support
The simulator support is a practical companion to the dataset: it places dynamic humans into photorealistic environments and exposes the state needed for human-aware navigation research.
HA-VLN is best understood as data and environment infrastructure for studying human-aware navigation. The Habitat-based simulator and baseline code are supporting resources, not the main novelty claim.
Resources
Team and citation
@misc{dong2026havln20openbenchmark,
title={HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions},
author={Yifei Dong and Fengyi Wu and Qi He and Lingdong Kong and Heng Li and Minghan Li and Zebang Cheng and Yuxuan Zhou and Jingdong Sun and Qi Dai and Alexander G Hauptmann and Zhi-Qi Cheng},
year={2026},
eprint={2503.14229},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2503.14229},
}