Human-aware navigation data and environments

HA-VLN

A public resource for vision-and-language navigation in dynamic scenes with humans, activities, social constraints, and grounded instructions.

Get Started Read Paper Repo

arXiv:2503.14229 University of Washington + collaborators

We present several annotated instances of human subjects from the proposed HAPS 2.0 Dataset (overall and single), showcasing a variety of well-aligned motions, movements, and interactions.

Interactive demos

Explore human-aware navigation examples up front.

Simulator clips, motion cases, and trajectory visualizations show how humans shape the navigation task before the page moves into dataset details.

Demo gallery

Watch the human dynamics that shape the navigation task.

Short clips show the variety of human movements, interaction cues, and navigation cases that HA-VLN makes available to researchers.

Scene diversity

Scene diversity across human-aware navigation cases

Dynamic people and activity references make navigation decisions depend on the evolving context around the agent.

Stair climbing

A person moves upward through a stair area, changing the navigable space around the agent.

Indoor walking

Human motion inside a room provides activity and route context for grounded instructions.

Phone pacing

A person walks through the hallway while using a phone, creating a moving social cue.

Running and stopping

Fast motion followed by a stop makes timing and clearance important for navigation.

Backyard crouching

A backyard motion sequence shifts from crouching to walking while the surrounding scene remains visible.

Indoor crouching

The person bends down while moving indoors, varying height, posture, and visibility.

Navigation visualization

Socially aware behavior is visible in the trajectory.

These examples pair instructions with agent trajectories in scenes where human motion changes what a reasonable route looks like.

Navigation Instruction: Start by moving forward in the lounge area, where an individual is engaged in a phone conversation while pacing back and forth. Navigate carefully to avoid crossing their path. As you proceed, you will pass by a television mounted on the wall. Continue your movement, observing people relaxing and watching the TV, some seated comfortably on sofas. Further along, notice a group of friends raising their glasses in a toast, enjoying cocktails together. Maintain a steady course, ensuring you do not disrupt their gathering. Finally, reach the end of your path where a potted plant is situated next to a door. Stop at this location, positioning yourself near the plant and door without obstructing access.

Navigation Instruction: Exit the room and make a left turn. Proceed down the hallway where an individual is ironing clothes, carefully smoothing out wrinkles on garments. Continue walking and make another left turn. Enter the next room, which is a bedroom. Inside, someone is comfortably seated in bed, engrossed in reading a book. Move past the bed, ensuring not to disturb the reader. Turn left again to enter the bathroom. Once inside, position yourself near the sink and wait there, observing the surroundings without interfering with any activities.

Overview

Human-aware VLN needs data where people matter.

Many VLN setups focus on static environments. HA-VLN shifts attention to scenes where people move, interact, occlude views, and shape what a safe navigation decision means.

Dynamic Humans

Scenes with moving people, not just static obstacles

Human activities, motion, partial observability, and local crowd behavior create navigation decisions that static VLN scenes usually miss.

Social Grounding

Instructions tied to human activities and spatial cues

Agents must interpret language that refers to people, activities, rooms, objects, and social context rather than only final goal locations.

Data + Environment

A practical public resource for human-aware VLN

HA-VLN releases human-populated navigation data, environment support, baseline references, and evaluation metrics for reproducible study.

Safety Signals

Metrics that make social navigation visible

Strict success, collision rate, trajectory collision rate, and navigation error help separate reaching goals from reaching them safely.

Data and environments

A human-populated resource for socially grounded navigation.

HA-VLN brings together navigation instructions, dynamic human motions, activity cues, and Habitat-based environments so agents can be tested beyond static shortest-path behavior.

Dataset instructions in docs

16,844socially grounded instructions

172human activity categories

486detailed 3D motion models

58,320human motion frames

90annotated scans

910single-human motion instances

HA-VLN navigation scenario with instruction cues, dynamic humans, and RGB-depth observations. — A navigation instruction can refer to human activities, spatial cues, and decision points. The agent must progress while respecting nearby people.

Environment support

Habitat-based scenes with annotated human activity.

The simulator support is a practical companion to the dataset: it places dynamic humans into photorealistic environments and exposes the state needed for human-aware navigation research.

HA-VLN simulator annotation and rendering pipeline. — Annotation and rendering pipeline for dynamic human activities in HA-VLN environments.

Overview of annotated HA-VLN scenes. — Overview of annotated scenarios with human-populated navigation scenes.

What to take away

HA-VLN is best understood as data and environment infrastructure for studying human-aware navigation. The Habitat-based simulator and baseline code are supporting resources, not the main novelty claim.

Resources

Everything you need to inspect, reproduce, and build on HA-VLN.

Get Started

Installation, data setup, integration notes, APIs, and metrics.

Read Paper

The arXiv paper describing HA-VLN and its public resources.

Repo

Source code, simulator support, baseline material, and scripts.

Dataset

Download sources, access notes, and expected data layout.

Team and citation

Built by researchers across embodied AI and navigation.

Yifei DongFengyi WuQi HeLingdong KongBohan XiongYetong ShaHeng LiMinghan LiZebang ChengYuxuan ZhouJingdong SunQi DaiAlexander G. HauptmannZhi-Qi Cheng

@misc{dong2026havln20openbenchmark,
      title={HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions}, 
      author={Yifei Dong and Fengyi Wu and Qi He and Lingdong Kong and Heng Li and Minghan Li and Zebang Cheng and Yuxuan Zhou and Jingdong Sun and Qi Dai and Alexander G Hauptmann and Zhi-Qi Cheng},
      year={2026},
      eprint={2503.14229},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2503.14229}, 
}