Using Datasets#

We provide datasets in the lerobot format. There are broadly three types of datasets: pretraining (human) datasets, pretraining (MimicGen) datasets, and target (human) datasets (see datasets overview for details).

Downloading datasets#

Dataset storage location

By default, all datasets are stored under datasets/ in the root RoboCasa directory. You can change the location for datasets by setting DATASET_BASE_PATH in robocasa/macros_private.py.

Here are a few examples to download datasets:

Click to expand download examples
# downloads all datasets
python -m robocasa.scripts.download_datasets --all

# only download pretraining human data
python -m robocasa.scripts.download_datasets --split pretrain --source human

# only download pretraining MimicGen data
python -m robocasa.scripts.download_datasets --split pretrain --source mimicgen

# only download target human data
python -m robocasa.scripts.download_datasets --split target --source human

# download all datasets for specific task(s)
python -m robocasa.scripts.download_datasets --tasks PickPlaceCounterToCabinet ArrangeBreadBasket

You can specify --overwrite to overwrite existing datasets.

Dataset structure#

RoboCasa datasets follow the LeRobot format. Here is an overview of important elements of each dataset:

Click to expand dataset structure
lerobot/
├── meta/                               # Metadata files describing the dataset
│   ├── info.json                       # Dataset info (robot type, episodes, frames, fps, features)
│   ├── tasks.jsonl                     # Language instructions with task indices
│   ├── episodes.jsonl                  # Per-episode metadata (index, instruction, length)
│   ├── episodes_stats.jsonl            # Per-episode statistics for actions/proprioception
│   ├── stats.json                      # Aggregated statistics across all episodes
│   ├── modality.json                   # Info contained in observations and action vectors
│   └── embodiment.json                 # Embodiment information
│
├── data/                               # Low-dimensional trajectory data (parquet files)
│   └── chunk-<chunk_id>/
│       └── episode_<episode_id>.parquet   # Proprioception, actions, dones, timestamps
│
├── videos/                             # MP4 video files for each camera view
│   └── chunk-<chunk_id>/
│       ├── observation.images.robot0_agentview_left/
│       │   └── episode_<episode_id>.mp4   # Left third-person camera
│       ├── observation.images.robot0_agentview_right/
│       │   └── episode_<episode_id>.mp4   # Right third-person camera
│       └── observation.images.robot0_eye_in_hand/
│           └── episode_<episode_id>.mp4   # Eye-in-hand camera
│
└── extras/                             # MuJoCo/RoboCasa-specific metadata (non-standard)
    ├── dataset_meta.json               # Environment args and controller configs
    └── episode_<episode_id>/           # Per-episode extras
        ├── ep_meta.json                # Episode metadata (layout, style, fixtures, objects)
        ├── model.xml.gz                # Compressed MJCF MuJoCo model XML
        └── states.npz                  # Raw MuJoCo states for replay (not for training)

Retrieving dataset metadata#

We track each dataset with metadata (paths, task horizon length, etc.) in the dataset registry. You can use the get_ds_meta() function to retrieve metadata for a specific task:

from robocasa.utils.dataset_registry import get_ds_meta

ds_meta = get_ds_meta(
    task="PickPlaceCounterToCabinet",
    split="target", # or try "pretrain"
    source="human", # defaults to "human", try "mimicgen" for synthetic data
    demo_fraction=1.0, # the fraction of available demos to use (default is 1.0)
)

Creating environments from dataset metadata#

You can initialize a gym environment given the dataset metadata and run random rollouts:

import gymnasium as gym
import robocasa
from robocasa.utils.env_utils import run_random_rollouts

# gather relevant information from ds_meta from previous section
task_name = ds_meta["task"]
split = ds_meta["split"]
horizon = ds_meta["horizon"]

env = gym.make(
    f"robocasa/{task_name}",
    split=split,
    seed=0 # seed environment as needed. set seed=None to run unseeded
)

# run rollouts with random actions and save video
run_random_rollouts(
    env, num_rollouts=3, num_steps=horizon, video_path=f"/tmp/{task_name}_{split}_rollouts.mp4"
)

Creating datasets for training#

Here is an example script to access dataset elements:

from lerobot.datasets.lerobot_dataset import LeRobotDataset
import random

# get dataset path from ds_meta from previous section
dataset_path = ds_meta["path"]

ds = LeRobotDataset(repo_id="robocasa365", root=dataset_path)
ep_idx = 5
start = int(ds.episode_data_index["from"][ep_idx]) 
end = int(ds.episode_data_index["to"][ep_idx])
timestep_idx = random.randint(0, end - start)

sample = ds[start + timestep_idx]                                   # Accessing a random sample from the 5th demo in the dataset
right_img = sample["observation.images.robot0_agentview_right"]     # Accessing the right camera image
action = sample["action"]                                           # Accessing the action taken    
instruction = sample["task"]                                        # Accessing the instruction for the episode

Training beyond a single dataset#

The code above returns meta data for a single dataset. You can retrieve information for a collection of datasets using the get_ds_soup() function, which returns a list of dataset metadata:

from robocasa.utils.dataset_registry import get_ds_soup

ds_soup = get_ds_soup(
    task_soup="atomic_seen", # the list of tasks
    split="target", # or try "pretrain"
    source="human", # defaults to "human", try "mimicgen" for synthetic data
    demo_fraction=1.0, # the fraction of available demos to use (default is 1.0)
)

Prominent dataset soups are registerd in the dataset soup registry.

To construct a combined dataset from multiple datasets with custom weights, you can re-use the dataloader from GR00T-N1.5 codebase:

Click to expand weighted dataset creation
import copy
import os
from dataclasses import dataclass
import numpy as np
from robocasa.utils.dataset_registry import DATASET_SOUP_REGISTRY
from robocasa.utils.groot_utils.groot_dataset import LeRobotMixtureDataset, LeRobotSingleDataset, ModalityConfig
from robocasa.utils.groot_utils.schema import EmbodimentTag


embodiment_tag = EmbodimentTag("new_embodiment")

# Define configs needed for dataloader to fetch correct data
modality_configs = {
    "video": ModalityConfig(
        delta_indices=[0],
        modality_keys=[
            "video.robot0_agentview_left",
            "video.robot0_agentview_right",
            "video.robot0_eye_in_hand",
        ],
    ),
    "state": ModalityConfig(
        delta_indices=[0],
        modality_keys=[
            "state.end_effector_position_relative",
            "state.end_effector_rotation_relative",
            "state.gripper_qpos",
            "state.base_position",
            "state.base_rotation",
        ],
    ),
    "action": ModalityConfig(
        delta_indices=list(range(16)),
        modality_keys=[
            "action.end_effector_position",
            "action.end_effector_rotation",
            "action.gripper_close",
            "action.base_motion",
            "action.control_mode",
        ],
    ),
    "language": ModalityConfig(
        delta_indices=[0],
        modality_keys=[
            "annotation.human.task_description",
        ],
    ),
}


dataset_soup = "target_atomic_seen" # specify which dataset soup to use
ds_soup_list = copy.deepcopy(DATASET_SOUP_REGISTRY[dataset_soup])
single_datasets = []
for ds_meta in ds_soup_list:
    ds_path = ds_meta["path"]
    ds_filter_key = ds_meta["filter_key"]
    assert os.path.exists(ds_path), f"Dataset path {ds_path} does not exist"
    dataset = LeRobotSingleDataset(
        dataset_path=ds_path,
        modality_configs=modality_configs,
        embodiment_tag=embodiment_tag,
        filter_key=ds_filter_key,
    )
    single_datasets.append(dataset)

ds_weights = np.ones(len(single_datasets)) # custom weights for datasets
print("dataset weights:", ds_weights)

train_dataset = LeRobotMixtureDataset(
    data_mixture=[
        (dataset, ds_w)  
        for dataset, ds_w in zip(single_datasets, ds_weights)
    ],
    mode="train"
)

for item in train_dataset:
    print(item)
    break

Inspecting and visualizing datasets#

To get dataset statistics (filter keys, objects, task language, scenes):

python robocasa/scripts/get_dataset_info.py --dataset <ds-path>

You can visualize dataset videos by looking at the videos folder under each lerobot dataset directory. To visualize a dataset and save a video:

python robocasa/scripts/playback_dataset.py --n 10 --dataset <ds-path>

This will save a video of 10 random demonstrations in the same path as the dataset. You can play the full dataset by removing the --n flag.