Overview of Datasets#
RoboCasa offers over 2,200 hours of demonstration data, comprising human teleoperation data and synthetic data. Broadly, the data is split into pretraining datasets and target datasets. The pretraining datasets feature 300 diverse tasks across 2,500 pretraining kitchens, while the target datasets feature 50 target tasks across a distinct set of 10 heldout target kitchens.
| Setting | Num Tasks | Num Scenes | Demos per Task | Dataset Size (hrs) |
|---|---|---|---|---|
| Pretraining (Human) | 300 | 2500 | 100 | 482 |
| Pretraining (MimicGen) | 60 | 2500 | 10,000 | 1615 |
| Target (Human) | 50 | 10 | 500 | 193 |
We provide a detailed overview of the pretraining and target datasets below.
Pretraining Datasets#
RoboCasa offers ~2,000 hours of pretraining demonstration data. The pretraining datasets feature 300 diverse tasks across 2500 pretraining kitchens. We feature both human and sythentic datasets:
Human Datasets#
482 hours of data collected via teleoperation. The data spans 300 tasks (65 atomic tasks and 235 composite tasks), with 100 demonstrations per task. Go to the Atomic Tasks and Composite Tasks pages to see the list of supported tasks.
Synthetic Datasets#
1615 hours of data generated via MimicGen. The data spans 60 atomic tasks, with ~10k demonstrations per task. Go to the Atomic Tasks page to see the list of supported tasks.
Target Datasets#
In addition to pretraining data, RoboCasa offers over 193 hours of high-quality demonstration data for target tasks collected via teleoperation. The target datasets feature 50 diverse tasks across 10 distinct target kitchen scenes. Note that these target scenes are distinct from the pretraining scenes represented in the pretraining datasets. For each task, we provide 500 human demonstrations collected via teleoperation.
We split these datasets into three groups:
Atomic-Seen (18 tasks): 18 atomic tasks, with all tasks also represented in pretraining datasets.
Composite-Seen (16 tasks): 16 composite tasks, with all tasks also represented in pretraining datasets.
Composite-Unseen (16 tasks): 16 composite tasks, only seen in target datasets and not in pretraining datasets.
Atomic-Seen Tasks#
| Task | Description | Horizon | Video |
|---|
Composite-Seen Tasks#
| Task | Description |
|---|
Composite-Unseen Tasks#
| Task | Description |
|---|