Policy Learning Algorithms#
We provide official support for benchmarking the following policy learning algorithms: Diffusion Policy, Openpi, and GR00T.
Diffusion Policy#
We fork the official Diffusion Policy code base, hosted at https://github.com/robocasa-benchmark/diffusion_policy.
Recommended system specs#
For training we recommend a GPU with at least 24 GB of memory, but 48 GB+ is prefered. For inference we recommend a GPU with at least 8 GB of memory.
Installation#
git clone https://github.com/robocasa-benchmark/diffusion_policy
cd diffusion_policy
pip install -e .
Key files#
Training: train.py
Evaluation: eval_robocasa.py
Experiment workflow#
# train model
python train.py \
--config-name=train_diffusion_transformer_bs192 \
task=robocasa/<dataset-soup>
# Evaluate model
python eval_robocasa.py \
--checkpoint <checkpoint-path> \
--task_set <task-set> \
--split <split>
# Report evaluation results
python diffusion_policy/scripts/get_eval_stats.py \
--dir <outputs-dir>
Openpi#
We fork the official Openpi code base, hosted at robocasa-benchmark/openpi. Our fork support training for pi0.
Recommended system specs#
For training we recommend a GPU with at least 100 GB of memory (B100, H200, etc). For inference we recommend a GPU with at least 8 GB of memory.
Installation#
git clone https://github.com/robocasa-benchmark/openpi
cd openpi
pip install -e .
pip install -e packages/openpi-client/
Key files#
Training: scripts/train.py
Evaluation: scripts/serve_policy.py and examples/robocasa/main.py
Setting up configs: src/openpi/training/config.py
Experiment workflow#
# train model
XLA_PYTHON_CLIENT_MEM_FRACTION=1.0 python scripts/train.py \
<dataset-soup> \
--exp-name=<exp-name>
# evaluate model
# part a: start inference server
python scripts/serve_policy.py \
--port=8000 policy:checkpoint \
--policy.config=<dataset-soup> \
--policy.dir=<checkpoint-path>
# part b: run evals on server
python examples/robocasa/main.py \
--args.port 8000 \
--args.task_set <task-set> \
--args.split <split> \
--args.log_dir <checkpoint-path>
# report evaluation results
python examples/robocasa/get_eval_stats.py \
--dir <checkpoint-path>
GR00T#
We fork the official GR00T code base, hosted at robocasa-benchmark/Isaac-GR00T. Our fork supports training for GR00T N1.5.
Recommended system specs#
For training we recommend a GPU with at least 100 GB of memory (B100, H200, etc). For inference we recommend a GPU with at least 8 GB of memory.
Installation#
git clone https://github.com/robocasa-benchmark/Isaac-GR00T
cd groot
pip install -e .[base]
pip install --no-build-isolation flash-attn==2.7.1.post4
Key files#
Training: scripts/gr00t_finetune.py
Evaluation: scripts/run_eval.py
Experiment workflow#
# train model
python scripts/gr00t_finetune.py \
--output-dir <experiment-path> \
--dataset_soup <dataset-soup> \
--max_steps <num-training-steps>
# evaluate model
python scripts/run_eval.py \
--model_path <checkpoint-path> \
--task_set <task-set> \
--split <split>
# report evaluation results
python gr00t/eval/get_eval_stats.py \
--dir <checkpoint-path>