robosuite Datasets#

The repository is fully compatible with datasets collected using robosuite. See this link for more information on collecting your own human demonstrations using robosuite.

Converting robosuite hdf5 datasets#

The raw demo.hdf5 file generated by the robosuite script can easily be modified in-place to be compatible with robomimic:

$ python conversion/ --dataset /path/to/demo.hdf5

Post-Processed Dataset Structure

This post-processed demo.hdf5 file in its current state is missing observations (e.g.: proprioception, images, …), rewards, and dones, which are necessary for training policies.

However, keeping these observation-free datasets is useful because it allows flexibility in extracting different kinds of observations and rewards.

Dataset Structure (click to expand)

  • data (group)

    • total (attribute) - number of state-action samples in the dataset

    • env_args (attribute) - a json string that contains metadata on the environment and relevant arguments used for collecting data

    • demo_0 (group) - group for the first demonstration (every demonstration has a group)

      • num_samples (attribute) - the number of state-action samples in this trajectory

      • model_file (attribute) - the xml string corresponding to the MJCF MuJoCo model

      • states (dataset) - flattened raw MuJoCo states, ordered by time

      • actions (dataset) - environment actions, ordered by time

    • demo_1 (group) - group for the second demonstration

Next, we will extract observations from this raw dataset.

Extracting Observations from MuJoCo states#

Warning! Train-Validation Data Splits

For robosuite datasets, if using your own train-val splits, generate these splits before extracting observations. This ensures that all postprocessed hdf5s generated from the demo.hdf5 inherits the same filter keys.

Generating observations from a dataset is straightforward and can be done with a single command from robomimic/scripts:

# For low dimensional observations only, with done on task success
$ python --dataset /path/to/demo.hdf5 --output_name low_dim.hdf5 --done_mode 2

# For including image observations
$ python --dataset /path/to/demo.hdf5 --output_name image.hdf5 --done_mode 2 --camera_names agentview robot0_eye_in_hand --camera_height 84 --camera_width 84

# For including depth observations too
python --dataset /path/to/demo.hdf5 --output_name depth.hdf5 --done_mode 2 --camera_names agentview robot0_eye_in_hand --camera_height 84 --camera_width 84 --depth

# Using dense rewards
$ python --dataset /path/to/demo.hdf5 --output_name image_dense.hdf5 --done_mode 2 --dense --camera_names agentview robot0_eye_in_hand --camera_height 84 --camera_width 84

# (space saving option) extract 84x84 image observations with compression and without 
# extracting next obs (not needed for pure imitation learning algos)
python --dataset /path/to/demo.hdf5 --output_name image.hdf5 \
    --done_mode 2 --camera_names agentview robot0_eye_in_hand --camera_height 84 --camera_width 84 \
    --compress --exclude-next-obs

# Only writing done at the end of the trajectory
$ python --dataset /path/to/demo.hdf5 --output_name image_done_1.hdf5 --done_mode 1 --camera_names agentview robot0_eye_in_hand --camera_height 84 --camera_width 84

# For seeing descriptions of all the command-line args available
$ python --help

Saving storage space

Image datasets can be quite large in terms of storage, but we also offer two flags that might be useful to save on storage. First, the --compress flag will run lossless compression on the extracted observations, resulting in datasets that are up to 5x smaller in storage (in our testing). However, training will be marginally slower due to uncompression costs when loading batches. Second, the --exclude-next-obs will exclude the next_obs keys per trajectory, since they are not needed for imitation learning algorithms like BC and BC-RNN.

In our testing, enabling both flags reduced the Square (PH) Image dataset size from 2.5 GB to 307 MB at the cost of increasing BC-RNN training time from 7 hours to 8.5 hours.


  title={robosuite: A modular simulation framework and benchmark for robot learning},
  author={Zhu, Yuke and Wong, Josiah and Mandlekar, Ajay and Mart{\'\i}n-Mart{\'\i}n, Roberto and Joshi, Abhishek and Nasiriany, Soroush and Zhu, Yifeng},
  journal={arXiv preprint arXiv:2009.12293},