Language-conditioned Policy Learning
Contents
Language-conditioned Policy Learning#
This tutorial will guide you through setting up language-conditioned policy learning in robomimic.
Note: Understand how to launch training runs and view results first!
Before trying to train a language-conditioned policy, it might be useful to read the following tutorials:
1. Creating a Dataset Config#
To create a dataset config with language conditioning, include the lang
key under the dataset config dictionary. This key should specify the language annotations for all demos in this dataset.
Example:
{
...
"train": {
"data": [
{
"path": "path/to/dataset.hdf5",
"lang": "language instruction for your task"
},
...
],
...
},
...
}
2. Conditioning Policies on Language Embeddings#
We support CLIP embeddings for encoding language. The pre-defined key for language embeddings is lang_emb
(specified in robomimic/utils/lang_utils.py
). You can condition your policy on lang_emb
using 2 ways:
As feature input to action head
FiLM over vision encoder
Feature input to action head#
This concatenates language embeddings with other low-dim observations input to the policy.
Example:
{
...
"observation": {
"modalities": {
"obs": {
"low_dim": [
"robot0_eef_pos",
"robot0_eef_quat",
"lang_emb"
],
...
},
},
...
},
...
}
FiLM over vision encoder#
This conditions the ResNet18 visual encoder with lang_emb
using FiLM (see paper).
Example:
{
...
"observation": {
"rgb": {
"core_class": "VisualCoreLanguageConditioned",
"core_kwargs": {
"feature_dimension": 64,
"flatten": true,
"backbone_class": "ResNet18ConvFiLM",
"backbone_kwargs": {
"pretrained": false,
"input_coord_conv": false
},
"pool_class": null,
"pool_kwargs": {}
},
"obs_randomizer_class": null,
"obs_randomizer_kwargs": {}
},
...
}
}