# Language-conditioned Policy Learning
This tutorial will guide you through setting up language-conditioned policy learning in robomimic.
Note: Understand how to launch training runs and view results first!
Before trying to train a language-conditioned policy, it might be useful to read the following tutorials:
- [how to launch training runs](./configs.html)
- [how to view training results](./viewing_results.html)
- [how to launch multiple training runs efficiently](./hyperparam_scan.html)
## 1. Creating a Dataset Config
To create a dataset config with language conditioning, include the `lang` key under the dataset config dictionary. This key should specify the language annotations for all demos in this dataset.
Example:
```json
{
...
"train": {
"data": [
{
"path": "path/to/dataset.hdf5",
"lang": "language instruction for your task"
},
...
],
...
},
...
}
```
## 2. Conditioning Policies on Language Embeddings
We support CLIP embeddings for encoding language. The pre-defined key for language embeddings is `lang_emb` (specified in `robomimic/utils/lang_utils.py`). You can condition your policy on `lang_emb` using 2 ways:
1. As feature input to action head
2. [FiLM](https://arxiv.org/pdf/1709.07871) over vision encoder
### Feature input to action head
This concatenates language embeddings with other low-dim observations input to the policy.
Example:
```json
{
...
"observation": {
"modalities": {
"obs": {
"low_dim": [
"robot0_eef_pos",
"robot0_eef_quat",
"lang_emb"
],
...
},
},
...
},
...
}
```
### FiLM over vision encoder
This conditions the ResNet18 visual encoder with `lang_emb` using FiLM (see [paper](https://arxiv.org/pdf/1709.07871)).
Example:
```json
{
...
"observation": {
"rgb": {
"core_class": "VisualCoreLanguageConditioned",
"core_kwargs": {
"feature_dimension": 64,
"flatten": true,
"backbone_class": "ResNet18ConvFiLM",
"backbone_kwargs": {
"pretrained": false,
"input_coord_conv": false
},
"pool_class": null,
"pool_kwargs": {}
},
"obs_randomizer_class": null,
"obs_randomizer_kwargs": {}
},
...
}
}
```