robomimic.models package#

Submodules#

robomimic.models.base_nets module#

Contains torch Modules that correspond to basic network building blocks, like MLP, RNN, and CNN backbones.

class robomimic.models.base_nets.Conv1dBase(input_channel=1, activation='relu', **conv_kwargs)#

Bases: robomimic.models.base_nets.Module

Base class for stacked Conv1d layers.

Parameters

input_channel (int) – Number of channels for inputs to this network
activation (None or str) – Per-layer activation to use. Defaults to “relu”. Valid options are currently {relu, None} for no activation
conv_kwargs (dict) –
Specific nn.Conv1D args to use, in list form, where the ith element corresponds to the argument to be passed to the ith Conv1D layer. See https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html for specific possible arguments.

e.g.: common values to use:
out_channels (list of int): Output channel size for each sequential Conv1d layer kernel_size (list of int): Kernel sizes for each sequential Conv1d layer stride (list of int): Stride sizes for each sequential Conv1d layer

forward(inputs)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.ConvBase#

Bases: robomimic.models.base_nets.Module

Base class for ConvNets.

forward(inputs)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.CoordConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', coord_encoding='position')#

Bases: torch.nn.modules.conv.Conv2d, robomimic.models.base_nets.Module

2D Coordinate Convolution

Source: An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution https://arxiv.org/abs/1807.03247 (e.g. adds 2 channels per input feature map corresponding to (x, y) location on map)

bias: Optional[torch.Tensor]#

dilation: Tuple[int, ...]#

forward(input)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

groups: int#

kernel_size: Tuple[int, ...]#

out_channels: int#

output_padding: Tuple[int, ...]#

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

padding: Tuple[int, ...]#

padding_mode: str#

stride: Tuple[int, ...]#

transposed: bool#

weight: torch.Tensor#

class robomimic.models.base_nets.CropRandomizer(input_shape, crop_height, crop_width, num_crops=1, pos_enc=False)#

Bases: robomimic.models.base_nets.Randomizer

Randomly sample crops at input, and then average across crop features at output.

forward_in(inputs)#: Samples N random crops for each input in the batch, and then reshapes inputs to [B * N, …].

forward_out(inputs)#: Splits the outputs from shape [B * N, …] -> [B, N, …] and then average across N to result in shape [B, …] to make sure the network output is consistent with what would have happened if there were no randomization.

output_shape_in(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

output_shape_out(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.EncoderCore(input_shape)#

Bases: robomimic.models.base_nets.Module

Abstract class used to categorize all cores used to encode observations

training: bool#

class robomimic.models.base_nets.FeatureAggregator(dim=1, agg_type='avg')#

Bases: robomimic.models.base_nets.Module

Helpful class for aggregating features across a dimension. This is useful in practice when training models that break an input image up into several patches since features can be extraced per-patch using the same encoder and then aggregated using this module.

clear_weight()#

forward(x)#: Forward pooling pass.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

set_weight(w)#

training: bool#

class robomimic.models.base_nets.MLP(input_dim, output_dim, layer_dims=(), layer_func=<class 'torch.nn.modules.linear.Linear'>, layer_func_kwargs=None, activation=<class 'torch.nn.modules.activation.ReLU'>, dropouts=None, normalization=False, output_activation=None)#

Bases: robomimic.models.base_nets.Module

Base class for simple Multi-Layer Perceptrons.

forward(inputs)#: Forward pass.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.Module#

Bases: torch.nn.modules.module.Module

Base class for networks. The only difference from torch.nn.Module is that it requires implementing @output_shape.

abstract output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.Parameter(init_tensor)#

Bases: robomimic.models.base_nets.Module

A class that is a thin wrapper around a torch.nn.Parameter to make for easy saving and optimization.

forward(inputs=None)#: Forward call just returns the parameter tensor.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.RNN_Base(input_dim, rnn_hidden_dim, rnn_num_layers, rnn_type='LSTM', rnn_kwargs=None, per_step_net=None)#

Bases: robomimic.models.base_nets.Module

A wrapper class for a multi-step RNN and a per-step network.

forward(inputs, rnn_init_state=None, return_state=False)#

Forward a sequence of inputs through the RNN and the per-step network.

Parameters

inputs (torch.Tensor) – tensor input of shape [B, T, D], where D is the RNN input size
rnn_init_state – rnn hidden state, initialize to zero state if set to None
return_state (bool) – whether to return hidden state

Returns

outputs of the per_step_net

rnn_state: return rnn state at the end if return_state is set to True

Return type

outputs

forward_step(inputs, rnn_state)#

Forward a single step input through the RNN and per-step network, and return the new hidden state. :param inputs: tensor input of shape [B, D], where D is the RNN input size :type inputs: torch.Tensor :param rnn_state: rnn hidden state, initialize to zero state if set to None

Returns

outputs of the per_step_net

rnn_state: return the new rnn state

Return type

outputs

get_rnn_init_state(batch_size, device)#

Get a default RNN state (zeros) :param batch_size: batch size dimension :type batch_size: int :param device: device the hidden state should be sent to.

Returns

returns hidden state tensor or tuple of hidden state tensors: depending on the RNN type

Return type

hidden_state (torch.Tensor or tuple)

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

property rnn_type#

training: bool#

class robomimic.models.base_nets.Randomizer#

Bases: robomimic.models.base_nets.Module

Base class for randomizer networks. Each randomizer should implement the @output_shape_in, @output_shape_out, @forward_in, and @forward_out methods. The randomizer’s @forward_in method is invoked on raw inputs, and @forward_out is invoked on processed inputs (usually processed by a @VisualCore instance). Note that the self.training property can be used to change the randomizer’s behavior at train vs. test time.

abstract forward_in(inputs)#: Randomize raw inputs.

abstract forward_out(inputs)#: Processing for network outputs.

output_shape(input_shape=None)#: This function is unused. See @output_shape_in and @output_shape_out.

abstract output_shape_in(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

abstract output_shape_out(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.ResNet18Conv(input_channel=3, pretrained=False, input_coord_conv=False)#

Bases: robomimic.models.base_nets.ConvBase

A ResNet18 block that can be used to process input images.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.ScanCore(input_shape, conv_kwargs, conv_activation='relu', pool_class=None, pool_kwargs=None, flatten=True, feature_dimension=None)#

Bases: robomimic.models.base_nets.EncoderCore, robomimic.models.base_nets.ConvBase

A network block that combines a Conv1D backbone network with optional pooling and linear layers.

forward(inputs)#: Forward pass through visual core.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.Sequential(*args)#

Bases: torch.nn.modules.container.Sequential, robomimic.models.base_nets.Module

Compose multiple Modules together (defined above).

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.ShallowConv(input_channel=3, output_channel=32)#

Bases: robomimic.models.base_nets.ConvBase

A shallow convolutional encoder from https://rll.berkeley.edu/dsae/dsae.pdf

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.SpatialMeanPool(input_shape)#

Bases: robomimic.models.base_nets.Module

Module that averages inputs across all spatial dimensions (dimension 2 and after), leaving only the batch and channel dimensions.

forward(inputs)#: Forward pass - average across all dimensions except batch and channel.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.SpatialSoftmax(input_shape, num_kp=None, temperature=1.0, learnable_temperature=False, output_variance=False, noise_std=0.0)#

Bases: robomimic.models.base_nets.ConvBase

Spatial Softmax Layer.

Based on Deep Spatial Autoencoders for Visuomotor Learning by Finn et al. https://rll.berkeley.edu/dsae/dsae.pdf

forward(feature)#

Forward pass through spatial softmax layer. For each keypoint, a 2D spatial probability distribution is created using a softmax, where the support is the pixel locations. This distribution is used to compute the expected value of the pixel location, which becomes a keypoint of dimension 2. K such keypoints are created.

Returns

mean keypoints of shape [B, K, 2], and possibly: keypoint variance of shape [B, K, 2, 2] corresponding to the covariance under the 2D spatial softmax distribution

Return type

out (torch.Tensor or tuple)

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.Squeeze(dim)#

Bases: robomimic.models.base_nets.Module

Trivial class that squeezes the input. Useful for including in a nn.Sequential network

forward(x)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.Unsqueeze(dim)#

Bases: robomimic.models.base_nets.Module

Trivial class that unsqueezes the input. Useful for including in a nn.Sequential network

forward(x)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

class robomimic.models.base_nets.VisualCore(input_shape, backbone_class, backbone_kwargs, pool_class=None, pool_kwargs=None, flatten=True, feature_dimension=None)#

Bases: robomimic.models.base_nets.EncoderCore, robomimic.models.base_nets.ConvBase

A network block that combines a visual backbone network with optional pooling and linear layers.

forward(inputs)#: Forward pass through visual core.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

robomimic.models.base_nets.rnn_args_from_config(rnn_config)#: Takes a Config object corresponding to RNN settings (for example config.algo.rnn in BCConfig) and extracts rnn kwargs for instantiating rnn networks.

robomimic.models.distributions module#

Contains distribution models used as parts of other networks. These classes usually inherit or emulate torch distributions.

class robomimic.models.distributions.DiscreteValueDistribution(values, probs=None, logits=None)#

Bases: object

Extension to torch categorical probability distribution in order to keep track of the support (categorical values, or in this case, value atoms). This is used for distributional value networks.

property logits#

mean()#: Categorical distribution mean, taking the value support into account.

property probs#

sample(sample_shape=torch.Size([]))#: Sample from the distribution. Make sure to return value atoms, not categorical class indices.

property values#

variance()#: Categorical distribution variance, taking the value support into account.

class robomimic.models.distributions.TanhWrappedDistribution(base_dist, scale=1.0, epsilon=1e-06)#

Bases: torch.distributions.distribution.Distribution

Class that wraps another valid torch distribution, such that sampled values from the base distribution are passed through a tanh layer. The corresponding (log) probabilities are also modified accordingly. Tanh Normal distribution - adapted from rlkit and CQL codebase (https://github.com/aviralkumar2907/CQL/blob/d67dbe9cf5d2b96e3b462b6146f249b3d6569796/d4rl/rlkit/torch/distributions.py#L6).

log_prob(value, pre_tanh_value=None)#

Parameters

value (torch.Tensor) – some tensor to compute log probabilities for
pre_tanh_value – If specified, will not calculate atanh manually from @value. More numerically stable

property mean#: Returns the mean of the distribution.

rsample(sample_shape=torch.Size([]), return_pretanh_value=False)#: Sampling in the reparameterization case - for differentiable samples.

sample(sample_shape=torch.Size([]), return_pretanh_value=False)#: Gradients will and should not pass through this operation. See https://github.com/pytorch/pytorch/issues/4620 for discussion.

property stddev#: Returns the standard deviation of the distribution.

robomimic.models.obs_nets module#

Contains torch Modules that help deal with inputs consisting of multiple modalities. This is extremely common when networks must deal with one or more observation dictionaries, where each input dictionary can have observation keys of a certain modality and shape.

As an example, an observation could consist of a flat “robot0_eef_pos” observation key, and a 3-channel RGB “agentview_image” observation key.

class robomimic.models.obs_nets.MIMO_MLP(input_obs_group_shapes, output_shapes, layer_dims, layer_func=<class 'torch.nn.modules.linear.Linear'>, activation=<class 'torch.nn.modules.activation.ReLU'>, encoder_kwargs=None)#

Bases: robomimic.models.base_nets.Module

Extension to MLP to accept multiple observation dictionaries as input and to output dictionaries of tensors. Inputs are specified as a dictionary of observation dictionaries, with each key corresponding to an observation group.

This module utilizes @ObservationGroupEncoder to process the multiple input dictionaries and @ObservationDecoder to generate tensor dictionaries. The default behavior for encoding the inputs is to process visual inputs with a learned CNN and concatenating the flat encodings with the other flat inputs. The default behavior for generating outputs is to use a linear layer branch to produce each modality separately (including visual outputs).

forward(**inputs)#

Process each set of inputs in its own observation group.

Parameters

inputs (dict) – a dictionary of dictionaries with one dictionary per observation group. Each observation group’s dictionary should map modality to torch.Tensor batches. Should be consistent with @self.input_obs_group_shapes.

Returns

dictionary of output torch.Tensors, that corresponds: to @self.output_shapes

Return type

outputs (dict)

output_shape(input_shape=None)#: Returns output shape for this module, which is a dictionary instead of a list since outputs are dictionaries.

training: bool#

class robomimic.models.obs_nets.ObservationDecoder(decode_shapes, input_feat_dim)#

Bases: robomimic.models.base_nets.Module

Module that can generate observation outputs by modality. Inputs are assumed to be flat (usually outputs from some hidden layer). Each observation output is generated with a linear layer from these flat inputs. Subclass this module in order to implement more complex schemes for generating each modality.

forward(feats)#: Predict each modality from input features, and reshape to each modality’s shape.

output_shape(input_shape=None)#: Returns output shape for this module, which is a dictionary instead of a list since outputs are dictionaries.

training: bool#

class robomimic.models.obs_nets.ObservationEncoder(feature_activation=<class 'torch.nn.modules.activation.ReLU'>)#

Bases: robomimic.models.base_nets.Module

Module that processes inputs by observation key and then concatenates the processed observation keys together. Each key is processed with an encoder head network. Call @register_obs_key to register observation keys with the encoder and then finally call @make to create the encoder networks.

forward(obs_dict)#

Processes modalities according to the ordering in @self.obs_shapes. For each modality, it is processed with a randomizer (if present), an encoder network (if present), and again with the randomizer (if present), flattened, and then concatenated with the other processed modalities.

Parameters: obs_dict (OrderedDict) – dictionary that maps modalities to torch.Tensor batches that agree with @self.obs_shapes. All modalities in @self.obs_shapes must be present, but additional modalities can also be present.
Returns: flat features of shape [B, D]
Return type: feats (torch.Tensor)

make()#: Creates the encoder networks and locks the encoder so that more modalities cannot be added.

output_shape(input_shape=None)#: Compute the output shape of the encoder.

register_obs_key(name, shape, net_class=None, net_kwargs=None, net=None, randomizer=None, share_net_from=None)#

Parameters

name (str) – modality name
shape (int tuple) – shape of modality
net_class (str) – name of class in base_nets.py that should be used to process this observation key before concatenation. Pass None to flatten and concatenate the observation key directly.
net_kwargs (dict) – arguments to pass to @net_class
net (Module instance) – if provided, use this Module to process the observation key instead of creating a different net
randomizer (Randomizer instance) – if provided, use this Module to augment observation keys coming in to the encoder, and possibly augment the processed output as well
share_net_from (str) – if provided, use the same instance of @net_class as another observation key. This observation key must already exist in this encoder. Warning: Note that this does not share the observation key randomizer

training: bool#

class robomimic.models.obs_nets.ObservationGroupEncoder(observation_group_shapes, feature_activation=<class 'torch.nn.modules.activation.ReLU'>, encoder_kwargs=None)#

Bases: robomimic.models.base_nets.Module

This class allows networks to encode multiple observation dictionaries into a single flat, concatenated vector representation. It does this by assigning each observation dictionary (observation group) an @ObservationEncoder object.

The class takes a dictionary of dictionaries, @observation_group_shapes. Each key corresponds to a observation group (e.g. ‘obs’, ‘subgoal’, ‘goal’) and each OrderedDict should be a map between modalities and expected input shapes (e.g. { ‘image’ : (3, 120, 160) }).

forward(**inputs)#

Process each set of inputs in its own observation group.

Parameters: inputs (dict) – dictionary that maps observation groups to observation dictionaries of torch.Tensor batches that agree with @self.observation_group_shapes. All observation groups in @self.observation_group_shapes must be present, but additional observation groups can also be present. Note that these are specified as kwargs for ease of use with networks that name each observation stream in their forward calls.
Returns: flat outputs of shape [B, D]
Return type: outputs (torch.Tensor)

output_shape()#: Compute the output shape of this encoder.

training: bool#

class robomimic.models.obs_nets.RNN_MIMO_MLP(input_obs_group_shapes, output_shapes, mlp_layer_dims, rnn_hidden_dim, rnn_num_layers, rnn_type='LSTM', rnn_kwargs=None, mlp_activation=<class 'torch.nn.modules.activation.ReLU'>, mlp_layer_func=<class 'torch.nn.modules.linear.Linear'>, per_step=True, encoder_kwargs=None)#

Bases: robomimic.models.base_nets.Module

A wrapper class for a multi-step RNN and a per-step MLP and a decoder.

Structure: [encoder -> rnn -> mlp -> decoder]

All temporal inputs are processed by a shared @ObservationGroupEncoder, followed by an RNN, and then a per-step multi-output MLP.

forward(rnn_init_state=None, return_state=False, **inputs)#

Parameters

inputs (dict) – a dictionary of dictionaries with one dictionary per observation group. Each observation group’s dictionary should map modality to torch.Tensor batches. Should be consistent with @self.input_obs_group_shapes. First two leading dimensions should be batch and time [B, T, …] for each tensor.
rnn_init_state – rnn hidden state, initialize to zero state if set to None
return_state (bool) – whether to return hidden state

Returns

dictionary of output torch.Tensors, that corresponds: to @self.output_shapes. Leading dimensions will be batch and time [B, T, …] for each tensor.

rnn_state (torch.Tensor or tuple): return the new rnn state (if @return_state)

Return type

outputs (dict)

forward_step(rnn_state, **inputs)#

Unroll network over a single timestep.

Parameters

inputs (dict) – expects same modalities as @self.input_shapes, with additional batch dimension (but NOT time), since this is a single time step.
rnn_state (torch.Tensor) – rnn hidden state

Returns

dictionary of output torch.Tensors, that corresponds: to @self.output_shapes. Does not contain time dimension.

rnn_state: return the new rnn state

Return type

outputs (dict)

get_rnn_init_state(batch_size, device)#

Get a default RNN state (zeros)

Parameters

batch_size (int) – batch size dimension
device – device the hidden state should be sent to.

Returns

returns hidden state tensor or tuple of hidden state tensors: depending on the RNN type

Return type

hidden_state (torch.Tensor or tuple)

output_shape(input_shape)#

Returns output shape for this module, which is a dictionary instead of a list since outputs are dictionaries.

Parameters: input_shape (dict) – dictionary of dictionaries, where each top-level key corresponds to an observation group, and the low-level dictionaries specify the shape for each modality in an observation dictionary

training: bool#

robomimic.models.obs_nets.obs_encoder_factory(obs_shapes, feature_activation=<class 'torch.nn.modules.activation.ReLU'>, encoder_kwargs=None)#

Utility function to create an @ObservationEncoder from kwargs specified in config.

Parameters

obs_shapes (OrderedDict) – a dictionary that maps observation key to expected shapes for observations.
feature_activation – non-linearity to apply after each obs net - defaults to ReLU. Pass None to apply no activation.
encoder_kwargs (dict or None) –
If None, results in default encoder_kwargs being applied. Otherwise, should be nested dictionary containing relevant per-modality information for encoder networks. Should be of form:

obs_modality1: dict
feature_dimension: int core_class: str core_kwargs: dict

obs_randomizer_class: str obs_randomizer_kwargs: dict

obs_modality2: dict
…

robomimic.models.policy_nets module#

Contains torch Modules for policy networks. These networks take an observation dictionary as input (and possibly additional conditioning, such as subgoal or goal dictionaries) and produce action predictions, samples, or distributions as outputs. Note that actions are assumed to lie in [-1, 1], and most networks will have a final tanh activation to help ensure this range.

class robomimic.models.policy_nets.ActorNetwork(obs_shapes, ac_dim, mlp_layer_dims, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.obs_nets.MIMO_MLP

A basic policy network that predicts actions from observations. Can optionally be goal conditioned on future observations.

forward(obs_dict, goal_dict=None)#

Process each set of inputs in its own observation group.

Parameters

Returns

dictionary of output torch.Tensors, that corresponds: to @self.output_shapes

Return type

outputs (dict)

output_shape(input_shape=None)#: Returns output shape for this module, which is a dictionary instead of a list since outputs are dictionaries.

training: bool#

class robomimic.models.policy_nets.GMMActorNetwork(obs_shapes, ac_dim, mlp_layer_dims, num_modes=5, min_std=0.01, std_activation='softplus', low_noise_eval=True, use_tanh=False, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.policy_nets.ActorNetwork

Variant of actor network that learns a multimodal Gaussian mixture distribution over actions.

forward(obs_dict, goal_dict=None)#

Samples actions from the policy distribution.

Parameters

obs_dict (dict) – batch of observations
goal_dict (dict) – if not None, batch of goal observations

Returns

batch of actions from policy distribution

Return type

action (torch.Tensor)

forward_train(obs_dict, goal_dict=None)#

Return full GMM distribution, which is useful for computing quantities necessary at train-time, like log-likelihood, KL divergence, etc.

Parameters

obs_dict (dict) – batch of observations
goal_dict (dict) – if not None, batch of goal observations

Returns

GMM distribution

Return type

dist (Distribution)

training: bool#

class robomimic.models.policy_nets.GaussianActorNetwork(obs_shapes, ac_dim, mlp_layer_dims, fixed_std=False, std_activation='softplus', init_last_fc_weight=None, init_std=0.3, mean_limits=(- 9.0, 9.0), std_limits=(0.007, 7.5), low_noise_eval=True, use_tanh=False, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.policy_nets.ActorNetwork

Variant of actor network that learns a diagonal unimodal Gaussian distribution over actions.

forward(obs_dict, goal_dict=None)#

Samples actions from the policy distribution.

Parameters

obs_dict (dict) – batch of observations
goal_dict (dict) – if not None, batch of goal observations

Returns

batch of actions from policy distribution

Return type

action (torch.Tensor)

forward_train(obs_dict, goal_dict=None)#

Return full Gaussian distribution, which is useful for computing quantities necessary at train-time, like log-likelihood, KL divergence, etc.

Parameters

obs_dict (dict) – batch of observations
goal_dict (dict) – if not None, batch of goal observations

Returns

Gaussian distribution

Return type

dist (Distribution)

training: bool#

class robomimic.models.policy_nets.PerturbationActorNetwork(obs_shapes, ac_dim, mlp_layer_dims, perturbation_scale=0.05, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.policy_nets.ActorNetwork

An action perturbation network - primarily used in BCQ. It takes states and actions and returns action perturbations.

forward(obs_dict, acts, goal_dict=None)#: Forward pass through perturbation actor.

training: bool#

class robomimic.models.policy_nets.RNNActorNetwork(obs_shapes, ac_dim, mlp_layer_dims, rnn_hidden_dim, rnn_num_layers, rnn_type='LSTM', rnn_kwargs=None, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.obs_nets.RNN_MIMO_MLP

An RNN policy network that predicts actions from observations.

forward(obs_dict, goal_dict=None, rnn_init_state=None, return_state=False)#

Forward a sequence of inputs through the RNN and the per-step network.

Parameters

obs_dict (dict) – batch of observations - each tensor in the dictionary should have leading dimensions batch and time [B, T, …]
goal_dict (dict) – if not None, batch of goal observations
rnn_init_state – rnn hidden state, initialize to zero state if set to None
return_state (bool) – whether to return hidden state

Returns

predicted action sequence rnn_state: return rnn state at the end if return_state is set to True

Return type

actions (torch.Tensor)

forward_step(obs_dict, goal_dict=None, rnn_state=None)#

Unroll RNN over single timestep to get actions.

Parameters

obs_dict (dict) – batch of observations. Should not contain time dimension.
goal_dict (dict) – if not None, batch of goal observations
rnn_state – rnn hidden state, initialize to zero state if set to None

Returns

batch of actions - does not contain time dimension state: updated rnn state

Return type

actions (torch.Tensor)

output_shape(input_shape)#

Returns output shape for this module, which is a dictionary instead of a list since outputs are dictionaries.

Parameters: input_shape (dict) – dictionary of dictionaries, where each top-level key corresponds to an observation group, and the low-level dictionaries specify the shape for each modality in an observation dictionary

training: bool#

class robomimic.models.policy_nets.RNNGMMActorNetwork(obs_shapes, ac_dim, mlp_layer_dims, rnn_hidden_dim, rnn_num_layers, rnn_type='LSTM', rnn_kwargs=None, num_modes=5, min_std=0.01, std_activation='softplus', low_noise_eval=True, use_tanh=False, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.policy_nets.RNNActorNetwork

An RNN GMM policy network that predicts sequences of action distributions from observation sequences.

forward(obs_dict, goal_dict=None, rnn_init_state=None, return_state=False)#

Samples actions from the policy distribution.

Parameters

obs_dict (dict) – batch of observations
goal_dict (dict) – if not None, batch of goal observations

Returns

batch of actions from policy distribution

Return type

action (torch.Tensor)

forward_step(obs_dict, goal_dict=None, rnn_state=None)#

Unroll RNN over single timestep to get sampled actions.

Parameters

obs_dict (dict) – batch of observations. Should not contain time dimension.
goal_dict (dict) – if not None, batch of goal observations
rnn_state – rnn hidden state, initialize to zero state if set to None

Returns

batch of actions - does not contain time dimension state: updated rnn state

Return type

acts (torch.Tensor)

forward_train(obs_dict, goal_dict=None, rnn_init_state=None, return_state=False)#

Return full GMM distribution, which is useful for computing quantities necessary at train-time, like log-likelihood, KL divergence, etc.

Parameters

obs_dict (dict) – batch of observations
goal_dict (dict) – if not None, batch of goal observations
rnn_init_state – rnn hidden state, initialize to zero state if set to None
return_state (bool) – whether to return hidden state

Returns

sequence of GMM distributions over the timesteps rnn_state: return rnn state at the end if return_state is set to True

Return type

dists (Distribution)

forward_train_step(obs_dict, goal_dict=None, rnn_state=None)#

Unroll RNN over single timestep to get action GMM distribution, which is useful for computing quantities necessary at train-time, like log-likelihood, KL divergence, etc.

Parameters

obs_dict (dict) – batch of observations. Should not contain time dimension.
goal_dict (dict) – if not None, batch of goal observations
rnn_state – rnn hidden state, initialize to zero state if set to None

Returns

GMM action distributions state: updated rnn state

Return type

ad (Distribution)

training: bool#

class robomimic.models.policy_nets.VAEActor(obs_shapes, ac_dim, encoder_layer_dims, decoder_layer_dims, latent_dim, device, decoder_is_conditioned=True, decoder_reconstruction_sum_across_elements=False, latent_clip=None, prior_learn=False, prior_is_conditioned=False, prior_layer_dims=(), prior_use_gmm=False, prior_gmm_num_modes=10, prior_gmm_learn_weights=False, prior_use_categorical=False, prior_categorical_dim=10, prior_categorical_gumbel_softmax_hard=False, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.base_nets.Module

A VAE that models a distribution of actions conditioned on observations. The VAE prior and decoder are used at test-time as the policy.

decode(obs_dict=None, goal_dict=None, z=None, n=None)#

Thin wrapper around @VaeNets.VAE implementation.

Parameters

obs_dict (dict) – a dictionary that maps modalities to torch.Tensor batches. Only needs to be provided if @decoder_is_conditioned or @z is None (since the prior will require it to generate z).
goal_dict (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to goal modalities.
z (torch.Tensor) – if provided, these latents are used to generate reconstructions from the VAE, and the prior is not sampled.
n (int) – this argument is used to specify the number of samples to generate from the prior. Only required if @z is None - i.e. sampling takes place

Returns

dictionary of reconstructed inputs (this will be a dictionary: with a single “action” key)

Return type

recons (dict)

encode(actions, obs_dict, goal_dict=None)#

Parameters

actions (torch.Tensor) – a batch of actions
obs_dict (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to the observation modalities used for conditioning in either the decoder or the prior (or both).
goal_dict (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to goal modalities.

Returns

dictionary with the following keys:

mean (torch.Tensor): posterior encoder means

logvar (torch.Tensor): posterior encoder logvars

Return type

posterior params (dict)

forward(obs_dict, goal_dict=None, z=None)#

Samples actions from the policy distribution.

Parameters

obs_dict (dict) – batch of observations
goal_dict (dict) – if not None, batch of goal observations
z (torch.Tensor) – if not None, use the provided batch of latents instead of sampling from the prior

Returns

batch of actions from policy distribution

Return type

action (torch.Tensor)

forward_train(actions, obs_dict, goal_dict=None, freeze_encoder=False)#

A full pass through the VAE network used during training to construct KL and reconstruction losses. See @VAE class for more info.

Parameters

actions (torch.Tensor) – a batch of actions
obs_dict (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to the observation modalities used for conditioning in either the decoder or the prior (or both).
goal_dict (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to goal modalities.

Returns

a dictionary that contains the following outputs.

encoder_params (dict): parameters for the posterior distribution
from the encoder forward pass

encoder_z (torch.Tensor): latents sampled from the encoder posterior

decoder_outputs (dict): action reconstructions from the decoder

kl_loss (torch.Tensor): KL loss over the batch of data

reconstruction_loss (torch.Tensor): reconstruction loss over the batch of data

Return type

vae_outputs (dict)

get_gumbel_temperature()#: Return current Gumbel-Softmax temperature. Should only be used if @prior_use_categorical is True.

output_shape(input_shape=None)#: This implementation is required by the Module superclass, but is unused since we never chain this module to other ones.

sample_prior(obs_dict=None, goal_dict=None, n=None)#

Thin wrapper around @VaeNets.VAE implementation.

Parameters

n (int) – this argument is used to specify the number of samples to generate from the prior.
obs_dict (dict) – a dictionary that maps modalities to torch.Tensor batches. Only needs to be provided if @prior_is_conditioned.
goal_dict (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to goal modalities.

Returns

latents sampled from the prior

Return type

z (torch.Tensor)

set_gumbel_temperature(temperature)#: Used by external algorithms to schedule Gumbel-Softmax temperature, which is used during reparametrization at train-time. Should only be used if @prior_use_categorical is True.

training: bool#

robomimic.models.vae_nets module#

Contains an implementation of Variational Autoencoder (VAE) and other variants, including other priors, and RNN-VAEs.

class robomimic.models.vae_nets.CategoricalPrior(latent_dim, categorical_dim, device, learnable=False, obs_shapes=None, mlp_layer_dims=(), goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.vae_nets.Prior

A class that holds functionality for learning categorical priors for use in VAEs.

forward(batch_size, obs_dict=None, goal_dict=None)#

Computes prior logits (unnormalized log-probs).

Parameters

batch_size (int) – batch size - this is needed for parameters that are not obs-dependent, to make sure the leading dimension is correct for downstream sampling and loss computation purposes
obs_dict (dict) – inputs according to @obs_shapes. Only needs to be provided if any prior parameters are obs-dependent.
goal_dict (dict) – inputs according to @goal_shapes (only if using goal observations)

Returns

dictionary containing prior parameters

Return type

prior_params (dict)

kl_loss(posterior_params, z=None, obs_dict=None, goal_dict=None)#

Computes KL divergence loss between the Categorical distribution given by the unnormalized logits @logits and the prior distribution.

Parameters

posterior_params (dict) – dictionary with key “logits” corresponding to torch.Tensor batch of unnormalized logits of shape [B, D * C] that corresponds to the posterior categorical distribution
z (torch.Tensor) – samples from encoder - unused for this prior
obs_dict (dict) – inputs according to @obs_shapes. Only needs to be provided if any prior parameters are obs-dependent.
goal_dict (dict) – inputs according to @goal_shapes (only if using goal observations)

Returns

KL divergence loss

Return type

kl_loss (torch.Tensor)

sample(n, obs_dict=None, goal_dict=None)#

Returns a batch of samples from the prior distribution.

Parameters

n (int) – this argument is used to specify the number of samples to generate from the prior.
obs_dict (dict) – inputs according to @obs_shapes. Only needs to be provided if any prior parameters are obs-dependent. Leading dimension should be consistent with @n, the number of samples to generate.
goal_dict (dict) – inputs according to @goal_shapes (only if using goal observations)

Returns

batch of sampled latent vectors.

Return type

z (torch.Tensor)

training: bool#

class robomimic.models.vae_nets.GaussianPrior(latent_dim, device, latent_clip=None, learnable=False, use_gmm=False, gmm_num_modes=10, gmm_learn_weights=False, obs_shapes=None, mlp_layer_dims=(), goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.vae_nets.Prior

A class that holds functionality for learning both unimodal Gaussian priors and multimodal Gaussian Mixture Model priors for use in VAEs.

forward(batch_size, obs_dict=None, goal_dict=None)#

Computes means, logvars, and GMM weights (if using GMM and learning weights).

Parameters

batch_size (int) – batch size - this is needed for parameters that are not obs-dependent, to make sure the leading dimension is correct for downstream sampling and loss computation purposes
obs_dict (dict) – inputs according to @obs_shapes. Only needs to be provided if any prior parameters are obs-dependent.
goal_dict (dict) – inputs according to @goal_shapes (only if using goal observations)

Returns

dictionary containing prior parameters

Return type

prior_params (dict)

kl_loss(posterior_params, z=None, obs_dict=None, goal_dict=None)#

Computes sample-based KL divergence loss between the Gaussian distribution given by @mu, @logvar and the prior distribution.

Parameters

posterior_params (dict) – dictionary with keys “mu” and “logvar” corresponding to torch.Tensor batch of means and log-variances of posterior Gaussian distribution.
z (torch.Tensor) – samples from the Gaussian distribution parametrized by @mu and @logvar. Only needed if @self.use_gmm is True.
obs_dict (dict) – inputs according to @obs_shapes. Only needs to be provided if any prior parameters are obs-dependent.
goal_dict (dict) – inputs according to @goal_shapes (only if using goal observations)

Returns

KL divergence loss

Return type

kl_loss (torch.Tensor)

sample(n, obs_dict=None, goal_dict=None)#

Returns a batch of samples from the prior distribution.

Parameters

n (int) – this argument is used to specify the number of samples to generate from the prior.
obs_dict (dict) – inputs according to @obs_shapes. Only needs to be provided if any prior parameters are obs-dependent. Leading dimension should be consistent with @n, the number of samples to generate.
goal_dict (dict) – inputs according to @goal_shapes (only if using goal observations)

Returns

batch of sampled latent vectors.

Return type

z (torch.Tensor)

training: bool#

class robomimic.models.vae_nets.Prior(param_shapes, param_obs_dependent, obs_shapes=None, mlp_layer_dims=(), goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.base_nets.Module

Base class for VAE priors. It’s basically the same as a @MIMO_MLP network (it instantiates one) but it supports additional methods such as KL loss computation and sampling, and also may learn prior parameters as observation-independent torch Parameters instead of observation-dependent mappings.

forward(batch_size, obs_dict=None, goal_dict=None)#

Computes prior parameters.

Parameters

batch_size (int) – batch size - this is needed for parameters that are not obs-dependent, to make sure the leading dimension is correct for downstream sampling and loss computation purposes
obs_dict (dict) – inputs according to @obs_shapes. Only needs to be provided if any prior parameters are obs-dependent.
goal_dict (dict) – inputs according to @goal_shapes (only if using goal observations)

Returns

dictionary containing prior parameters

Return type

prior_params (dict)

kl_loss(posterior_params, z=None, obs_dict=None, goal_dict=None)#

Computes sample-based KL divergence loss between the Gaussian distribution given by @mu, @logvar and the prior distribution.

Parameters

posterior_params (dict) – dictionary with keys “mu” and “logvar” corresponding to torch.Tensor batch of means and log-variances of posterior Gaussian distribution.
z (torch.Tensor) – samples from the Gaussian distribution parametrized by @mu and @logvar. May not be needed depending on the prior.
obs_dict (dict) – inputs according to @obs_shapes. Only needs to be provided if any prior parameters are obs-dependent.
goal_dict (dict) – inputs according to @goal_shapes (only if using goal observations)

Returns

KL divergence loss

Return type

kl_loss (torch.Tensor)

output_shape(input_shape=None)#: Returns output shape for this module, which is a dictionary instead of a list since outputs are dictionaries.

sample(n, obs_dict=None, goal_dict=None)#

Returns a batch of samples from the prior distribution.

Parameters

n (int) – this argument is used to specify the number of samples to generate from the prior.
obs_dict (dict) – inputs according to @obs_shapes. Only needs to be provided if any prior parameters are obs-dependent. Leading dimension should be consistent with @n, the number of samples to generate.
goal_dict (dict) – inputs according to @goal_shapes (only if using goal observations)

Returns

batch of sampled latent vectors.

Return type

z (torch.Tensor)

training: bool#

class robomimic.models.vae_nets.VAE(input_shapes, output_shapes, encoder_layer_dims, decoder_layer_dims, latent_dim, device, condition_shapes=None, decoder_is_conditioned=True, decoder_reconstruction_sum_across_elements=False, latent_clip=None, output_squash=(), output_scales=None, output_ranges=None, prior_learn=False, prior_is_conditioned=False, prior_layer_dims=(), prior_use_gmm=False, prior_gmm_num_modes=10, prior_gmm_learn_weights=False, prior_use_categorical=False, prior_categorical_dim=10, prior_categorical_gumbel_softmax_hard=False, goal_shapes=None, encoder_kwargs=None)#

Bases: torch.nn.modules.module.Module

A Variational Autoencoder (VAE), as described in https://arxiv.org/abs/1312.6114.

Models a distribution p(X) or a conditional distribution p(X | Y), where each variable can consist of multiple modalities. The target variable X, whose distribution is modeled, is specified through the @input_shapes argument, which is a map between modalities (strings) and expected shapes. In this way, a variable that consists of multiple kinds of data (e.g. image and flat-dimensional) can be modeled as well. A separate @output_shapes argument is used to specify the expected reconstructions - this allows for asymmetric reconstruction (for example, reconstructing low-resolution images).

This implementation supports learning conditional distributions as well (cVAE). The conditioning variable Y is specified through the @condition_shapes argument, which is also a map between modalities (strings) and expected shapes. In this way, variables with multiple kinds of data (e.g. image and flat-dimensional) can jointly be conditioned on. By default, the decoder takes the conditioning variable Y as input. To force the decoder to reconstruct from just the latent, set @decoder_is_conditioned to False (in this case, the prior must be conditioned).

The implementation also supports learning expressive priors instead of using the usual N(0, 1) prior. There are three kinds of priors supported - Gaussian, Gaussian Mixture Model (GMM), and Categorical. For each prior, the parameters can be learned as independent parameters, or be learned as functions of the conditioning variable Y (by setting @prior_is_conditioned).

decode(conditions=None, goals=None, z=None, n=None)#

Pass latents through decoder. Latents should be passed in to this function at train-time for backpropagation, but they can be left out at test-time. In this case, latents will be sampled using the VAE prior.

Parameters

conditions (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to the modalities used for conditioning in either the decoder or the prior (or both). Only for cVAEs.
goals (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to goal modalities. Only for cVAEs.
z (torch.Tensor) – if provided, these latents are used to generate reconstructions from the VAE, and the prior is not sampled.
n (int) – this argument is used to specify the number of samples to generate from the prior. Only required if @z is None - i.e. sampling takes place

Returns

dictionary of reconstructed inputs

Return type

recons (dict)

encode(inputs, conditions=None, goals=None)#

Parameters

inputs (dict) – a dictionary that maps input modalities to torch.Tensor batches. These should correspond to the encoder-only modalities (i.e. @self.encoder_only_shapes).
conditions (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to the modalities used for conditioning in either the decoder or the prior (or both). Only for cVAEs.
goals (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to goal modalities. Only for cVAEs.

Returns

dictionary with posterior parameters

Return type

posterior params (dict)

forward(inputs, outputs, conditions=None, goals=None, freeze_encoder=False)#

A full pass through the VAE network to construct KL and reconstruction losses.

Parameters

inputs (dict) – a dictionary that maps input modalities to torch.Tensor batches. These should correspond to the encoder-only modalities (i.e. @self.encoder_only_shapes).
outputs (dict) – a dictionary that maps output modalities to torch.Tensor batches. These should correspond to the modalities used for reconstruction (i.e. @self.output_shapes).
conditions (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to the modalities used for conditioning in either the decoder or the prior (or both). Only for cVAEs.
goals (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to goal modalities. Only for cVAEs.
freeze_encoder (bool) – if True, don’t backprop into encoder by detaching encoder outputs. Useful for doing staged VAE training.

Returns

a dictionary that contains the following outputs.

encoder_params (dict): parameters for the posterior distribution
from the encoder forward pass

encoder_z (torch.Tensor): latents sampled from the encoder posterior

decoder_outputs (dict): reconstructions from the decoder

kl_loss (torch.Tensor): KL loss over the batch of data

reconstruction_loss (torch.Tensor): reconstruction loss over the batch of data

Return type

vae_outputs (dict)

get_gumbel_temperature()#: Return current Gumbel-Softmax temperature. Should only be used if @self.prior_use_categorical is True.

kl_loss(posterior_params, encoder_z=None, conditions=None, goals=None)#

Computes KL divergence loss given the results of the VAE encoder forward pass and the conditioning and goal modalities (if the prior is input-dependent).

Parameters

posterior_params (dict) – dictionary with keys “mu” and “logvar” corresponding to torch.Tensor batch of means and log-variances of posterior Gaussian distribution. This is the output of @self.encode.
encoder_z (torch.Tensor) – samples from the Gaussian distribution parametrized by @mu and @logvar. Only required if using a GMM prior.
conditions (dict) – inputs according to @self.condition_shapes. Only needs to be provided if any prior parameters are input-dependent.
goal_dict (dict) – inputs according to @self.goal_shapes (only if using goal observations)

Returns

VAE KL divergence loss

Return type

kl_loss (torch.Tensor)

reconstruction_loss(reconstructions, targets)#

Reconstruction loss. Note that we compute the average per-dimension error in each modality and then average across all the modalities.

The beta term for weighting between reconstruction and kl losses will need to be tuned in practice for each situation (see https://twitter.com/memotv/status/973323454350090240 for more discussion).

Parameters

reconstructions (dict) – reconstructed inputs, consistent with @self.output_shapes
targets (dict) – reconstruction targets, consistent with @self.output_shapes

Returns

VAE reconstruction loss

Return type

reconstruction_loss (torch.Tensor)

reparameterize(posterior_params)#

Parameters: params (posterior) – dictionary from encoder forward pass that parametrizes the encoder distribution
Returns: sampled latents that are also differentiable
Return type: z (torch.Tensor)

sample_prior(n, conditions=None, goals=None)#

Samples from the prior using the prior parameters.

Parameters

n (int) – this argument is used to specify the number of samples to generate from the prior.
conditions (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to the modalities used for conditioning in either the decoder or the prior (or both). Only for cVAEs.
goals (dict) – a dictionary that maps modalities to torch.Tensor batches. These should correspond to goal modalities. Only for cVAEs.

Returns

sampled latents from the prior

Return type

z (torch.Tensor)

set_gumbel_temperature(temperature)#: Used by external algorithms to schedule Gumbel-Softmax temperature, which is used during reparametrization at train-time. Should only be used if @self.prior_use_categorical is True.

training: bool#

robomimic.models.vae_nets.vae_args_from_config(vae_config)#: Generate a set of VAE args that are read from the VAE-specific part of a config (for example see config.algo.vae in BCConfig).

robomimic.models.value_nets module#

Contains torch Modules for value networks. These networks take an observation dictionary as input (and possibly additional conditioning, such as subgoal or goal dictionaries) and produce value or action-value estimates or distributions.

class robomimic.models.value_nets.ActionValueNetwork(obs_shapes, ac_dim, mlp_layer_dims, value_bounds=None, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.value_nets.ValueNetwork

A basic Q (action-value) network that predicts values from observations and actions. Can optionally be goal conditioned on future observations.

forward(obs_dict, acts, goal_dict=None)#: Modify forward from super class to include actions in inputs.

training: bool#

class robomimic.models.value_nets.DistributionalActionValueNetwork(obs_shapes, ac_dim, mlp_layer_dims, value_bounds, num_atoms, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.value_nets.ActionValueNetwork

Distributional Q (action-value) network that outputs a categorical distribution over a discrete grid of value atoms. See https://arxiv.org/pdf/1707.06887.pdf for more details.

forward(obs_dict, acts, goal_dict=None)#

Return mean of critic categorical distribution. Useful for obtaining point estimates of critic values.

Parameters

obs_dict (dict) – batch of observations
acts (torch.Tensor) – batch of actions
goal_dict (dict) – if not None, batch of goal observations

Returns

expectation of value distribution

Return type

mean_value (torch.Tensor)

forward_train(obs_dict, acts, goal_dict=None)#

Return full critic categorical distribution.

Parameters

obs_dict (dict) – batch of observations
acts (torch.Tensor) – batch of actions
goal_dict (dict) – if not None, batch of goal observations

Returns

value_distribution (DiscreteValueDistribution instance)

training: bool#

class robomimic.models.value_nets.ValueNetwork(obs_shapes, mlp_layer_dims, value_bounds=None, goal_shapes=None, encoder_kwargs=None)#

Bases: robomimic.models.obs_nets.MIMO_MLP

A basic value network that predicts values from observations. Can optionally be goal conditioned on future observations.

forward(obs_dict, goal_dict=None)#: Forward through value network, and then optionally use tanh scaling.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters: input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
Returns: list of integers corresponding to output shape
Return type: out_shape ([int])

training: bool#

robomimic 0.2.1 documentation

robomimic.models package

Contents

robomimic.models package#

Submodules#

robomimic.models.base_nets module#

robomimic.models.distributions module#

robomimic.models.obs_nets module#

robomimic.models.policy_nets module#

robomimic.models.vae_nets module#

robomimic.models.value_nets module#

Module contents#