robomimic.models package#

Submodules#

robomimic.models.base_nets module#

Contains torch Modules that correspond to basic network building blocks, like MLP, RNN, and CNN backbones.

class robomimic.models.base_nets.Conv1dBase(input_channel=1, activation='relu', out_channels=(32, 64, 64), kernel_size=(8, 4, 2), stride=(4, 2, 1), **conv_kwargs)#

Bases: robomimic.models.base_nets.Module

Base class for stacked Conv1d layers.

Parameters:
  • input_channel (int) – Number of channels for inputs to this network

  • activation (None or str) – Per-layer activation to use. Defaults to “relu”. Valid options are currently {relu, None} for no activation

  • out_channels (list of int) – Output channel size for each sequential Conv1d layer

  • kernel_size (list of int) – Kernel sizes for each sequential Conv1d layer

  • stride (list of int) – Stride sizes for each sequential Conv1d layer

  • conv_kwargs (dict) – additional nn.Conv1D args to use, in list form, where the ith element corresponds to the argument to be passed to the ith Conv1D layer. See https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html for specific possible arguments.

forward(inputs)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.ConvBase#

Bases: robomimic.models.base_nets.Module

Base class for ConvNets.

forward(inputs)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.CoordConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', coord_encoding='position')#

Bases: torch.nn.modules.conv.Conv2d, robomimic.models.base_nets.Module

2D Coordinate Convolution

Source: An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution https://arxiv.org/abs/1807.03247 (e.g. adds 2 channels per input feature map corresponding to (x, y) location on map)

bias: Optional[torch.Tensor]#
dilation: Tuple[int, ...]#
forward(input)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

groups: int#
kernel_size: Tuple[int, ...]#
out_channels: int#
output_padding: Tuple[int, ...]#
output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

padding: Tuple[int, ...]#
padding_mode: str#
stride: Tuple[int, ...]#
transposed: bool#
weight: torch.Tensor#
class robomimic.models.base_nets.FeatureAggregator(dim=1, agg_type='avg')#

Bases: robomimic.models.base_nets.Module

Helpful class for aggregating features across a dimension. This is useful in practice when training models that break an input image up into several patches since features can be extraced per-patch using the same encoder and then aggregated using this module.

clear_weight()#
forward(x)#

Forward pooling pass.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

set_weight(w)#
training: bool#
class robomimic.models.base_nets.FiLMLayer(lang_emb_dim, channels)#

Bases: robomimic.models.base_nets.ConvBase

Uses Feature-wIse Linear Modulation to language condition a conv net

forward(x, lang_emb)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.MLP(input_dim, output_dim, layer_dims=(), layer_func=<class 'torch.nn.modules.linear.Linear'>, layer_func_kwargs=None, activation=<class 'torch.nn.modules.activation.ReLU'>, dropouts=None, normalization=False, output_activation=None)#

Bases: robomimic.models.base_nets.Module

Base class for simple Multi-Layer Perceptrons.

forward(inputs)#

Forward pass.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.MVPConv(input_channel=3, mvp_model_class='vitb-mae-egosoup', freeze=True)#

Bases: robomimic.models.base_nets.ConvBase

Base class for ConvNets pretrained with MVP (https://arxiv.org/abs/2203.06173)

forward(inputs)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

output_shape(input_shape)#

Function to compute output shape from inputs to this module. :param input_shape: shape of input. Does not include batch dimension.

Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.Module#

Bases: torch.nn.modules.module.Module

Base class for networks. The only difference from torch.nn.Module is that it requires implementing @output_shape.

abstract output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.Parameter(init_tensor)#

Bases: robomimic.models.base_nets.Module

A class that is a thin wrapper around a torch.nn.Parameter to make for easy saving and optimization.

forward(inputs=None)#

Forward call just returns the parameter tensor.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.R3MConv(input_channel=3, r3m_model_class='resnet18', freeze=True)#

Bases: robomimic.models.base_nets.ConvBase

Base class for ConvNets pretrained with R3M (https://arxiv.org/abs/2203.12601)

output_shape(input_shape)#

Function to compute output shape from inputs to this module. :param input_shape: shape of input. Does not include batch dimension.

Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.RNN_Base(input_dim, rnn_hidden_dim, rnn_num_layers, rnn_type='LSTM', rnn_kwargs=None, per_step_net=None)#

Bases: robomimic.models.base_nets.Module

A wrapper class for a multi-step RNN and a per-step network.

forward(inputs, rnn_init_state=None, return_state=False)#

Forward a sequence of inputs through the RNN and the per-step network.

Parameters:
  • inputs (torch.Tensor) – tensor input of shape [B, T, D], where D is the RNN input size

  • rnn_init_state – rnn hidden state, initialize to zero state if set to None

  • return_state (bool) – whether to return hidden state

Returns:

outputs of the per_step_net

rnn_state: return rnn state at the end if return_state is set to True

Return type:

outputs

forward_step(inputs, rnn_state)#

Forward a single step input through the RNN and per-step network, and return the new hidden state. :param inputs: tensor input of shape [B, D], where D is the RNN input size :type inputs: torch.Tensor :param rnn_state: rnn hidden state, initialize to zero state if set to None

Returns:

outputs of the per_step_net

rnn_state: return the new rnn state

Return type:

outputs

get_rnn_init_state(batch_size, device)#

Get a default RNN state (zeros) :param batch_size: batch size dimension :type batch_size: int :param device: device the hidden state should be sent to.

Returns:

returns hidden state tensor or tuple of hidden state tensors

depending on the RNN type

Return type:

hidden_state (torch.Tensor or tuple)

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

property rnn_type#
training: bool#
class robomimic.models.base_nets.ResNet18Conv(input_channel=3, pretrained=False, input_coord_conv=False)#

Bases: robomimic.models.base_nets.ConvBase

A ResNet18 block that can be used to process input images.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.ResNet18ConvFiLM(input_channel=3, pretrained=False, input_coord_conv=False, lang_emb_dim=768)#

Bases: robomimic.models.base_nets.ConvBase

A ResNet18 block that can be used to process input images and uses FiLM for language conditioning.

forward(inputs, lang_emb)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.ResNet50Conv(input_channel=3, pretrained=False, input_coord_conv=False)#

Bases: robomimic.models.base_nets.ConvBase

A ResNet50 block that can be used to process input images.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.Sequential(*args, has_output_shape=True)#

Bases: torch.nn.modules.container.Sequential, robomimic.models.base_nets.Module

Compose multiple Modules together (defined above).

freeze()#
output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

train(mode)#

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

training: bool#
class robomimic.models.base_nets.ShallowConv(input_channel=3, output_channel=32)#

Bases: robomimic.models.base_nets.ConvBase

A shallow convolutional encoder from https://rll.berkeley.edu/dsae/dsae.pdf

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.SpatialMeanPool(input_shape)#

Bases: robomimic.models.base_nets.Module

Module that averages inputs across all spatial dimensions (dimension 2 and after), leaving only the batch and channel dimensions.

forward(inputs)#

Forward pass - average across all dimensions except batch and channel.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.SpatialSoftmax(input_shape, num_kp=32, temperature=1.0, learnable_temperature=False, output_variance=False, noise_std=0.0)#

Bases: robomimic.models.base_nets.ConvBase

Spatial Softmax Layer.

Based on Deep Spatial Autoencoders for Visuomotor Learning by Finn et al. https://rll.berkeley.edu/dsae/dsae.pdf

forward(feature)#

Forward pass through spatial softmax layer. For each keypoint, a 2D spatial probability distribution is created using a softmax, where the support is the pixel locations. This distribution is used to compute the expected value of the pixel location, which becomes a keypoint of dimension 2. K such keypoints are created.

Returns:

mean keypoints of shape [B, K, 2], and possibly

keypoint variance of shape [B, K, 2, 2] corresponding to the covariance under the 2D spatial softmax distribution

Return type:

out (torch.Tensor or tuple)

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.Squeeze(dim)#

Bases: robomimic.models.base_nets.Module

Trivial class that squeezes the input. Useful for including in a nn.Sequential network

forward(x)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.base_nets.Unsqueeze(dim)#

Bases: robomimic.models.base_nets.Module

Trivial class that unsqueezes the input. Useful for including in a nn.Sequential network

forward(x)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
robomimic.models.base_nets.rnn_args_from_config(rnn_config)#

Takes a Config object corresponding to RNN settings (for example config.algo.rnn in BCConfig) and extracts rnn kwargs for instantiating rnn networks.

robomimic.models.base_nets.transformer_args_from_config(transformer_config)#

Takes a Config object corresponding to Transformer settings (for example config.algo.transformer in BCConfig) and extracts transformer kwargs for instantiating transformer networks.

robomimic.models.diffusion_policy_nets module#

This file contains nets used for Diffusion Policy.

class robomimic.models.diffusion_policy_nets.ConditionalResidualBlock1D(in_channels, out_channels, cond_dim, kernel_size=3, n_groups=8)#

Bases: torch.nn.modules.module.Module

forward(x, cond)#

x : [ batch_size x in_channels x horizon ] cond : [ batch_size x cond_dim]

returns: out : [ batch_size x out_channels x horizon ]

training: bool#
class robomimic.models.diffusion_policy_nets.ConditionalUnet1D(input_dim, global_cond_dim, diffusion_step_embed_dim=256, down_dims=[256, 512, 1024], kernel_size=5, n_groups=8)#

Bases: torch.nn.modules.module.Module

forward(sample: torch.Tensor, timestep: Union[torch.Tensor, float, int], global_cond=None)#

x: (B,T,input_dim) timestep: (B,) or int, diffusion step global_cond: (B,global_cond_dim) output: (B,T,input_dim)

training: bool#
class robomimic.models.diffusion_policy_nets.Conv1dBlock(inp_channels, out_channels, kernel_size, n_groups=8)#

Bases: torch.nn.modules.module.Module

Conv1d –> GroupNorm –> Mish

forward(x)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool#
class robomimic.models.diffusion_policy_nets.Downsample1d(dim)#

Bases: torch.nn.modules.module.Module

forward(x)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool#
class robomimic.models.diffusion_policy_nets.SinusoidalPosEmb(dim)#

Bases: torch.nn.modules.module.Module

forward(x)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool#
class robomimic.models.diffusion_policy_nets.Upsample1d(dim)#

Bases: torch.nn.modules.module.Module

forward(x)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool#

robomimic.models.distributions module#

Contains distribution models used as parts of other networks. These classes usually inherit or emulate torch distributions.

class robomimic.models.distributions.DiscreteValueDistribution(values, probs=None, logits=None)#

Bases: object

Extension to torch categorical probability distribution in order to keep track of the support (categorical values, or in this case, value atoms). This is used for distributional value networks.

property logits#
mean()#

Categorical distribution mean, taking the value support into account.

property probs#
sample(sample_shape=torch.Size([]))#

Sample from the distribution. Make sure to return value atoms, not categorical class indices.

property values#
variance()#

Categorical distribution variance, taking the value support into account.

class robomimic.models.distributions.TanhWrappedDistribution(base_dist, scale=1.0, epsilon=1e-06)#

Bases: torch.distributions.distribution.Distribution

Class that wraps another valid torch distribution, such that sampled values from the base distribution are passed through a tanh layer. The corresponding (log) probabilities are also modified accordingly. Tanh Normal distribution - adapted from rlkit and CQL codebase (https://github.com/aviralkumar2907/CQL/blob/d67dbe9cf5d2b96e3b462b6146f249b3d6569796/d4rl/rlkit/torch/distributions.py#L6).

log_prob(value, pre_tanh_value=None)#
Parameters:
  • value (torch.Tensor) – some tensor to compute log probabilities for

  • pre_tanh_value – If specified, will not calculate atanh manually from @value. More numerically stable

property mean#

Returns the mean of the distribution.

rsample(sample_shape=torch.Size([]), return_pretanh_value=False)#

Sampling in the reparameterization case - for differentiable samples.

sample(sample_shape=torch.Size([]), return_pretanh_value=False)#

Gradients will and should not pass through this operation. See https://github.com/pytorch/pytorch/issues/4620 for discussion.

property stddev#

Returns the standard deviation of the distribution.

robomimic.models.obs_core module#

Contains torch Modules for core observation processing blocks such as encoders (e.g. EncoderCore, VisualCore, ScanCore, …) and randomizers (e.g. Randomizer, CropRandomizer).

class robomimic.models.obs_core.ColorRandomizer(input_shape, brightness=0.3, contrast=0.3, saturation=0.3, hue=0.3, num_samples=1)#

Bases: robomimic.models.obs_core.Randomizer

Randomly sample color jitter at input, and then average across color jtters at output.

get_batch_transform(N)#

Generates a batch transform, where each set of sample(s) along the batch (first) dimension will have the same @N unique ColorJitter transforms applied.

Parameters:

N (int) – Number of ColorJitter transforms to apply per set of sample(s) along the batch (first) dimension

Returns:

Aggregated transform which will autoamtically apply a different ColorJitter transforms to

each sub-set of samples along batch dimension, assumed to be the FIRST dimension in the inputted tensor Note: This function will MULTIPLY the first dimension by N

Return type:

Lambda

get_transform()#

Get a randomized transform to be applied on image.

Implementation taken directly from:

https://github.com/pytorch/vision/blob/2f40a483d73018ae6e1488a484c5927f2b309969/torchvision/transforms/transforms.py#L1053-L1085

Returns:

Transform which randomly adjusts brightness, contrast and saturation in a random order.

Return type:

Transform

output_shape_in(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

output_shape_out(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.obs_core.CropRandomizer(input_shape, crop_height=76, crop_width=76, num_crops=1, pos_enc=False)#

Bases: robomimic.models.obs_core.Randomizer

Randomly sample crops at input, and then average across crop features at output.

output_shape_in(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

output_shape_out(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.obs_core.EncoderCore(input_shape)#

Bases: robomimic.models.base_nets.Module

Abstract class used to categorize all cores used to encode observations

training: bool#
class robomimic.models.obs_core.GaussianNoiseRandomizer(input_shape, noise_mean=0.0, noise_std=0.3, limits=None, num_samples=1)#

Bases: robomimic.models.obs_core.Randomizer

Randomly sample gaussian noise at input, and then average across noises at output.

output_shape_in(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

output_shape_out(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.obs_core.Randomizer#

Bases: robomimic.models.base_nets.Module

Base class for randomizer networks. Each randomizer should implement the @output_shape_in, @output_shape_out, @forward_in, and @forward_out methods. The randomizer’s @forward_in method is invoked on raw inputs, and @forward_out is invoked on processed inputs (usually processed by a @VisualCore instance). Note that the self.training property can be used to change the randomizer’s behavior at train vs. test time.

forward_in(inputs)#

Randomize raw inputs if training.

forward_out(inputs)#

Processing for network outputs.

output_shape(input_shape=None)#

This function is unused. See @output_shape_in and @output_shape_out.

abstract output_shape_in(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

abstract output_shape_out(input_shape=None)#

Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.obs_core.ScanCore(input_shape, conv_kwargs=None, conv_activation='relu', pool_class=None, pool_kwargs=None, flatten=True, feature_dimension=None)#

Bases: robomimic.models.obs_core.EncoderCore, robomimic.models.base_nets.ConvBase

A network block that combines a Conv1D backbone network with optional pooling and linear layers.

forward(inputs)#

Forward pass through visual core.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.obs_core.VisualCore(input_shape, backbone_class='ResNet18Conv', pool_class='SpatialSoftmax', backbone_kwargs=None, pool_kwargs=None, flatten=True, feature_dimension=64)#

Bases: robomimic.models.obs_core.EncoderCore, robomimic.models.base_nets.ConvBase

A network block that combines a visual backbone network with optional pooling and linear layers.

forward(inputs)#

Forward pass through visual core.

output_shape(input_shape)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.obs_core.VisualCoreLanguageConditioned(input_shape, backbone_class='ResNet18ConvFiLM', pool_class='SpatialSoftmax', backbone_kwargs=None, pool_kwargs=None, flatten=True, feature_dimension=64)#

Bases: robomimic.models.obs_core.VisualCore

Variant of VisualCore that expects language embedding during forward pass.

forward(inputs, lang_emb=None)#

Update forward pass to pass language embedding through ResNet18ConvFiLM.

training: bool#

robomimic.models.obs_nets module#

robomimic.models.policy_nets module#

robomimic.models.transformers module#

Implementation of transformers, mostly based on Andrej’s minGPT model. See https://github.com/karpathy/minGPT/blob/master/mingpt/model.py for more details.

class robomimic.models.transformers.CausalSelfAttention(embed_dim, num_heads, context_length, attn_dropout=0.1, output_dropout=0.1)#

Bases: robomimic.models.base_nets.Module

forward(x)#

Forward pass through Self-Attention block. Input should be shape (B, T, D) where B is batch size, T is seq length (@self.context_length), and D is input dimension (@self.embed_dim).

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.transformers.GEGLU#

Bases: torch.nn.modules.module.Module

References

Shazeer et al., “GLU Variants Improve Transformer,” 2020. https://arxiv.org/abs/2002.05202

Implementation: https://github.com/pfnet-research/deep-table/blob/237c8be8a405349ce6ab78075234c60d9bfe60b7/deep_table/nn/layers/activation.py

forward(x)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

geglu(x)#
training: bool#
class robomimic.models.transformers.GPT_Backbone(embed_dim, context_length, attn_dropout=0.1, block_output_dropout=0.1, num_layers=6, num_heads=8, activation='gelu')#

Bases: robomimic.models.base_nets.Module

the GPT model, with a context size of block_size

forward(inputs)#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#
class robomimic.models.transformers.PositionalEncoding(embed_dim)#

Bases: torch.nn.modules.module.Module

Taken from https://pytorch.org/tutorials/beginner/transformer_tutorial.html.

forward(x)#

Input timestep of shape BxT

training: bool#
class robomimic.models.transformers.SelfAttentionBlock(embed_dim, num_heads, context_length, attn_dropout=0.1, output_dropout=0.1, activation=GELU())#

Bases: robomimic.models.base_nets.Module

A single Transformer Block, that can be chained together repeatedly. It consists of a @CausalSelfAttention module and a small MLP, along with layer normalization and residual connections on each input.

forward(x)#

Forward pass - chain self-attention + MLP blocks, with residual connections and layer norms.

output_shape(input_shape=None)#

Function to compute output shape from inputs to this module.

Parameters:

input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.

Returns:

list of integers corresponding to output shape

Return type:

out_shape ([int])

training: bool#

robomimic.models.vae_nets module#

robomimic.models.value_nets module#

Module contents#