robomimic.models package
Contents
robomimic.models package#
Submodules#
robomimic.models.base_nets module#
Contains torch Modules that correspond to basic network building blocks, like MLP, RNN, and CNN backbones.
- class robomimic.models.base_nets.Conv1dBase(input_channel=1, activation='relu', out_channels=(32, 64, 64), kernel_size=(8, 4, 2), stride=(4, 2, 1), **conv_kwargs)#
Bases:
robomimic.models.base_nets.Module
Base class for stacked Conv1d layers.
- Parameters:
input_channel (int) – Number of channels for inputs to this network
activation (None or str) – Per-layer activation to use. Defaults to “relu”. Valid options are currently {relu, None} for no activation
out_channels (list of int) – Output channel size for each sequential Conv1d layer
kernel_size (list of int) – Kernel sizes for each sequential Conv1d layer
stride (list of int) – Stride sizes for each sequential Conv1d layer
conv_kwargs (dict) – additional nn.Conv1D args to use, in list form, where the ith element corresponds to the argument to be passed to the ith Conv1D layer. See https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html for specific possible arguments.
- forward(inputs)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.ConvBase#
Bases:
robomimic.models.base_nets.Module
Base class for ConvNets.
- forward(inputs)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.CoordConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', coord_encoding='position')#
Bases:
torch.nn.modules.conv.Conv2d
,robomimic.models.base_nets.Module
2D Coordinate Convolution
Source: An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution https://arxiv.org/abs/1807.03247 (e.g. adds 2 channels per input feature map corresponding to (x, y) location on map)
- bias: Optional[torch.Tensor]#
- dilation: Tuple[int, ...]#
- forward(input)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- groups: int#
- kernel_size: Tuple[int, ...]#
- out_channels: int#
- output_padding: Tuple[int, ...]#
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- padding: Tuple[int, ...]#
- padding_mode: str#
- stride: Tuple[int, ...]#
- transposed: bool#
- weight: torch.Tensor#
- class robomimic.models.base_nets.FeatureAggregator(dim=1, agg_type='avg')#
Bases:
robomimic.models.base_nets.Module
Helpful class for aggregating features across a dimension. This is useful in practice when training models that break an input image up into several patches since features can be extraced per-patch using the same encoder and then aggregated using this module.
- clear_weight()#
- forward(x)#
Forward pooling pass.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- set_weight(w)#
- training: bool#
- class robomimic.models.base_nets.FiLMLayer(lang_emb_dim, channels)#
Bases:
robomimic.models.base_nets.ConvBase
Uses Feature-wIse Linear Modulation to language condition a conv net
- forward(x, lang_emb)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.MLP(input_dim, output_dim, layer_dims=(), layer_func=<class 'torch.nn.modules.linear.Linear'>, layer_func_kwargs=None, activation=<class 'torch.nn.modules.activation.ReLU'>, dropouts=None, normalization=False, output_activation=None)#
Bases:
robomimic.models.base_nets.Module
Base class for simple Multi-Layer Perceptrons.
- forward(inputs)#
Forward pass.
- output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.MVPConv(input_channel=3, mvp_model_class='vitb-mae-egosoup', freeze=True)#
Bases:
robomimic.models.base_nets.ConvBase
Base class for ConvNets pretrained with MVP (https://arxiv.org/abs/2203.06173)
- forward(inputs)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module. :param input_shape: shape of input. Does not include batch dimension.
Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.Module#
Bases:
torch.nn.modules.module.Module
Base class for networks. The only difference from torch.nn.Module is that it requires implementing @output_shape.
- abstract output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.Parameter(init_tensor)#
Bases:
robomimic.models.base_nets.Module
A class that is a thin wrapper around a torch.nn.Parameter to make for easy saving and optimization.
- forward(inputs=None)#
Forward call just returns the parameter tensor.
- output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.R3MConv(input_channel=3, r3m_model_class='resnet18', freeze=True)#
Bases:
robomimic.models.base_nets.ConvBase
Base class for ConvNets pretrained with R3M (https://arxiv.org/abs/2203.12601)
- output_shape(input_shape)#
Function to compute output shape from inputs to this module. :param input_shape: shape of input. Does not include batch dimension.
Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.RNN_Base(input_dim, rnn_hidden_dim, rnn_num_layers, rnn_type='LSTM', rnn_kwargs=None, per_step_net=None)#
Bases:
robomimic.models.base_nets.Module
A wrapper class for a multi-step RNN and a per-step network.
- forward(inputs, rnn_init_state=None, return_state=False)#
Forward a sequence of inputs through the RNN and the per-step network.
- Parameters:
inputs (torch.Tensor) – tensor input of shape [B, T, D], where D is the RNN input size
rnn_init_state – rnn hidden state, initialize to zero state if set to None
return_state (bool) – whether to return hidden state
- Returns:
outputs of the per_step_net
rnn_state: return rnn state at the end if return_state is set to True
- Return type:
outputs
- forward_step(inputs, rnn_state)#
Forward a single step input through the RNN and per-step network, and return the new hidden state. :param inputs: tensor input of shape [B, D], where D is the RNN input size :type inputs: torch.Tensor :param rnn_state: rnn hidden state, initialize to zero state if set to None
- Returns:
outputs of the per_step_net
rnn_state: return the new rnn state
- Return type:
outputs
- get_rnn_init_state(batch_size, device)#
Get a default RNN state (zeros) :param batch_size: batch size dimension :type batch_size: int :param device: device the hidden state should be sent to.
- Returns:
- returns hidden state tensor or tuple of hidden state tensors
depending on the RNN type
- Return type:
hidden_state (torch.Tensor or tuple)
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- property rnn_type#
- training: bool#
- class robomimic.models.base_nets.ResNet18Conv(input_channel=3, pretrained=False, input_coord_conv=False)#
Bases:
robomimic.models.base_nets.ConvBase
A ResNet18 block that can be used to process input images.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.ResNet18ConvFiLM(input_channel=3, pretrained=False, input_coord_conv=False, lang_emb_dim=768)#
Bases:
robomimic.models.base_nets.ConvBase
A ResNet18 block that can be used to process input images and uses FiLM for language conditioning.
- forward(inputs, lang_emb)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.ResNet50Conv(input_channel=3, pretrained=False, input_coord_conv=False)#
Bases:
robomimic.models.base_nets.ConvBase
A ResNet50 block that can be used to process input images.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.Sequential(*args, has_output_shape=True)#
Bases:
torch.nn.modules.container.Sequential
,robomimic.models.base_nets.Module
Compose multiple Modules together (defined above).
- freeze()#
- output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- train(mode)#
Sets the module in training mode.
This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g.
Dropout
,BatchNorm
, etc.- Parameters:
mode (bool) – whether to set training mode (
True
) or evaluation mode (False
). Default:True
.- Returns:
self
- Return type:
- training: bool#
- class robomimic.models.base_nets.ShallowConv(input_channel=3, output_channel=32)#
Bases:
robomimic.models.base_nets.ConvBase
A shallow convolutional encoder from https://rll.berkeley.edu/dsae/dsae.pdf
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.SpatialMeanPool(input_shape)#
Bases:
robomimic.models.base_nets.Module
Module that averages inputs across all spatial dimensions (dimension 2 and after), leaving only the batch and channel dimensions.
- forward(inputs)#
Forward pass - average across all dimensions except batch and channel.
- output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.SpatialSoftmax(input_shape, num_kp=32, temperature=1.0, learnable_temperature=False, output_variance=False, noise_std=0.0)#
Bases:
robomimic.models.base_nets.ConvBase
Spatial Softmax Layer.
Based on Deep Spatial Autoencoders for Visuomotor Learning by Finn et al. https://rll.berkeley.edu/dsae/dsae.pdf
- forward(feature)#
Forward pass through spatial softmax layer. For each keypoint, a 2D spatial probability distribution is created using a softmax, where the support is the pixel locations. This distribution is used to compute the expected value of the pixel location, which becomes a keypoint of dimension 2. K such keypoints are created.
- Returns:
- mean keypoints of shape [B, K, 2], and possibly
keypoint variance of shape [B, K, 2, 2] corresponding to the covariance under the 2D spatial softmax distribution
- Return type:
out (torch.Tensor or tuple)
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.Squeeze(dim)#
Bases:
robomimic.models.base_nets.Module
Trivial class that squeezes the input. Useful for including in a nn.Sequential network
- forward(x)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.base_nets.Unsqueeze(dim)#
Bases:
robomimic.models.base_nets.Module
Trivial class that unsqueezes the input. Useful for including in a nn.Sequential network
- forward(x)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- robomimic.models.base_nets.rnn_args_from_config(rnn_config)#
Takes a Config object corresponding to RNN settings (for example config.algo.rnn in BCConfig) and extracts rnn kwargs for instantiating rnn networks.
- robomimic.models.base_nets.transformer_args_from_config(transformer_config)#
Takes a Config object corresponding to Transformer settings (for example config.algo.transformer in BCConfig) and extracts transformer kwargs for instantiating transformer networks.
robomimic.models.diffusion_policy_nets module#
This file contains nets used for Diffusion Policy.
- class robomimic.models.diffusion_policy_nets.ConditionalResidualBlock1D(in_channels, out_channels, cond_dim, kernel_size=3, n_groups=8)#
Bases:
torch.nn.modules.module.Module
- forward(x, cond)#
x : [ batch_size x in_channels x horizon ] cond : [ batch_size x cond_dim]
returns: out : [ batch_size x out_channels x horizon ]
- training: bool#
- class robomimic.models.diffusion_policy_nets.ConditionalUnet1D(input_dim, global_cond_dim, diffusion_step_embed_dim=256, down_dims=[256, 512, 1024], kernel_size=5, n_groups=8)#
Bases:
torch.nn.modules.module.Module
- forward(sample: torch.Tensor, timestep: Union[torch.Tensor, float, int], global_cond=None)#
x: (B,T,input_dim) timestep: (B,) or int, diffusion step global_cond: (B,global_cond_dim) output: (B,T,input_dim)
- training: bool#
- class robomimic.models.diffusion_policy_nets.Conv1dBlock(inp_channels, out_channels, kernel_size, n_groups=8)#
Bases:
torch.nn.modules.module.Module
Conv1d –> GroupNorm –> Mish
- forward(x)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class robomimic.models.diffusion_policy_nets.Downsample1d(dim)#
Bases:
torch.nn.modules.module.Module
- forward(x)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class robomimic.models.diffusion_policy_nets.SinusoidalPosEmb(dim)#
Bases:
torch.nn.modules.module.Module
- forward(x)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class robomimic.models.diffusion_policy_nets.Upsample1d(dim)#
Bases:
torch.nn.modules.module.Module
- forward(x)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
robomimic.models.distributions module#
Contains distribution models used as parts of other networks. These classes usually inherit or emulate torch distributions.
- class robomimic.models.distributions.DiscreteValueDistribution(values, probs=None, logits=None)#
Bases:
object
Extension to torch categorical probability distribution in order to keep track of the support (categorical values, or in this case, value atoms). This is used for distributional value networks.
- property logits#
- mean()#
Categorical distribution mean, taking the value support into account.
- property probs#
- sample(sample_shape=torch.Size([]))#
Sample from the distribution. Make sure to return value atoms, not categorical class indices.
- property values#
- variance()#
Categorical distribution variance, taking the value support into account.
- class robomimic.models.distributions.TanhWrappedDistribution(base_dist, scale=1.0, epsilon=1e-06)#
Bases:
torch.distributions.distribution.Distribution
Class that wraps another valid torch distribution, such that sampled values from the base distribution are passed through a tanh layer. The corresponding (log) probabilities are also modified accordingly. Tanh Normal distribution - adapted from rlkit and CQL codebase (https://github.com/aviralkumar2907/CQL/blob/d67dbe9cf5d2b96e3b462b6146f249b3d6569796/d4rl/rlkit/torch/distributions.py#L6).
- log_prob(value, pre_tanh_value=None)#
- Parameters:
value (torch.Tensor) – some tensor to compute log probabilities for
pre_tanh_value – If specified, will not calculate atanh manually from @value. More numerically stable
- property mean#
Returns the mean of the distribution.
- rsample(sample_shape=torch.Size([]), return_pretanh_value=False)#
Sampling in the reparameterization case - for differentiable samples.
- sample(sample_shape=torch.Size([]), return_pretanh_value=False)#
Gradients will and should not pass through this operation. See https://github.com/pytorch/pytorch/issues/4620 for discussion.
- property stddev#
Returns the standard deviation of the distribution.
robomimic.models.obs_core module#
Contains torch Modules for core observation processing blocks such as encoders (e.g. EncoderCore, VisualCore, ScanCore, …) and randomizers (e.g. Randomizer, CropRandomizer).
- class robomimic.models.obs_core.ColorRandomizer(input_shape, brightness=0.3, contrast=0.3, saturation=0.3, hue=0.3, num_samples=1)#
Bases:
robomimic.models.obs_core.Randomizer
Randomly sample color jitter at input, and then average across color jtters at output.
- get_batch_transform(N)#
Generates a batch transform, where each set of sample(s) along the batch (first) dimension will have the same @N unique ColorJitter transforms applied.
- Parameters:
N (int) – Number of ColorJitter transforms to apply per set of sample(s) along the batch (first) dimension
- Returns:
- Aggregated transform which will autoamtically apply a different ColorJitter transforms to
each sub-set of samples along batch dimension, assumed to be the FIRST dimension in the inputted tensor Note: This function will MULTIPLY the first dimension by N
- Return type:
Lambda
- get_transform()#
Get a randomized transform to be applied on image.
Implementation taken directly from:
- Returns:
Transform which randomly adjusts brightness, contrast and saturation in a random order.
- Return type:
Transform
- output_shape_in(input_shape=None)#
Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- output_shape_out(input_shape=None)#
Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.obs_core.CropRandomizer(input_shape, crop_height=76, crop_width=76, num_crops=1, pos_enc=False)#
Bases:
robomimic.models.obs_core.Randomizer
Randomly sample crops at input, and then average across crop features at output.
- output_shape_in(input_shape=None)#
Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- output_shape_out(input_shape=None)#
Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.obs_core.EncoderCore(input_shape)#
Bases:
robomimic.models.base_nets.Module
Abstract class used to categorize all cores used to encode observations
- training: bool#
- class robomimic.models.obs_core.GaussianNoiseRandomizer(input_shape, noise_mean=0.0, noise_std=0.3, limits=None, num_samples=1)#
Bases:
robomimic.models.obs_core.Randomizer
Randomly sample gaussian noise at input, and then average across noises at output.
- output_shape_in(input_shape=None)#
Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- output_shape_out(input_shape=None)#
Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.obs_core.Randomizer#
Bases:
robomimic.models.base_nets.Module
Base class for randomizer networks. Each randomizer should implement the @output_shape_in, @output_shape_out, @forward_in, and @forward_out methods. The randomizer’s @forward_in method is invoked on raw inputs, and @forward_out is invoked on processed inputs (usually processed by a @VisualCore instance). Note that the self.training property can be used to change the randomizer’s behavior at train vs. test time.
- forward_in(inputs)#
Randomize raw inputs if training.
- forward_out(inputs)#
Processing for network outputs.
- output_shape(input_shape=None)#
This function is unused. See @output_shape_in and @output_shape_out.
- abstract output_shape_in(input_shape=None)#
Function to compute output shape from inputs to this module. Corresponds to the @forward_in operation, where raw inputs (usually observation modalities) are passed in.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- abstract output_shape_out(input_shape=None)#
Function to compute output shape from inputs to this module. Corresponds to the @forward_out operation, where processed inputs (usually encoded observation modalities) are passed in.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.obs_core.ScanCore(input_shape, conv_kwargs=None, conv_activation='relu', pool_class=None, pool_kwargs=None, flatten=True, feature_dimension=None)#
Bases:
robomimic.models.obs_core.EncoderCore
,robomimic.models.base_nets.ConvBase
A network block that combines a Conv1D backbone network with optional pooling and linear layers.
- forward(inputs)#
Forward pass through visual core.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.obs_core.VisualCore(input_shape, backbone_class='ResNet18Conv', pool_class='SpatialSoftmax', backbone_kwargs=None, pool_kwargs=None, flatten=True, feature_dimension=64)#
Bases:
robomimic.models.obs_core.EncoderCore
,robomimic.models.base_nets.ConvBase
A network block that combines a visual backbone network with optional pooling and linear layers.
- forward(inputs)#
Forward pass through visual core.
- output_shape(input_shape)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.obs_core.VisualCoreLanguageConditioned(input_shape, backbone_class='ResNet18ConvFiLM', pool_class='SpatialSoftmax', backbone_kwargs=None, pool_kwargs=None, flatten=True, feature_dimension=64)#
Bases:
robomimic.models.obs_core.VisualCore
Variant of VisualCore that expects language embedding during forward pass.
- forward(inputs, lang_emb=None)#
Update forward pass to pass language embedding through ResNet18ConvFiLM.
- training: bool#
robomimic.models.obs_nets module#
robomimic.models.policy_nets module#
robomimic.models.transformers module#
Implementation of transformers, mostly based on Andrej’s minGPT model. See https://github.com/karpathy/minGPT/blob/master/mingpt/model.py for more details.
- class robomimic.models.transformers.CausalSelfAttention(embed_dim, num_heads, context_length, attn_dropout=0.1, output_dropout=0.1)#
Bases:
robomimic.models.base_nets.Module
- forward(x)#
Forward pass through Self-Attention block. Input should be shape (B, T, D) where B is batch size, T is seq length (@self.context_length), and D is input dimension (@self.embed_dim).
- output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.transformers.GEGLU#
Bases:
torch.nn.modules.module.Module
References
Shazeer et al., “GLU Variants Improve Transformer,” 2020. https://arxiv.org/abs/2002.05202
Implementation: https://github.com/pfnet-research/deep-table/blob/237c8be8a405349ce6ab78075234c60d9bfe60b7/deep_table/nn/layers/activation.py
- forward(x)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- geglu(x)#
- training: bool#
- class robomimic.models.transformers.GPT_Backbone(embed_dim, context_length, attn_dropout=0.1, block_output_dropout=0.1, num_layers=6, num_heads=8, activation='gelu')#
Bases:
robomimic.models.base_nets.Module
the GPT model, with a context size of block_size
- forward(inputs)#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#
- class robomimic.models.transformers.PositionalEncoding(embed_dim)#
Bases:
torch.nn.modules.module.Module
Taken from https://pytorch.org/tutorials/beginner/transformer_tutorial.html.
- forward(x)#
Input timestep of shape BxT
- training: bool#
- class robomimic.models.transformers.SelfAttentionBlock(embed_dim, num_heads, context_length, attn_dropout=0.1, output_dropout=0.1, activation=GELU())#
Bases:
robomimic.models.base_nets.Module
A single Transformer Block, that can be chained together repeatedly. It consists of a @CausalSelfAttention module and a small MLP, along with layer normalization and residual connections on each input.
- forward(x)#
Forward pass - chain self-attention + MLP blocks, with residual connections and layer norms.
- output_shape(input_shape=None)#
Function to compute output shape from inputs to this module.
- Parameters:
input_shape (iterable of int) – shape of input. Does not include batch dimension. Some modules may not need this argument, if their output does not depend on the size of the input, or if they assume fixed size input.
- Returns:
list of integers corresponding to output shape
- Return type:
out_shape ([int])
- training: bool#