Models

MLP Model

class rsl_rl.models.mlp_model.MLPModel[source]

MLP-based neural model.

This model uses a simple multi-layer perceptron (MLP) to process 1D observation groups. Observations can be normalized before being passed to the MLP. The output of the model can be either deterministic or stochastic, in which case a distribution module is used to sample the outputs.

is_recurrent: bool = False

Whether the model contains a recurrent module.

__init__(obs, obs_groups, obs_set, output_dim, hidden_dims=(256, 256, 256), activation='elu', obs_normalization=False, distribution_cfg=None)[source]

Initialize the MLP-based model.

Parameters:
  • obs (tensordict.TensorDict) – Observation Dictionary.

  • obs_groups (dict[str, list[str]]) – Dictionary mapping observation sets to lists of observation groups.

  • obs_set (str) – Observation set to use for this model (e.g., “actor” or “critic”).

  • output_dim (int) – Dimension of the output.

  • hidden_dims (tuple[int, ...] | list[int]) – Hidden dimensions of the MLP.

  • activation (str) – Activation function of the MLP.

  • obs_normalization (bool) – Whether to normalize the observations before feeding them to the MLP.

  • distribution_cfg (dict | None) – Configuration dictionary for the output distribution. If provided, the model outputs stochastic values sampled from the distribution.

Return type:

None

forward(obs, masks=None, hidden_state=None, stochastic_output=False)[source]

Forward pass of the MLP model.

..note::

The stochastic_output flag only has an effect if the model has a distribution (i.e., distribution_cfg was provided) and defaults to False, meaning that even stochastic models will return deterministic outputs by default.

Parameters:
  • obs (TensorDict)

  • masks (torch.Tensor | None)

  • hidden_state (HiddenState)

  • stochastic_output (bool)

Return type:

torch.Tensor

get_latent(obs, masks=None, hidden_state=None)[source]

Build the model latent by concatenating and normalizing selected observation groups.

Parameters:
  • obs (TensorDict)

  • masks (torch.Tensor | None)

  • hidden_state (HiddenState)

Return type:

torch.Tensor

reset(dones=None, hidden_state=None)[source]

Reset the internal state for recurrent models (no-op).

Parameters:
  • dones (torch.Tensor | None)

  • hidden_state (HiddenState)

Return type:

None

get_hidden_state()[source]

Return the recurrent hidden state (None for MLP).

Return type:

torch.Tensor | tuple[torch.Tensor, torch.Tensor] | None

detach_hidden_state(dones=None)[source]

Detach therecurrent hidden state for truncated backpropagation (no-op).

Parameters:

dones (torch.Tensor | None)

Return type:

None

property output_mean: torch.Tensor

Return the mean of the current output distribution.

property output_std: torch.Tensor

Return the standard deviation of the current output distribution.

property output_entropy: torch.Tensor

Return the entropy of the current output distribution.

property output_distribution_params: tuple[torch.Tensor, ...]

Return raw parameters of the current output distribution.

get_output_log_prob(outputs)[source]

Compute log-probabilities of outputs under the current distribution.

Parameters:

outputs (torch.Tensor)

Return type:

torch.Tensor

get_kl_divergence(old_params, new_params)[source]

Compute KL divergence between two parameterizations of the distribution.

Parameters:
  • old_params (tuple[torch.Tensor, ...])

  • new_params (tuple[torch.Tensor, ...])

Return type:

torch.Tensor

as_jit()[source]

Return a version of the model compatible with Torch JIT export.

Return type:

torch.nn.Module

as_onnx(verbose)[source]

Return a version of the model compatible with ONNX export.

Parameters:

verbose (bool)

Return type:

torch.nn.Module

update_normalization(obs)[source]

Update observation-normalization statistics from a batch of observations.

Parameters:

obs (tensordict.TensorDict)

Return type:

None

RNN Model

class rsl_rl.models.rnn_model.RNNModel[source]

RNN-based neural model.

This model uses a recurrent neural network (RNN) to process 1D observation groups before passing the resulting latent to an MLP. Available RNN types are “lstm” and “gru”. Observations can be normalized before being passed to the RNN. The output of the model can be either deterministic or stochastic, in which case a distribution module is used to sample the outputs.

is_recurrent: bool = True

Whether the model contains a recurrent module.

__init__(obs, obs_groups, obs_set, output_dim, hidden_dims=(256, 256, 256), activation='elu', obs_normalization=False, distribution_cfg=None, rnn_type='lstm', rnn_hidden_dim=256, rnn_num_layers=1)[source]

Initialize the RNN-based model.

Parameters:
  • obs (tensordict.TensorDict) – Observation Dictionary.

  • obs_groups (dict[str, list[str]]) – Dictionary mapping observation sets to lists of observation groups.

  • obs_set (str) – Observation set to use for this model (e.g., “actor” or “critic”).

  • output_dim (int) – Dimension of the output.

  • hidden_dims (tuple[int, ...] | list[int]) – Hidden dimensions of the MLP.

  • activation (str) – Activation function of the MLP.

  • obs_normalization (bool) – Whether to normalize the observations before feeding them to the MLP.

  • distribution_cfg (dict | None) – Configuration dictionary for the output distribution.

  • rnn_type (str) – Type of RNN to use (“lstm” or “gru”).

  • rnn_hidden_dim (int) – Dimension of the RNN hidden state.

  • rnn_num_layers (int) – Number of RNN layers.

Return type:

None

get_latent(obs, masks=None, hidden_state=None)[source]

Build the model latent by passing normalized observation groups through the RNN.

Parameters:
  • obs (TensorDict)

  • masks (torch.Tensor | None)

  • hidden_state (HiddenState)

Return type:

torch.Tensor

reset(dones=None, hidden_state=None)[source]

Reset the recurrent hidden state of the RNN.

Parameters:
  • dones (torch.Tensor | None)

  • hidden_state (HiddenState)

Return type:

None

get_hidden_state()[source]

Return the recurrent hidden state of the RNN.

Return type:

torch.Tensor | tuple[torch.Tensor, torch.Tensor] | None

detach_hidden_state(dones=None)[source]

Detach the recurrent hidden state for truncated backpropagation.

Parameters:

dones (torch.Tensor | None)

Return type:

None

as_jit()[source]

Return a version of the model compatible with Torch JIT export.

Return type:

torch.nn.Module

as_onnx(verbose=False)[source]

Return a version of the model compatible with ONNX export.

Parameters:

verbose (bool)

Return type:

torch.nn.Module

CNN Model

class rsl_rl.models.cnn_model.CNNModel[source]

CNN-based neural model.

This model uses one or more convolutional neural network (CNN) encoders to process one or more 2D observation groups before passing the resulting latent to an MLP. Any 1D observation groups are directly concatenated with the CNN latent and passed to the MLP. 1D observations can be normalized before being passed to the MLP. The output of the model can be either deterministic or stochastic, in which case a distribution module is used to sample the outputs.

__init__(obs, obs_groups, obs_set, output_dim, hidden_dims=(256, 256, 256), activation='elu', obs_normalization=False, distribution_cfg=None, cnn_cfg=None, cnns=None)[source]

Initialize the CNN-based model.

Parameters:
  • obs (TensorDict) – Observation Dictionary.

  • obs_groups (dict[str, list[str]]) – Dictionary mapping observation sets to lists of observation groups.

  • obs_set (str) – Observation set to use for this model (e.g., “actor” or “critic”).

  • output_dim (int) – Dimension of the output.

  • hidden_dims (tuple[int, ...] | list[int]) – Hidden dimensions of the MLP.

  • activation (str) – Activation function of the CNN and MLP.

  • obs_normalization (bool) – Whether to normalize the observations before feeding them to the MLP.

  • distribution_cfg (dict | None) – Configuration dictionary for the output distribution.

  • cnn_cfg (dict[str, dict] | dict[str, Any] | None) – Configuration of the CNN encoder(s).

  • cnns (nn.ModuleDict | dict[str, nn.Module] | None) – CNN modules to use, e.g., for sharing CNNs between actor and critic. If None, new CNNs are created.

Return type:

None

get_latent(obs, masks=None, hidden_state=None)[source]

Build the model latent by combining normalized 1D and CNN-encoded 2D observation groups.

Parameters:
  • obs (TensorDict)

  • masks (torch.Tensor | None)

  • hidden_state (HiddenState)

Return type:

torch.Tensor

as_jit()[source]

Return a version of the model compatible with Torch JIT export.

Return type:

torch.nn.Module

as_onnx(verbose=False)[source]

Return a version of the model compatible with ONNX export.

Parameters:

verbose (bool)

Return type:

torch.nn.Module