Utils

Logger

Utils

rsl_rl.utils.utils.get_param(param, idx)[source]

Get a parameter for the given index.

Parameters:
  • param (Any) – Parameter or list/tuple of parameters.

  • idx (int) – Index to get the parameter for.

Return type:

Any

rsl_rl.utils.utils.resolve_nn_activation(act_name)[source]

Resolve the activation function from the name.

Valid activation function names are: "elu", "selu", "relu", "crelu", "lrelu", "tanh", "sigmoid", "softplus", "gelu", "swish", "mish", "identity".

Parameters:

act_name (str) – Name of the activation function.

Returns:

The activation function.

Raises:

ValueError – If the activation function is not found.

Return type:

torch.nn.Module

rsl_rl.utils.utils.resolve_optimizer(optimizer_name)[source]

Resolve the optimizer from the name.

Valid optimizer names are: "adam", "adamw", "sgd", "rmsprop".

Parameters:

optimizer_name (str) – Name of the optimizer.

Returns:

The optimizer.

Raises:

ValueError – If the optimizer is not found.

Return type:

torch.optim.Optimizer

rsl_rl.utils.utils.resolve_callable(callable_or_name)[source]

Resolve a callable from a string, type, or return callable directly.

This function supports resolving callables from a direct callable input or from a string in one of these formats:

  • Direct callable: pass a type or function directly (for example, MyClass or my_func).

  • Qualified name with colon: "module.path:Attr.Nested" (explicit, recommended).

  • Qualified name with dot: "module.path.ClassName" (implicit).

  • Simple name: for example "PPO" or "ActorCritic" (searched within rsl_rl).

Parameters:

callable_or_name (type | Callable | str) – A callable (type/function) or string name.

Returns:

The resolved callable.

Raises:
  • TypeError – If input is neither a callable nor a string.

  • ImportError – If the module cannot be imported.

  • AttributeError – If the attribute cannot be found in the module.

  • ValueError – If a simple name cannot be found in rsl_rl packages.

Return type:

Callable

rsl_rl.utils.utils.resolve_obs_groups(obs, obs_groups, default_sets)[source]

Validate the observation configuration and resolve missing observation sets.

The input is an observation dictionary obs containing observation groups and a configuration dictionary obs_groups where the keys are the observation sets and the values are lists of observation groups.

The configuration dictionary could for example look like:

{
    "actor": ["group_1", "group_2"],
    "critic": ["group_1", "group_3"],
}

This means that the ‘actor’ observation set will contain the observations “group_1” and “group_2” and the ‘critic’ observation set will contain the observations “group_1” and “group_3”. This function will check that all the observations in the ‘actor’ and ‘critic’ observation sets are present in the observation dictionary from the environment.

Additionally, if one of the default_sets, e.g. “critic”, is not present in the configuration dictionary, this function will:

  1. Check if a group with the same name exists in the observations and assign this group to the observation set.

  2. If 1. fails, it will assign the ‘policy’ observation group to the missing observation set.

  3. If 2. fails, an error is raised.

Parameters:
  • obs (tensordict.TensorDict) – Observations from the environment in the form of a dictionary.

  • obs_groups (dict[str, list[str]]) – Dictionary mapping observation sets to lists of observation groups.

  • default_sets (list[str]) – Default observation set names used by the algorithm. If not provided in obs_groups, a default behavior gets triggered.

Returns:

The resolved observation groups.

Raises:
  • ValueError – If any observation set is an empty list.

  • ValueError – If any observation set contains an observation term that is not present in the observations.

  • ValueError – If a default observation set cannot be resolved according to the rules above.

Return type:

dict[str, list[str]]

rsl_rl.utils.utils.check_nan(obs, rewards, dones)[source]

Raise ValueError if any environment output contains NaN.

Parameters:
  • obs (tensordict.TensorDict)

  • rewards (torch.Tensor)

  • dones (torch.Tensor)

Return type:

None

rsl_rl.utils.utils.split_and_pad_trajectories(tensor, dones)[source]

Split trajectories at done indices.

Split trajectories, concatenate them and pad with zeros up to the length of the longest trajectory. Return masks corresponding to valid parts of the trajectories.

Example (transposed for readability):
Input: [[a1, a2, a3, a4 | a5, a6],

[b1, b2 | b3, b4, b5 | b6]]

Output:[[a1, a2, a3, a4], | [[True, True, True, True],

[a5, a6, 0, 0], | [True, True, False, False], [b1, b2, 0, 0], | [True, True, False, False], [b3, b4, b5, 0], | [True, True, True, False], [b6, 0, 0, 0]] | [True, False, False, False]]

Assumes that the input has the following order of dimensions: [time, number of envs, additional dimensions]

Parameters:
  • tensor (torch.Tensor | TensorDict)

  • dones (torch.Tensor)

Return type:

tuple[torch.Tensor | TensorDict, torch.Tensor]

rsl_rl.utils.utils.unpad_trajectories(trajectories, masks)[source]

Do the inverse operation of split_and_pad_trajectories().

Parameters:
  • trajectories (torch.Tensor | TensorDict)

  • masks (torch.Tensor)

Return type:

torch.Tensor | TensorDict

Wandb Utils

class rsl_rl.utils.wandb_utils.WandbSummaryWriter[source]

Summary writer for W&B.

__init__(log_dir, flush_secs, cfg)[source]

Initialize a W&B run for logging.

Parameters:
  • log_dir (str)

  • flush_secs (int)

  • cfg (dict)

Return type:

None

store_config(env_cfg, train_cfg)[source]

Upload environment and training configuration to W&B.

Parameters:
  • env_cfg (dict | object)

  • train_cfg (dict)

Return type:

None

add_scalar(tag, scalar_value, global_step=None, walltime=None, new_style=False)[source]

Log a scalar to both TensorBoard and W&B.

Parameters:
  • tag (str)

  • scalar_value (float)

  • global_step (int | None)

  • walltime (float | None)

  • new_style (bool)

Return type:

None

stop()[source]

Finish the active W&B run.

Return type:

None

save_model(model_path, it)[source]

Upload a model checkpoint artifact to W&B.

Parameters:
  • model_path (str)

  • it (int)

Return type:

None

save_file(path)[source]

Upload an arbitrary file artifact to W&B.

Parameters:

path (str)

Return type:

None

save_video(video, it)[source]

Upload a video artifact once per filename to W&B.

Parameters:
  • video (Path)

  • it (int)

Return type:

None

Neptune Utils

class rsl_rl.utils.neptune_utils.NeptuneSummaryWriter[source]

Summary writer for Neptune.

__init__(log_dir, flush_secs, cfg)[source]

Initialize a Neptune run for logging.

Parameters:
  • log_dir (str)

  • flush_secs (int)

  • cfg (dict)

Return type:

None

store_config(env_cfg, train_cfg)[source]

Upload environment and training configuration to Neptune.

Parameters:
  • env_cfg (dict | object)

  • train_cfg (dict)

Return type:

None

add_scalar(tag, scalar_value, global_step=None, walltime=None, new_style=False)[source]

Log a scalar to both TensorBoard and Neptune.

Parameters:
  • tag (str)

  • scalar_value (float)

  • global_step (int | None)

  • walltime (float | None)

  • new_style (bool)

Return type:

None

stop()[source]

Finish the active Neptune run.

Return type:

None

save_model(model_path, it)[source]

Upload a model checkpoint artifact to Neptune.

Parameters:
  • model_path (str)

  • it (int)

Return type:

None

save_file(path)[source]

Upload an arbitrary file artifact to Neptune.

Parameters:

path (str)

Return type:

None