Runners

On-Policy Runner

class rsl_rl.runners.on_policy_runner.OnPolicyRunner[source]

On-policy runner for reinforcement learning algorithms.

__init__(env, train_cfg, log_dir=None, device='cpu')[source]

Construct the runner, algorithm, and logging stack.

Parameters:
  • env (VecEnv)

  • train_cfg (dict)

  • log_dir (str | None)

  • device (str)

Return type:

None

alg: PPO

The actor-critic algorithm.

learn(num_learning_iterations, init_at_random_ep_len=False)[source]

Run the learning loop for the specified number of iterations.

Parameters:
  • num_learning_iterations (int)

  • init_at_random_ep_len (bool)

Return type:

None

save(path, infos=None)[source]

Save the models and training state to a given path and upload them if external logging is used.

Parameters:
  • path (str)

  • infos (dict | None)

Return type:

None

load(path, load_cfg=None, strict=True, map_location=None)[source]

Load the models and training state from a given path.

Parameters:
  • path (str) – Path to load the model from.

  • load_cfg (dict | None) – Optional dictionary that defines what models and states to load. If None, all models and states are loaded.

  • strict (bool) – Whether state_dict loading should be strict.

  • map_location (str | None) – Device mapping for loading the model.

Return type:

dict

get_inference_policy(device=None)[source]

Return the policy on the requested device for inference.

Parameters:

device (str | None)

Return type:

MLPModel

export_policy_to_jit(path, filename='policy.pt')[source]

Export the model to a Torch JIT file.

Parameters:
  • path (str)

  • filename (str)

Return type:

None

export_policy_to_onnx(path, filename='policy.onnx', verbose=False)[source]

Export the model into an ONNX file.

Parameters:
  • path (str)

  • filename (str)

  • verbose (bool)

Return type:

None

add_git_repo_to_log(repo_file_path)[source]

Register a repository path whose git status should be logged.

Parameters:

repo_file_path (str)

Return type:

None

Distillation Runner

class rsl_rl.runners.distillation_runner.DistillationRunner[source]

Distillation runner for student-teacher algorithms.

alg: Distillation

The distillation algorithm.

learn(num_learning_iterations, init_at_random_ep_len=False)[source]

Run the learning loop after validating that the teacher model is loaded.

Parameters:
  • num_learning_iterations (int)

  • init_at_random_ep_len (bool)

Return type:

None