Runners¶

On-Policy Runner¶

class rsl_rl.runners.on_policy_runner.OnPolicyRunner[source]¶

On-policy runner for reinforcement learning algorithms.

__init__(env, train_cfg, log_dir=None, device='cpu')[source]¶

Construct the runner, algorithm, and logging stack.

Parameters:

env (VecEnv)
train_cfg (dict)
log_dir (str | None)
device (str)

Return type:

None

alg: PPO¶: The actor-critic algorithm.

learn(num_learning_iterations, init_at_random_ep_len=False)[source]¶

Run the learning loop for the specified number of iterations.

Parameters:

num_learning_iterations (int)
init_at_random_ep_len (bool)

Return type:

None

save(path, infos=None)[source]¶

Save the models and training state to a given path and upload them if external logging is used.

Parameters:

path (str)
infos (dict | None)

Return type:

None

load(path, load_cfg=None, strict=True, map_location=None)[source]¶

Load the models and training state from a given path.

Parameters:

path (str) – Path to load the model from.
load_cfg (dict | None) – Optional dictionary that defines what models and states to load. If None, all models and states are loaded.
strict (bool) – Whether state_dict loading should be strict.
map_location (str | None) – Device mapping for loading the model.

Return type:

dict

get_inference_policy(device=None)[source]¶

Return the policy on the requested device for inference.

Parameters:: device (str | None)
Return type:: MLPModel

export_policy_to_jit(path, filename='policy.pt')[source]¶

Export the model to a Torch JIT file.

Parameters:

path (str)
filename (str)

Return type:

None

export_policy_to_onnx(path, filename='policy.onnx', verbose=False)[source]¶

Export the model into an ONNX file.

Parameters:

path (str)
filename (str)
verbose (bool)

Return type:

None

add_git_repo_to_log(repo_file_path)[source]¶

Parameters:: repo_file_path (str)
Return type:: None

Distillation Runner¶

class rsl_rl.runners.distillation_runner.DistillationRunner[source]¶

Distillation runner for student-teacher algorithms.

alg: Distillation¶: The distillation algorithm.

learn(num_learning_iterations, init_at_random_ep_len=False)[source]¶

Run the learning loop after validating that the teacher model is loaded.

Parameters:

num_learning_iterations (int)
init_at_random_ep_len (bool)

Return type:

None