.. index:: pair: page; MPC-Net .. _doxid-ocs2_doc_mpcnet: MPC-Net ======= MPC-Net is an imitation learning approach that uses solutions from MPC to guide the policy search. The main idea is to imitate MPC by minimizing the control Hamiltonian while representing the corresponding control inputs by a parametrized policy. MPC-Net can be used to clone a model predictive controller into a neural network policy, which can be evaluated much faster than MPC. Therefore, MPC-Net is a useful proxy for MPC in computationally demanding applications that do not require the most exact solution. The multi-threaded data generation and policy evaluation run asynchronously with the policy training. The data generation and policy evaluation are implemented in C++ and run on CPU, while the policy training is implemented in Python and runs on GPU. The control Hamiltonian is represented by a linear-quadratic approximation. Therefore, the training can run on GPU without callbacks to OCS2 C++ code running on CPU to evaluate the Hamiltonian, and one can exploit batch processing on GPU. Robots ~~~~~~ MPC-Net has been implemented for the following :ref:`Robotic Examples `: ============= ================ ================= ======== ============= Robot Recom. CPU Cores Recom. GPU Memory RaiSim Training Time ============= ================ ================= ======== ============= ballbot 4 2 GB No 0m 20s legged_robot 12 8 GB Yes / No 7m 40s ============= ================ ================= ======== ============= Setup ~~~~~ Make sure to follow the :ref:`Installation ` page. Follow all the instructions for the dependencies. Regarding the optional dependencies, make sure to follow the instruction for ONNX Runtime and the virtual environment, optionally set up RaiSim. To build all MPC-Net packages, build the meta package: .. code-block:: bash cd catkin_build ocs2_mpcnet # Example: cd ~/catkin_ws catkin_build ocs2_mpcnet To build a robot-specific package, replace :code:`` with the robot name: .. code-block:: bash cd catkin_build ocs2__mpcnet # Example: cd ~/catkin_ws catkin_build ocs2_ballbot_mpcnet Training ~~~~~~~~ To train an MPC-Net policy, run: .. code-block:: bash cd /ocs2_mpcnet/ocs2__mpcnet/python/ocs2__mpcnet source /devel/setup.bash source /mpcnet/bin/activate python3 train.py # Example: cd ~/git/ocs2/ocs2_mpcnet/ocs2_ballbot_mpcnet/python/ocs2_ballbot_mpcnet source ~/catkin_ws/devel/setup.bash source ~/venvs/mpcnet/bin/activate python3 train.py To monitor the training progress with Tensorboard, run: .. code-block:: bash cd /ocs2_mpcnet/ocs2__mpcnet/python/ocs2__mpcnet source /mpcnet/bin/activate tensorboard --logdir=runs # Example: cd ~/git/ocs2/ocs2_mpcnet/ocs2_ballbot_mpcnet/python/ocs2_ballbot_mpcnet source ~/venvs/mpcnet/bin/activate tensorboard --logdir=runs If you use RaiSim, you can visualize the data generation and policy evaluation rollouts with RaiSim Unity, where pre-built executables are provided in RaiSim's :code:`raisimUnity` folder. For example, on Linux run: .. code-block:: bash /raisimUnity/linux/raisimUnity.x86_64 # Example: ~/git/raisimLib/raisimUnity/linux/raisimUnity.x86_64 Deployment ~~~~~~~~~~ To deploy the default policy stored in the robot-specific package's :code:`policy` folder, run: .. code-block:: bash cd source devel/setup.bash roslaunch ocs2__mpcnet _mpcnet.launch # Example: cd ~/catkin_ws source devel/setup.bash roslaunch ocs2_ballbot_mpcnet ballbot_mpcnet.launch To deploy a new policy stored in the robot-specific package's :code:`python/ocs2__mpcnet/runs` folder, replace :code:`` with the absolute file path to the final policy and run: .. code-block:: bash cd source devel/setup.bash roslaunch ocs2__mpcnet _mpcnet.launch policyFile:= # Example: cd ~/catkin_ws source devel/setup.bash roslaunch ocs2_ballbot_mpcnet ballbot_mpcnet.launch policyFile:='/home/user/git/ocs2/ocs2_mpcnet/ocs2_ballbot_mpcnet/python/ocs2_ballbot_mpcnet/runs/2022-04-01_12-00-00_ballbot_description/final_policy.onnx' How to Set Up a New Robot ~~~~~~~~~~~~~~~~~~~~~~~~~ Setting up MPC-Net for a new robot is relatively easy, as the **ocs2_mpcnet_core** package takes care of the data generation as well as policy evaluation rollouts and implements important learning components, such as the memory, policy, and loss function. This section assumes that you already have the packages for the robot-specific MPC implementation: 1. **ocs2_**: Provides the library with the robot-specific MPC implementation. 2. **ocs2__ros**: Wraps around the MPC implementation with ROS to define ROS nodes. 3. **ocs2__raisim**: (Optional) interface between the robot-specific MPC implementation and RaiSim. For the actual **ocs2__mpcnet** package, follow the structure of existing robot-specific MPC-Net packages. The most important classes/files that have to be implemented are: * **MpcnetDefinition**: Defines how OCS2 state variables are transformed to the policy observations. and how policy actions are transformed to OCS2 control inputs. * **MpcnetInterface**: Provides the interface between C++ and Python, allowing to exchange data and policies. * **.yaml**: Stores the configuration parameters. * **mpcnet.py**: Adds robot-specific methods, e.g. implements the tasks that the robot should execute, for the MPC-Net training. * **train.py**: Starts the main training script. Known Issues ~~~~~~~~~~~~ Stiff inequality constraints can lead to very large Hamiltonians and gradients of the Hamilltonian near the log barrier. This can obstruct the learning process and the policy might not learn something useful. In that case, enable the gradient clipping in the robot's MPC-Net YAML configuration file and tune the gradient clipping value. References ~~~~~~~~~~ This part of the toolbox has been developed based on the following publications: .. bibliography:: :list: enumerated carius2020mpcnet reske2021imitation