This repository provides the implementation of GAIfO in the paper---On the Guaranteed Almost Equivalence between Imitation Learning from Observation and Demonstration. Our code is mainly based on OpenAI Baselines.
Our code follows the framework of OpenAI Baselines, so you need to install OpenAI Baselines at first following the instructions here.
NOTE: All of our experiments use MuJoCo (multi-joint dynamics in contact) physics simulator, so the installation of MuJoCo is necessary. Instructions on setting up MuJoCo can be found here.
After the successful installations of OpenAI Baselines and MuJoCo, copy the folder ./gaifo
into baselines/baselines/
.
Download the expert data here. Then copy the expert data ./expert_data/Hopper-v2.npz
into the folder baselines/data
.
The variable OPENAI_LOGDIR specifies the path to save the log file. The progress.csv which monitors the training process will be stored in this path.
export OPENAI_LOGDIR=path_to_save_log_file
Run with single rank:
python -m baselines.gaifo.run_mujoco
Run with multiple ranks:
mpirun -np 8 python -m baselines.gaifo.run_mujoco
A complete example of training GAIfO for Hopper-v2 is given as follows:
mpirun -np 8 python -m baselines.gaifo.run_mujoco --env_id=Hopper-v2 --expert_path data/Hopper-v2.npz --traj_limitation 20 --seed=0 --g_step=3 --d_step=1 --num_timesteps=5000000
See help (-h
) for more options.
If you find this repository or our paper useful, please cite it in your publications.
@ARTICLE{9509344,
author={Cheng, Zhihao and Liu, Liu and Liu, Aishan and Sun, Hao and Fang, Meng and Tao, Dacheng},
journal={IEEE Transactions on Neural Networks and Learning Systems},
title={On the Guaranteed Almost Equivalence Between Imitation Learning From Observation and Demonstration},
year={2021},
volume={},
number={},
pages={1-13},
doi={10.1109/TNNLS.2021.3099621}
}
Thanks to the open source:
- @openai/baselines
The MIT License