This is PyTorch implementation of paper
Foerster, Jakob N., et al. "Counterfactual multi-agent policy gradients." Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
in the multiagent environment "findgoals" https://github.com/Bigpig4396/Multi-Agent-Reinforcement-Learning-Environment The discription of environment is in 'FindGoals.pdf'
You have to install opencv-python and pytorch to run the code. run 'COMA2.py' you will get the training curve like