Reinforcement Learning Continuous Control

Repo Table of Contents

Project Overview

For Project 2 of Udacity's Deep Reinforcement Learning Nanodegree, we were tasked with teaching an agent to maintain a target position using the "Reacher" environment configured by Udacity on Unity's ML-Agents platform.

For further information on the environment, see the accompanying project Report or Udacity's project github repo.

In this project, we explored a variety of policies to solve this continuous state space environment, including Deep Deterministic Policy Gradient DDPG, Distributed Distributional Deep Deterministic Policy Gradient D4PG, Poximal Policy Optimization PPO and Twin Delayed Deep Deterministic Policy Gradients TD3. We will use DDPG for our base implementation, but work in progress of the remaining policies are also available in this repo.

The algorithms are further explained in the accompanying Report.

Environment Setup

To set up the python (conda) environment, in the root directory, type:

conda env update --file=environment_drlnd.yml

This requires installation of OpenAI Gym and Unity's ML-Agents.

In the root directory, run python setup.py to set up directories and download specified environments. When running this file, make sure you have the full path to your root repo folder readily available (and end the input with a "/").

If you need to further review and access environment implementation, visit the project repo here.

The Model

The key files in this repo include:

Scripts

main.py Execute this script to train in the environment(s) and agent(s) specified on this script in the environment and agent dictionaries, respectively.

util.py Contains functions to train in Unity and OpenAI environments, and to chart results.

agents folder Contains agent classes as specified policies. See the accompanying Report for additional details on agent implementations.

To train the agent, first open main.py in your favorite text editor (ie nano main.py or vi main.py). Make sure the path to the root repo folder is correct and that the proper environments and agents (policies) are selected. Then, in the command line run:

source activate drlnd # to activate python (conda) environment python main.py # to train the environment and agent (policy)

Notebooks

rl2_results.ipynb

Charts the results from model results file.

Results

Contains the "checkpoint" model weights of each implementation.

Resources

The algorithms used in this project were inspired by a variety of sources and authors, including implementations from the following github handles:

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
charts		charts
notebooks		notebooks
results		results
scripts		scripts
.gitignore		.gitignore
README.md		README.md
environment_drlnd.yml		environment_drlnd.yml
report.md		report.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning Continuous Control

Repo Table of Contents

Project Overview

Environment Setup

The Model

Scripts

Notebooks

Results

Resources

About

Releases

Packages

Languages

cipher813/rl_continuous_control

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning Continuous Control

Repo Table of Contents

Project Overview

Environment Setup

The Model

Scripts

Notebooks

Results

Resources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages