Abubakar Aliyu BADAWI
University of Toulon
This project explores advanced techniques in deep reinforcement learning, including Imitation Learning, Deep Q-Networks (DQN), and Proximal Policy Optimization (PPO). The objective is to implement these techniques, test different architectures, tweak hyperparameters, and evaluate their performance on various tasks and environments.
Exploring machine learning models that mimic expert behavior to perform tasks in a simulated driving environment using CNN and MLP networks.
- Architectures: CNN and MLP policy networks.
- Hyperparameter Tuning: Effects of batch sizes and training epochs.
- DAgger: Enhancing model training using Dataset Aggregation.
- Performance Evaluation: Comparing models with and without DAgger.
Utilizing DQN for solving control tasks in different settings: MiniGrid and Pong environments.
- Architectures: Testing MLP and CNN in MiniGrid.
- Hyperparameter Tuning: Modifying epochs and learning rates to observe changes in performance.
- Pong Environment: Challenges due to GPU limitations and adjustments in training episodes.
Focusing on the PPO algorithm to refine the policy gradient approach, aiming to improve training stability and efficiency in a simulated BipedalWalker-v3 environment.
- Hyperparameter Tuning: Varying the number of episodes and weight decay parameters.
- Performance Metrics: Observing the impact of changes on rewards, episode lengths, and entropy.
This repository contains the implementation and results of various deep reinforcement learning algorithms, including Imitation Learning, Deep Q-Networks (DQN), and Proximal Policy Optimization (PPO). The main focus of these experiments was to explore the effects of different architectures and hyperparameter settings on the performance of models in simulated environments.
report.pdf
: A comprehensive report detailing the methodology, experiments, and findings.code/
: Directory containing the source code used for all experiments.Images/
: Contains all the plots generated during the experiments.
The plots are stored in the Images
folder and are referenced in the report. Here is how you can view them directly from GitHub:
- Ensure you have Python 3.x installed.
- Install the necessary dependencies as listed in
requirements.txt
. - Run the scripts in the
code/
directory to reproduce the experiments.
Feel free to fork this repository and submit pull requests to contribute to this project. You can also open an issue if you find any bugs or have suggestions for additional experiments.
This project is open-sourced under the MIT license. See the LICENSE file for more details.
The project highlights the significance of architecture selection and hyperparameter tuning in deep reinforcement learning. Each technique and modification provided valuable lessons on the models' behavior and performance in complex environments.
Special thanks to Prof. J. Arjona-Medina and the University of Toulon for guidance and resources throughout this research.