Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

Changed

Fixed

v0.2.0

Added

Introduce cherry.nn.Policy, cherry.nn.ActionValue, and cherry.nn.StateValue.
Algorithm class utilities for: A2C, PPO, TRPO, DDPG, TD3, SAC, and DrQ/DrQv2.
DMC examples for SAC, DrQ, and DrQv2.
N-steps returns sampling in ExperienceReplay.

Changed

Discontinue most of cherry.wrappers.

Fixed

Fixes return value of StateNormalizer and RewardNormalizer wrappers.
Requirements to generate docs.

v0.1.4

Fixed

Support for torch 1.5 and new _parse_to behavior in ExperienceReplay. (thanks @ManifoldFR)

v0.1.3

Added

A CHANGELOG.md file.

Changed

Travis testing with different versions of Python (3.6, 3.7), torch (1.1, 1.2, 1.3, 1.4), and torchvision (0.3, 0.4, 0.5).

Fixed

fix bug in torch_wrapper when use GPU by callling Tensor.cpu().detach().numpy() to convert CUDA tensor to numpy.(@walkacross)
Bugfix when using td.discount with replays coming from vectorized environments (@galatolofederico)
env.action_size and env.state_size when the number of vectorized environments is 1. (thanks @galatolofederico)
Actor-critic integration test being to finicky.
cherry.onehot support for numpy's float and integer types. (thanks @ngoby)