[P] Collection of Clean and Minimal Policy Gradient Algorithm Implementations(REINFORCE, NPG, TRPO, PPO on Unity3D& Mujoco Environments)

This repository contains PyTorch (v0.4.0) implementations of typical gradient (PG) algorithms.
URL : https://github.com/reinforcement-learning-kr/pg_travel

  • Vanilla Policy Gradient
  • Truncated Natural Policy Gradient
  • Trust Region Policy Optimization
  • Proximal Policy Optimization

https://i.redd.it/ayne4k5lwem11.gif

submitted by /u/hr_yang
[comments]



Source link
thanks you RSS link
( https://www.reddit.com/r/MachineLearning/comments/9g1uor/p__of_clean_and__policy_gradient/)

LEAVE A REPLY

Please enter your comment!
Please enter your name here