Pytorch implementations of reinforcement learning. Policy gradient methods (Vanilla pg, Actor Critic, PPO). Generative adversial imitation learning.