Divergence-augmented policy optimization

Publication
Advances in Neural Information Processing Systems