Further readings:
- Actually proving the policy gradient for discounted rewards - article
- On variance of policy gradient and optimal baselines: article, another article
Based on Practical_RL week07
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
Further readings:
Based on Practical_RL week07