Skip to content

Fix Numerical Errors; Improve PER

Latest
Compare
Choose a tag to compare
@kengz kengz released this 27 Apr 02:47
· 9 commits to master since this release

Improvements/Bug Fixes

Misc

PR #131

  • fix overflow error in np.exp of SoftmaxPolicy, BoltzmannPolicy by casting to float64 instead of float32
  • improve overall np.isfinite asserts
  • remove index after reset in *analysis.csv
  • remove unused specs
  • reorganize and expand test specs
  • guard continuous action value range in continuous policies
  • fix analytics param variable sourcing

DDPG

PR: #131

  • add EpsilonGreedyNoisePolicy

PER

PR: #131

  • add memory.update(errors) throughout all agents
  • add shape assert for Q values and errors throughout
  • auto max_mem_len as max_timestep * max_epis/3 if not specified
  • put the missing abs for init reward