Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ClipValue and ClipNorm #1133

Merged
merged 13 commits into from
May 15, 2020
Merged

add ClipValue and ClipNorm #1133

merged 13 commits into from
May 15, 2020

Conversation

AStupidBear
Copy link
Contributor

No description provided.

@bhvieira
Copy link
Contributor

Keras has these, good to have it on Flux as well

@AStupidBear
Copy link
Contributor Author

@bhvieira The docstrings are simpified now.

@DhairyaLGandhi
Copy link
Member

Please add them to the tests as well

@mcabbott
Copy link
Member

mcabbott commented Apr 21, 2020

Should both of these be called clipping, given that one clamps (only components outside range) and the other scales (all components)? Also they act on the gradient not the value. Maybe ClampGrad and, I don't know, LimitGradNorm, BoundGrad?

Also: is the norm of the whole array always what you want? If this is some batch of observations, then it may make more sense to scale each of them by its own norm. In which case this would need to take dims somehow.

@HenriDeh
Copy link
Contributor

HenriDeh commented Apr 21, 2020

norm uses scalar getindex. This throws an error when working with CuArrays. eg:

norm(cu(rand(10,10)), 2)

The issue comes from CuArrays. However, norm(cu(rand(10,10))) is implemented.

@AStupidBear
Copy link
Contributor Author

Should both of these be called clipping, given that one clamps (only components outside range) and the other scales (all components)? Also they act on the gradient not the value. Maybe ClampGrad and, I don't know, LimitGradNorm, BoundGrad?

Also: is the norm of the whole array always what you want? If this is some batch of observations, then it may make more sense to scale each of them by its own norm. In which case this would need to take dims somehow.

Should both of these be called clipping, given that one clamps (only components outside range) and the other scales (all components)? Also they act on the gradient not the value. Maybe ClampGrad and, I don't know, LimitGradNorm, BoundGrad?

Also: is the norm of the whole array always what you want? If this is some batch of observations, then it may make more sense to scale each of them by its own norm. In which case this would need to take dims somehow.

I think it's better to just follow Keras's convention. Gradients are not batched, so...

@AStupidBear
Copy link
Contributor Author

norm uses scalar getindex. This throws an error when working with CuArrays. eg:

norm(cu(rand(10,10)), 2)

The issue comes from CuArrays. However, norm(cu(rand(10,10))) is implemented.

@HenriDeh Thanks for pointing that out.

@mcabbott
Copy link
Member

Gradients are not batched, so...
Gradients are not batched, so...

Oh sorry, this only applies to parameter updates, not to gradients in general. Which is obvious in the code, apply!, but not in the present docstrings.

@AStupidBear
Copy link
Contributor Author

Since this PR is just a simple feature add-on, how about merging this?

@DhairyaLGandhi
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Apr 25, 2020
@bors
Copy link
Contributor

bors bot commented Apr 25, 2020

try

Build succeeded:

@CarloLucibello
Copy link
Member

We need usage examples in the docstrings and we should add the two to the docs/

Project.toml Show resolved Hide resolved
@CarloLucibello
Copy link
Member

this seems good to go, thanks

bors r+

@bors
Copy link
Contributor

bors bot commented May 15, 2020

Build succeeded:

@bors bors bot merged commit b6a5dd7 into FluxML:master May 15, 2020
@darsnack darsnack mentioned this pull request Feb 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants