Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net.cpp now allows zero-sized batches #2053

Closed

Conversation

mtamburrano
Copy link
Contributor

This is the new PR based on old #1484 which was not mergeable anymore

Old description:

Implemented a way to allow zero-sized batches as discussed in #1448 with @sguada and @longjon (point 5).
The net now checks, before forwarding or backwarding anything, if some blob has num == 0, if so forward and backward are denyied and > all subsequent layers are reshaped to avoid forward and bacward to those, too.

@bhack
Copy link
Contributor

bhack commented Mar 13, 2015

@jeffdonahue I see that somebody is tagging PR as "ready for review". Can you tag also this one?

@cdoersch
Copy link
Contributor

I also have a use case for this PR, so I vote for making caffe aware of this.

However, force-reshaping all the top blobs doesn't quite seem right to me; it seems to me like this would be better handled when reshape gets called on each layer, right before the forward pass. That way, the layer can decide for itself if it wants all the top blobs to have zero batch size. Then the forward/backward would only be skipped if all bottom and top blobs have zero batch size.

Another potential improvement to this PR is to make the solvers aware of backwardIsAllowed() and not update the associated parameters. Right now, it seems like momentum and weight decay would still be applied.

@mtamburrano
Copy link
Contributor Author

rebased on Master

@bhack
Copy link
Contributor

bhack commented Jun 3, 2015

@jeffdonahue This is the last one of the triplet for filter_layer. Can you pass here?

@bhack
Copy link
Contributor

bhack commented Jan 15, 2016

Ping

@jeffdonahue
Copy link
Contributor

Agreed with @cdoersch -- this is a bit too aggressive in its assumptions about what each layer might want to do in the event of size 0 batches (for example, the output shouldn't necessarily have the same shape as the input as assumed here, and often doesn't), and should probably operate on the level of individual layers rather than the net. Furthermore the net itself doesn't and shouldn't have any global notion of batch size, and the assumption that "batch size" is the 0th axis of each blob in the net isn't valid (at least, not anymore). On a more mundane note, code in net.cpp should certainly not use the legacy dimension calls (num/channels/height/width), and probably should never assume any particular shape layout.

@seanbell
Copy link

A more general solution would be to skip blobs with 0 entries, since any blob with 0 along its first axis has 0 total entries. One open question would be how to resize all the blobs after the 0-sized blob, if at all.

Another thought: ForwardIsAllowed() seems to suggest that it acts like a getter. However, calling this function changes state -- it destructively resizes the output blobs. It might be worth renaming this function, or splitting them into two functions, to make the consequences of calling this function more obvious.

@hyojinie
Copy link

hyojinie commented Feb 28, 2017

Could anyone tell me how empty top/bottom blobs are handled right now? Is forward/backward prevented for 0 size batches? All I know is that when I feed in empty bottoms (as a result of the FilterLayer) to a loss layer (SoftmaxWithLoss) Caffe crashes with cudnn bad param error.

@cypof
Copy link
Member

cypof commented Apr 14, 2017

@hyojinie unlikely to resume this

@cypof cypof closed this Apr 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants