-
Notifications
You must be signed in to change notification settings - Fork 39.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scheduler race #11150
Fix scheduler race #11150
Conversation
Good catch - LGTM. |
GCE e2e build/test passed for commit 0728c08. |
Thanks @bprashanth great catch. I missed this, as nothing else was locked in this file:/ |
Does this need to cherry-pick to release-1.0 ? @zmerlynn FYI |
Waiting for OK tag before merge |
It's a (very) probable cause of max-pods e2e test failure. Without fixing this it's possible that scheduler will overallocate Nodes (it happens in the test). Locking of modeller is distributed, which makes changes around LockedAction slightly tricky though... |
Shippable looks like a unittest flake (https://app.shippable.com/builds/55a3cd28a4a0be0e007c1f22), test-go passes locally so I kicked it. Risk: low-medium, since it just adds a lock around a non blocking operation. overcommit nodes are notoriously hard to debug and the kubelet won't reject these pods either, so I think we should consider either this or #11161. @davidopp to make the call about cherry-pick |
LGTM Thanks for finding and fixing! I'll add this to my list of things to cherry-pick. |
Good find! |
…50-upstream-release-1.0 Automated cherry pick of #11150 upstream release 1.0
@davidopp Did this get merged to 1.0? |
The scheduler counts pods by combining the pods in the scheduled and assumed pod stores. The listing needs a to be synchronized with the informer.
/ref #10720
@davidopp @gmarek @lavalamp