-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
previous node taint is wiped when kubelet restarts #119645
Comments
/sig node |
/cc |
/triage accepted |
@klueska @dashpole @liggitt @derekwaynecarr @pacoxu PTAL. Thanks! |
I have encountered the same bug, is there a plan to fix it? |
I hope the community can fix this bug as soon as possible. |
I also met this before, is there a fix plan? |
I also meet this problem, I wonder if there's any restoration value |
I feel this is a bug that should be fixed, although the impact doesn't seem particularly significant. During the kubelet startup process, a check should be performed when setting the condition, rather than clearing the taints first. |
yes, I think this is a TOCTOU bug. maybe I will rethink how to fix it. previous pr cannot fix this issue. |
@Chaunceyctx can you reproduce it with v1.32? I cannot reproduce it with v1.32 locally. I tried v1.27 and I can reproduce it. I am not sure why. Can you give it a try? |
ok, I will try to reproduce it with v1.32 |
What happened?
I have a k8s cluster(v1.27.2) containing one node. I set
evictionHard: nodefs.available: 90%
and write a large amount of data to kubelet rootdir(used 8GB / total 10GB ) to trigger eviction.node.kubernetes.io/disk-pressure
taint was added to this node. But when kubelet restarted, previous disk-pressure taint was wiped weirdly. And pending pod is normally scheduled to run on the current node. Then I checked the kubelet logs and found:kubelet restart:
update node status NodeHasNoDiskPressure
eviction manager start to synchronize
Q: Why does kubelet report NodeHasNoDiskPressure ?
A: Eviction manager has not yet executed the synchronize method
What did you expect to happen?
previous node taint is not wiped
How can we reproduce it (as minimally and precisely as possible)?
Restart kubelet repeatedly after the disk pressure eviction is triggered. Observe
node.spec.taints
Anything else we need to know?
No response
Kubernetes version
1.27.2
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: