-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removing labels when node NotReady #933
Removing labels when node NotReady #933
Conversation
Hi @TomerNewman. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for kubernetes-sigs-kmm ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
/ok-to-test |
modifiedNode := node.DeepCopy() | ||
patchFrom := client.MergeFrom(modifiedNode) | ||
for label := range modifiedNode.GetLabels() { | ||
if ok, _, _ := utils.IsKernelModuleReadyNodeLabel(label); ok || | ||
utils.IsDeprecatedKernelModuleReadyNodeLabel(label) { | ||
delete(node.ObjectMeta.Labels, label) | ||
} | ||
} | ||
if err := r.client.Patch(ctx, &node, patchFrom); err != nil { | ||
return ctrl.Result{}, fmt.Errorf("could not patch node %s: %v", node.Name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we use r.nodeAPI.UpdateLabels
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we use
r.nodeAPI.UpdateLabels
instead?
Good idea
e668a21
to
bb7d3dd
Compare
if !r.nodeAPI.IsNodeSchedulable(&node) { | ||
var labelsToRemove []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tomer, wdyt about moving this whole if section to a nodeAPI? This code does not require any data from nmc, it just looks at the node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
bb7d3dd
to
3098b4b
Compare
internal/node/node.go
Outdated
NodeBecomeReadyAfter(node *v1.Node, checkTime metav1.Time) bool | ||
RemoveAllKmodLabels(ctx context.Context, node *v1.Node) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemoveAllKmodLabels(ctx context.Context, node *v1.Node) error | |
RemoveKmodReadyLabels(ctx context.Context, node *v1.Node) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would suggest renaming it "RemoveNodeReadyLabels", since it is not related to a specific kmod, but to any kmod on the node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would suggest renaming it "RemoveNodeReadyLabels", since it is not related to a specific kmod, but to any kmod on the node
RemoveNodeReadyLabels
is not to general? I think for example a node can have a label for his Ready status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, then RemoveNodeKmodReadyLabels. @ybettan wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go with RemoveNodeReadyLabels
.
There is no "ready" label on the node by default:
$ kubectl get node/minikube -o yaml | yq '.metadata.labels'
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
kubernetes.io/hostname: minikube
kubernetes.io/os: linux
minikube.k8s.io/commit: 5883c09216182566a63dff4c326a6fc9ed2982ff
minikube.k8s.io/name: minikube
minikube.k8s.io/primary: "true"
minikube.k8s.io/updated_at: 2024_11_25T11_06_54_0700
minikube.k8s.io/version: v1.33.1
node-role.kubernetes.io/control-plane: ""
node.kubernetes.io/exclude-from-external-load-balancers: ""
The "ready" on the node is a condition
and not a label:
$ kubectl get node/minikube -o yaml | yq '.status.conditions[] | select (.type == "Ready")'
lastHeartbeatTime: "2024-11-25T09:07:08Z"
lastTransitionTime: "2024-11-25T09:07:08Z"
message: kubelet is posting ready status
reason: KubeletReady
status: "True"
type: Ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I changed it to RemoveNodeReadyLabels
then.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #933 +/- ##
==========================================
- Coverage 79.09% 72.65% -6.45%
==========================================
Files 51 65 +14
Lines 5109 5756 +647
==========================================
+ Hits 4041 4182 +141
- Misses 882 1395 +513
+ Partials 186 179 -7 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
3098b4b
to
bb35bc3
Compare
Noticed, added another unit-test to catch the failing scenario. |
Until now nodes kept their kmod labels when became not ready. Today we are deleting them when they are not ready due to potential race condition.
bb35bc3
to
a2a4f0b
Compare
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: TomerNewman, ybettan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Until now, nodes retained their kmod labels even when they became NotReady for any reason (e.g., a reboot).
Today, we are removing these labels when a node is NotReady to address a potential race condition.
Besides updating the unit tests, I also tested this manually with simple-kmod, and it works as expected (label was deleted when Node became NotReady).
Note
I am also checking whether to remove the label using
IsDeprecatedKernelModuleReadyNodeLabel
. If this is not relevant, please let me know, and I will update it accordingly./cc @yevgeny-shnaidman @ybettan