-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconcile ClusterCIDRs with Finalizers #37
Conversation
@sarveshr7 @ameukam could you ptal? |
@@ -1090,12 +1091,10 @@ func (r *multiCIDRRangeAllocator) reconcileCreate(ctx context.Context, clusterCI | |||
defer r.lock.Unlock() | |||
|
|||
logger := klog.FromContext(ctx) | |||
if needToAddFinalizer(clusterCIDR, clusterCIDRFinalizer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem here is that we were doing the assumption that "finalizer" was the indicator to create or update.
Independently, before Create or Update, we need to check the new object we are building against the old object (this is missing), and then decide if we want to create or update. Regarding PATCH vs PUT, since these objects are only managed by this controller entirely, we can use PUT to move the object to the state we just built
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is since ClusterCIDR has all fields marked as immutable the only update we do is we set the finalizer.
We already check if the finalizer exists. Previously, we skipped the reconciliation and adding a ClusterCIDR to the cidrMap in this case that caused the bug.
With the current fix the finalizer is always added (if needed). We create a ClusterCIDR only when it does not have the ResourceVersion which seems to me a better indicator if the resource has already been processed and stored in etcd.
I understand that it is not the best approach and would maybe add a tuple to the queue: namespaced name and operation, such that we would know exact operation and would process the CIDR accordingly.
Can provide a PoC with the latter approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apologize because I only skimmed through the code and my comment can be inaccurate.
I think this is ok, just wanted to say that reconcilation seems to depend on more things than the finalizer, hence, we need to process the entire object.
I understand that it is not the best approach and would maybe add a tuple to the queue: namespaced name and operation, such that we would know exact operation and would process the CIDR accordingly.
Can provide a PoC with the latter approach.
that will make this an event based instead of level based controller, we don't want to do that.
What I try to mean, is that when you need to Update, you have the old and the new object, so you can compare them and decide if you want to update or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aojea ptal.
I added the check reflect.DeepEqual(old, new)
to the update func. With this change reconciliation only happens when there was an actual change in ClusterCIDR. Which should never happen because all ClusterCIDR fields are immutable. The only valid case I see is the metadata change (for example adding finalizers) that should not affect networking.
Other misc changes - logs improvements and ignoring duplicates.
Changed test to print test cases separately:
--- PASS: TestSyncClusterCIDRCreate (0.00s)
--- PASS: TestSyncClusterCIDRCreate/valid_IPv4_ClusterCIDR_with_no_NodeSelector (0.00s)
--- PASS: TestSyncClusterCIDRCreate/valid_IPv4_ClusterCIDR_with_NodeSelector (0.00s)
--- PASS: TestSyncClusterCIDRCreate/valid_IPv4_ClusterCIDR_with_overlapping_CIDRs (0.00s)
--- PASS: TestSyncClusterCIDRCreate/valid_IPv6_ClusterCIDR_with_no_NodeSelector (0.00s)
--- PASS: TestSyncClusterCIDRCreate/valid_IPv6_ClusterCIDR_with_NodeSelector (0.00s)
--- PASS: TestSyncClusterCIDRCreate/valid_IPv6_ClusterCIDR_with_overlapping_CIDRs (0.00s)
--- PASS: TestSyncClusterCIDRCreate/valid_Dualstack_ClusterCIDR_with_no_NodeSelector (0.00s)
--- PASS: TestSyncClusterCIDRCreate/valid_DualStack_ClusterCIDR_with_NodeSelector (0.00s)
--- PASS: TestSyncClusterCIDRCreate/valid_Dualstack_ClusterCIDR_with_overlapping_CIDRs (0.00s)
--- PASS: TestSyncClusterCIDRCreate/invalid_ClusterCIDR_with_both_IPv4_and_IPv6_CIDRs_nil (0.00s)
--- PASS: TestSyncClusterCIDRCreate/invalid_IPv4_ClusterCIDR (0.00s)
--- PASS: TestSyncClusterCIDRCreate/invalid_IPv6_ClusterCIDR (0.00s)
--- PASS: TestSyncClusterCIDRCreate/invalid_dualstack_ClusterCIDR (0.00s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, just use apiequality.Semantic.DeepEqual
https://github.com/search?q=repo%3Akubernetes%2Fkubernetes%20apiequality.Semantic.DeepEqual&type=code
It differentiates between nil and empty per example https://github.com/kubernetes/kubernetes/blob/8f8c94a04d00e59d286fe4387197bc62c6a4f374/staging/src/k8s.io/apimachinery/pkg/api/equality/semantic.go#L27-L28
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, didn't know about it. Thanks!
@mneverov please rebase and sorry for the late response, totally slipped, feel free to ping me in slack if you need reviews /lgtm |
708c9c4
to
a95f139
Compare
50e54e9
to
d6564d5
Compare
…append ClusterCIDRs if they are already present in the cidrMap which makes createClusterCIDR idempotent.
1df66e0
to
65d40ed
Compare
/lgtm Thanks |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, mneverov The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Currently, when a finalizer is present a ClusterCIDR reconciliation (create/update) is skipped. Since users can create ClusterCIDRs with finalizers already in the ClusterCIDR definition it leads to problem described in #17.
This patch makes the following changes:
Performance concerns:
Moving finalizer logic from
createClusterCIDR
requires changes insyncClusterCIDR
signature since if it does not return an error cidrQueue forgets the event. I'll do it in a separate PR to keep this small.Fixes #17