Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rb created by ResourceDetector is overwriten by DependenciesDistributor #5929

Open
lianzhanbiao opened this issue Dec 9, 2024 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@lianzhanbiao
Copy link

What happened:
When using Karmada, I discovered that in certain scenarios, the ResourceBinding created by the ResourceDetector has its replicas modified when updated by the DependenciesDistributor. The specific process is as follows:

  1. First, create a PropagationPolicy with propagateDeps: true.
  2. Create a Deployment with replicas=0 and add a persistentVolumeClaim volume.
  3. Update the annotations of the Deployment.
  4. Update replicas=5.

Upon checking the logs, I found that the spec.replicas of the ResourceBinding changed in the sequence 0->5->0. Ultimately, this value was inconsistent with the Deployment.
What you expected to happen:
I reviewed the relevant code and found that when the DependenciesDistributor updates the ResourceBinding, it does not set the replicas parameter for the ResourceBinding:

// dependencies_distributor.go->buildAttachedBinding
Spec: workv1alpha2.ResourceBindingSpec{
    Resource: workv1alpha2.ObjectReference{
        APIVersion:      object.GetAPIVersion(),
        Kind:            object.GetKind(),
        Namespace:       object.GetNamespace(),
        Name:            object.GetName(),
        ResourceVersion: object.GetResourceVersion(),
    },
    RequiredBy: result,
},

I would like to confirm if this could be the cause of the phenomenon described above?
How to reproduce it (as minimally and precisely as possible):

Environment:

  • Karmada version: 1.9
@lianzhanbiao lianzhanbiao added the kind/bug Categorizes issue or PR as related to a bug. label Dec 9, 2024
@XiShanYongYe-Chang
Copy link
Member

Hi @lianzhanbiao, thank you for your feedback. Did you encounter this error by chance, or did it happen every time you test it?

@lianzhanbiao
Copy link
Author

Hi @lianzhanbiao, thank you for your feedback. Did you encounter this error by chance, or did it happen every time you test it?

I encounter this error by chance.
I added some logs in ResourceDetector.BuildResourceBinding and DependenciesDistributor.SetupWithManager
image
image
In ResourceDetector.BuildResourceBinding, the replicas changed in the sequence 0->0->5.
In DependenciesDistributor.SetupWithManager, the replicas changed in the sequence 0->5->0.

I reviewed the code again, and it seems that the setupwithmanager function in DependenciesDistributor shouldn't affect the deployment's rb, but should only affect dependencies like the deployment's volume? I am confused about the phenomenon described above...

@XiShanYongYe-Chang
Copy link
Member

Let me try and have a see.

@XiShanYongYe-Chang
Copy link
Member

Hi @lianzhanbiao, sorry for reply late. You added the log and then it recurred, and it seems that this problem occurs frequently. What is the interval between steps 3 and 4? Besides, do you share your journal? Which version is your specific version?

@lianzhanbiao
Copy link
Author

lianzhanbiao commented Dec 26, 2024

Hi @lianzhanbiao, sorry for reply late. You added the log and then it recurred, and it seems that this problem occurs frequently. What is the interval between steps 3 and 4? Besides, do you share your journal? Which version is your specific version?

I1205 16:17:40.993931   77524 detector.go:837] set replicas for workload(f8-btif-3vqrriuz-0-preempt-online-126168), replicas=0
I1205 16:17:41.003928   77524 detector.go:837] set replicas for workload(f8-btif-3vqrriuz-0-preempt-online-126168), replicas=0
I1205 16:17:41.029818   77524 detector.go:837] set replicas for workload(f8-btif-3vqrriuz-0-preempt-online-126168), replicas=0
I1205 16:17:41.033789   77524 detector.go:837] set replicas for workload(f8-btif-3vqrriuz-0-preempt-online-126168), replicas=5

Some logs about my problem is above.
My karmada version is 1.9.

@lianzhanbiao
Copy link
Author

lianzhanbiao commented Dec 26, 2024

BTW, I solved my problem by setting ConcurrentResourceTemplateSyncs to 1 and PropagateDeps to False...

I suspect that there might be a bug in Karmada when creating RBs concurrently for workloads. Perhaps, in addition to setting the resync-period, Karmada could support other methods for tuning and inspection to address inconsistencies like the one mentioned above (e.g., implementing certain measures in the statuscontroller).
This suspicion arises because I recently observed that during high-frequency use of Karmada, there are cases where the spec and status of RBs remain inconsistent, such as:
image
image

@XiShanYongYe-Chang
Copy link
Member

Let's keep this issue open and see if we can get similar questions.

This suspicion arises because I recently observed that during high-frequency use of Karmada, there are cases where the spec and status of RBs remain inconsistent, such as:

The spec and status are not always consistent. The status changes based on the status of the member cluster and takes a certain period of time to check the content of the spec. Even when the cluster is faulty, the status is inconsistent for a long time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
Status: No status
Development

No branches or pull requests

2 participants