Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRA: Update resourceslice controller filtering logic #128000

Merged
merged 1 commit into from
Oct 14, 2024

Conversation

klueska
Copy link
Contributor

@klueska klueska commented Oct 11, 2024

The logic has been updated to ensure that a controller started for non-node-local resources filters out all resourceslices created for node-local resources. Without this change, a single driver with both node-local and non-node-local resources would end up in a constant battle of creating and deleting node-local resource slices in the controller it setup for its non-node-local resources. This change fixes that.

/kind bug
/wg device-management

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 11, 2024
@k8s-ci-robot k8s-ci-robot requested review from bart0sh and pohly October 11, 2024 10:29
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/node Categorizes an issue or PR as relevant to SIG Node. approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 11, 2024
@klueska
Copy link
Contributor Author

klueska commented Oct 11, 2024

/cc @pohly

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 11, 2024
@klueska
Copy link
Contributor Author

klueska commented Oct 11, 2024

/hold
Since I have approval status

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 11, 2024
@pohly
Copy link
Contributor

pohly commented Oct 11, 2024

/wg device-management

@k8s-ci-robot k8s-ci-robot added the wg/device-management Categorizes an issue or PR as relevant to WG Device Management. label Oct 11, 2024
@pohly
Copy link
Contributor

pohly commented Oct 11, 2024

"gofmt" failed.

@klueska klueska force-pushed the fix-resourceslice-filter branch from d321add to 1470734 Compare October 12, 2024 20:22
The logic has been updated to ensure that a controller started for
non-node-local resources filters out all resourceslices created for
node-local resources. Without this change, a single driver with both
node-local and non-node-local resources would end up in a constant
battle of creating and deleting node-local resource slices in the
controller it setup for its non-node-local resources. This change fixes
that.

Signed-off-by: Kevin Klues <[email protected]>
@klueska klueska force-pushed the fix-resourceslice-filter branch from 1470734 to cfd6037 Compare October 12, 2024 20:30
@bart0sh
Copy link
Contributor

bart0sh commented Oct 13, 2024

/king bug
/triage accepted
/priority important-soon

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 13, 2024
@klueska
Copy link
Contributor Author

klueska commented Oct 13, 2024

@pohly do you know anything about the erros in the dra tests:

+ kubetest2 noop --test=node -- --repo-root=. --gcp-zone=us-west1-b --parallelism=1 '--label-filter="Feature: containsAny DynamicResourceAllocation && Feature: isSubsetOf DynamicResourceAllocation && !Flaky && !Slow"' '--test-args=--feature-gates="DynamicResourceAllocation=true" --service-feature-gates="DynamicResourceAllocation=true" --runtime-config=api/alpha=true,api/beta=true --container-runtime-endpoint=unix:///var/run/crio/crio.sock --container-runtime-process-name=/usr/local/bin/crio --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/crio.service --kubelet-cgroups=/system.slice/kubelet.service" --extra-log="{\"name\": \"crio.log\", \"journalctl\": [\"-u\", \"crio\"]}"' --image-config-file=/home/prow/go/src/k8s.io/test-infra/jobs/e2e_node/crio/latest/image-config-cgroupv1-serial.yaml
Error: unknown flag: --label-filter

@pohly
Copy link
Contributor

pohly commented Oct 14, 2024

It looks like --label-filter is being passed to kubetest2? That's wrong, it must be set as test flags that kubetest2 then passes on to Ginkgo.

@pohly
Copy link
Contributor

pohly commented Oct 14, 2024

See kubernetes/test-infra#33550

@pohly
Copy link
Contributor

pohly commented Oct 14, 2024

@bart0sh
Copy link
Contributor

bart0sh commented Oct 14, 2024

@klueska I believe it's safe to ignore -kubetest2 failures for now as it's ongoing work to move jobs to use kubetest2. Here is a tracking issue: kubernetes/test-infra#32567

@klueska
Copy link
Contributor Author

klueska commented Oct 14, 2024

I will wait until that fix for --label-filter gets merged. I want to make sure that the DRA tests don't break with the change I introduced. I can't imagine they would, but this change isn't urgent anyway.

@pohly
Copy link
Contributor

pohly commented Oct 14, 2024

We are still testing with the non-kubetest2 jobs (pull-kubernetes-node-e2e-crio-cgrpv1-dra , pull-kubernetes-node-e2e-crio-cgrpv2-dra), which are green.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 8ecc9498f1818dbe38735587268501303f432829

@pohly
Copy link
Contributor

pohly commented Oct 14, 2024

/hold cancel

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: klueska, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 14, 2024
@pacoxu
Copy link
Member

pacoxu commented Oct 14, 2024

/kind bug
/skip

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. and removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Oct 14, 2024
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Oct 14, 2024

@klueska: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-node-e2e-crio-cgrpv1-dra-kubetest2 cfd6037 link false /test pull-kubernetes-node-e2e-crio-cgrpv1-dra-kubetest2
pull-kubernetes-node-e2e-crio-cgrpv2-dra-kubetest2 cfd6037 link false /test pull-kubernetes-node-e2e-crio-cgrpv2-dra-kubetest2

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@pacoxu
Copy link
Member

pacoxu commented Oct 14, 2024

/test pull-kubernetes-unit
flake

@k8s-ci-robot k8s-ci-robot merged commit faf89fe into kubernetes:master Oct 14, 2024
21 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.32 milestone Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.
Development

Successfully merging this pull request may close these issues.

5 participants