k8s-dns e2e test suite failing with exit status 1 at HEAD #646

DamianSawicki · 2024-10-06T18:26:48Z

pull-kubernetes-dns-test fails at HEAD (verified for the no-op PR #645) as below:

...
2024/10/06 16:17:58 test | 2024/10/06 16:17:53 sidecar started
2024/10/06 16:17:58 test | 2024/10/06 16:17:53 running `dig`
2024/10/06 16:17:58 test | 2024/10/06 16:17:53 Waiting for hits to be reported to be greater than 100
2024/10/06 16:17:58 test | 
2024/10/06 16:17:58 All tests passed
2024/10/06 16:17:58 docker [rmi -f k8s-dns-sidecar-e2e-test]
Running Suite: k8s-dns e2e test suite
=====================================
Random Seed: 1728231478
Will run 5 of 5 specs
2024/10/06 16:18:20 exit status 1
Ginkgo ran 1 suite in 21.764852525s
Test Suite Failed

This (most probably) blocks a vulnerability-fix PR #638 open since July for which tests are failing identically.

For the last merged PR #635 the test pull-kubernetes-dns-test passed, so apparently the tests or test infra must have changed in the meantime. For #638, the test failed identically on July 23rd, July 29th, and September 14th, so the issue seems to predate the August 2024 Prow migration.

The text was updated successfully, but these errors were encountered:

DamianSawicki · 2024-10-06T21:17:00Z

I think the failing test is defined in test/e2e/e2e_test.go in the present repo. This means it has not been modified since #635, so it is more of an infra thing.

When I tried to run the test locally, I got the message 2024/10/06 21:08:39 e2e test requires `sudo` to be active. Run `sudo -v` before running the e2e test., so perhaps it is a matter of permissions?

Also, in artifacts of the failed run, in the file podinfo.json, I've found the following:

				{
					"name": "test",
					"state": {
						"terminated": {
							"exitCode": 1,
							"reason": "Error",
							"message": " test | \n2024/10/06 16:17:58 All tests passed\n2024/10/06 16:17:58 docker [rmi -f k8s-dns-sidecar-e2e-test]\nRunning Suite: k8s-dns e2e test suite\n=====================================\nRandom Seed: \u001b[1m1728231478\u001b[0m\nWill run \u001b[1m5\u001b[0m of \u001b[1m5\u001b[0m specs\n\n2024/10/06 16:18:20 exit status 1\n\nGinkgo ran 1 suite in 21.764852525s\nTest Suite Failed\n\n\u001b[38;5;228mGinkgo 2.0 is coming soon!\u001b[0m\n\u001b[38;5;228m==========================\u001b[0m\n\u001b[1m\u001b[38;5;10mGinkgo 2.0\u001b[0m is under active development and will introduce several new features, improvements, and a small handful of breaking changes.\nA release candidate for 2.0 is now available and 2.0 should GA in Fall 2021.  \u001b[1mPlease give the RC a try and send us feedback!\u001b[0m\n  - To learn more, view the migration guide at \u001b[38;5;14m\u001b[4mhttps://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md\u001b[0m\n  - For instructions on using the Release Candidate visit \u001b[38;5;14m\u001b[4mhttps://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md#using-the-beta\u001b[0m\n  - To comment, chime in at \u001b[38;5;14m\u001b[4mhttps://github.com/onsi/ginkgo/issues/711\u001b[0m\n\nTo \u001b[1m\u001b[38;5;204msilence this notice\u001b[0m, set the environment variable: \u001b[1mACK_GINKGO_RC=true\u001b[0m\nAlternatively you can: \u001b[1mtouch $HOME/.ack-ginkgo-rc\u001b[0m\n+ EXIT_VALUE=1\n+ set +o xtrace\nCleaning up after docker in docker.\n================================================================================\nWaiting 30 seconds for pods stopped with terminationGracePeriod:30\nCleaning up after docker\nWaiting for docker to stop for 30 seconds\nStopping Docker: dockerProgram process in pidfile '/var/run/docker-ssd.pid', 1 process(es), refused to die.\n================================================================================\nDone cleaning up after docker in docker.\n{\"component\":\"entrypoint\",\"error\":\"wrapped process failed: exit status 1\",\"file\":\"sigs.k8s.io/prow/pkg/entrypoint/run.go:84\",\"func\":\"sigs.k8s.io/prow/pkg/entrypoint.Options.internalRun\",\"level\":\"error\",\"msg\":\"Error executing test process\",\"severity\":\"error\",\"time\":\"2024-10-06T16:19:10Z\"}\n",
							"startedAt": "2024-10-06T15:55:53Z",
							"finishedAt": "2024-10-06T16:19:10Z",
							"containerID": "containerd://302c6068cdfb4c64dd8aafb8b56a4f61083e252a3c594e89249c2a568e443000"
						}
					},
					"lastState": {},
					"ready": false,
					"restartCount": 0,
					"image": "gcr.io/k8s-staging-test-infra/kubekins-e2e:v20240923-c8645c1a17-master",
					"imageID": "gcr.io/k8s-staging-test-infra/kubekins-e2e@sha256:c5cf57a29e78a568ecf90a3b5b4df6b2afd5245c97edda91759e3e07f2330ba7",
					"containerID": "containerd://302c6068cdfb4c64dd8aafb8b56a4f61083e252a3c594e89249c2a568e443000",
					"started": false
				}

which mentions kubekins-e2e, which seems to be deprecated.

DamianSawicki · 2024-10-08T18:07:10Z

Hey @BenTheElder, I found you among the owners of kubekins-e2e mentioned above. Would you be able to look at the comments above and possibly share some advice?

BenTheElder · 2024-10-08T18:55:59Z

I don't work in this repo, but kubekins-e2e is an image we use currently to run some CI in the kubernetes project. It has a grab bag of tools like docker. Any other usage is best-effort.

podinfo.json is the pod in which we executed the PR tests. for more see https://docs.prow.k8s.io/docs/jobs/ and https://github.com/kubernetes/test-infra (config/)

BenTheElder · 2024-10-08T18:58:34Z

unless this project opted into it, the pod most likely ran as root, but it's hard to know without tracing the job specifics, e.g. you may have scheduled the test into the cluster under test (Which is NOT the cluster we use to run CI, that just executes the CI workloads, which then create disposable test clusters)

seems to predate the August 2024 Prow migration.

that migration was for the control plane. migrating the workloads was done prior to this, and varies by workload.

you can find this job's definition in the test-infra repo and see the git history there.

we're currently approach KEP Freeze, and I will be out for a few days after that, so time is tight this week 😅

DamianSawicki · 2024-10-09T09:24:06Z

Ben, thank you very much for your responses!

@VikashLNU @zhangguanzhang You can have a look at the comments above to try to unblock the PR #638 you're interested in.

zhangguanzhang · 2024-10-10T00:59:21Z

Ben, thank you very much for your responses!

@VikashLNU @zhangguanzhang You can have a look at the comments above to try to unblock the PR #638 you're interested in.

I don't see how to resolve the issue, but once someone fixes the CI build problem, I can rebase my code onto the master branch and push it.

dereknola · 2024-11-13T17:16:36Z

We should be good to close this issue now. #651 Addressed it.

DamianSawicki · 2024-11-13T19:22:40Z

Yeah, thank you very much again, @dereknola!

This was referenced Oct 6, 2024

Bump k8s.io/kubernetes to fix vulnerabilities #638

Closed

Send SIGUSR1 to dnsmasq periodically #644

Open

DamianSawicki mentioned this issue Oct 22, 2024

Bump coredns and kubernetes dependencies #649

Merged

dereknola mentioned this issue Oct 22, 2024

Update e2e test images, fix test timeouts #651

Merged

DamianSawicki closed this as completed Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k8s-dns e2e test suite failing with exit status 1 at HEAD #646

k8s-dns e2e test suite failing with exit status 1 at HEAD #646

DamianSawicki commented Oct 6, 2024

DamianSawicki commented Oct 6, 2024

DamianSawicki commented Oct 8, 2024

BenTheElder commented Oct 8, 2024

BenTheElder commented Oct 8, 2024

DamianSawicki commented Oct 9, 2024

zhangguanzhang commented Oct 10, 2024

dereknola commented Nov 13, 2024

DamianSawicki commented Nov 13, 2024

k8s-dns e2e test suite failing with exit status 1 at HEAD #646

k8s-dns e2e test suite failing with exit status 1 at HEAD #646

Comments

DamianSawicki commented Oct 6, 2024

DamianSawicki commented Oct 6, 2024

DamianSawicki commented Oct 8, 2024

BenTheElder commented Oct 8, 2024

BenTheElder commented Oct 8, 2024

DamianSawicki commented Oct 9, 2024

zhangguanzhang commented Oct 10, 2024

dereknola commented Nov 13, 2024

DamianSawicki commented Nov 13, 2024