-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release v0.5.0 requirements #438
Comments
I think we're good to proceed with the release, I'll do it tomorrow. |
Can we include the status update PR? |
I'll try, it's almost done but after rebasing to include the TTL changes, there seems to be some bug / race condition causing the TTL feature to not work. The TTL integration test passes when I step through in debug mode, but not when run at full speed (it fails here, expected job deletion timestamps are not set). These kind of issues sometimes take a while to determine the root cause (and I have other deadlines coming up this month I need to mindful of) and I don't want to delay the v0.5.0 too much if we can avoid it, since we are trying to coordinate the release with Kueue's next release on April 15th, so that our changes in #487 can be utilized in their Multi-Kueue work. I'll see if I can figure it out tomorrow. |
@danielvegamyhre can you describe the issue with TTL? I can work on fixing it. I never managed to get a failure in the integration test neither locally nor on CI, I do have some experience with deflaking tests, I will try to run to replicate a failure. |
@dejanzele that would be super helpful! You can checkout the feature branch in #494 locally and run the integration tests. This line which checks that the expected number of jobs have been deleted fails. Adding print statements, it shows the jobs being listed do not have the deletion timestamp set when the test is run at full speed. When I add breakpoints in the JobSet controller and step through the logic, the test passes. To be clear, I don't think your PR implementing TTL support had a bug. The issue seems to be the interaction of my new changes (updating JobSet status only once per reconcile call, at the very end) with the existing TTL logic, causing jobs to not be deleted. |
@danielvegamyhre this should fix it https://github.com/kubernetes-sigs/jobset/pull/494/files#r1560773603, tests pass locally on my side |
v0.5.0 was released in #515 |
Targeting release around April 10th, 2024.
Must haves
managedBy
field #440Nice to haves
The text was updated successfully, but these errors were encountered: