-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: capture errors that shouldnt cause the whole plan to be discarded during network resolution #444
base: transition-to-runkit
Are you sure you want to change the base?
Conversation
d0b83eb
to
63070d9
Compare
72722c1
to
d070e92
Compare
0121c5a
to
a55ca32
Compare
InternalResolutionFailure is a helper struct that collects enough information about a resolution failure so it can be turned into a ResolutionError in another scope. The ResolutionError describes the instance task and the toplevel task, so the system can clean them up and propagate the error at the right scope. Finally, the PartialNetworkResolution should be used as an exception, aggregating multile resolution errors/failures to be raised at once.
a55ca32
to
b66c087
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that is not clear to me, when trying to deploy the below composition there is an error with second deployment and on_error: commit
. The deployer will still deploy first and second child?
class Parent < Compositions
add Model1, as: "first"
add Model2, as "second"
add Model3, as "third"
...
end
def validate_generated_network(toplevel_tasks: @toplevel_tasks, | ||
merge_solver: @merge_solver) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some plugin will pass its own toplevel_tasks or merge_solver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, any plugin should implement this version of the method themselves (that's where the super comes in). I will resolve these arguments, they arent needed.
If a child of a parent fails the deployment validation, itself and all of its parents also fail, therefore the other children will also fail. PS: |
b66c087
to
571f5dd
Compare
These errors shouldnt block the whole transaction from being applied. Instead, they are captured so we know which are the badly defined tasks, so we can deal with them (i.e. propagate errors for each individual task) later on.
This is the first step of taking the captured resolution failures. They are processed into ResolutionErrors, which contain information about the instance requirement task and toplevel task it relates to, and details about the exception that caused the resolution failure. When cleanup_after_resolution_errors is on, the instance requirement tasks that have a resolution error during network generation are not deployed and the related toplevel task is removed from the plan. That is the case to ensure the work_plan is clean and can continue the transaction. We also use the on_error behavior in the #resolve method to signal when not to cleanup, as it should never cleanup when the behavior is set to commit the transaction on error.
This allows for handling of resolution errors on the network generation
This emits the failed event on the planning task of a plan resolution that failed. The error emitted is a ResolutionError
This is more in line with the new resolution errors, where these "exceptions" are now replicated for each failing tasks individually, instead of grouped into one exception
They now emit a PartialNetworkResolution when any resolution error is detected. This is propagated especially to keep the behavior of the profile assertions.
571f5dd
to
0eafc30
Compare
We introduced a ResolutionError to mark errors during the new plan
network resolution shouldnt be raised, which causes the whole transation
to fail. Instead, we capture them and fail the deployment of the
specific tasks that caused them.