feat: capture errors that shouldnt cause the whole plan to be discarded during network resolution #444

jhonasiv · 2024-12-06T13:34:36Z

We introduced a ResolutionError to mark errors during the new plan
network resolution shouldnt be raised, which causes the whole transation
to fail. Instead, we capture them and fail the deployment of the
specific tasks that caused them.

InternalResolutionFailure is a helper struct that collects enough information about a resolution failure so it can be turned into a ResolutionError in another scope. The ResolutionError describes the instance task and the toplevel task, so the system can clean them up and propagate the error at the right scope. Finally, the PartialNetworkResolution should be used as an exception, aggregating multile resolution errors/failures to be raised at once.

wvmcastro

One thing that is not clear to me, when trying to deploy the below composition there is an error with second deployment and on_error: commit. The deployer will still deploy first and second child?

class Parent < Compositions
  add Model1, as: "first"
  add Model2, as "second"
  add Model3, as "third"
  ...
end

lib/syskit/cli/doc/gen.rb

lib/syskit/network_generation/system_network_deployer.rb

wvmcastro · 2025-02-03T18:35:19Z

lib/syskit/network_generation/system_network_generator.rb

+            def validate_generated_network(toplevel_tasks: @toplevel_tasks,
+                merge_solver: @merge_solver)


some plugin will pass its own toplevel_tasks or merge_solver?

Nope, any plugin should implement this version of the method themselves (that's where the super comes in). I will resolve these arguments, they arent needed.

jhonasiv · 2025-02-03T19:05:42Z

One thing that is not clear to me, when trying to deploy the below composition there is an error with second deployment and on_error: commit. The deployer will still deploy first and second child?
class Parent < Compositions
  add Model1, as: "first"
  add Model2, as "second"
  add Model3, as "third"
  ...
end

If a child of a parent fails the deployment validation, itself and all of its parents also fail, therefore the other children will also fail. PS: on_error: commit is only used by profile_assertions tests, so in normal circumstances the deployer will do what I just said, but when on_error: commit the whole transaction would be commited, which would lead to all present errors being propagated at the end of the network generation process.

These errors shouldnt block the whole transaction from being applied. Instead, they are captured so we know which are the badly defined tasks, so we can deal with them (i.e. propagate errors for each individual task) later on.

This is the first step of taking the captured resolution failures. They are processed into ResolutionErrors, which contain information about the instance requirement task and toplevel task it relates to, and details about the exception that caused the resolution failure. When cleanup_after_resolution_errors is on, the instance requirement tasks that have a resolution error during network generation are not deployed and the related toplevel task is removed from the plan. That is the case to ensure the work_plan is clean and can continue the transaction. We also use the on_error behavior in the #resolve method to signal when not to cleanup, as it should never cleanup when the behavior is set to commit the transaction on error.

This allows for handling of resolution errors on the network generation

This emits the failed event on the planning task of a plan resolution that failed. The error emitted is a ResolutionError

This is more in line with the new resolution errors, where these "exceptions" are now replicated for each failing tasks individually, instead of grouped into one exception

They now emit a PartialNetworkResolution when any resolution error is detected. This is propagated especially to keep the behavior of the profile assertions.

jhonasiv requested review from doudou and wvmcastro December 6, 2024 13:34

jhonasiv self-assigned this Dec 6, 2024

jhonasiv force-pushed the capture-errors branch 3 times, most recently from d0b83eb to 63070d9 Compare December 6, 2024 17:17

jhonasiv force-pushed the capture-errors branch 2 times, most recently from 72722c1 to d070e92 Compare December 23, 2024 21:12

jhonasiv force-pushed the capture-errors branch from 0121c5a to a55ca32 Compare January 29, 2025 19:14

jhonasiv changed the title ~~[WIP] feat: capture errors that shouldnt cause the whole plan to be discarded during network resolution~~ feat: capture errors that shouldnt cause the whole plan to be discarded during network resolution Jan 29, 2025

jhonasiv added 2 commits January 31, 2025 11:52

fix: add exceptions to header file

aea2b43

jhonasiv force-pushed the capture-errors branch from a55ca32 to b66c087 Compare January 31, 2025 14:52

wvmcastro reviewed Feb 3, 2025

View reviewed changes

jhonasiv force-pushed the capture-errors branch from b66c087 to 571f5dd Compare February 4, 2025 13:10

jhonasiv added 7 commits February 4, 2025 10:41

feat: report the result of apply with instances and resolution errors

6453dc8

This allows for handling of resolution errors on the network generation

feat: propagate resolution errors

0bae6a5

This emits the failed event on the planning task of a plan resolution that failed. The error emitted is a ResolutionError

fix: rework missing deployment and configuration section to single task

9e60aed

This is more in line with the new resolution errors, where these "exceptions" are now replicated for each failing tasks individually, instead of grouped into one exception

fix: adapt network manipulation to resolution errors scheme

d187f56

They now emit a PartialNetworkResolution when any resolution error is detected. This is propagated especially to keep the behavior of the profile assertions.

fix: raise resolution errors when it was expect an exception

0eafc30

jhonasiv force-pushed the capture-errors branch from 571f5dd to 0eafc30 Compare February 4, 2025 13:41

jhonasiv mentioned this pull request Feb 4, 2025

feat: capture errors during network generation instead of raising them rock-gazebo/drivers-transformer#13

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: capture errors that shouldnt cause the whole plan to be discarded during network resolution #444

feat: capture errors that shouldnt cause the whole plan to be discarded during network resolution #444

jhonasiv commented Dec 6, 2024

wvmcastro left a comment

wvmcastro Feb 3, 2025

jhonasiv Feb 3, 2025

jhonasiv commented Feb 3, 2025

		def validate_generated_network(toplevel_tasks: @toplevel_tasks,
		merge_solver: @merge_solver)

feat: capture errors that shouldnt cause the whole plan to be discarded during network resolution #444

Are you sure you want to change the base?

feat: capture errors that shouldnt cause the whole plan to be discarded during network resolution #444

Conversation

jhonasiv commented Dec 6, 2024

wvmcastro left a comment

Choose a reason for hiding this comment

wvmcastro Feb 3, 2025

Choose a reason for hiding this comment

jhonasiv Feb 3, 2025

Choose a reason for hiding this comment

jhonasiv commented Feb 3, 2025