Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple ExecutionCompletedEvent found #176

Open
bhugot opened this issue Jul 8, 2023 · 10 comments
Open

Multiple ExecutionCompletedEvent found #176

bhugot opened this issue Jul 8, 2023 · 10 comments
Labels
Needs: Attention 👋 Issue needs attention from maintainers P1 Priority 1

Comments

@bhugot
Copy link
Contributor

bhugot commented Jul 8, 2023

Hello,

We are hitting this issue time to time leading to some stuck durable that stay in running when they are complete

System.InvalidOperationException: Multiple ExecutionCompletedEvent found, potential corruption in state storage
   at DurableTask.Core.OrchestrationRuntimeState.SetMarkerEvents(HistoryEvent historyEvent) in /_/src/DurableTask.Core/OrchestrationRuntimeState.cs:line 265
   at DurableTask.Core.TaskOrchestrationDispatcher.ProcessWorkflowCompletedTaskDecision(OrchestrationCompleteOrchestratorAction completeOrchestratorAction, OrchestrationRuntimeState runtimeState, Boolean includeDetails, Boolean& continuedAsNew) in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 822
   at DurableTask.Core.TaskOrchestrationDispatcher.OnProcessWorkItemAsync(TaskOrchestrationWorkItem workItem) in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 410
   at DurableTask.Core.TaskOrchestrationDispatcher.OnProcessWorkItemAsync(TaskOrchestrationWorkItem workItem)
   at DurableTask.Core.TaskOrchestrationDispatcher.OnProcessWorkItemSessionAsync(TaskOrchestrationWorkItem workItem) in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 194
   at DurableTask.Core.WorkItemDispatcher`1.ProcessWorkItemAsync(WorkItemDispatcherContext context, Object workItemObj) in /_/src/DurableTask.Core/WorkItemDispatcher.cs:line 373

Backing off for 1 seconds until 5 successful operations

Any idea on what could be the cause of this and how to fix it?

@cgillum
Copy link
Member

cgillum commented Jul 8, 2023

This means that multiple "completion" events have been found in your orchestration history state. The code that raises this error is here. You can see the full set of events here. The "completion" events are ExecutionCompleted and ExecutionTerminated.

To understand why you're seeing this, we'd need to look at the union of your History and NewEvents tables, filtered by your orchestration instance ID, to see whether/why there are multiple completion events for your orchestration that are triggering this error.

@bhugot
Copy link
Contributor Author

bhugot commented Jul 9, 2023

Here are some data

image

@bhugot
Copy link
Contributor Author

bhugot commented Jul 9, 2023

And I compared to an equivalent sucessfully completed orchestration. We have exactly the same history It's look like the event is registered but not deleted

@bhugot
Copy link
Contributor Author

bhugot commented Aug 21, 2023

@cgillum any news on this?

@cgillum
Copy link
Member

cgillum commented Aug 21, 2023

Unfortunately, no updates yet. I was on a long vacation and apparently our GitHub automation which is supposed to help us track issues stopped working in the past few months. I'll make sure that this shows up in our triage so that we can get someone assigned to take a closer look.

@bachuv bachuv added P1 Priority 1 and removed Needs: Triage 🔍 labels Aug 31, 2023
@davidmrdavid
Copy link
Member

@bhugot: I'm working on a fix for this issue here - Azure/durabletask#949

@bhugot
Copy link
Contributor Author

bhugot commented Sep 6, 2023

For information I think the use case is a functions for some reason is running twice (no retry configured). In the 2 case i met one was not idempotent and the second run failed. In the other case it was idempotent so both were successful

@bhugot
Copy link
Contributor Author

bhugot commented Sep 7, 2023

New error on same subject I got

TaskOrchestrationDispatcher-b8699d665c904134b8e019ba498238dc-0: Unhandled exception with work item '413826bf-e213-4cca-a67a-77c495889fbe': System.InvalidCastException: Unable to cast object of type 'DurableTask.Core.History.TaskCompletedEvent' to type 'DurableTask.Core.History.TaskScheduledEvent'.
   at DurableTask.Core.TaskOrchestrationDispatcher.ReconcileMessagesWithState(TaskOrchestrationWorkItem workItem) in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 837
   at DurableTask.Core.TaskOrchestrationDispatcher.OnProcessWorkItemAsync(TaskOrchestrationWorkItem workItem) in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 328
   at DurableTask.Core.TaskOrchestrationDispatcher.OnProcessWorkItemAsync(TaskOrchestrationWorkItem workItem)
   at DurableTask.Core.TaskOrchestrationDispatcher.OnProcessWorkItemSessionAsync(TaskOrchestrationWorkItem workItem) in /_/src/DurableTask.Core/TaskOrchestrationDispatcher.cs:line 194
   at DurableTask.Core.WorkItemDispatcher`1.ProcessWorkItemAsync(WorkItemDispatcherContext context, Object workItemObj) in /_/src/DurableTask.Core/WorkItemDispatcher.cs:line 373

Backing off for 1 seconds until 5 successful operations

@bhugot
Copy link
Contributor Author

bhugot commented Sep 7, 2023

And this time I had 2 task who are sequential in TaskCompleted Status

@lilyjma lilyjma added the Needs: Attention 👋 Issue needs attention from maintainers label Sep 13, 2023
@bhugot
Copy link
Contributor Author

bhugot commented Jul 6, 2024

It seem's that last version fixed this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs: Attention 👋 Issue needs attention from maintainers P1 Priority 1
Projects
None yet
Development

No branches or pull requests

5 participants