-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass an error message to the failure node #6181
base: master
Are you sure you want to change the base?
Pass an error message to the failure node #6181
Conversation
Signed-off-by: Alex Wu <[email protected]>
Code Review Agent Run #f10373Actionable Suggestions - 5
Review Details
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6181 +/- ##
==========================================
+ Coverage 37.02% 37.09% +0.07%
==========================================
Files 1317 1318 +1
Lines 132534 132797 +263
==========================================
+ Hits 49069 49265 +196
- Misses 79219 79270 +51
- Partials 4246 4262 +16
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Changelist by BitoThis pull request implements the following key changes.
|
assert.FailNow(t, "Not yet implemented for types %v", reflect.TypeOf(lt1.GetType())) | ||
} | ||
|
||
assert.Equal(t, lt1.GetStructure().GetTag(), lt2.GetStructure().GetTag()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a nil check for GetStructure()
before accessing GetTag()
to avoid potential panic if structure is nil
Code suggestion
Check the AI-generated fix before applying
assert.Equal(t, lt1.GetStructure().GetTag(), lt2.GetStructure().GetTag()) | |
structure1 := lt1.GetStructure() | |
structure2 := lt2.GetStructure() | |
if structure1 != nil && structure2 != nil { | |
assert.Equal(t, structure1.GetTag(), structure2.GetTag()) | |
} |
Code Review Run #f10373
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
Message: "node failure", | ||
} | ||
expectedLiterals := make(map[string]*core.Literal, 1) | ||
errorLiteral, _ := coreutils.MakeLiteral(&core.Error{Message: execErr.Message, FailedNodeId: nID,}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding error handling for MakeLiteral()
call since it returns an error that is currently being ignored with _
Code suggestion
Check the AI-generated fix before applying
errorLiteral, _ := coreutils.MakeLiteral(&core.Error{Message: execErr.Message, FailedNodeId: nID,}) | |
errorLiteral, err := coreutils.MakeLiteral(&core.Error{Message: execErr.Message, FailedNodeId: nID,}) | |
if err != nil { | |
t.Fatalf("Failed to create error literal: %v", err) | |
return | |
} |
Code Review Run #f10373
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
nl := NewTestNodeLookup( | ||
map[string]v1alpha1.ExecutableNode{v1alpha1.StartNodeID: n, failureNodeID: n}, | ||
map[string]v1alpha1.ExecutableNodeStatus{v1alpha1.StartNodeID: ns, failureNodeID: ns}, | ||
) | ||
|
||
assert.NotNil(t, nl) | ||
|
||
failureNodeLookup := NewFailureNodeLookup(nl, n, ns) | ||
failureNodeLookup := NewFailureNodeLookup(nl, n, ns, originalErr).(FailureNodeLookup) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a test case to verify behavior when originalErr
is nil
. The type assertion to FailureNodeLookup
assumes the cast will always succeed.
Code suggestion
Check the AI-generated fix before applying
failureNodeLookup := NewFailureNodeLookup(nl, n, ns, originalErr).(FailureNodeLookup) | |
failureNodeLookup, ok := NewFailureNodeLookup(nl, n, ns, originalErr).(FailureNodeLookup) | |
assert.True(t, ok) |
Code Review Run #f10373
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
originalErr, _ := failureNodeLookup.GetOriginalError() | ||
if originalErr != nil { | ||
ResolveOnFailureNodeInput(ctx, nodeInputs, node.GetID(), originalErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding error handling for GetOriginalError()
call. The error return value is currently being ignored which could mask potential issues.
Code suggestion
Check the AI-generated fix before applying
originalErr, _ := failureNodeLookup.GetOriginalError() | |
if originalErr != nil { | |
ResolveOnFailureNodeInput(ctx, nodeInputs, node.GetID(), originalErr) | |
originalErr, err := failureNodeLookup.GetOriginalError() | |
if err != nil { | |
return handler.PhaseInfoFailure(core.ExecutionError_SYSTEM, "FailureNodeError", err.Error(), nil), nil | |
} else if originalErr != nil { | |
ResolveOnFailureNodeInput(ctx, nodeInputs, node.GetID(), originalErr) |
Code Review Run #f10373
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
Signed-off-by: Alex Wu <[email protected]>
Signed-off-by: Alex Wu <[email protected]>
Signed-off-by: Alex Wu <[email protected]>
109201f
to
ab9dfd2
Compare
Signed-off-by: Alex Wu <[email protected]>
Code Review Agent Run #4749b8Actionable Suggestions - 4
Review Details
|
if originalErr != nil { | ||
ResolveOnFailureNodeInput(ctx, nodeInputs, node.GetID(), originalErr) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider checking if nodeInputs
is nil before attempting to resolve failure node input. The current code could potentially cause a nil pointer dereference if nodeInputs
is nil.
Code suggestion
Check the AI-generated fix before applying
if originalErr != nil { | |
ResolveOnFailureNodeInput(ctx, nodeInputs, node.GetID(), originalErr) | |
} | |
} | |
if originalErr != nil && nodeInputs != nil { | |
ResolveOnFailureNodeInput(ctx, nodeInputs, node.GetID(), originalErr) | |
} | |
} |
Code Review Run #4749b8
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
execErr := &core.ExecutionError{ | ||
Message: "node failure", | ||
} | ||
nodeLoopUp := NewFailureNodeLookup(nl, en, ns, execErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding assertions to verify that the execErr
is correctly set in the FailureNodeLookup
struct. Currently, the test doesn't validate the OriginalError
field.
Code suggestion
Check the AI-generated fix before applying
@@ -35,4 +35,5 @@
typed := nodeLoopUp.(FailureNodeLookup)
assert.Equal(t, nl, typed.NodeLookup)
assert.Equal(t, en, typed.FailureNode)
assert.Equal(t, ns, typed.FailureNodeStatus)
+ assert.Equal(t, execErr, typed.OriginalError)
Code Review Run #4749b8
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
noneLiteral, _ := coreutils.MakeLiteral(nil) | ||
inputLiterals := make(map[string]*core.Literal, 1) | ||
inputLiterals["err"] = &core.Literal{ | ||
Value: &core.Literal_Scalar{ | ||
Scalar: &core.Scalar{ | ||
Value: &core.Scalar_Union{ | ||
Union: &core.Union{ | ||
Value: noneLiteral, | ||
Type: &core.LiteralType{ | ||
Type: &core.LiteralType_Simple{ | ||
Simple: core.SimpleType_NONE, | ||
}, | ||
Structure: &core.TypeStructure{ | ||
Tag: "none", | ||
}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
} | ||
inputLiteralMap := &core.LiteralMap{ | ||
Literals: inputLiterals, | ||
} | ||
nID := "fn" | ||
execErr := &core.ExecutionError{ | ||
Message: "node failure", | ||
} | ||
expectedLiterals := make(map[string]*core.Literal, 1) | ||
errorLiteral, _ := coreutils.MakeLiteral(&core.Error{Message: execErr.GetMessage(), FailedNodeId: nID}) | ||
expectedLiterals["err"] = &core.Literal{ | ||
Value: &core.Literal_Scalar{ | ||
Scalar: &core.Scalar{ | ||
Value: &core.Scalar_Union{ | ||
Union: &core.Union{ | ||
Value: errorLiteral, | ||
Type: &core.LiteralType{ | ||
Type: &core.LiteralType_Simple{ | ||
Simple: core.SimpleType_ERROR, | ||
}, | ||
Structure: &core.TypeStructure{ | ||
Tag: "FlyteError", | ||
}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
}, | ||
} | ||
expectedLiteralMap := &core.LiteralMap{ | ||
Literals: expectedLiterals, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider extracting the literal creation logic into a helper function to improve readability and maintainability. The current implementation has deeply nested literal construction that could be simplified.
Code suggestion
Check the AI-generated fix before applying
noneLiteral, _ := coreutils.MakeLiteral(nil) | |
inputLiterals := make(map[string]*core.Literal, 1) | |
inputLiterals["err"] = &core.Literal{ | |
Value: &core.Literal_Scalar{ | |
Scalar: &core.Scalar{ | |
Value: &core.Scalar_Union{ | |
Union: &core.Union{ | |
Value: noneLiteral, | |
Type: &core.LiteralType{ | |
Type: &core.LiteralType_Simple{ | |
Simple: core.SimpleType_NONE, | |
}, | |
Structure: &core.TypeStructure{ | |
Tag: "none", | |
}, | |
}, | |
}, | |
}, | |
}, | |
}, | |
} | |
inputLiteralMap := &core.LiteralMap{ | |
Literals: inputLiterals, | |
} | |
nID := "fn" | |
execErr := &core.ExecutionError{ | |
Message: "node failure", | |
} | |
expectedLiterals := make(map[string]*core.Literal, 1) | |
errorLiteral, _ := coreutils.MakeLiteral(&core.Error{Message: execErr.GetMessage(), FailedNodeId: nID}) | |
expectedLiterals["err"] = &core.Literal{ | |
Value: &core.Literal_Scalar{ | |
Scalar: &core.Scalar{ | |
Value: &core.Scalar_Union{ | |
Union: &core.Union{ | |
Value: errorLiteral, | |
Type: &core.LiteralType{ | |
Type: &core.LiteralType_Simple{ | |
Simple: core.SimpleType_ERROR, | |
}, | |
Structure: &core.TypeStructure{ | |
Tag: "FlyteError", | |
}, | |
}, | |
}, | |
}, | |
}, | |
}, | |
} | |
expectedLiteralMap := &core.LiteralMap{ | |
Literals: expectedLiterals, | |
} | |
inputLiterals := createTestInputLiterals() | |
inputLiteralMap := &core.LiteralMap{ | |
Literals: inputLiterals, | |
} | |
nID := "fn" | |
execErr := &core.ExecutionError{ | |
Message: "node failure", | |
} | |
expectedLiterals := createExpectedLiterals(execErr, nID) | |
expectedLiteralMap := &core.LiteralMap{ | |
Literals: expectedLiterals, | |
} |
Code Review Run #4749b8
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
literals := nodeInputs.GetLiterals() | ||
if literal, exists := literals["err"]; exists { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding error handling for the case when execErr
is nil. Currently, the function assumes execErr
is non-nil when accessing GetMessage()
.
Code suggestion
Check the AI-generated fix before applying
literals := nodeInputs.GetLiterals() | |
if literal, exists := literals["err"]; exists { | |
literals := nodeInputs.GetLiterals() | |
if literal, exists := literals["err"]; exists { | |
if execErr == nil { | |
return | |
} |
Code Review Run #4749b8
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
Signed-off-by: Alex Wu <[email protected]>
Code Review Agent Run #f0fe6aActionable Suggestions - 1
Review Details
|
structure1 := lt1.GetStructure() | ||
structure2 := lt2.GetStructure() | ||
if structure1 != nil && structure2 != nil { | ||
assert.Equal(t, structure1.GetTag(), structure2.GetTag()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a test case to verify behavior when one structure is nil and the other is not. Currently, the code only checks when both are nil or both are non-nil.
Code suggestion
Check the AI-generated fix before applying
structure1 := lt1.GetStructure() | |
structure2 := lt2.GetStructure() | |
if structure1 != nil && structure2 != nil { | |
assert.Equal(t, structure1.GetTag(), structure2.GetTag()) | |
} | |
structure1 := lt1.GetStructure() | |
structure2 := lt2.GetStructure() | |
assert.Equal(t, structure1 == nil, structure2 == nil, "Both structures should be either nil or non-nil") | |
if structure1 != nil && structure2 != nil { | |
assert.Equal(t, structure1.GetTag(), structure2.GetTag()) | |
} |
Code Review Run #f0fe6a
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
Signed-off-by: Alex Wu <[email protected]>
Signed-off-by: Alex Wu <[email protected]>
Code Review Agent Run #46f37bActionable Suggestions - 3
Review Details
|
execErr := &core.ExecutionError{ | ||
Message: "node failure", | ||
} | ||
nodeLoopUp := NewFailureNodeLookup(nl, en, ns, execErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add an assert below
assert.Equal(t, execErr, typed.OriginalError)
structure1 := lt1.GetStructure() | ||
structure2 := lt2.GetStructure() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if one of the structure is nil, we should also raise an error
Signed-off-by: Alex Wu <[email protected]>
Hi @pingsutw , I've added the test cases you mentioned in this PR. Could you please review it and let me know if you have any additional comments, thanks! |
Code Review Agent Run #66823eActionable Suggestions - 0Review Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you! Could you also update this doc when you get a chance?
Tracking issue
Closes #4574
Why are the changes needed?
In flytekit, we add err promise to the failure node input interface. This err message explains why the workflow failed. However, propeller doesn't pass an error message as an input to the failure node. Therefore, current error message in the failure node is always None.
What changes were proposed in this pull request?
FailureNodeLookup
, we added a new attributeOriginalError
to store runtime execution error.NodeExecutor
,preExecute
func will call a new funcResolveOnFailureNodeInput
if current execution node is an on failure node, and the new func will inject execution error iferr
input existed in the workflow.How was this patch tested?
We designed a workflow that triggers the
clean_up
on failure node, which includes anerr
input with a default value of None. During workflow execution in the backend, we expected the error message to be passed to the on failure node.The error message was passed into
err
input as we expectedCheck all the applicable boxes
Summary by Bito
Implementation of error message propagation to failure nodes in Flyte workflows, enhancing error handling capabilities. Changes include adding OriginalError field to FailureNodeLookup, modifications to node executor, and splitting error resolution logic. The implementation includes new assertions in failure node lookup tests and improvements to literal type comparison functionality in assert utilities. These updates enable better debugging capabilities and workflow failure information access.Unit tests added: True
Estimated effort to review (1-5, lower is better): 2