-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix binary metrics so they properly return NaN when appropriate #113
Changes from all commits
9c43dff
453aea7
b0118e1
7c811ed
5853125
7cfce07
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you know the rationale behind lighthouse returning There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, and I agree this is a dangerous change. However, the most common usages of Line 222 in 2a4494e
Line 302 in 2a4494e
So I think there is a bug somewhere, either TradeoffMetricsV1 needs to support missing or area_under_curve needs to not return missing. I believe this change will be less disruptive. It does contravene the docstring for |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,11 +15,15 @@ end | |
area_under_curve(x, y) | ||
|
||
Calculates the area under the curve specified by the `x` vector and `y` vector | ||
using the trapezoidal rule. If inputs are empty, return `missing`. | ||
using the trapezoidal rule. If inputs are empty, return `NaN`. Excludes NaN entries. | ||
""" | ||
function area_under_curve(x, y) | ||
length(x) == length(y) || throw(ArgumentError("Length of inputs must match.")) | ||
length(x) == 0 && return missing | ||
isempty(x) && return NaN | ||
non_nan = (!).(isnan.(x) .| isnan.(y)) | ||
x = x[non_nan] | ||
y = y[non_nan] | ||
isempty(x) && return NaN | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we need to check if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nvm, I was confusing myself initially but I think the single check covers it! If |
||
auc = zero(middle(one(eltype(x)), one(eltype(y)))) | ||
perms = sortperm(x) | ||
sorted_x = view(x, perms) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you remember there a rationale behind these quantities being 1/0 in the edge case? I agree these should return NaN but just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I traced this code back to https://github.com/beacon-biosignals/OldLighthouse.jl/pull/36 (private repo), so I believe it was to support ROC curves better, where NaNs could contaminate the whole AUC even if it's just some 0.0 threshold or something. But IMO that is much better handled in the AUC computation itself rather than here.