-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
correct np.inf
values & dealing w/ nan
#9
Conversation
…an and notebook to discuss nan sfed
ugh one annoying complication that affects this issue as well is that currently the So now if we have blank (NA) values representing rasters w/ no zonal stats how do we represent RP values >10 in a way that would preserve them as numeric? We could either a. ) not include the no zonal stat admins and list them in an additional table tab (option 1 above) and then use blanks for those, or b.) if no infinity class just resign to using a string/character value, but instead make it more explicit ">10" and add that to the readme - leaving it to the analyst to parse -- fairly straight forward to add create a conditional numeric column next to it or set as Inf in python or R |
ok chatted w/ Hannah and Tristan on slack, in summary:
I implemented what was discussed in 559b469 @t-downing - i think you can go ahead an review code. @isatotun - good to be aware of this reclassification step and admin filling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The classification (in reclassify_rp ()
) looks good!
But, in reviewing it, I think I uncovered an issue with the empirical return period calculation... Which is of course ironic because I think it was adapted from a function that I wrote in the first place. The issue is that empirical_rp()
still returns RANK
values (and therefore RP
values) when the value
values are tied. In the extreme, if all the value
values are 0 (i.e. there has never been flooding in the admin), the RANK
is set to 13.5 for all rows (and RP
is set to 2.0). Then, when you interpolate with a value of 0 for the current period, it will return 2.0 as well. So you have have a return period of 2 years for a value of 0. This happens in plenty of admins (and less extreme versions happen in many more, where there are several years where the maximum flood extent was 0.0).
Is this the behaviour we want? I feel like we should always return the minimum RP value (i.e. 1) when the value is 0.
nice catch - i pulled this into #10 and will comment from there. This func to |
Infinity values
np.inf
in the empirical rp interpolation functionexploration/add_return_periods.py
nan values
@t-downing @hannahker
exploration/07_informing_user_of_NA.py
, but i think it would be good to discuss options for presenting this information. @hannahker wrote a nice succinct disclaimer that we can use/modify with any of the options hereOptions:
I like option 2 as it avoids both the need to update a table in the in readme as well as placing an outsized emphasis on this relatively smally issue. I think it will also make the ultimate disclaimer less wordy.
@isatotun - i think the implementation of option 2 is also pretty straight forward to implement in the notebook and have added one implementation of it to the bottom of
exploration/07_informing_user_of_NA.py
Also l latest version of sample readme is on the Jira ticket