-
-
Notifications
You must be signed in to change notification settings - Fork 40
implement average and median support for bkg_spectrum #253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
specreduce/background.py
Outdated
bkg_statistic : str, optional | ||
Statistical method used to collapse the background image. | ||
Supported values are: | ||
- `'median'` : Uses the median (`np.nanmedian`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ref linking needs full name of the package. Please fix accordingly to get rid of warnings on ReadTheDocs. Thanks!
- `'median'` : Uses the median (`np.nanmedian`). | |
- `'median'` : Uses the median (`numpy.nanmedian`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if passing in the function directly is more straightforward than using string for translation. It certainly would be more flexible and not confined to numpy. User can pass in their custom stats func... but is that too dangerous?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about the implementation in Jdaviz (passing the traitlet value for Statistic) but can see the applicability to be able to pass the function directly too. Could modify the jdaviz logic to make this possible so I'm open to either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the API here should be dictated by what Jdaviz does. The API we pick should be useful to someone who is using just specreduce natively. Jdaviz can hook into it downstream no matter what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made the other changes locally, I'll leave the string in for now but am open to modifying the logic here to make it more flexible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see what specreduce maintainers say. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like the idea of passing in a function for full generality, but also recognize that in this context, sum, average, and median are the most applicable options in the the vast majority of cases. it's also highly likely that someone would naively pass in, say, np.median
and then wonder why their data becomes all np.nan
. also, should we use "mean" instead of "average"?
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #253 +/- ##
==========================================
+ Coverage 85.84% 85.93% +0.08%
==========================================
Files 13 13
Lines 1159 1166 +7
==========================================
+ Hits 995 1002 +7
Misses 164 164 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
specreduce/background.py
Outdated
statistic_function = np.nansum | ||
|
||
if bkg_statistic: | ||
bkg_statistic = bkg_statistic.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to support mixed case here. I say we only support inputs in all lowercase. Python is naturally case-sensitive.
bkg_statistic = bkg_statistic.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used lower
here for the Jdaviz traitlet which has the first letter capitalized in the string, that could be done in Jdaviz instead, so I'm good with removing this and moving the logic elsewhere (or like you said passing the function directly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cshanahan1 I had that originally but removed it with this is mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might also want to add a test for invalid input and ensure ValueError is raised. Thanks!
Will do, thank you for the thorough review, it was very helpful! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, thanks! I added a couple of minor suggestions.
@@ -351,12 +358,21 @@ def bkg_spectrum(self, image=None): | |||
""" | |||
bkg_image = self.bkg_image(image) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a bkg_statistic = bkg_statistic.lower() to avoid potential frustration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this. i am approving this, but want to note that "average" and "median" return a different scale than "sum". i.e., they're normalized to the area of the background region and thus easily scaled to a source region for subtraction. "sum" is not scaled and the number of background pixels used to calculate it would be required to use it for background subtraction. tracking number of pixels excluded by np.nansum
is tricky and possibly complicated. i'm open to keeping "sum" as an option, but i'm unclear on the use case.
Thank you for reviewing! This function solely used sum prior to this PR, which was my reasoning in keeping it in and leaving it as the default parameter. I had considered removing sum altogether, however, there is test coverage for sum that results in several test failures and I didn't want to break existing functionality. I'm open to making any additional changes necessary in this PR or a follow-up effort, whether that is setting the default parameter for the statistical function to be either mean/median or remove the option for sum. |
yes, i think it's a larger issue for discussion so it's fine to leave it for now and do a follow-up effort later. i would want to revisit why "sum" was used in the first place before nixing it. |
This pull request is created to add functionality to the bkg_spectrum function. Currently, this function utilizes np.nansum for all computations of the background spectrum. This pull request defaults this function with np.nansum, while also adding support for 'Average' and 'Median' by utilizing np.nanmean and np.nanmedian.