Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: turn query radii for the different archives into astropy quantities #350

Open
jkrick opened this issue Oct 11, 2024 · 4 comments
Open
Labels

Comments

@jkrick
Copy link
Contributor

jkrick commented Oct 11, 2024

From Brigitta in a comment on PR #344
it would ultimately be nice if these radiuses could be any astropy quantity rather than a float, of which the user should keep track if arcsec, or arcmin, or degrees, depending on which datasets they are querying.

@jkrick jkrick added enhancement New feature or request use case: light curves labels Oct 11, 2024
@troyraen
Copy link
Contributor

it would ultimately be nice if these radiuses could be any astropy quantity

Generally, I agree, but in this case (as I recall) the scale_up.md notebook needs these to be floats (or ints) because the functions are called from the command line. So if they get changed to astropy quantities we will need to implement a workaround in scale_up.md.

OTOH, we'll probably want to revisit the parallel cell in this notebook because this PR removes Pan-STARRS from the multiprocessing call so now it has to wait for the others to finish and then it runs separately. It would be nice to get them all to actually run in parallel again. Potentially, we can come up with a solution that actually works at scale, and then we wouldn't need scale_up.md anymore. The two biggest reasons we wrote scale_up.md[*] in the first place were that (1) the notebook would not continue running long enough for the parallel cell to complete at scale (servers were being shut down automatically) and (2) the ZTF function was taking the longest by far so we implemented multiprocessing directly in that function, but that could not be used in conjunction with the multiprocessing in the notebook's parallel cell. The first might be solved by infrastructure updates and the second by getting ZTF into HATS format -- if so, maybe we can switch multiprocessing -> dask and run all the calls in parallel with Dask directly from this notebook, at scale.

[*] scale_up.md describes the underlying bash code but doesn't actually execute it at scale -- the user is instructed to do that directly from the command line.

@bsipocz
Copy link
Member

bsipocz commented Oct 11, 2024

scale_up.md notebook needs these to be floats

Do they need to be floats in the API signature can that conversion be an internal thing?

@troyraen
Copy link
Contributor

can that conversion be an internal thing?

That would be the workaround I mentioned. We'd have to implement it.

@bsipocz
Copy link
Member

bsipocz commented Oct 11, 2024

Hmm, we could have it both ways, that's actually what we do all around in pyvo, too (though we do that as the standards are very clear what the unit is when a float is provided). We can just always assume that one needs to be really careful when running things from the command line, vs the recommended user API of working with quantities and not paying any attention of the desired units of the various services.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants