-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing of units from strings #298
Conversation
Generalize the handling of `@u_str` and the unit extension module context to allow the same machinery to be used for a new `parseunit` function which does the same thing at runtime. Fixes PainterQubits#272 as a side effect of generalizing `@u_str`.
src/user.jl
Outdated
parseunit(unitmodule::Module, str) = parseunit([unitmodule], str) | ||
function parseunit(unitmods, str) | ||
ex = Meta.parse(str) | ||
eval(lookup_units(unitmods, ex)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I believe this use of eval
is a rare circumstance where it's probably warranted: we want to evaluate the expression containing the small number of allowed_funcs
where we've already sanitized the expression ex
and fairly carefully resolved all symbols ourselves. In addition, we want to use Unitful.eval
in particular, taking the allowed_funcs
bindings from the scope of Unitful
.
Codecov Report
@@ Coverage Diff @@
## master #298 +/- ##
=========================================
+ Coverage 78.23% 78.4% +0.17%
=========================================
Files 15 15
Lines 1075 1079 +4
=========================================
+ Hits 841 846 +5
+ Misses 234 233 -1
Continue to review full report at Codecov.
|
test/runtests.jl
Outdated
@test parseunit("m") == m | ||
@test parseunit("m,s") == (m,s) | ||
@test parseunit("1.0") == 1.0 | ||
@test parseunit("m/s") == m/s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess a test about ms
should exist as well, to see whether it is interpreted as miliseconds?
How would you type the units of meters times seconds? In general in the tests presented here I don't see any test regarding parsing a multiplication of units (which is a reasonable way to make units as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: I guess by using multiplication sign *
, but still this could/should be one test case as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we could add that. As you can see, I just converted the existing u_str
tests (which use the same infrastructure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The likelihood that someone would make a change causing "ms"
to parse as meters times seconds is basically zero. You'd really have to go out of your way to make that happen. I don't mind if we add it though.
src/user.jl
Outdated
parseunit([unit_module(s),] string) | ||
|
||
Parse a string as a unit type. The format for `string` must be a valid Julia | ||
expression, and any identifiers will be looked up in the context `unit_module` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose some doc improvement here: after valid Julia expression
add "(i.e. multiplication between units is specified with *
, division between units with /
and space characters have no meaning (if between operators)". After this parenthesis add a full stop, to make the sentence easier to read, and continue with "Any identifiers...".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "The string will be parsed with standard Julia syntax" ? What you've described is not special to Unitful.
I was actually a bit more confused by the first sentence. Perhaps that could read "Parse a string for Unitful quantities, units, or dimensions."
As a user or a developer of another Julia package that wants to add support for Unitful (namely NCDatasets.jl), how would I extend this function to handle cases more used in my field, e.g. using spaces to separate units that are multiplied? |
So you are using NetCDF, yes? Do you know exactly what the rules are for parsing valid NetCDF unit strings? My impression is that they're defined by the implementation of the udunits package, which supposedly implements the formal grammar available at https://www.unidata.ucar.edu/software/udunits/udunits-current/doc/udunits/udunits2lib.html#Grammar. So the robust and complete way to do it would be to write a proper parser which follows that exact grammar. But this seems potentially NetCDF specific so it's not clear that it belongs in For some cases you might get away with simple string substitutions to turn your NetCDF unit strings into julia syntax, which could then be passed to |
I've commented this in the issue page, but here is an idea: make
This would allow users and package developers to give a pre-processing function in a natural way, without Unitful.jl ever having to worry about what whackadoodle functions the users would use? And for some pre-processing functions, if you find them helpful enough, you could add them in the docs of Unitful.jl? |
But I'm not sure how this is better than simply writing |
I guess the first way explicitly shows how you would extend the parsing functionality, while the second has to be explicitly mentioned in the docs? But sure, we can move forward with your way. |
orthogonal to the current discussion: can the first argument of |
Thanks so much for this long-desired functionality!
The new implementation is really nice. It is a little funny that It seems like
I suppose since Finally, would you imagine implementing three-arg versions, |
Oh right, somehow I missed this! I guess an alternative spelling could be
I'm not sure about the reasoning though I think it's unclear what the concrete return type should be for cases like
Yes I think I may agree with this.
I think the output type should always come first (that's the |
Just some brief feedback:
Let me know how much you want to tackle here. I really appreciate what you've done already. Perhaps we can change |
I had a bit of a look at what it would take to do "something sensible" with For example, what do we do about
Right, though it might make sense to support more general "unit contexts" in the future other than fixing on modules? For example it could be nice to support @Datseris's idea of a vector as a collection of allowed units.
We could have
Thanks, I appreciate it! Sometimes a simple looking thing turns out to be a deeper rabbit hole than you might expect. For now I'd be fairly happy to simply rename |
Also add a test for PainterQubits#727
Ok, I've renamed This might be good to merge now. I did rearrange a bit and added a |
Beautiful, thank you! |
HECK YEAH, thank you @c42f! |
uparse just saved my day! |
It'll also be useful to have this kind of type comparison: |
@pranshumalik14 depending on what you want to do, you may be able to rely on Unitful to make a conversion for you assuming that the result of dimension(uparse("1cm")) == dimension(1u"m") and the following return approximately 0.01:
|
Is this described in the document? If not then it would be nice to mention it as it is very useful for other packages that depends on Unitful.jl! I found |
No, |
Generalize the handling of
@u_str
and the unit extension modulecontext to allow the same machinery to be used for a new
parseunit
function which does the same thing at runtime.
Fixes #214 in a basic kind of way, and fixes #272 as a side effect of generalizing
@u_str
.I'm fairly happy with the implementation here, but I wonder whether there's a better naming for
parseunit
. For example, we could possibly do something like overloadBase.parse(::Unit, str)
instead if that's appropriate. Arguably this could extend more naturally toBase.parse(::Quantity, str)
). Thoughts?