Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Units interface package #595

Open
tfiers opened this issue Nov 29, 2022 · 22 comments
Open

Units interface package #595

tfiers opened this issue Nov 29, 2022 · 22 comments

Comments

@tfiers
Copy link

tfiers commented Nov 29, 2022

using Unitful currently takes 1 to 3 seconds1.

This is fine if you're planning on actually using physical units in your Julia session.2
But it adds to the load time of downstream packages that offer optional Unitful support.

My concrete motivation for writing this issue is that I want to add Unitful support to Distributions.jl:

Doing that would require adding Unitful as a dependency to Distributions. But most Distributions.jl users will not use the new unitful distributions; and they would still get the added start-up latency.

That is why I propose to create a very lightweight package that defines just the interface/API for working with units in Julia. Something like UnitfulCore, or UnitsInterface.jl.

Downstream packages offering physical unit support would then load just that interface package.
(If the user wants to actually use the unit support, they have to explicitly be using Unitful, DownstreamPkg).

This pattern has been applied before, e.g. in StaticArrays.jl. From its readme:

Most of the primary array types exported by StaticArrays.jl are defined in the small interface package StaticArraysCore.jl. This includes e.g., the definitions of the abstract type StaticArray and the concrete types SArray, MArray, and SizedArray (as well as their dimension-specific aliases). This enables downstream packages to implement new methods for these types without depending on (and hence loading) the entirety of StaticArrays.jl, and thereby to avoid incurring the full load-time of StaticArrays.jl (which is on the order of 0.6 s for StaticArrays.jl v1.4 on Julia v1.7).


So my question to the maintainers of Unitful.jl is: would you be open to adding such an interface package as a dependency of Unitful, and moving some core types and functions to that package (or having them subtype from the package) ?



Footnotes

  1. @time using Unitful gives 1.0 seconds on replit, and between 1.6 and 2.6 seconds on my laptop (Julia 1.8).

  2. Though a @precompile_all_calls block might be a good idea to add to src/ in any case.

@sostock
Copy link
Collaborator

sostock commented Dec 14, 2022

There have been several requests for such an interface package, so I think we should consider it. Unfortunately, I didn’t have much time in the past to really take part in the previous discussion:

Based on that, I think we would need a hierarchy of several packages to satisfy most users’ needs. Maybe we can base the discussion on the following proposal and refine it:

  1. UnitfulCore: Defines the AbstractQuantity, Dimensions and Units types and some functions that are required to facilitate working with quantities. Note that the actual implementation of Quantity with all of its arithmetic and promotion rules is not included, and neither are FreeUnits etc.
  2. UnitfulBase: Depends on UnitfulCore and defines the Quantity type and implements the actual conversion and arithmetic. The u"…" macro should probably be defined here, as well as the @unit, @dimension macros etc.
  3. UnitfulDimensions: Depends only on UnitfulCore, not on UnitfulBase. Defines the dimensions that are currently defined in Unitful (e.g. 𝐋) and “dispatch types” (Length and LengthUnits). Since the package does not depend on UnitfulBase, these will have to be defined by hand, i.e., without using @dimension. Or should we include @dimension in UnitfulCore instead?
  4. Unitful: Depends on UnitfulBase and UnitfulDimensions. Defines the units and constants that are currently included in src/pkgdefaults.jl as well as the methods for handling Period types from the Dates stdlib.

@giordano @Gregstrq @aplavin @mileslucas @tfiers What do you think about such a package hierarchy?

Some other thoughts:

  • Even if we don’t do this, we should look into reducing the loading time of this package.
  • Some packages might not need to depend on Unitful to support quantities. Maybe we could add a page to the documentation with some hints on how to do that? There is an older Julia PR (add Test.GenericDimensionful number type JuliaLang/julia#39852) which I think should be revived. It adds functionality to the Test stdlib that allows users to test whether their functions work with dimensionful quantities without assuming any interface for them.
  • We could think about renaming
    UnitfulCore       → UnitfulBase
    UnitfulBase       → Unitful
    UnitfulDimensions → UnitfulSIDimensions
    Unitful           → UnitfulSI
    
    This would be breaking (because Unitful wouldn’t contain the default SI dimensions/units/constants anymore), but we wouldn’t have both UnitfulCore and UnitfulBase anymore, which might be less confusing. It also would emphasize that the SI system is not necessarily the default. If we do this, we could also think about removing the constants from UnitfulSI altogether and refer to the PhysicalConstants package for that.

@aplavin
Copy link
Contributor

aplavin commented Dec 14, 2022

Given the "extension packages" feature of 1.9, splitting packages for the sake of loading time is not needed anymore. Distributions and others can just add Unitful as a weak dependency, enabling units support when it's loaded.

As for splitting for more modularity, I fully support it!

@cstjean
Copy link
Contributor

cstjean commented Dec 15, 2022

I'm going to throw in my unsolicited opinion, as someone who's done a *Base package, that managing multiple packages is a big hassle. CI, github permissions, PRs that must now be spread across multiple repos, package compatibility, user confusion ("which package do I need?"), documentation... I haven't reviewed all the issues linked above, but IMVHO, there must be some big benefits to justify the costs. Quadruply so to justify a 4-way split.

@giordano
Copy link
Collaborator

using Unitful currently takes 1 to 3 seconds.

% julia --project=/tmp --startup-file=no -q
julia> @time @eval using Unitful
  0.444807 seconds (924.10 k allocations: 64.710 MiB, 5.12% gc time)

I have a 7-year old laptop.

@tfiers
Copy link
Author

tfiers commented Dec 15, 2022

Interesting.
I get

julia> @time using Unitful
  0.781682 seconds (943.66 k allocations: 63.208 MiB)

if I run it on a very modern desktop (different than the modern laptop used for the earlier measurements. But also Windows).


edit
Aha, it's indeed shorter in an empty env (which is weird, right?!)

Same powerful desktop:

pkg> activate --temp
  Activating new project at `C:\Users\lpxtf3\AppData\Local\Temp\jl_0KhY3g`

(jl_0KhY3g) pkg> add Unitful
[…]

julia> @time using Unitful
  0.513769 seconds (939.96 k allocations: 63.047 MiB)

In my global env (which isn't super bloated I'd say: https://github.com/tfiers/dotfiles/blob/174862c/.julia/environments/v1.8/Project.toml),
I get ~0.8 seconds on the desktop

$ julia --startup-file=no -e "@time using Unitful"
  0.840768 seconds (938.31 k allocations: 65.985 MiB, 2.28% gc time)

@giordano
Copy link
Collaborator

giordano commented Dec 15, 2022

Aha, it's indeed shorter in an empty env (which is weird, right?!)

Are you always in a fresh session? If not, maybe there is something invalidating Unitful code? Also, do you use Revise? I see you use Windows, that may account for some slowness (I/O is pretty terrible on Windows).

@tfiers
Copy link
Author

tfiers commented Dec 15, 2022

See my last edit (sorry for post-editing my comment):

$ julia --startup-file=no -e "@time using Unitful"
  0.840768 seconds (938.31 k allocations: 65.985 MiB, 2.28% gc time)

(so no Revise or invalidations)


Edit:
And on my modern laptop¹, that same command gives

  2.119220 seconds (937.82 k allocations: 65.975 MiB, 1.40% gc time)

¹: Intel i7 10th gen (10510U), 16 GB Ram, ThinkPad X1 Yoga Gen 5 (2020 release), Win 10 Pro


(and just for completeness, in a --temp env, the laptop gives:

$ julia --startup-file=no -e "using Pkg; Pkg.activate(temp=true); Pkg.add(\"Unitful\"); @time using Unitful"
  1.381432 seconds (933.05 k allocations: 64.572 MiB, 3.93% gc time)

)

@Gregstrq
Copy link

I would agree with @cstjean that making four way split is a bit too much. Also, I can't imagine someone using UnitfulCore without the arithmetics and convenience constructors.

I stand by my initial proposal to only split out pkgdefaults.
In this regards, the dimensions are a part of the unit system, so there is no actual reason to split them out from pkgdefaults.

@sostock
Copy link
Collaborator

sostock commented Dec 18, 2022

I can't imagine someone using UnitfulCore without the arithmetics and convenience constructors.

In my opinion, UnitfulCore is all that should be needed to implement something like JuliaStats/Distributions.jl#1413 (comment), which is the reason why this issue was opened.

@tfiers
Copy link
Author

tfiers commented Dec 18, 2022

I'd like to add that I plan to experiment with an alternative units package (one with runtime units instead of units in the typesystem -- the goal is not necessarily performance, but rather ease of use and smaller type-signatures in compound datatypes).

So it would be fantastic to have a kind of UnitsInterface package that can be implemented by different packages.

Ofc it's great if everyone's effort is concentrated on the same package instead of duplicated; but independent experimentation is worthwile too imho

@KristofferC
Copy link
Contributor

"Interface" packages and "Core" packages should have no purpose now that "package extensions" are available (or at least will be available in 1.9).

@tfiers
Copy link
Author

tfiers commented Jan 22, 2023

For import times, indeed.
They might still be useful for allowing different implementations, no?

@timholy
Copy link
Contributor

timholy commented Jan 23, 2023

managing multiple packages is a big hassle

Agreed. IMO "subdir" packages, in which you still use a single repo and can thus coordinate changes across multiple packages, reduce most but not all of this hassle. You still have to think through what you want to release, set up your CI scripts appropriately, etc.

@aplavin
Copy link
Contributor

aplavin commented Nov 29, 2023

Another motivation for splitting the "interface" part is that it would enable support for "static units" (Unitful) and "dynamic units" (DynamicQuantities, @MilesCranmer) with the exact same code. Would be great to share all unit definitions, common functions like ustrip(), unit(), ... – basically everything that makes sense for both!

@MilesCranmer
Copy link

MilesCranmer commented Nov 30, 2023

^I would personally be 100% on board. (I don’t know if it’s possible, as the types are so different, including a variety of abstract types, but hopefully it is a possibility.)

@aplavin
Copy link
Contributor

aplavin commented Apr 18, 2024

I thought more about the unification idea, and noticed that quite a lot of useful stuff is possible even without any buy-in from Unitful/DQ! Although, it would be cleaner with the buy-in.

See my attempt at the common interface at https://github.com/JuliaAPlavin/UnitsBase.jl. Tests show some examples of what's possible: https://github.com/JuliaAPlavin/UnitsBase.jl/blob/master/test/runtests.jl.
It mainly does two things:

  • Support arithmetic on unitful + dq types by promoting to dq (dynamic) ones
  • Export functions ustrip and unit that can be used with both unitful and dq types

The main usecase is that the package/function author can write

using Unitful  # can use DQ instead, but makes most sense to use type-stable Unitful when possible
using UnitsBase: ustrip, unit

g(x) = ustrip(u"m", x)
f(x) = ustrip(u"g/mm", x / u"km")
h(x) = "unit is $(unit(x))"

and these functions work with either of the unit packages!
They remain stable and performant no matter what the type is.

So, the package code can be written using type-stable Unitful types, and used both with Unitful (most performant while units are type-stable) and DQ (most performant when units are dynamic) from user code.

Let me know what you think of this approach.

@MilesCranmer
Copy link

Nice work!!

And yes there is buy-in from me on the DQ side. Can we move it to a shared Julia org to lower the bus factor?

@aplavin
Copy link
Contributor

aplavin commented Apr 18, 2024

Totally, both the name and location of the repo are up to changes :)

IMO would be best to have a lightweight UnitsBase with just two empty function definitions for now: https://github.com/JuliaAPlavin/UnitsBase.jl/blob/master/src/UnitsBase.jl. And no other content (some potentially to be added later).

Then, both Unitful and DQ would depend on it and define methods for ustrip & unit functions instead of their own functions.
Mixed Unitful + DQ arithmetic and conversion (https://github.com/JuliaAPlavin/UnitsBase.jl/blob/master/ext/UnitfulDynamicQuantitiesExt.jl#L14-L19) makes most sense to put into DQ. First – because this arithmetic should promote to DQ, and second – because DQ already defines some Unitful conversions.

@aplavin
Copy link
Contributor

aplavin commented Apr 18, 2024

If Unitful devs also support this course of action, we can create that lightweight UnitsBase in a common Julia org. I guess JuliaMath fits well...

@MilesCranmer
Copy link

JuliaMath sounds good (not sure who to ask for this?). Or maybe JuliaPhysics since that's where Measurements.jl and PhysicalConstants.jl live.

Mixed Unitful + DQ arithmetic and conversion makes most sense to put into DQ.

Sounds good to me. We can put it in the existing Unitful extension within DQ.

My initial suggestions for more content would be AbstractQuantity{T} <: Number (and perhaps with a AbstractRealQuantity{T} <: Real).

@giordano
Copy link
Collaborator

I sent you an invite to JuliaPhysics. Years ago I also proposed to move this repo to the same org (#228), but it eventually was moved here and I think we're stuck with it, Andrew is the only admin of the repo and he doesn't seem to be very active on GitHub anymore.

@MilesCranmer
Copy link

MilesCranmer commented Apr 19, 2024

Awesome, thanks. Btw I'd be happy to rotate DynamicQuantities over there too, so we could have all the units and measurements packages under one umbrella org.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants