Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a reason rand(TDist(nu)) does not obey the type of nu? #1884

Open
Red-Portal opened this issue Aug 13, 2024 · 10 comments · May be fixed by #1905
Open

Is there a reason rand(TDist(nu)) does not obey the type of nu? #1884

Red-Portal opened this issue Aug 13, 2024 · 10 comments · May be fixed by #1905

Comments

@Red-Portal
Copy link
Contributor

Hi, it seems that

nu = 3f0
rand(TDist(nu))

returns a Float64 instead of a Float32. Is this intended behavior? I find it unusual because the rand implementation tries to be type-stable half way there by using one(T) but ends up calling randn(rng) without specifying the type.

@devmotion
Copy link
Member

Indeed, this seems to be a bug. randn(rng) should probably be changed to randn(rng, typeof(z)).

@Red-Portal
Copy link
Contributor Author

Okay let me pick this up.

@Red-Portal
Copy link
Contributor Author

Hi @devmotion I opened a PR, could you take a look when you have time?

@quildtide
Copy link
Contributor

quildtide commented Aug 15, 2024

Has the situation from #1071, #1666, etc. changed?

I mean, personally, I'm not at all against the idea that eltype(d) should equal partype(d) when {eltype(d), partype(d)} <: AbstractFloat.

Although there should probably be a better interface that can generalize across more distributions.

@devmotion
Copy link
Member

I'm not at all against the idea that eltype(d) should equal partype(d)

That's absolutely not the idea here. In the case of TDist, we get a sample from ChiSq (of some type, regardless of whether it follows any conventions or not). And IMO regardless of what type it is, it seems wrong to modify/promote its type by a randn call (basically a "Float64 literal").

@Red-Portal
Copy link
Contributor Author

Red-Portal commented Aug 27, 2024

Hi @devmotion , unfortunately, it seems that calls to rand(q, d) still result in the wrong type. I guess this is because eltype(q) returns Float64. Should I make a PR that defines eltype(q) = float(partype(q))?

@quildtide
Copy link
Contributor

One option within the current interface, if you need more than one random value, is to call the rand! method on a vector of the type you want, e.g. rand!(Exponential(), zeros(Float32, 20)).

I think there'll still be a double conversion going on (Float32 -> Float64-> Float32) if you call this on the current public version, but I think a side-effect of merging #1879 will cause things to stay Float32 at all times. Not 100% sure on this though; hadn't given this much thought before.

@Red-Portal
Copy link
Contributor Author

Hi @devmotion , could you comment on how to proceed?

@devmotion
Copy link
Member

devmotion commented Aug 29, 2024

Hmmmm that's annoying, that leads directly to all the eltype inconsistencies 😄 Initially - and for basically all distributions apart from Normal - eltype and partype were two completely different things: eltype was supposed to be the type of samples (Float64 for continuous and Int for discrete distributions) whereas partype was the common/promoted type of the parameters.

At some point someone wanted rand(Normal(0f0, 1f0)) to return samples of type Float32, so the definition of eltype for Normal was changed to partype. But this made it completely inconsistent with how it was supposed to be defined, how it's defined for other distributions - and generally it's not even the type of the samples (think of e.g. Normal{Int})!

Sure, we could just propagate this inconsistency a bit more but I'm a bit reluctant. Not because I think that it should return Float64 but because I think this eltype concept is utterly broken. I think we should just stop defining eltype - an "element type" of a distribution is not clearly defined: Is it supposed to be the element type of the struct? The return type of rand? The return type of quantile? The element type of support?

Maybe at least for univariate distributions we could start removing it from the rand pipeline by changing

out = Array{eltype(s)}(undef, dims)
return @inbounds rand!(rng, sampler(s), out)
to

    return map(_ -> rand(rng, sampler(s)), CartesianIndices(dims))

(probably with an additional function barrier, similar to rand!)
In the default fallback for rand! we call scalar rand in a loop anyway. And defining array rand for those univariate distributions for which it is beneficial (see #1879) doesn't seem too bad.

@quildtide
Copy link
Contributor

but because I think this eltype concept is utterly broken. I think we should just stop defining eltype - an "element type" of a distribution is not clearly defined: Is it supposed to be the element type of the struct? The return type of rand? The return type of quantile? The element type of support?

In a perfect world, I believe eltype of a distribution should be the type of the value returned by rand, quantile, minimum, maximum, median, mode, etc.; and that mean and std should return a result of type float(eltype).

It is my view that a Distribution in Distributions.jl is an infinite-length unordered container which defines rand as a way to allow users to obtain values. There's only two discrepancies between this view and the status quo that I'm aware of; size and length in Distributions.jl return the size/length of a single sample, while I believe that those roles should go to another function while these functions should always return Inf on a Distribution.

@devmotion devmotion linked a pull request Sep 25, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants