Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rand of AbstractArray #8649

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions base/abstractarray.jl
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ ndims{T,n}(::Type{AbstractArray{T,n}}) = n
ndims{T<:AbstractArray}(::Type{T}) = ndims(super(T))
length(t::AbstractArray) = prod(size(t))::Int
endof(a::AbstractArray) = length(a)
cendof(a::AbstractArray) = unsigned(length(a) - 1) # precondition: !isempty(a)
first(a::AbstractArray) = a[1]
first(a) = next(a,start(a))[1]
last(a) = a[end]
Expand Down Expand Up @@ -374,6 +375,8 @@ imag{T<:Real}(x::AbstractArray{T}) = zero(x)

getindex(t::AbstractArray, i::Real) = error("indexing not defined for ", typeof(t))

cget(t::AbstractArray, i::Real) = @inbounds return t[i+1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll not get this use of @inbounds past Jeff. #8227 is still open about being able to propagate @inbounds declarations trough functions in a meaningful way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but then I'm not sure what to do: rename this method to cget_unsafe instead? (in my mind, the "c" in "cget" was already a warning, and this is meant for internal use only). Or remove @inbounds and hope that there is no/little overhead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is such a simple definition, I'm not sure what benefit there is to even having it if its strictly for internal use. Similarly, cendof just seems like a pointless private definition mixed in with a bunch of things that are public.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions cendof and cget are needed only for their specializations for ranges (where the definition is slightly less trivial). In order for this PR's extended rand functionality to work with ranges like rand(typemin(Int):typemax(Int)), there needs to be a way to index into every position of such a range (which is not particularly specific to rand). Implementing cget/cendof was the simplest solution I found, but I humbly request for help to improve the design of these functions or to find alternatives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stefan has a good point. In fact, cget is only used in two places, so it looks like it would be just as easy to specialize those cases for Ranges rather than introduce cget and cendof. At the very least I would move all the code for cget to random.jl, since I certainly hope we won't see increasing use/need of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to start in the Random module in random.jl and move to Base in abstractarray.jl if we need it in more places. range.jl is also a possible place, as the problem is really that we want UnitRange to support a length of typemax(Uint) + 1, and other uses of UnitRange might also need to support full ranges.

How about the names length_0 and getindex_0 instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think length_0 fits, as whatever the first indice is, the length remains the same, unlike the last indince; I could rename to endof_0 and getindex_0 (I first thougth of length_minus_1 or even length_1 , taking "_" as meaning "minus", but "end of" describes better what is needed).

I can certainly move the code for cget/cendof in random.jl. I didn't do that at first as rand has nothing to do with these implementations (rand needs it only not to break previously working functionality), and these could be used elsewhere also (e.g. sum(1:typemax(Uint)) works but not sum(0:typemax(Uint)) which should give the same answer, but I admit it's not a very convincing example). However I'd rather not have rand have a specialization for ranges.


# linear indexing with a single multi-dimensional index
function getindex(A::AbstractArray, I::AbstractArray)
x = similar(A, size(I))
Expand Down Expand Up @@ -440,6 +443,8 @@ setindex!(t::AbstractArray, x, i::Real) =
error("setindex! not defined for ",typeof(t))
setindex!(t::AbstractArray, x) = throw(MethodError(setindex!, (t, x)))

cset!(t::AbstractArray, x, i::Real) = @inbounds return t[i+1] = x

## Indexing: handle more indices than dimensions if "extra" indices are 1

# Don't require vector/matrix subclasses to implement more than 1/2 indices,
Expand Down
3 changes: 0 additions & 3 deletions base/int.jl
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
## integer arithmetic ##

const IntTypes = (Int8, Uint8, Int16, Uint16, Int32, Uint32,
Int64, Uint64, Int128, Uint128)

+(x::Int, y::Int) = box(Int,add_int(unbox(Int,x),unbox(Int,y)))
<(x::Int, y::Int) = slt_int(unbox(Int,x),unbox(Int,y))

Expand Down
113 changes: 50 additions & 63 deletions base/random.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
module Random

using Base.dSFMT
using Base: dSFMT, cget, cendof

export srand,
rand, rand!,
Expand Down Expand Up @@ -146,91 +146,78 @@ rand{T<:Number}(::Type{T}) = error("no random number generator for type $T; try
rand{T<:Number}(::Type{T}, dims::Int...) = rand(T, dims)


## Generate random integer within a range
## Generate random integer within a range 0:k-1

# remainder function according to Knuth, where rem_knuth(a, 0) = a
rem_knuth(a::Uint, b::Uint) = a % (b + (b == 0)) + a * (b == 0)
rem_knuth{T<:Unsigned}(a::T, b::T) = b != 0 ? a % b : a
# maxmultiple: precondition: k>0
# maximum multiple of k <= 2^numbits(k), decremented by one
maxmultiple(k::Uint32) = itrunc(Uint32, div(0x0000000100000000, k)*k - 1)
maxmultiple(k::Uint64) = itrunc(Uint64, div(0x00000000000000010000000000000000, k)*k - 1)
# maximum multiple of k <= typemax(Uint128), decremented by one
maxmultiple(k::Uint128) = div(typemax(Uint128), k)*k - 1
# 32 or 64 bits version depending on size
maxmultiplemix(k::Uint64) = k >> 32 == 0 ? uint64(maxmultiple(itrunc(Uint32, k))) : maxmultiple(k)

# maximum multiple of k <= 2^bits(T) decremented by one,
# that is 0xFFFFFFFF if k = typemax(T) - typemin(T) with intentional underflow
maxmultiple(k::Uint32) = itrunc(Uint32, div(0x0000000100000000,k + (k == 0))*k - 1)
maxmultiple(k::Uint64) = itrunc(Uint64, div(0x00000000000000010000000000000000, k + (k == 0))*k - 1)
# maximum multiple of k within 1:typemax(Uint128)
maxmultiple(k::Uint128) = div(typemax(Uint128), k + (k == 0))*k - 1
# maximum multiple of k within 1:2^32 or 1:2^64, depending on size
maxmultiplemix(k::Uint64) = itrunc(Uint64, div((k >> 32 != 0)*0x0000000000000000FFFFFFFF00000000 + 0x0000000100000000, k + (k == 0))*k - 1)

immutable RandIntGen{T<:Integer, U<:Unsigned}
a::T # first element of the range
immutable RandIntGen{U<:Union(Uint32, Uint64, Uint128)}
k::U # range length or zero for full range
u::U # rejection threshold
end
# generators with 32, 128 bits entropy
RandIntGen{T, U<:Union(Uint32, Uint128)}(a::T, k::U) = RandIntGen{T, U}(a, k, maxmultiple(k))
# mixed 32/64 bits entropy generator
RandIntGen{T}(a::T, k::Uint64) = RandIntGen{T,Uint64}(a, k, maxmultiplemix(k))


# generator for ranges
RandIntGen{T<:Unsigned}(r::UnitRange{T}) = isempty(r) ? error("range must be non-empty") : RandIntGen(first(r), last(r) - first(r) + one(T))
function RandIntGen(k::U)
u = k == zero(U) ? typemax(U) :
U === Uint64 ? maxmultiplemix(k) :
maxmultiple(k)
new(k, u)
end
end

# generator API
# randintgen(k) returns a helper object for generating random integers in the range 0:k
randintgen{U<:Unsigned}(k::U) = RandIntGen{U}(one(U)+k)
# specialized versions
for (T, U) in [(Uint8, Uint32), (Uint16, Uint32),
(Int8, Uint32), (Int16, Uint32), (Int32, Uint32), (Int64, Uint64), (Int128, Uint128),
(Bool, Uint32), (Char, Uint32)]
randintgen(k::Uint8) = randintgen(convert(Uint32, k))
randintgen(k::Uint16) = randintgen(convert(Uint32, k))

@eval RandIntGen(r::UnitRange{$T}) = isempty(r) ? error("range must be non-empty") : RandIntGen(first(r), convert($U, unsigned(last(r) - first(r)) + one($U))) # overflow ok
@inline function rand_lessthan{U}(u::U)
while true
x = rand(U)
x <= u && return x
end
end

# this function uses 32 bit entropy for small ranges of length <= typemax(Uint32) + 1
# RandIntGen is responsible for providing the right value of k
function rand{T<:Union(Uint64, Int64)}(g::RandIntGen{T,Uint64})
local x::Uint64
if (g.k - 1) >> 32 == 0
x = rand(Uint32)
while x > g.u
x = rand(Uint32)
end
else
x = rand(Uint64)
while x > g.u
x = rand(Uint64)
end
end
return reinterpret(T, reinterpret(Uint64, g.a) + rem_knuth(x, g.k))
function rand(g::RandIntGen{Uint64})
g.k == zero(Uint64) && return rand(Uint64)
x = (g.k - 1) >> 32 == 0 ?
uint64(rand_lessthan(itrunc(Uint32, g.u))) :
rand_lessthan(g.u)
return x % g.k
end

function rand{T<:Integer, U<:Unsigned}(g::RandIntGen{T,U})
x = rand(U)
while x > g.u
x = rand(U)
end
itrunc(T, unsigned(g.a) + rem_knuth(x, g.k))
end
rand{U}(g::RandIntGen{U}) = g.k == zero(U) ? rand(U) : rand_lessthan(g.u) % g.k

rand{T<:Union(Signed,Unsigned,Bool,Char)}(r::UnitRange{T}) = rand(RandIntGen(r))
rand{T}(r::Range{T}) = r[rand(1:(length(r)))]

function rand!(g::RandIntGen, A::AbstractArray)
for i = 1 : length(A)
@inbounds A[i] = rand(g)
end
return A
end
## Randomly draw a sample from an AbstractArray r
# (e.g. r is a range 0:2:8 or a vector [2, 3, 5, 7])

check_nonempty(r::AbstractArray) = (isempty(r) && error(ArgumentError("cannot randomly pick an element from an empty collection")); r)

rand(r::AbstractArray) = cget(r, rand(randintgen(cendof(check_nonempty(r)))))

# Arrays of random integers

rand!{T<:Union(Signed,Unsigned,Bool,Char)}(r::UnitRange{T}, A::AbstractArray) = rand!(RandIntGen(r), A)
rand!(r::Range, A::AbstractArray) = rand!(r, A, ())

function rand!(r::Range, A::AbstractArray)
g = RandIntGen(1:(length(r)))
# TODO: this more general version is "disabled" until #8246 is resolved
function rand!(r::AbstractArray, A::AbstractArray, ::())
g = randintgen(cendof(check_nonempty(r)))
for i = 1 : length(A)
@inbounds A[i] = r[rand(g)]
@inbounds A[i] = cget(r, rand(g))
end
return A
end

rand{T}(r::Range{T}, dims::Dims) = rand!(r, Array(T, dims))
rand(r::Range, dims::Int...) = rand(r, dims)
rand{T}(r::AbstractArray{T}, dims::Dims) = rand!(r, Array(T, dims), ())
rand(r::AbstractArray, dims::Int...) = rand(r, dims)


## random Bools
Expand Down
17 changes: 17 additions & 0 deletions base/range.jl
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,21 @@ let smallint = (Int === Int64 ?
length{T <: smallint}(r::UnitRange{T}) = int(r.stop) - int(r.start) + 1
end

cendof(r::StepRange) = unsigned(endof(r) - 1)

const IntTypes = (Int8, Uint8, Int16, Uint16, Int32, Uint32,
Int64, Uint64, Int128, Uint128)

function cendof{T<:Union(IntTypes...),S<:Union(IntTypes...)}(r::StepRange{T,S})
if r.step > 0
div(unsigned(r.stop - r.start), r.step)
else
div(unsigned(r.start - r.stop), -r.step)
end
end

cendof{T<:Union(IntTypes...)}(r::UnitRange{T}) = unsigned(r.stop - r.start)

first{T}(r::OrdinalRange{T}) = oftype(T, r.start)
first(r::FloatRange) = r.start/r.divisor

Expand Down Expand Up @@ -263,6 +278,8 @@ function getindex{T}(r::FloatRange{T}, i::Integer)
oftype(T, (r.start + (i-1)*r.step)/r.divisor)
end

cget{T<:Union(IntTypes...)}(r::Range{T}, i::Integer) = itrunc(T, unsigned(first(r)) + unsigned(i)*unsigned(step(r)))

function check_indexingrange(s, r)
sl = length(s)
rl = length(r)
Expand Down
10 changes: 7 additions & 3 deletions doc/stdlib/base.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3947,7 +3947,7 @@ Random number generation in Julia uses the `Mersenne Twister library <http://www

.. function:: rand!([rng], A)

Populate the array A with random number generated from the specified RNG.
Populate the array A with random numbers generated from the specified RNG.

.. function:: rand(rng::AbstractRNG, [dims...])

Expand All @@ -3961,9 +3961,13 @@ Random number generation in Julia uses the `Mersenne Twister library <http://www

Generate a random number or array of random numbes of the given type.

.. function:: rand(r, [dims...])
.. function:: rand(coll, [dims...])

Pick a random element or array of random elements from range ``r`` (for example, ``1:n`` or ``0:2:10``).
Pick a random element or array of random elements from the indexable collection ``coll`` (for example, ``1:n`` or ``['x','y','z']``).

.. function:: rand!(r, A)

Populate the array A with random values drawn uniformly from the range ``r``.

.. function:: randbool([dims...])

Expand Down
12 changes: 7 additions & 5 deletions test/random.jl
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@ rand!(MersenneTwister(0), A)
@test A == [0.8236475079774124 0.16456579813368521;
0.9103565379264364 0.17732884646626457]

@test rand(0:3:1000) in 0:3:1000
coll = Any[2, uint128(128), big(619), "string", 'c']
@test rand(coll) in coll
@test issubset(rand(coll, 2, 3), coll)

# randn
@test randn(MersenneTwister(42)) == -0.5560268761463861
A = zeros(2, 2)
Expand Down Expand Up @@ -54,9 +59,7 @@ if sizeof(Int32) < sizeof(Int)
r = rand(int32(-1):typemax(Int32))
@test typeof(r) == Int32
@test -1 <= r <= typemax(Int32)
@test all([div(0x00010000000000000000,k)*k - 1 == Base.Random.RandIntGen(uint64(1:k)).u for k in 13 .+ int64(2).^(32:62)])
@test all([div(0x00010000000000000000,k)*k - 1 == Base.Random.RandIntGen(int64(1:k)).u for k in 13 .+ int64(2).^(32:61)])

@test all([div(0x00010000000000000000,k)*k - 1 == Base.Random.randintgen(uint64(k)-1).u for k in 13 .+ int64(2).^(32:62)])
end

randn(100000)
Expand Down Expand Up @@ -160,8 +163,7 @@ r = uint64(rand(uint32(97:122)))
srand(seed)
@test r == rand(uint64(97:122))

@test all([div(0x000100000000,k)*k - 1 == Base.Random.RandIntGen(uint64(1:k)).u for k in 13 .+ int64(2).^(1:30)])
@test all([div(0x000100000000,k)*k - 1 == Base.Random.RandIntGen(int64(1:k)).u for k in 13 .+ int64(2).^(1:30)])
@test all([div(0x000100000000,k)*k - 1 == Base.Random.randintgen(uint64(k)-1).u for k in 13 .+ int64(2).^(1:30)])

import Base.Random: uuid4, UUID

Expand Down
12 changes: 12 additions & 0 deletions test/ranges.jl
Original file line number Diff line number Diff line change
Expand Up @@ -372,3 +372,15 @@ let smallint = (Int === Int64 ?
@test length(typemin(T):typemax(T)) == 2^(8*sizeof(T))
end
end

# cendof, cget
for r in (typemin(Int):typemax(Int), typemin(Int):1:typemax(Int))
@test Base.cendof(r) == typemax(Uint)
@test Base.cget(r, 0) == typemin(Int)
@test Base.cget(r, Base.cendof(r)) == typemax(Int)
end
let r = typemax(Int):-1:typemin(Int)
@test Base.cendof(r) == typemax(Uint)
@test Base.cget(r, 0) == typemax(Int)
@test Base.cget(r, Base.cendof(r)) == typemin(Int)
end