Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate linearindices in favor of LinearIndices #26775

Merged
merged 5 commits into from
May 8, 2018
Merged

Conversation

nalimilan
Copy link
Member

LinearIndices is strictly more powerful than linearindices: these two functions return arrays holding the same elements, but the former also preserves the shape and indices of the original array. Also improve docstrings.

Part of #25901.

Not all uses of linearindices in Base can be replaced with LinearIndices, because of the order in which array support is defined. For these cases, I used eachindex(IndexLinear(), A), which needs to continue to work anyway and return a range of linear indices. This is why I marked this as RFC: while it simplifies the public API, one still needs to choose between LinearIndices and eachindex. In particular, I'm not sure whether linearindices/eachindex can be more efficient (since it returns simple ranges) than LinearIndices, at least in terms of compilation time.

@nalimilan nalimilan added the arrays [a, r, r, a, y, s] label Apr 10, 2018
@nalimilan nalimilan requested a review from mbauman April 10, 2018 21:00
@mbauman mbauman added the deprecation This change introduces or involves a deprecation label Apr 10, 2018
@JeffBezanson JeffBezanson added the triage This should be discussed on a triage call label Apr 12, 2018
@StefanKarpinski
Copy link
Member

Triage accepts but the failures seem related.

@StefanKarpinski StefanKarpinski removed the triage This should be discussed on a triage call label Apr 12, 2018
@StefanKarpinski StefanKarpinski added this to the 1.0 milestone Apr 12, 2018
@mbauman
Copy link
Member

mbauman commented Apr 12, 2018

I did a few spot checks on performance here — iteration over LinearIndices is exactly the same as iterating over eachindex for a Array.

@nalimilan nalimilan force-pushed the nl/linearindices2 branch 2 times, most recently from 7f274a0 to 5c6da0c Compare April 13, 2018 06:35
@nalimilan
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JuliaLang JuliaLang deleted a comment from nanosoldier Apr 13, 2018
@nalimilan
Copy link
Member Author

I've fixed the failures. One of them was interesting: on 32-bit, linearindices used to return a range with an element type matching length, but LinearIndices always uses Int. A test for Dates ranges overflowed because such ranges use Int64 length. I used eachindex instead, as was internal (in lastindex), but that's still interesting to note.

Regarding the Nanosoldier run,the find* regressions are expected because of the deprecation (JuliaCI/BaseBenchmarks.jl#196). Other regressions for cumsum!, rand!/RangeGenerator and add/mul/sub are intriguing: they can't be due to chance, but I'm not sure why these would particularly suffer from this PR.

@ararslan
Copy link
Member

Nanosoldier has been updated and retuned with Milan's changes.

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@Keno
Copy link
Member

Keno commented May 2, 2018

What's the status of this? It's 1.0 tagged.

@nalimilan
Copy link
Member Author

Sorry, I haven't investigated the cumsum and rand!/RandomGenerator benchmarks regressions, which appear to be real. I'll try to have a look, but we can always merge this and fix the regressions later if needed.

@mbauman
Copy link
Member

mbauman commented May 2, 2018

I'm looking into it now.

@mbauman mbauman force-pushed the nl/linearindices2 branch from 94255ce to 653d2d4 Compare May 2, 2018 21:48
@mbauman
Copy link
Member

mbauman commented May 2, 2018

That seems to have done the trick for most of the regressions. Looks like there is still a bonus allocation in rand! due to constructing a LinearIndices and passing it to a non-inlined function, but it has no effect on the performance.

May as well have nanosoldier double-check my work:

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@nalimilan
Copy link
Member Author

It appears that all CI builds time out (even after restarting one). Really weird.

@Keno
Copy link
Member

Keno commented May 4, 2018

Looks like building the Test stdlib takes 10x longer on this branch than usual. Does that add an invalidation that touches a lot of code?

@nalimilan
Copy link
Member Author

Good catch! That's intriguing. Test is the only affected module, and yet it doesn't use LinearIndices except in the deprecated test_approx_eq. And the problem remains after replacing these with eachindex.

Can you detail what you suspect about code invalidation?

BTW, I've checked that the problem is not due to Matt's last commit, and yet it didn't happen before I rebased. So maybe some interaction with a recent change on master.

@nalimilan nalimilan force-pushed the nl/linearindices2 branch from d68752b to c6b6100 Compare May 6, 2018 15:26
@nalimilan
Copy link
Member Author

Unfortunately, your fix has reintroduced the failure on 32-bit I mentioned above. I've pushed another commit which seems to also fix the problem, by moving the eachindex definition before firstindex/lastindex (don't ask me why it works...).

_all_match_first(f::F, inds) where F<:Function = true

# keys with an IndexStyle
keys(s::IndexStyle, A::AbstractArray, B::AbstractArray...) = eachindex(s, A, B...)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method isn't documented and doesn't seem to be used. Probably better remove it? It's not clear whether it should call eachindex rather than LinearIndices/CartesianIndices.

@nalimilan nalimilan changed the title RFC: Deprecate linearindices in favor of LinearIndices Deprecate linearindices in favor of LinearIndices May 6, 2018
@nalimilan
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@nalimilan
Copy link
Member Author

Most regressions seem to be spurious (as they don't appear in both Nanosoldier runs), except for accumulate/cumsum. It turns out its due to doing LinearIndices(x)[2:end], which currently allocates a new Array (!). I've pushed a commit which should fix this by returning a range instead (and removes the apparently useless length method I spotted above).

…earIndices, ::AbstractRange)

Plus two small fixes.
@nalimilan nalimilan force-pushed the nl/linearindices2 branch from 3a3683c to 1ad025b Compare May 7, 2018 11:22
@nalimilan
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

base/indices.jl Outdated
@@ -357,11 +357,16 @@ LinearIndices(A::Union{AbstractArray,SimpleVector}) = LinearIndices(axes(A))
IndexStyle(::Type{<:LinearIndices}) = IndexLinear()
axes(iter::LinearIndices) = iter.indices
size(iter::LinearIndices) = map(unsafe_length, iter.indices)
length(iter::LinearIndices) = prod(size(iter))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If something is iterable and has a known length, it makes sense to define length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case it's also an AbstractArray, in which case exactly this definition is defined as a fallback. I defined it because (like you) I was thinking about it as an iterable — but Milan is also correct that this is an array and doesn't technically need it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah of course; I see.

@mbauman
Copy link
Member

mbauman commented May 7, 2018

Ah, thanks for cleaning that up!

I'm still not entirely satisfied about the use of eachindex(IndexLinear(), _) in lieu of LinearIndices. It seems like the main reason we use it is in cases where we specifically require a vector or range. What about defining:

# Reshaping multidimensional LinearIndices to a vector can simply return OneTos
vec(iter::LinearIndices{1}) = iter.indices[1]
vec(iter::LinearIndices) = OneTo(length(iter))
reshape(iter::LinearIndices, ::Tuple{Colon}) = vec(iter)
reshape(iter::LinearIndices{1}, ::Tuple{Colon}) = vec(iter)

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@nalimilan
Copy link
Member Author

Looks like all real performance regressions are fixed now!

Regarding the use of eachindex, I'm not sure using vec(::LinearIndices) to get a range would really match the definition of vec. AFAICT vec returns an AbstractVector, and nothing in its definition implies that a LinearIndices{1} input should be transformed into a range, since it's already an AbstractVector. Maybe we just need to define checkindex for LinearIndices: that would cover most cases.

For firstindex and lastindex, the problem is different anyway, and it's due to some range types using Int64 rather than Int with length. Not sure what to do about that, but we could decide it's not supported (since problems can happen elsewhere anyway due to the use of Int in most Base functions).

Copy link
Member

@mbauman mbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good as it is. Barring any other comments, let's merge in the next 24 hours.

@mbauman mbauman merged commit 4c665b7 into master May 8, 2018
@mbauman mbauman deleted the nl/linearindices2 branch May 8, 2018 22:43
@fredrikekre
Copy link
Member

This broke master since I merged #26993 and this PR didn't run CI after that merge, failure:

725.287218  !! Duplicate docs found for 'Base.LinearIndices'. [src/base/arrays.md]

(Sorry for the breakage, but I guess this proves that it is nice to have strict mode enabled :) )

@fredrikekre
Copy link
Member

fredrikekre commented May 9, 2018

A couple of depwarns that was caught by doctests

julia> prod(1:20)
┌ Warning: `LinearIndices(inds::Vararg{AbstractUnitRange{Int}, N}) where N` is deprecated, use `LinearIndices(inds)` instead.
│   caller = _mapreduce(::typeof(identity), ::typeof(Base.mul_prod), ::IndexLinear, ::UnitRange{Int64}) at reduce.jl:334
└ @ Base reduce.jl:334
2432902008176640000

julia> copyto!([1.0, 2.0], 1:2)
┌ Warning: `LinearIndices(inds::Vararg{AbstractUnitRange{Int}, N}) where N` is deprecated, use `LinearIndices(inds)` instead.
│   caller = copyto!(::IndexLinear, ::Array{Float64,1}, ::IndexLinear, ::UnitRange{Int64}) at abstractarray.jl:718
└ @ Base abstractarray.jl:718
2-element Array{Float64,1}:
 1.0
 2.0

We really need #27000 -- the doctests are as important as the regular tests, and as shown here, covers some different paths.

@fredrikekre
Copy link
Member

Seems like the case with N == 1 here

julia/base/deprecated.jl

Lines 1582 to 1586 in 4c665b7

# Remove ambiguous CartesianIndices and LinearIndices constructors that are ambiguous between an axis and an array (#26448)
@eval IteratorsMD @deprecate CartesianIndices(inds::Vararg{AbstractUnitRange{Int},N}) where {N} CartesianIndices(inds)
@eval IteratorsMD @deprecate CartesianIndices(inds::Vararg{AbstractUnitRange{<:Integer},N}) where {N} CartesianIndices(inds)
@eval IteratorsMD @deprecate LinearIndices(inds::Vararg{AbstractUnitRange{Int},N}) where {N} LinearIndices(inds)
@eval IteratorsMD @deprecate LinearIndices(inds::Vararg{AbstractUnitRange{<:Integer},N}) where {N} LinearIndices(inds)

should not be deprecated?

@mbauman
Copy link
Member

mbauman commented May 9, 2018

I'm not reproducing that. There are the N=1 methods just below the lines you posted.

@fredrikekre
Copy link
Member

There are the N=1 methods just below the lines you posted.

Yea, cause I included them in #27037 😄

@mbauman
Copy link
Member

mbauman commented May 9, 2018

That's awesome. Thanks so much @fredrikekre! I need to drink more coffee and change my issue-reading order!

@nalimilan
Copy link
Member Author

I've filed #27090 about the lastindex issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants