Skip to content

Add internal top_set_bit function #47523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Jan 8, 2023
Merged

Add internal top_set_bit function #47523

merged 22 commits into from
Jan 8, 2023

Conversation

LilithHafner
Copy link
Member

At least 17 out of the 36 uses of leading_zeros in base are an attempt to compute how many bits are used. This PR defines that function explicitly and documents it for increased robustness and ease of use.

@laborg
Copy link
Contributor

laborg commented Nov 10, 2022

I had need of such a function on occasions. With this being documented I assume its stable and can't be removed in future versions, right?

@LilithHafner
Copy link
Member Author

LilithHafner commented Nov 10, 2022

Yes. The documentation at https://docs.julialang.org is the authoritative source on what is guaranteed to be stable for all of 1.x (as described here). This PR would add Base.used_bits to that documentation.

EDIT: No. Sorry, it will not be documented; see the discussion below. You can use ndigits(x; base=2, pad=0), though.

@oscardssmith
Copy link
Member

I think we should probably triage this to bike-shop the name.

@LilithHafner LilithHafner added the triage This should be discussed on a triage call label Nov 10, 2022
@LilithHafner
Copy link
Member Author

Triage is 1:15am my time, so I will not be joining you, but that seems like a good idea.

@ararslan
Copy link
Member

Seems like this could be exported given that leading_*, trailing_*, and count_* (for * in zeros, ones) are exported.

@JeffBezanson
Copy link
Member

Ideally this would be called ndigits; we should try to get that to constprop for constant base 2.

@JeffBezanson
Copy link
Member

If needed we can also call it ndigits2 internally.

@giordano
Copy link
Contributor

ndigits has already fast-track for all bases which are powers of 2 (not just 2): #31722

@MasonProtter
Copy link
Contributor

The concern about ndigits(x, base=2) was that const-prop failed to inline one of the calls when this should really just compile down to a couple of assembly instructions, but it does still end up being quite fast even without the inlining:

julia> code_llvm(Tuple{Int}) do x
           ndigits(x, base=2)
       end
;  @ REPL[7]:2 within `#5`
define i64 @"julia_#5_386"(i64 signext %0) #0 {
top:
; ┌ @ intfuncs.jl:658 within `ndigits##kw`
; │┌ @ intfuncs.jl:658 within `#ndigits#426`
    %1 = call i64 @j_ndigits0z_388(i64 signext %0, i64 signext 2) #0
; ││┌ @ promotion.jl:488 within `max`
; │││┌ @ essentials.jl:489 within `ifelse`
      %2 = icmp sgt i64 %1, 1
      %3 = select i1 %2, i64 %1, i64 1
; └└└└
  ret i64 %3
}

julia> @benchmark ndigits(x, base=2) setup=(x=rand(1:10000))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  4.740 ns  8.200 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     5.390 ns             ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.137 ns ± 0.394 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅                            █                             
  █▂▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂█▄▂▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▂▁▂▂▂▂▁▁▂▂ ▂
  4.74 ns        Histogram: frequency by time       6.01 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

I wonder if anyone is missing the 2ns that could be gained here:

julia> used_bits(x::Base.BitInteger) = 8sizeof(x) - leading_zeros(x)
used_bits (generic function with 2 methods)

julia> code_llvm(Tuple{Int}) do x
           used_bits(x)
       end
;  @ REPL[13]:2 within `#9`
define i64 @"julia_#9_954"(i64 signext %0) #0 {
top:
; ┌ @ REPL[12]:1 within `used_bits`
; │┌ @ int.jl:421 within `leading_zeros`
    %1 = call i64 @llvm.ctlz.i64(i64 %0, i1 false)
; │└
; │┌ @ int.jl:86 within `-`
    %2 = sub nuw nsw i64 64, %1
; └└
  ret i64 %2
}

julia> @benchmark used_bits(x) setup=(x=rand(1:10000))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  1.940 ns  4.940 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.950 ns             ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.951 ns ± 0.062 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                                          █  
  ▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁█ ▂
  1.94 ns        Histogram: frequency by time       1.95 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

@oscardssmith
Copy link
Member

These 2ns matter a lot. Uses for this include things like fmod which in only a few ns anyway so this would be a roughly 100% regression.

@giordano
Copy link
Contributor

Would it be easier if for the time being the proposed used_bits function (maybe renamed to ndigits2 as suggested above) wasn't part of the public API and kept only for internal usage when ultimate performance is needed?

@KristofferC
Copy link
Member

Would it be easier if for the time being the proposed used_bits function (maybe renamed to ndigits2 as suggested above) wasn't part of the public API and kept only for internal usage when ultimate performance is needed?

Yes please. Always do things incrementally if possible imo.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Nov 10, 2022
@LilithHafner LilithHafner changed the title Add unexported used_bits function Add internal ndigits2 function Nov 11, 2022
@LilithHafner
Copy link
Member Author

Thanks! This all seems quite sensible. I was unaware of the existing ndigits function when I named this used_bits. I've implemented the changes.

I named the function ndigits0z2 rather than ndigits2 because ndigits0z2(0) == 0 which is consistent with ndigits0z and not ndigits. This name was inspired by the comment found here

@LilithHafner LilithHafner changed the title Add internal ndigits2 function Add internal ndigits0z2 function Nov 11, 2022
@oscardssmith
Copy link
Member

I'm slightly worried that I will never remember that name.

@LilithHafner
Copy link
Member Author

Is inconsistency with ndigits when x=0 something that triage discussed and decided was okay, or something that didn't come up? It seems misleading to call this ndigits2 because of that inconsistency.

@oscardssmith
Copy link
Member

arg. Triage missed that. technically this corresponds to ndigits(0,base=2,pad=0).

@LilithHafner LilithHafner added the triage This should be discussed on a triage call label Nov 13, 2022
@LilithHafner
Copy link
Member Author

top_set_bit

@LilithHafner LilithHafner removed the triage This should be discussed on a triage call label Jan 5, 2023
@LilithHafner LilithHafner changed the title Add internal ndigits0z2 function Add internal top_set_bit function Jan 6, 2023
@LilithHafner LilithHafner added the merge me PR is reviewed. Merge when all tests are passing label Jan 6, 2023
Copy link
Member

@StefanKarpinski StefanKarpinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and can be merged when green as far as I'm concerned. We might want to have a slightly broader name discussion before considering exporting, but I do think this would be a pretty hand function to export.

julia> top_set_bit(-1)
64
"""
top_set_bit(x::BitInteger) = 8sizeof(x) - leading_zeros(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
top_set_bit(x::BitInteger) = 8sizeof(x) - leading_zeros(x)
top_set_bit(x::BitInteger) = 8*sizeof(x) - leading_zeros(x)

@giordano
Copy link
Contributor

giordano commented Jan 8, 2023

Needs a rebase.

@oscardssmith oscardssmith merged commit 708d1bd into master Jan 8, 2023
@oscardssmith oscardssmith deleted the add-used-bits branch January 8, 2023 22:16
@LilithHafner LilithHafner removed the merge me PR is reviewed. Merge when all tests are passing label Jan 9, 2023
odow pushed a commit to jump-dev/MutableArithmetics.jl that referenced this pull request Jan 31, 2023
Issue: MutableArithmetics for SparseArrays is broken on nightly Julia
because SparseArrays.ilog2 doesn't exist anymore.

This adds the function _top_set_bit and uses it instead of ilog2.

See
JuliaSparse/SparseArrays.jl#324
and
JuliaLang/julia#47523
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.