Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide implementation for fast integral log2 #5277

Open
0xdaryl opened this issue Jun 3, 2020 · 4 comments
Open

Provide implementation for fast integral log2 #5277

0xdaryl opened this issue Jun 3, 2020 · 4 comments

Comments

@0xdaryl
Copy link
Contributor

0xdaryl commented Jun 3, 2020

Integral log2 (essentially a floor base-2 logarithm function on integral types) is useful in some contexts during compilation. Provide a fast implementation that lives in infra/Bit.hpp. I suggest an implementation based on the Hacker's Delight definition of integral log2, which essentially defines log2(0) = -1 (which isn't true in the mathematical sense) and makes the computation more efficient.

For example, the implementation for 32-bit log2 is simply:

/**
 * @return floor log2(x) if x>0
 *         -1 if x==0
 *         undefined if x<0
 */
static inline int32_t log2(int32_t x)
   {
   return 31 - leadingZeroes(x);
   }

This results in:

log2(16) = 4
log2(23) = 4
log2(0) = -1
log2(-145) = undefined

leadingZeroes has been optimized in the compiler depending on the architecture.

Provide a version for 64-bit integrals as well. Create test cases to exercise these functions.

@0xdaryl
Copy link
Contributor Author

0xdaryl commented Jun 3, 2020

A bigger issue is to contribute such utility functions to a more common OMR component (e.g., in include_core) and use them everywhere, but that is not the focus of this particular issue.

@0xdaryl 0xdaryl changed the title Provide implementations for fast integral log2 Provide implementation for fast integral log2 Jun 3, 2020
@ghost
Copy link

ghost commented Jun 3, 2020

Hi @0xdaryl !
Seems like something interesting. I would like to take a stab at this.
I was wondering if I could add this implementation: https://stackoverflow.com/a/11398748/4042839

Let me know if you have comments. If you are okay with this, I would love to take this up ! I am just starting off contributing to open source and this should be a good beginners exercise.

@ghost
Copy link

ghost commented Jun 21, 2020

@0xdaryl in case you missed my earlier comment :)

@0xdaryl
Copy link
Contributor Author

0xdaryl commented Jun 22, 2020

Thanks for the poke. My apologies, I did miss your update earlier.

First off, thanks for your offer to help. New contributors are always welcome!

One of the big advantages of implementing N-bit log2(X) as N-1-leadingZeros(X) is that every architecture OMR supports [1] provides some reasonably fast implementation of a "count leading zeros" instruction in hardware. Since the OMR compiler is not only about producing fast code but fast compile-time as well, we're looking to shave off as many cycles as we can during the compilation. Hence, I would prefer exploring a solution that is capable of leveraging the hardware instructions rather than implement a generic implementation in software that may be slower.

Some of our build compilers provide intrinsics for the CLZ family of instructions. For those that don't I believe inline assembly should suffice.

In thinking the design of this through a bit more, I realize to be truly efficient we should really be replacing the leadingZeros function in the compiler with a version that is backed by a hardware instruction (via a compiler intrinsic or inline assembly). I think only Power takes advantage of this, but there's no reason it shouldn't be extended to other architectures as well.

Perhaps this work can be split into two parts: the first is implementing a log2 intrinsic that leverages a leadingZeros intrinsic, and the second is to provide a hardware-backed implementation of leadingZeros (the second could likely be done as part of a different issue if that's not something you want to work on).

[1] x86 (lzcnt), Power (cnttzd), Z (flogr), AArch64 (clz), RISC-V (clz) or variants thereof

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant