Skip to content

Implement Exponentiation via Associative Iteration #75

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 10 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
# Vscode does not like to build outside of the source tree
# (multiple glitches)

.vscode
test/.vscode
build
.cache

.vscode
test/.vscode
build
.cache
.idea
**cmake-build**
Comment on lines +6 to +7
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does your IDE really require building inline with the sources, or within the source tree?
I am loathe to pollute the repository with support for the anti-pattern of building within the sources; however, Visual Studio Code forced me to yield. VSCode is a poor tool that glitches in very frustrating ways when asked to build out of the source tree, that's the only reason for the test/.vscode, build and .cache.
So, unless your tool glitches, adopt the practice of not building in-tree and remove this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this such a bad thing? I think most tools / LSPs expect to build in the source tree? I think CLion does this by default and I think clangd by default looks for ./build in the root of the project?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two things I dislike:
Most tools like the command line grep, find, etc. are extremely hard to configure for understanding .gitignore, so, you just slow down your work flow by having to treat the non-sources directories as explicit special cases with every non-git tool you use.

Let's say you want to build with GCC and also with Clang.
Who gets to build in-tree? both?
So, for every flavor of build you may want you need to provide an entry point within the tree, and add it to the .gitignore.
How's that helpful instead of segregating builds from sources?

These are two instances of the more general problem of disorganization. A freshly cloned repository ought to contain only files generated by people or by tools that are so sophisticated they may as well. Further processing of "sources" ought not to dump the results into a place where it is hard to tell them apart from files made by people.

It baffles me that colleagues expect that whatever configuration is good for holding source files is going to be just as good for built files. For example, what if I have a configuration of very small blocks at the file system level expecting that source code is among the very small files in the distribution of file size, but keep the build filesystem with large blocks, because of the same reason on the opposite?

See? there are multiplicity of day to day advantages of keeping things organized. Very fundamental stuff for Software Engineers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry took a while to get back to this.

I understand both of these points, I think the overriding point from me regardless of my opinion is that this is how it is going to be used by people, so we might as well make it convenient for people to contribute. Many IDEs and tools like the build folder to be in the source tree. I'm using the clangd LSP which is hooking into nvim which I believe requires the compile_commands.json to be in ./build. JetBrains also will build in the source tree by default into cmake-build-[BUILD_TYPE].

Secondly, you can use tools like ripgrep in place of grep which are .gitignore sensitive, which is awesome.

Though I hear your points well and agree, this addition doesn't prevent more optimal workflows and enables anyone to contribute more easily.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's continue this conversation online, at this point, it is an item that should not go in this PR, but in another PR if we decide to support building in-tree.

Let's have PRs with a "single responsibility".


# Vscode does not like to build outside of the source tree
# (multiple glitches)
Empty file added CMakeLists.txt
Empty file.
1 change: 1 addition & 0 deletions inc/zoo/meta/BitmaskMaker.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ struct BitmaskMaker {

static_assert(0xF0F0 == BitmaskMaker<uint16_t, 0xF0, 8>::value);
static_assert(0xEDFEDFED == BitmaskMaker<uint32_t, 0xFED, 12>::value);
static_assert(0b0001'0001 == BitmaskMaker<unsigned char, 1, 4>::value);

}} // zoo::meta

Expand Down
16 changes: 16 additions & 0 deletions inc/zoo/swar/SWAR.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include "zoo/meta/log.h"

#include <type_traits>
#include <initializer_list>

#ifdef _MSC_VER
#include <iso646.h>
Expand Down Expand Up @@ -90,6 +91,21 @@ struct SWAR {

constexpr T value() const noexcept { return m_v; }

template<std::size_t N>
constexpr static T baseFromLaneLiterals(const T(&args)[N]) {
static_assert(N == Lanes, "Wrong number of lanes");
T result = 0;
for (auto arg: args) {
result = (result << NBits) | arg;
}
return result;
}

template<std::size_t N>
constexpr static SWAR fromLaneLiterals(const T(&args)[N]) {
return SWAR{baseFromLaneLiterals(args)};
}

#define SWAR_UNARY_OPERATORS_X_LIST \
X(SWAR, ~)
//constexpr SWAR operator~() const noexcept { return SWAR{~m_v}; }
Expand Down
69 changes: 61 additions & 8 deletions inc/zoo/swar/associative_iteration.h
Original file line number Diff line number Diff line change
Expand Up @@ -260,7 +260,7 @@ template<int NB, typename B>
constexpr auto makeLaneMaskFromMSB(SWAR<NB, B> input) {
using S = SWAR<NB, B>;
auto msb = input & S{S::MostSignificantBit};
auto msbCopiedToLSB = S{msb.value() >> (NB - 1)};
auto msbCopiedToLSB = S{static_cast<B>(msb.value() >> (NB - 1))};
return impl::makeLaneMaskFromMSB_and_LSB(msb, msbCopiedToLSB);
}

Expand Down Expand Up @@ -392,8 +392,13 @@ template<
typename CountHalver
>
constexpr auto associativeOperatorIterated_regressive(
Base base, Base neutral, IterationCount count, IterationCount forSquaring,
Operator op, unsigned log2Count, CountHalver ch
const Base base,
const Base neutral,
IterationCount count,
const IterationCount forSquaring,
const Operator op,
unsigned log2Count,
const CountHalver ch
) {
auto result = neutral;
if(!log2Count) { return result; }
Expand All @@ -419,17 +424,54 @@ constexpr auto multiplication_OverflowUnsafe_SpecificBitCount(

auto halver = [](auto counts) {
auto msbCleared = counts & ~S{S::MostSignificantBit};
return S{msbCleared.value() << 1};
return S{static_cast<T>(msbCleared.value() << 1)};
};

multiplier = S{multiplier.value() << (NB - ActualBits)};
multiplier = S{static_cast<T>(multiplier.value() << (NB - ActualBits))};
return associativeOperatorIterated_regressive(
multiplicand, S{0}, multiplier, S{S::MostSignificantBit}, operation,
ActualBits, halver
multiplicand,
S{0},
multiplier,
S{S::MostSignificantBit},
operation,
ActualBits,
halver
);
}

/// \note Not removed yet because it is an example of "progressive" associative exponentiation
template<int ActualBits, int NB, typename T>
constexpr auto exponentiation_OverflowUnsafe_SpecificBitCount(
SWAR<NB, T> x,
SWAR<NB, T> exponent
) {
using S = SWAR<NB, T>;

auto operation = [](auto left, auto right, auto counts) {
const auto mask = makeLaneMaskFromMSB(counts);
const auto product =
multiplication_OverflowUnsafe_SpecificBitCount<ActualBits>(left, right);
return (product & mask) | (left & ~mask);
};

// halver should work same as multiplication... i think...
auto halver = [](auto counts) {
auto msbCleared = counts & ~S{S::MostSignificantBit};
return S{static_cast<T>(msbCleared.value() << 1)};
};

exponent = S{static_cast<T>(exponent.value() << (NB - ActualBits))};
return associativeOperatorIterated_regressive(
x,
S{meta::BitmaskMaker<T, 1, NB>().value}, // neutral is lane wise..
exponent,
S{S::MostSignificantBit},
operation,
ActualBits,
halver
);
}

// \note Not removed yet because it is an example of "progressive" associative exponentiation
template<int ActualBits, int NB, typename T>
constexpr auto multiplication_OverflowUnsafe_SpecificBitCount_deprecated(
SWAR<NB, T> multiplicand,
Expand Down Expand Up @@ -462,6 +504,17 @@ constexpr auto multiplication_OverflowUnsafe(
);
}

template<int NB, typename T>
constexpr auto exponentiation_OverflowUnsafe(
SWAR<NB, T> base,
SWAR<NB, T> exponent
) {
return
exponentiation_OverflowUnsafe_SpecificBitCount<NB>(
base, exponent
);
}

template<int NB, typename T>
struct SWAR_Pair{
SWAR<NB, T> even, odd;
Expand Down
30 changes: 26 additions & 4 deletions test/swar/BasicOperations.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
#include <iostream>
#include <type_traits>


using namespace zoo;
using namespace zoo::swar;

Expand Down Expand Up @@ -64,8 +63,31 @@ static_assert(
multiplication_OverflowUnsafe_SpecificBitCount<3>(Micand, Mplier).value()
);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GOOD NEWS. Looks like my initial, first-pass implementation was correct after all!
Seems like part of the issue was how I'm trying to compare the binary literals.

All value I tried when evaluating using the hex idiom you guys have already got provides consistently correct answers.

static_assert(0b00000010000000110000010100000110 == 0x02'03'05'06);

TEST_CASE("Expontiation with 8-bit lane width (overflow unsafe)") {
using S = SWAR<8, u32>;
constexpr auto base = S::fromLaneLiterals({2, 3, 5, 6});
constexpr auto exponent = S::fromLaneLiterals({7, 4, 2, 3});
constexpr auto expected = S::fromLaneLiterals({128, 81, 25, 216});
constexpr auto actual = exponentiation_OverflowUnsafe(base, exponent);
static_assert(expected.value() == actual.value());
CHECK(expected.value() == actual.value());
}

TEST_CASE("Expontiation with 16-bit lane width (overflow unsafe)") {
using S = SWAR<16, u64>; // Change to 16-bit lane width
constexpr auto base = S::fromLaneLiterals({10, 2, 7, 3});
constexpr auto exponent = S::fromLaneLiterals({3, 5, 1, 4});
constexpr auto expected = S::fromLaneLiterals({1000, 32, 7, 81});
constexpr auto actual = exponentiation_OverflowUnsafe(base, exponent);
static_assert(expected.value() == actual.value());
CHECK(expected.value() == actual.value());
}

};


#define HE(nbits, t, v0, v1) \
static_assert(horizontalEquality<nbits, t>(\
SWAR<nbits, t>(v0),\
Expand Down Expand Up @@ -425,7 +447,7 @@ TEST_CASE(
"BooleanSWAR MSBtoLaneMask",
"[swar]"
) {
// BooleanSWAR as a mask:
// BooleanSWAR as a mask:
auto bswar =BooleanSWAR<4, u32>(0x0808'0000);
auto mask = S4_32(0x0F0F'0000);
CHECK(bswar.MSBtoLaneMask().value() == mask.value());
Expand All @@ -452,6 +474,6 @@ TEST_CASE(
CHECK(SWAR<4, u16>(0x0400).value() == saturatingUnsignedAddition(SWAR<4, u16>(0x0100), SWAR<4, u16>(0x0300)).value());
CHECK(SWAR<4, u16>(0x0B00).value() == saturatingUnsignedAddition(SWAR<4, u16>(0x0800), SWAR<4, u16>(0x0300)).value());
CHECK(SWAR<4, u16>(0x0F00).value() == saturatingUnsignedAddition(SWAR<4, u16>(0x0800), SWAR<4, u16>(0x0700)).value());
CHECK(SWAR<4, u16>(0x0F00).value() == saturatingUnsignedAddition(SWAR<4, u16>(0x0800), SWAR<4, u16>(0x0800)).value());
CHECK(S4_32(0x0F0C'F000).value() == saturatingUnsignedAddition(S4_32(0x0804'F000), S4_32(0x0808'F000)).value());
CHECK(SWAR<4, u16>(0x0F00).value() == saturatingUnsignedAddition(SWAR<4, u16>(0x0800), SWAR<4, u16>(0x0800)).value());
CHECK(S4_32(0x0F0C'F000).value() == saturatingUnsignedAddition(S4_32(0x0804'F000), S4_32(0x0808'F000)).value());
}