Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further sequester Group/Tag code #568

Merged
merged 2 commits into from
Nov 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/raw/bitmask.rs → src/control/bitmask.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use super::imp::{
use super::group::{
BitMaskWord, NonZeroBitMaskWord, BITMASK_ITER_MASK, BITMASK_MASK, BITMASK_STRIDE,
};

Expand Down Expand Up @@ -102,7 +102,7 @@ impl IntoIterator for BitMask {

/// Iterator over the contents of a `BitMask`, returning the indices of set
/// bits.
#[derive(Copy, Clone)]
#[derive(Clone)]
pub(crate) struct BitMaskIter(pub(crate) BitMask);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterators in general shouldn't implement Copy, for better or worse.


impl Iterator for BitMaskIter {
Expand Down
9 changes: 3 additions & 6 deletions src/raw/generic.rs → src/control/group/generic.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
use super::bitmask::BitMask;
use super::Tag;
use super::super::{BitMask, Tag};
use core::{mem, ptr};

// Use the native word size as the group size. Using a 64-bit group size on
Expand Down Expand Up @@ -81,8 +80,7 @@ impl Group {
#[inline]
#[allow(clippy::cast_ptr_alignment)]
pub(crate) unsafe fn load_aligned(ptr: *const Tag) -> Self {
// FIXME: use align_offset once it stabilizes
debug_assert_eq!(ptr as usize & (mem::align_of::<Self>() - 1), 0);
debug_assert_eq!(ptr.align_offset(mem::align_of::<Self>()), 0);
Group(ptr::read(ptr.cast()))
}

Expand All @@ -91,8 +89,7 @@ impl Group {
#[inline]
#[allow(clippy::cast_ptr_alignment)]
pub(crate) unsafe fn store_aligned(self, ptr: *mut Tag) {
// FIXME: use align_offset once it stabilizes
debug_assert_eq!(ptr as usize & (mem::align_of::<Self>() - 1), 0);
debug_assert_eq!(ptr.align_offset(mem::align_of::<Self>()), 0);
ptr::write(ptr.cast(), self.0);
}

Expand Down
35 changes: 35 additions & 0 deletions src/control/group/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
cfg_if! {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just copied directly from the raw module, without any additional changes.

// Use the SSE2 implementation if possible: it allows us to scan 16 buckets
// at once instead of 8. We don't bother with AVX since it would require
// runtime dispatch and wouldn't gain us much anyways: the probability of
// finding a match drops off drastically after the first few buckets.
//
// I attempted an implementation on ARM using NEON instructions, but it
// turns out that most NEON instructions have multi-cycle latency, which in
// the end outweighs any gains over the generic implementation.
if #[cfg(all(
target_feature = "sse2",
any(target_arch = "x86", target_arch = "x86_64"),
not(miri),
))] {
mod sse2;
use sse2 as imp;
} else if #[cfg(all(
target_arch = "aarch64",
target_feature = "neon",
// NEON intrinsics are currently broken on big-endian targets.
// See https://github.com/rust-lang/stdarch/issues/1484.
target_endian = "little",
not(miri),
))] {
mod neon;
use neon as imp;
} else {
mod generic;
use generic as imp;
}
}
pub(crate) use self::imp::Group;
pub(super) use self::imp::{
BitMaskWord, NonZeroBitMaskWord, BITMASK_ITER_MASK, BITMASK_MASK, BITMASK_STRIDE,
};
9 changes: 3 additions & 6 deletions src/raw/neon.rs → src/control/group/neon.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
use super::bitmask::BitMask;
use super::Tag;
use super::super::{BitMask, Tag};
use core::arch::aarch64 as neon;
use core::mem;
use core::num::NonZeroU64;
Expand Down Expand Up @@ -52,8 +51,7 @@ impl Group {
#[inline]
#[allow(clippy::cast_ptr_alignment)]
pub(crate) unsafe fn load_aligned(ptr: *const Tag) -> Self {
// FIXME: use align_offset once it stabilizes
debug_assert_eq!(ptr as usize & (mem::align_of::<Self>() - 1), 0);
debug_assert_eq!(ptr.align_offset(mem::align_of::<Self>()), 0);
Group(neon::vld1_u8(ptr.cast()))
}

Expand All @@ -62,8 +60,7 @@ impl Group {
#[inline]
#[allow(clippy::cast_ptr_alignment)]
pub(crate) unsafe fn store_aligned(self, ptr: *mut Tag) {
// FIXME: use align_offset once it stabilizes
debug_assert_eq!(ptr as usize & (mem::align_of::<Self>() - 1), 0);
debug_assert_eq!(ptr.align_offset(mem::align_of::<Self>()), 0);
neon::vst1_u8(ptr.cast(), self.0);
}

Expand Down
9 changes: 3 additions & 6 deletions src/raw/sse2.rs → src/control/group/sse2.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
use super::bitmask::BitMask;
use super::Tag;
use super::super::{BitMask, Tag};
use core::mem;
use core::num::NonZeroU16;

Expand Down Expand Up @@ -58,8 +57,7 @@ impl Group {
#[inline]
#[allow(clippy::cast_ptr_alignment)]
pub(crate) unsafe fn load_aligned(ptr: *const Tag) -> Self {
// FIXME: use align_offset once it stabilizes
debug_assert_eq!(ptr as usize & (mem::align_of::<Self>() - 1), 0);
debug_assert_eq!(ptr.align_offset(mem::align_of::<Self>()), 0);
Group(x86::_mm_load_si128(ptr.cast()))
}

Expand All @@ -68,8 +66,7 @@ impl Group {
#[inline]
#[allow(clippy::cast_ptr_alignment)]
pub(crate) unsafe fn store_aligned(self, ptr: *mut Tag) {
// FIXME: use align_offset once it stabilizes
debug_assert_eq!(ptr as usize & (mem::align_of::<Self>() - 1), 0);
debug_assert_eq!(ptr.align_offset(mem::align_of::<Self>()), 0);
x86::_mm_store_si128(ptr.cast(), self.0);
}

Expand Down
10 changes: 10 additions & 0 deletions src/control/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
mod bitmask;
mod group;
mod tag;

use self::bitmask::BitMask;
pub(crate) use self::{
bitmask::BitMaskIter,
group::Group,
tag::{Tag, TagSliceExt},
};
83 changes: 83 additions & 0 deletions src/control/tag.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
use core::{fmt, mem};

/// Single tag in a control group.
#[derive(Copy, Clone, PartialEq, Eq)]
#[repr(transparent)]
pub(crate) struct Tag(pub(super) u8);
impl Tag {
/// Control tag value for an empty bucket.
pub(crate) const EMPTY: Tag = Tag(0b1111_1111);

/// Control tag value for a deleted bucket.
pub(crate) const DELETED: Tag = Tag(0b1000_0000);

/// Checks whether a control tag represents a full bucket (top bit is clear).
#[inline]
pub(crate) const fn is_full(self) -> bool {
self.0 & 0x80 == 0
}

/// Checks whether a control tag represents a special value (top bit is set).
#[inline]
pub(crate) const fn is_special(self) -> bool {
self.0 & 0x80 != 0
}

/// Checks whether a special control value is EMPTY (just check 1 bit).
#[inline]
pub(crate) const fn special_is_empty(self) -> bool {
debug_assert!(self.is_special());
self.0 & 0x01 != 0
}

/// Creates a control tag representing a full bucket with the given hash.
#[inline]
#[allow(clippy::cast_possible_truncation)]
pub(crate) const fn full(hash: u64) -> Tag {
// Constant for function that grabs the top 7 bits of the hash.
const MIN_HASH_LEN: usize = if mem::size_of::<usize>() < mem::size_of::<u64>() {
mem::size_of::<usize>()
} else {
mem::size_of::<u64>()
};

// Grab the top 7 bits of the hash. While the hash is normally a full 64-bit
// value, some hash functions (such as FxHash) produce a usize result
// instead, which means that the top 32 bits are 0 on 32-bit platforms.
// So we use MIN_HASH_LEN constant to handle this.
let top7 = hash >> (MIN_HASH_LEN * 8 - 7);
Tag((top7 & 0x7f) as u8) // truncation
}
}
impl fmt::Debug for Tag {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
if self.is_special() {
if self.special_is_empty() {
f.pad("EMPTY")
} else {
f.pad("DELETED")
}
} else {
f.debug_tuple("full").field(&(self.0 & 0x7F)).finish()
}
}
}
Comment on lines +52 to +64
Copy link
Contributor Author

@clarfonthey clarfonthey Oct 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to replace the derived debug to help me debug the code I'm going to add in a future PR, so it's a bit easier to read. I figure that if we care about the additional compile time added by this one debug impl so much we can cfg(debug_assertions) gate it later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a problem for compilation time since it's not a generic function.


/// Extension trait for slices of tags.
pub(crate) trait TagSliceExt {
/// Fills the control with the given tag.
fn fill_tag(&mut self, tag: Tag);

/// Clears out the control.
#[inline]
fn fill_empty(&mut self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs #[inline].

self.fill_tag(Tag::EMPTY)
}
}
impl TagSliceExt for [Tag] {
#[inline]
fn fill_tag(&mut self, tag: Tag) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs #[inline].

// SAFETY: We have access to the entire slice, so, we can write to the entire slice.
unsafe { self.as_mut_ptr().write_bytes(tag.0, self.len()) }
}
}
2 changes: 2 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,9 @@ doc_comment::doctest!("../README.md");
#[macro_use]
mod macros;

mod control;
mod raw;
mod util;

mod external_trait_impls;
mod map;
Expand Down
Loading
Loading