Skip to content

Doc erase and prepare_rehash_in_place #411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 29, 2023

Conversation

JustForFun88
Copy link
Contributor

@JustForFun88 JustForFun88 commented Mar 2, 2023

Also updated old comments and broken links. For example, the old comment inside the erase function was very confusing (or I didn't understand it).

The self.buckets() < Group::WIDTH inside prepare_rehash_in_place function is made unlikely since it is not possible to have tombstones in tables smaller than the group width. This is due to two conditions:

  1. Inside the erase function, index_before = index.wrapping_sub(Group::WIDTH) & self.bucket_mask equals index for all tables less than or equal to Group::WIDTH (proved by simple iteration, see test below).
  2. In particular, when self.buckets() < Group::WIDTH, we will have at least one empty slot due to the replication principles in the set_ctrl function.

Based on the above, when self.buckets() < Group::WIDTH, in the erase function, these two lines

let empty_before = Group::load(self.ctrl(index_before)).match_empty();
let empty_after = Group::load(self.ctrl(index)).match_empty();

load the same group that has at least one empty slot in the trailing control bytes, even if the map was full and there were no empty slots at all (self.items == self.buckets()). That is, empty_before.leading_zeros() + empty_after.trailing_zeros() < Group::WIDTH for any tables where self.buckets() < Group::WIDTH.

P.S. And so, after all that I wrote, I sit and think that maybe I should have done debug_assert! instead of unlikely 😄.

fn main() {
    let buckets_array: [usize; 3] = [1, 2, 4];
    let group_widths: [usize; 3] = [4, 8, 16];
    for group_width in group_widths {
        for buckets in buckets_array {
            let bucket_mask = buckets - 1;
            for index in 0..buckets {
                let index_before = index.wrapping_sub(group_width) & bucket_mask;
                assert_eq!(index, index_before);
            }
        }
    }

    let buckets_array: [usize; 4] = [1, 2, 4, 8];
    let group_widths: [usize; 2] = [8, 16];
    for group_width in group_widths {
        for buckets in buckets_array {
            let bucket_mask = buckets - 1;
            for index in 0..buckets {
                let index_before = index.wrapping_sub(group_width) & bucket_mask;
                assert_eq!(index, index_before);
            }
        }
    }

    let buckets_array: [usize; 5] = [1, 2, 4, 8, 16];
    let group_width: usize = 16;
    for buckets in buckets_array {
        let bucket_mask = buckets - 1;
        for index in 0..buckets {
            let index_before = index.wrapping_sub(group_width) & bucket_mask;
            assert_eq!(index, index_before);
        }
    }
}

@Amanieu
Copy link
Member

Amanieu commented Mar 24, 2023

@bors r+

@bors
Copy link
Contributor

bors commented Mar 24, 2023

📌 Commit 50c4d35 has been approved by Amanieu

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented Mar 24, 2023

⌛ Testing commit 50c4d35 with merge e7d606e...

bors added a commit that referenced this pull request Mar 24, 2023
Doc `erase` and `prepare_rehash_in_place`

Also updated old comments and broken links. For example, the old comment inside the `erase` function was very confusing (or I didn't understand it).

The `self.buckets() < Group::WIDTH` inside `prepare_rehash_in_place` function is made `unlikely` since it is not possible to have tombstones in tables smaller than the group width. This is due to two conditions:
1. Inside the `erase` function, `index_before = index.wrapping_sub(Group::WIDTH) & self.bucket_mask` equals `index` for all tables less than or equal to `Group::WIDTH` (proved by simple iteration, see test below).
2. In particular, when `self.buckets() < Group::WIDTH`, we will have at least one empty slot due to the replication principles in the `set_ctrl` function.

Based on the above, when `self.buckets() < Group::WIDTH`, in the `erase` function, these two lines
```rust
let empty_before = Group::load(self.ctrl(index_before)).match_empty();
let empty_after = Group::load(self.ctrl(index)).match_empty();
```
load the same group that has at least one empty slot in the trailing control bytes, even if the map was full and there were no empty slots at all (`self.items == self.buckets()`). That is, `empty_before.leading_zeros() + empty_after.trailing_zeros() < Group::WIDTH` for any tables where `self.buckets() < Group::WIDTH`.

P.S. And so, after all that I wrote, I sit and think that maybe I should have done `debug_assert!` instead of `unlikely` 😄.

```rust
fn main() {
    let buckets_array: [usize; 3] = [1, 2, 4];
    let group_widths: [usize; 3] = [4, 8, 16];
    for group_width in group_widths {
        for buckets in buckets_array {
            let bucket_mask = buckets - 1;
            for index in 0..buckets {
                let index_before = index.wrapping_sub(group_width) & bucket_mask;
                assert_eq!(index, index_before);
            }
        }
    }

    let buckets_array: [usize; 4] = [1, 2, 4, 8];
    let group_widths: [usize; 2] = [8, 16];
    for group_width in group_widths {
        for buckets in buckets_array {
            let bucket_mask = buckets - 1;
            for index in 0..buckets {
                let index_before = index.wrapping_sub(group_width) & bucket_mask;
                assert_eq!(index, index_before);
            }
        }
    }

    let buckets_array: [usize; 5] = [1, 2, 4, 8, 16];
    let group_width: usize = 16;
    for buckets in buckets_array {
        let bucket_mask = buckets - 1;
        for index in 0..buckets {
            let index_before = index.wrapping_sub(group_width) & bucket_mask;
            assert_eq!(index, index_before);
        }
    }
}
```
@bors
Copy link
Contributor

bors commented Mar 24, 2023

☀️ Test successful - checks-actions
Approved by: Amanieu
Pushing e7d606e to master...

@bors
Copy link
Contributor

bors commented Mar 24, 2023

👀 Test was successful, but fast-forwarding failed: 422 Update is not a fast forward

@Amanieu
Copy link
Member

Amanieu commented Mar 29, 2023

@bors r+

@bors
Copy link
Contributor

bors commented Mar 29, 2023

💡 This pull request was already approved, no need to approve it again.

  • This pull request previously failed. You should add more commits to fix the bug, or use retry to trigger a build again.

@bors
Copy link
Contributor

bors commented Mar 29, 2023

📌 Commit 50c4d35 has been approved by Amanieu

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented Mar 29, 2023

⌛ Testing commit 50c4d35 with merge 1a6b6e9...

bors added a commit that referenced this pull request Mar 29, 2023
Doc `erase` and `prepare_rehash_in_place`

Also updated old comments and broken links. For example, the old comment inside the `erase` function was very confusing (or I didn't understand it).

The `self.buckets() < Group::WIDTH` inside `prepare_rehash_in_place` function is made `unlikely` since it is not possible to have tombstones in tables smaller than the group width. This is due to two conditions:
1. Inside the `erase` function, `index_before = index.wrapping_sub(Group::WIDTH) & self.bucket_mask` equals `index` for all tables less than or equal to `Group::WIDTH` (proved by simple iteration, see test below).
2. In particular, when `self.buckets() < Group::WIDTH`, we will have at least one empty slot due to the replication principles in the `set_ctrl` function.

Based on the above, when `self.buckets() < Group::WIDTH`, in the `erase` function, these two lines
```rust
let empty_before = Group::load(self.ctrl(index_before)).match_empty();
let empty_after = Group::load(self.ctrl(index)).match_empty();
```
load the same group that has at least one empty slot in the trailing control bytes, even if the map was full and there were no empty slots at all (`self.items == self.buckets()`). That is, `empty_before.leading_zeros() + empty_after.trailing_zeros() < Group::WIDTH` for any tables where `self.buckets() < Group::WIDTH`.

P.S. And so, after all that I wrote, I sit and think that maybe I should have done `debug_assert!` instead of `unlikely` 😄.

```rust
fn main() {
    let buckets_array: [usize; 3] = [1, 2, 4];
    let group_widths: [usize; 3] = [4, 8, 16];
    for group_width in group_widths {
        for buckets in buckets_array {
            let bucket_mask = buckets - 1;
            for index in 0..buckets {
                let index_before = index.wrapping_sub(group_width) & bucket_mask;
                assert_eq!(index, index_before);
            }
        }
    }

    let buckets_array: [usize; 4] = [1, 2, 4, 8];
    let group_widths: [usize; 2] = [8, 16];
    for group_width in group_widths {
        for buckets in buckets_array {
            let bucket_mask = buckets - 1;
            for index in 0..buckets {
                let index_before = index.wrapping_sub(group_width) & bucket_mask;
                assert_eq!(index, index_before);
            }
        }
    }

    let buckets_array: [usize; 5] = [1, 2, 4, 8, 16];
    let group_width: usize = 16;
    for buckets in buckets_array {
        let bucket_mask = buckets - 1;
        for index in 0..buckets {
            let index_before = index.wrapping_sub(group_width) & bucket_mask;
            assert_eq!(index, index_before);
        }
    }
}
```
@bors
Copy link
Contributor

bors commented Mar 29, 2023

💔 Test failed - checks-actions

@Amanieu
Copy link
Member

Amanieu commented Mar 29, 2023

@bors retry

@bors
Copy link
Contributor

bors commented Mar 29, 2023

⌛ Testing commit 50c4d35 with merge dcb1de1...

@bors
Copy link
Contributor

bors commented Mar 29, 2023

☀️ Test successful - checks-actions
Approved by: Amanieu
Pushing dcb1de1 to master...

@bors bors merged commit dcb1de1 into rust-lang:master Mar 29, 2023
@JustForFun88 JustForFun88 deleted the erase_and_rehash branch April 16, 2023 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants