RootedRc is not much faster than Arc

Well that's unfortunate, since the whole point is to be faster than Arc.

Prime suspects are:

* the hash set used to track which roots are locked by the current thread. If so, could mitigate with a faster hash function (stdlib's default is known to be slow), or only supporting one root to be unlocked at a time so that it can just be an `Option` instead of a `HashSet`.
* thread-local-access itself. I seem to vaguely remember seeing somewhere that stdlib's thread locals may be slower than native ones. If this is the issue we should be able to find or make an alternative that e.g. just uses libc's native thread locals (assuming there's not some fundamental safety issue preventing that)

Probable next step is to run `perf` to verify where the time is actually being spent.

Here are the current numbers:
```
$ cargo bench
...
RootedRc::clone 1000    time:   [28.890 us 29.105 us 29.332 us]                                  
                        change: [-1.7329% -1.1690% -0.5940%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

Arc::clone 1000         time:   [8.0026 us 8.0184 us 8.0347 us]                             
                        change: [-0.6287% -0.2630% +0.1302%] (p = 0.20 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe
```

EDIT: deleted invalid `Rc` benchmark (operations under test were optimized away)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RootedRc is not much faster than Arc #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RootedRc is not much faster than Arc #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions