Skip to content

RootedRc is not much faster than Arc #2

@sporksmith

Description

@sporksmith

Well that's unfortunate, since the whole point is to be faster than Arc.

Prime suspects are:

  • the hash set used to track which roots are locked by the current thread. If so, could mitigate with a faster hash function (stdlib's default is known to be slow), or only supporting one root to be unlocked at a time so that it can just be an Option instead of a HashSet.
  • thread-local-access itself. I seem to vaguely remember seeing somewhere that stdlib's thread locals may be slower than native ones. If this is the issue we should be able to find or make an alternative that e.g. just uses libc's native thread locals (assuming there's not some fundamental safety issue preventing that)

Probable next step is to run perf to verify where the time is actually being spent.

Here are the current numbers:

$ cargo bench
...
RootedRc::clone 1000    time:   [28.890 us 29.105 us 29.332 us]                                  
                        change: [-1.7329% -1.1690% -0.5940%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

Arc::clone 1000         time:   [8.0026 us 8.0184 us 8.0347 us]                             
                        change: [-0.6287% -0.2630% +0.1302%] (p = 0.20 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

EDIT: deleted invalid Rc benchmark (operations under test were optimized away)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions