Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Call TryGetValue in overload of ConcurrentDictionary.GetOrAdd to avoid acquiring lock #63

Closed
wants to merge 1 commit into from

Conversation

russgray
Copy link

@russgray russgray commented Feb 4, 2015

There are two overloads of ConcurrentDictionary.GetOrAdd. One accepts a
factory method to generate the value if the key is not found, and calls
TryGetValue for the existence check which is a lock-free operation.

The other overload, however, does not call TryGetValue - it simply calls
straight through to TryAddInternal, which will obtain a bucket lock before it
does the existence check
. This violates the documented claim of
ConcurrentDictionary of lock-free reads. This commit simply adds the same
TryGetValue check, making the two overloads consistent.

acquiring lock.

There are two overloads of ConcurrentDictionary.GetOrAdd. One accepts a
factory method to generate the value if the key is not found, and calls
TryGetValue for the existence check which is a lock-free operation.

The other overload, however, does not call TryGetValue - it simply calls
straight through to TryAddInternal, which will obtain a bucket lock *before it
does the existence check*. This violates the documented claim of
ConcurrentDictionary of lock-free reads. This commit simply adds the same
TryGetValue check, making the two overloads consistent.
@stephentoub
Copy link
Member

Thanks, @russgray. The fix LGTM. Stylewise, I'd prefer to see this simplified to avoid multiple return statements for the same thing, e.g. instead of:

TValue resultingValue;
if (TryGetValue(key, out resultingValue))
{
    return resultingValue;
}
TryAddInternal(key, value, false, true, out resultingValue);
return resultingValue;

doing:

TValue resultingValue;
if (!TryGetValue(key, out resultingValue))
{
    TryAddInternal(key, value, false, true, out resultingValue);
}
return resultingValue;

I realize you were just copying the style that was already there for the other overload. Want to change both?

@stephentoub
Copy link
Member

Actually, scratch that. This should be on the CoreFX repository, not in CoreCLR (I didn't notice which repository we were talking about). Can you close this PR and reopen an appropriate one over on CoreFX for the ConcurrentDictionary there rather than the ConcurrentDictionary in mscorlib?

(Note that the ConcurrentDictionary in CoreFX has evolved from the one here, so you'll see some slightly different code involving TryGetValueInternal in order to avoid duplicate hashcode generation.)

@russgray
Copy link
Author

russgray commented Feb 4, 2015

No problem, I'll duplicate the change with the style tweak on the other repo

@russgray russgray closed this Feb 4, 2015
@stephentoub
Copy link
Member

Thanks.

kyulee1 added a commit to kyulee1/coreclr that referenced this pull request Mar 1, 2016
For 4 byte integer multiplication, JIT emits a bad-code which is valid
only for 8 byte (i8) multiplication.
For the fix, I use ```smull```(signed)/```umull```(unsigned) instructions
that contain 8 byte results from 4 byte by 4 byte multiplication.
So only one multiplication is needed instead of two for this case.
By simply shifting the results, we could get the upper results that is
used for sign check to detect overflow.
Similar transform is made for the unsigned case. But for this case, we
might trash the destination register by the extra register,
I added register check to optionally emit additional multiplication. I
guess we might revisit this to allocate two register like the signed one
to ensure only one multiplication for 4 bye case, but I think we should
follow this up later if needed.

Before
smulh   w10, w8, w9  --> Incorrect use: smulh is for obtaining the upper
bits [127:64] of x8 * x9
mul     w8, w8, w9
cmp     x10, x8, ASR dotnet#63

After
smull   x8, x8, x9   --> x8 = w8 * w9
lsr     x10, x8, dotnet#32 --> shift the upper bit of x8 to get sign bit
cmp     w10, w8, ASR #31 --> check sign bit
kyulee1 added a commit to kyulee1/coreclr that referenced this pull request Mar 2, 2016
For 4 byte integer multiplication, JIT emits a bad-code which is valid
only for 8 byte (i8) multiplication.
For the fix, I use ```smull```(signed)/```umull```(unsigned) instructions
that contain 8 byte results from 4 byte by 4 byte multiplication.
So only one multiplication is needed instead of two for this case.
By simply shifting the results, we could get the upper results that is
used to detect overflow.
Similar transform is made for the unsigned case.
Lower is also changed to reserve a register for overflow check.

Before
smulh   w10, w8, w9  --> Incorrect use: smulh is for obtaining the upper
bits [127:64] of x8 * x9
mul     w8, w8, w9
cmp     x10, x8, ASR dotnet#63

After
smull   x8, x8, x9   --> x8 = w8 * w9
lsr     x10, x8, dotnet#32 --> shift the upper bit of x8 to get sign bit
cmp     w10, w8, ASR #31 --> check sign bit
kyulee1 added a commit to kyulee1/coreclr that referenced this pull request Mar 2, 2016
For 4 byte integer multiplication, JIT emits a bad-code which is valid
only for 8 byte (i8) multiplication.
For the fix, I use ```smull```(signed)/```umull```(unsigned) instructions
that contain 8 byte results from 4 byte by 4 byte multiplication.
So only one multiplication is needed instead of two for this case.
By simply shifting the results, we could get the upper results that is
used to detect overflow.
Similar transform is made for the unsigned case.
Lower is also changed to reserve a register for overflow check.

Before
smulh   w10, w8, w9  --> Incorrect use: smulh is for obtaining the upper
bits [127:64] of x8 * x9
mul     w8, w8, w9
cmp     x10, x8, ASR dotnet#63

After
smull   x8, x8, x9   --> x8 = w8 * w9
lsr     x10, x8, dotnet#32 --> shift the upper bit of x8 to get sign bit
cmp     w10, w8, ASR #31 --> check sign bit
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants