Call TryGetValue in overload of ConcurrentDictionary.GetOrAdd to avoid acquiring lock #63

russgray · 2015-02-04T15:24:39Z

There are two overloads of ConcurrentDictionary.GetOrAdd. One accepts a
factory method to generate the value if the key is not found, and calls
TryGetValue for the existence check which is a lock-free operation.

The other overload, however, does not call TryGetValue - it simply calls
straight through to TryAddInternal, which will obtain a bucket lock before it
does the existence check. This violates the documented claim of
ConcurrentDictionary of lock-free reads. This commit simply adds the same
TryGetValue check, making the two overloads consistent.

acquiring lock. There are two overloads of ConcurrentDictionary.GetOrAdd. One accepts a factory method to generate the value if the key is not found, and calls TryGetValue for the existence check which is a lock-free operation. The other overload, however, does not call TryGetValue - it simply calls straight through to TryAddInternal, which will obtain a bucket lock *before it does the existence check*. This violates the documented claim of ConcurrentDictionary of lock-free reads. This commit simply adds the same TryGetValue check, making the two overloads consistent.

stephentoub · 2015-02-04T15:31:02Z

Thanks, @russgray. The fix LGTM. Stylewise, I'd prefer to see this simplified to avoid multiple return statements for the same thing, e.g. instead of:

TValue resultingValue;
if (TryGetValue(key, out resultingValue))
{
    return resultingValue;
}
TryAddInternal(key, value, false, true, out resultingValue);
return resultingValue;

doing:

TValue resultingValue;
if (!TryGetValue(key, out resultingValue))
{
    TryAddInternal(key, value, false, true, out resultingValue);
}
return resultingValue;

I realize you were just copying the style that was already there for the other overload. Want to change both?

stephentoub · 2015-02-04T15:33:51Z

Actually, scratch that. This should be on the CoreFX repository, not in CoreCLR (I didn't notice which repository we were talking about). Can you close this PR and reopen an appropriate one over on CoreFX for the ConcurrentDictionary there rather than the ConcurrentDictionary in mscorlib?

(Note that the ConcurrentDictionary in CoreFX has evolved from the one here, so you'll see some slightly different code involving TryGetValueInternal in order to avoid duplicate hashcode generation.)

russgray · 2015-02-04T15:36:51Z

No problem, I'll duplicate the change with the style tweak on the other repo

stephentoub · 2015-02-04T15:38:01Z

Thanks.

For 4 byte integer multiplication, JIT emits a bad-code which is valid only for 8 byte (i8) multiplication. For the fix, I use ```smull```(signed)/```umull```(unsigned) instructions that contain 8 byte results from 4 byte by 4 byte multiplication. So only one multiplication is needed instead of two for this case. By simply shifting the results, we could get the upper results that is used for sign check to detect overflow. Similar transform is made for the unsigned case. But for this case, we might trash the destination register by the extra register, I added register check to optionally emit additional multiplication. I guess we might revisit this to allocate two register like the signed one to ensure only one multiplication for 4 bye case, but I think we should follow this up later if needed. Before smulh w10, w8, w9 --> Incorrect use: smulh is for obtaining the upper bits [127:64] of x8 * x9 mul w8, w8, w9 cmp x10, x8, ASR dotnet#63 After smull x8, x8, x9 --> x8 = w8 * w9 lsr x10, x8, dotnet#32 --> shift the upper bit of x8 to get sign bit cmp w10, w8, ASR #31 --> check sign bit

For 4 byte integer multiplication, JIT emits a bad-code which is valid only for 8 byte (i8) multiplication. For the fix, I use ```smull```(signed)/```umull```(unsigned) instructions that contain 8 byte results from 4 byte by 4 byte multiplication. So only one multiplication is needed instead of two for this case. By simply shifting the results, we could get the upper results that is used to detect overflow. Similar transform is made for the unsigned case. Lower is also changed to reserve a register for overflow check. Before smulh w10, w8, w9 --> Incorrect use: smulh is for obtaining the upper bits [127:64] of x8 * x9 mul w8, w8, w9 cmp x10, x8, ASR dotnet#63 After smull x8, x8, x9 --> x8 = w8 * w9 lsr x10, x8, dotnet#32 --> shift the upper bit of x8 to get sign bit cmp w10, w8, ASR #31 --> check sign bit

russgray closed this Feb 4, 2015

Alexx999 mentioned this pull request Jan 31, 2020

State policy on mscorlib dotnet/runtime#14085

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Call TryGetValue in overload of ConcurrentDictionary.GetOrAdd to avoid acquiring lock #63

Call TryGetValue in overload of ConcurrentDictionary.GetOrAdd to avoid acquiring lock #63

Uh oh!

russgray commented Feb 4, 2015

Uh oh!

stephentoub commented Feb 4, 2015

Uh oh!

stephentoub commented Feb 4, 2015

Uh oh!

russgray commented Feb 4, 2015

Uh oh!

stephentoub commented Feb 4, 2015

Uh oh!

Uh oh!

Call TryGetValue in overload of ConcurrentDictionary.GetOrAdd to avoid acquiring lock #63

Call TryGetValue in overload of ConcurrentDictionary.GetOrAdd to avoid acquiring lock #63

Uh oh!

Conversation

russgray commented Feb 4, 2015

Uh oh!

stephentoub commented Feb 4, 2015

Uh oh!

stephentoub commented Feb 4, 2015

Uh oh!

russgray commented Feb 4, 2015

Uh oh!

stephentoub commented Feb 4, 2015

Uh oh!

Uh oh!