[GEN][ZH] Implement interlocked compat helper functions#887
[GEN][ZH] Implement interlocked compat helper functions#887Mauller wants to merge 1 commit intoTheSuperHackers:mainfrom
Conversation
There was a problem hiding this comment.
Can you try what chatgpt has suggested?
short AtomicIncrement16(volatile short* ptr) {
__asm {
mov ecx, ptr ; Load address of the value
mov ax, 1 ; Value to add
lock xadd [ecx], ax ; Atomically add and exchange
inc ax ; ax = original + 1
movzx eax, ax ; Zero extend to return a clean value
}
}There was a problem hiding this comment.
tried a few different variations but can't get these to build in VS22 to test. But i don't think modern VS supports the type of ASM anyway
There was a problem hiding this comment.
Does it build in VC6?
it appears to from the pipeline passing.
There was a problem hiding this comment.
You could test it by launching 2 threads, have them race with counters, and then check if interlocked increment causes no wrong values, while non-interlocked increment causes wrong values. Edit: that is perhaps easier said than done. I am not sure what the verify condition is.
There was a problem hiding this comment.
Chat GPT
To test that your atomic 16-bit increment works correctly, you need to simulate a concurrent multithreaded environment where multiple threads increment the same shared counter. If the operation is truly atomic:
- The final counter value should equal the total number of increments.
- No increments should be lost due to race conditions.
Here's a complete example using Win32 threads (compatible with VC6) to test atomic increment:
✅ Test Program for AtomicIncrement16
#include <windows.h>
#include <stdio.h>
volatile short counter = 0; // Shared counter
const int NUM_THREADS = 10;
const int INCREMENTS_PER_THREAD = 10000;
DWORD WINAPI ThreadProc(LPVOID lpParam) {
for (int i = 0; i < INCREMENTS_PER_THREAD; ++i) {
__asm {
mov ecx, offset counter
mov ax, 1
lock xadd [ecx], ax
}
}
return 0;
}
int main() {
HANDLE threads[NUM_THREADS];
for (int i = 0; i < NUM_THREADS; ++i) {
threads[i] = CreateThread(NULL, 0, ThreadProc, NULL, 0, NULL);
}
WaitForMultipleObjects(NUM_THREADS, threads, TRUE, INFINITE);
for (int i = 0; i < NUM_THREADS; ++i) {
CloseHandle(threads[i]);
}
printf("Expected: %d\n", NUM_THREADS * INCREMENTS_PER_THREAD);
printf("Actual: %d\n", counter);
return 0;
}🧪 What This Does:
- Launches
NUM_THREADSthreads. - Each thread performs
INCREMENTS_PER_THREADatomic increments. - Total expected:
NUM_THREADS * INCREMENTS_PER_THREAD. - Final value of
counteris printed.
✅ Expected Output (if atomicity is correct):
Expected: 100000
Actual: 100000
If the increment is not atomic, you will see a lower actual value, because increments would be lost due to race conditions.
🛠 Notes:
volatileprevents compiler optimization but does not ensure atomicity — your inlinelock xadddoes.- VC6 and older CPUs (Pentium/486/386) can run this just fine if the hardware supports
LOCK XADD(386+). - For older 8086-like processors, you'd need to simulate atomicity with interrupt disabling or similar — not practical for multitasking systems.
Let me know if you want to test this using lock inc instead, or if you're targeting SMP/multicore emulators or real hardware!
There was a problem hiding this comment.
Did you increase the number of 'INCREMENTS_PER_THREAD' later on?
'counter' is a short, so it's likely to have a 16 bit width. So the counter would overflow (and since it's a signed type, that'd be undefined behavior, but that's beside the point).
I don't understand how your output can be 100'000.
There was a problem hiding this comment.
Will have a look tomorrow when i have some more time, will need to properly setup VC6 locally too.
f236da0 to
6f96528
Compare
|
Updated based on suggestions. |
6f96528 to
62edff6
Compare
|
Is this change still relevant or will it be superseeded? |
From the conversations the other day it's likely to be superseded by some kind of atomic_compat type header. I have not gotten to it yet while investigating these replay crashes etc. |
This PR implements operating system, cross compatible, interlocked increment and decrement functions.
This is a follow up PR from unifying ASCII string for future use when replacing critical sections.
NOTE: For VC6 there are asm based functions for 16 and 64bit instead of relying on casting to the 32bit variant.