-
Notifications
You must be signed in to change notification settings - Fork 564
Store initializing singletons in a separate thread local small map #4019
Conversation
runtime/src/main/cpp/Memory.cpp
Outdated
UpdateHeapRef(localLocation, object); | ||
auto insertIt = memoryState->initializingSingletons.insert({location, object}); | ||
RuntimeCheck(insertIt.second, "object cannot be assigned twice into initializingSingletons"); | ||
addHeapRef(object); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but what's protecting this reference (and all that it references) from being garbage collected while (potentially long) initialization is taking place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allocation above actually creates a stack reference. It is alive at least until the return from the function.
runtime/src/main/cpp/Memory.cpp
Outdated
@@ -451,6 +451,10 @@ struct MemoryState { | |||
uint64_t allocSinceLastGcThreshold; | |||
#endif // USE_GC | |||
|
|||
// This map is expected to be small, std::map consumes less memory than | |||
// std::unordered_map and is just as efficient. | |||
KStdOrderedMap<ObjHeader**, ObjHeader*> initializingSingletons; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does using stack-like collection here have any effect on the performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, actually. It outperforms std::map
on benchmarks from #4018 And compared to the current implementation: it's 5% faster on a single initialization, and 10% slower on a nested initialization (with depth 500). I think, it's a win overall, since deeply nested initialization should be very rare.
|
||
import kotlin.native.concurrent.* | ||
|
||
val aWorkerId = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is always correct.
import kotlin.random.Random | ||
|
||
object A { | ||
val a1 = Random.nextInt(100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Random
makes the test non-deterministic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean if a1
and a2
become the same number and the test would fail to catch a potential bug? Fair enough, is there another way to dynamically initialize a1
while ensuring compiler won't ever optimize it away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
Well, IIRC, the compiler doesn't optimize the dynamic initialization even if the initializer is literal.
This can likely be verified by more assertions (checking that values are zero in initial state).
// Only proceed with initialization, when all b workers have started executing. | ||
while (aWorkerUnlocker.value < bWorkersCount) {} | ||
// And now wait a bit, to increase probability of races. | ||
Worker.current.park(1000 * 1000L) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100 ms would be enough :)
747fd09
to
b419101
Compare
No description provided.