-
Notifications
You must be signed in to change notification settings - Fork 72
Surround the GIL with a ReentrantLock on the Julia side. #637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mkitti
wants to merge
7
commits into
JuliaPy:main
Choose a base branch
from
mkitti:mkitti-jl-gil-lock
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
943473f
Surround the GIL with a ReentrantLock on the Julia side.
mkitti 3b6caaa
Change explanation to Jameson Nash (vtjnash)'s reasoning
mkitti beea5f2
Add an exclusive parameter to GIL.lock and GIL.@lock
mkitti 2f795dc
Do not lock on __init__
mkitti 53867da
Add initial draft of GlobalInterpreterLock
mkitti 58dba62
First GlobalInterpreterLock that works
mkitti 3b97f9c
Lock condvar to wait, restore sticky after notify
mkitti File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,3 +7,7 @@ dist/ | |
.CondaPkg/ | ||
/jltest.* | ||
uv.lock | ||
|
||
# pixi environments | ||
.pixi | ||
*.egg-info |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,178 @@ | ||
""" | ||
TaskState | ||
|
||
When a `Task` acquires the GIL, save the GIL state and the stickiness of the | ||
`Task` since we will force the `Task` to be sticky. We need to restore the GIL | ||
state on release of the GIL via `C.PyGILState_Release`. | ||
""" | ||
struct TaskState | ||
task::Task | ||
sticky::Bool # original stickiness of the task | ||
state::C.PyGILState_STATE | ||
end | ||
|
||
""" | ||
TaskStack | ||
|
||
For each thread the `TaskStack` maintains a first-in-last-out list of tasks | ||
as well as the GIL state and their stickiness upon entering the stack. This | ||
forces tasks to unlock the GIL in the reverse order of which they locked it. | ||
""" | ||
struct TaskStack | ||
stack::Vector{TaskState} | ||
count::IdDict{Task,Int} | ||
condvar::Threads.Condition | ||
function TaskStack() | ||
return new(TaskState[], IdDict{Task,Int}(), Threads.Condition()) | ||
end | ||
end | ||
function Base.last(task_stack::TaskStack)::Task | ||
return last(task_stack.stack).task | ||
end | ||
function Base.push!(task_stack::TaskStack, task::Task) | ||
original_sticky = task.sticky | ||
# The task should not migrate threads while acquiring or holding the GIL | ||
task.sticky = true | ||
gil_state = C.PyGILState_Ensure() | ||
|
||
# Save the stickiness and state for when we release | ||
state = TaskState(task, original_sticky, gil_state) | ||
push!(task_stack.stack, state) | ||
|
||
# Increment the count for this task | ||
count = get(task_stack.count, task, 0) | ||
task_stack.count[task] = count + 1 | ||
|
||
return task_stack | ||
end | ||
function Base.pop!(task_stack::TaskStack)::Task | ||
state = pop!(task_stack.stack) | ||
task = state.task | ||
sticky = state.sticky | ||
gil_state = state.state | ||
|
||
# Decrement the count for this task | ||
count = task_stack.count[task] - 1 | ||
if count == 0 | ||
# If 0, remove it from the key set | ||
pop!(task_stack.count, task) | ||
else | ||
task_stack.count[task] = count | ||
end | ||
|
||
C.PyGILState_Release(gil_state) | ||
|
||
Base.lock(task_stack.condvar) do | ||
notify(task_stack.condvar) | ||
end | ||
|
||
# Restore sticky state after releasing the GIL | ||
task.sticky = sticky | ||
|
||
return task | ||
end | ||
Base.in(task::Task, task_stack::TaskStack) = haskey(task_stack.count) | ||
Base.isempty(task_stack::TaskStack) = isempty(task_stack.stack) | ||
|
||
if !isdefined(Base, :OncePerThread) | ||
|
||
const PerThreadLock = Base.ThreadSynchronizer() | ||
|
||
# OncePerThread is implemented in full in Julia 1.12 | ||
# This implementation is meant for compatibility with Julia 1.10 and 1.11. | ||
# Using Julia 1.12 is recommended. | ||
mutable struct OncePerThread{T,F} <: Function | ||
@atomic xs::Dict{Int, T} # values | ||
@atomic ss::Dict{Int, UInt8} # states: 0=initial, 1=hasrun, 2=error, 3==concurrent | ||
const initializer::F | ||
function OncePerThread{T,F}(initializer::F) where {T,F} | ||
nt = Threads.maxthreadid() | ||
return new{T,F}(Dict{Int,T}(), Dict{Int,UInt8}(), initializer) | ||
end | ||
end | ||
OncePerThread{T}(initializer::Type{U}) where {T, U} = OncePerThread{T,Type{U}}(initializer) | ||
(once::OncePerThread{T,F})() where {T,F} = once[Threads.threadid()] | ||
function Base.getindex(once::OncePerThread, tid::Integer) | ||
tid = Int(tid) | ||
ss = @atomic :acquire once.ss | ||
xs = @atomic :monotonic once.xs | ||
|
||
if haskey(ss, tid) && ss[tid] == 1 | ||
return xs[tid] | ||
end | ||
|
||
Base.lock(PerThreadLock) | ||
try | ||
state = get(ss, tid, 0) | ||
if state == 0 | ||
xs[tid] = once.initializer() | ||
ss[tid] = 1 | ||
end | ||
finally | ||
Base.unlock(PerThreadLock) | ||
end | ||
return xs[tid] | ||
end | ||
end | ||
|
||
""" | ||
GlobalInterpreterLock | ||
|
||
Provides a thread aware reentrant lock around Python's interpreter lock that | ||
ensures that `Task`s acquiring the lock stay on the same thread. | ||
""" | ||
struct GlobalInterpreterLock <: Base.AbstractLock | ||
lock_owners::OncePerThread{TaskStack} | ||
function GlobalInterpreterLock() | ||
return new(OncePerThread{TaskStack}(TaskStack)) | ||
end | ||
end | ||
function Base.lock(gil::GlobalInterpreterLock) | ||
push!(gil.lock_owners(), current_task()) | ||
return nothing | ||
end | ||
function Base.unlock(gil::GlobalInterpreterLock) | ||
lock_owner::TaskStack = gil.lock_owners() | ||
last_owner::Task = if isempty(lock_owner) | ||
current_task() | ||
else | ||
last(lock_owner) | ||
end | ||
while last_owner != current_task() | ||
if istaskdone(last_owner) && !isempty(lock_owner) | ||
# Last owner is done and unable to unlock the GIL | ||
pop!(lock_owner) | ||
error("Unlock from the wrong task. The Task that owned the GIL is done and did not unlock the GIL: $(last_owner)") | ||
else | ||
# This task does not own the GIL. Wait to unlock the GIL until | ||
# another task successfully unlocks the GIL. | ||
Base.lock(lock_owner.condvar) do | ||
wait(lock_owner.condvar) | ||
end | ||
end | ||
last_owner = if isempty(lock_owner) | ||
current_task() | ||
else | ||
last(lock_owner) | ||
end | ||
end | ||
if isempty(lock_owner) | ||
error("Unlock from wrong task: $(current_task). No tasks on this thread own the lock.") | ||
else | ||
task = pop!(lock_owner) | ||
end | ||
@assert task == current_task() | ||
return nothing | ||
end | ||
function Base.islocked(gil::GlobalInterpreterLock) | ||
return any(!isempty(gil.lock_owners[thread_index]) for thread_index in 1:Threads.maxthreadid()) | ||
end | ||
function haslock(gil::GlobalInterpreterLock, task::Task) | ||
lock_owner::TaskStack = gil.lock_owners() | ||
if isempty(lock_owner) | ||
return false | ||
end | ||
return last(lock_owner)::Task == task | ||
end | ||
|
||
const _GIL = GlobalInterpreterLock() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locking on
__init__
is probably a bad idea even though we are technically holding the GIL since this will disable finalizers.Rather we should consider invoking
lock
while executing other calls.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disabling this apparently allowed the python tests to succeed.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After sleeping on this, I propose you make the following changes to make this fix work:
in(lockowners().set, current_task)
for safety (forcing the user to make explicit GIL usage on each Task with a call to lock on each Task)?current_task().sticky = true
to prevent it migrating to a thread that is not holding the GIL.__init__
is dangerous since that doesn't run on the main Task in some user applications even though it always does so during your CI tests. You might just want to add some special exemptions for when current_task() == root_task(0)?Does that all sound logical and reasonable? My proposal already went through a few drafts as I tried to make it seem straightforward enough to explain and use, so it might not be the only way to deal with some of these constraints, and I'm probably forgetting at least some important details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OncePerThread
is to be introduced in Julia 1.12 which has yet to be released at the time of this writing. Is there a lesser version of this that could be implemented prior to Julia 1.12, preferably to Julia 1.10 LTS?