Skip to content

Tags: bjackman/linux

Tags

asi/next-2025-10-15

Toggle asi/next-2025-10-15's commit message
SQUASHME: mm: asi: keep ASI domains initialized until the mm_struct i…

…s destroyed

BJ: In this branch, this is a bugfix. Otherwise we're taking a mutex
from asi_destroy() when we are atomic.

Currently, ASI domains are initialized individually when needed, and
destroyed when no longer being used. Multiple initializations are
allowed, and an init_count is used to keep track of when the domain
should be destroyed.

This design allows for an ASI domain of a specific class to be
initialized, destroyed, and re-initialized again by the same process.
However, the common case is that once an ASI domain is initialized, it
will remain in use until the end of the process lifetime (or shortly
before then).

Remove this complexity by keeping initialized ASI domains alive until
the containing mm_struct is being destroyed (after mm->mm_count drops to
zero). __asi_destroy() is mostly emptied, we no longer need to
increment the TLB gen to make sure the TLB is flushed if the ASI domain
is re-initialized. asi_destroy() is replaced with
asi_destroy_mm_state(), which destroys all ASI domains in an mm_struct.

asi_destroy_mm_state() can be called from unsleepable contexts and
cannot hold mm->asi_init_lock, so make sure asi_ini() could only be
called from a process where current->mm == asi->mm. This guarantees that
we cannot race with asi_destroy_mm_state(), which is only executed after
all users of the mm_struct are gone.

Ideally, asi_destroy_mm_state() is cheap enough that it doesn't impact
the process exit path.

Keeping the initialized ASI domains around has two effects:

(a) In a following change these domains will be dynamically allocated, by
    delaying their destruction until the mm_struct is destroyed we miss a
    chance to free their memory earlier. However, the size of struct asi
    is trivial, and the window between an ASI domain going out of use
    and the destruction of mm_struct is expected to be small.

(b) Keeping mm->asi[*] initialized when the ASI domain is no longer used
    means that we will unnecessarily flush the TLB in that ASI domain in
    asi_tlb_flush_one_user() -> asi_invpcid_nonsensitive_one(). However,
    this is probably fine because it is a single address flush, and the
    window between an ASI domain going out of use and the destruction of
    mm_struct is expected to be small.

The goal of this is beyond code simplification. Incoming changes will
support context switching and exiting to userspace without exiting ASI
in some cases, which means that arbitrary kernel code can be run in an
ASI domain. This requires a protection mechanism to make sure that ASI
domains are not destroyed while they are being used or referenced.

Tying the lifetime to ASI domains sidesteps this problem. As long as a
process is running in an ASI domain, it naturally holds a ref to the
containing mm_struct (through task->mm or task->active_mm). This means
that the ASI domain cannot be destroyed. To enforce this, add a warning
in asi_enter() if asi->mm is not the same as current->mm.

This applies to ASI domains retrieved with asi_get_current() as long as
preemption is disabled. If preemption is enabled, task->active_mm may
change (e.g. for kthreads), so the ASI domain may be destroyed if the
containing mm_struct is destroyed. Add a comment to document this and a
warning to enforce it. Stop using asi_get_current() in
asi_in_nonsensitive() to avoid the warning if preemption is enabled.
asi_in_nonsensitive() is inherently racy and does not access the ASI
domain.

It also applies naturally to ASI domains retrieved through mm->asi[*],
assuming the retriever is naturally holding a ref to the mm_struct
before dereferncing it.

Suggested-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>

asi/next-2025-10-10

Toggle asi/next-2025-10-10's commit message
SQUASHME: mm: asi: keep ASI domains initialized until the mm_struct i…

…s destroyed

BJ: In this branch, this is a bugfix. Otherwise we're taking a mutex
from asi_destroy() when we are atomic.

Currently, ASI domains are initialized individually when needed, and
destroyed when no longer being used. Multiple initializations are
allowed, and an init_count is used to keep track of when the domain
should be destroyed.

This design allows for an ASI domain of a specific class to be
initialized, destroyed, and re-initialized again by the same process.
However, the common case is that once an ASI domain is initialized, it
will remain in use until the end of the process lifetime (or shortly
before then).

Remove this complexity by keeping initialized ASI domains alive until
the containing mm_struct is being destroyed (after mm->mm_count drops to
zero). __asi_destroy() is mostly emptied, we no longer need to
increment the TLB gen to make sure the TLB is flushed if the ASI domain
is re-initialized. asi_destroy() is replaced with
asi_destroy_mm_state(), which destroys all ASI domains in an mm_struct.

asi_destroy_mm_state() can be called from unsleepable contexts and
cannot hold mm->asi_init_lock, so make sure asi_ini() could only be
called from a process where current->mm == asi->mm. This guarantees that
we cannot race with asi_destroy_mm_state(), which is only executed after
all users of the mm_struct are gone.

Ideally, asi_destroy_mm_state() is cheap enough that it doesn't impact
the process exit path.

Keeping the initialized ASI domains around has two effects:

(a) In a following change these domains will be dynamically allocated, by
    delaying their destruction until the mm_struct is destroyed we miss a
    chance to free their memory earlier. However, the size of struct asi
    is trivial, and the window between an ASI domain going out of use
    and the destruction of mm_struct is expected to be small.

(b) Keeping mm->asi[*] initialized when the ASI domain is no longer used
    means that we will unnecessarily flush the TLB in that ASI domain in
    asi_tlb_flush_one_user() -> asi_invpcid_nonsensitive_one(). However,
    this is probably fine because it is a single address flush, and the
    window between an ASI domain going out of use and the destruction of
    mm_struct is expected to be small.

The goal of this is beyond code simplification. Incoming changes will
support context switching and exiting to userspace without exiting ASI
in some cases, which means that arbitrary kernel code can be run in an
ASI domain. This requires a protection mechanism to make sure that ASI
domains are not destroyed while they are being used or referenced.

Tying the lifetime to ASI domains sidesteps this problem. As long as a
process is running in an ASI domain, it naturally holds a ref to the
containing mm_struct (through task->mm or task->active_mm). This means
that the ASI domain cannot be destroyed. To enforce this, add a warning
in asi_enter() if asi->mm is not the same as current->mm.

This applies to ASI domains retrieved with asi_get_current() as long as
preemption is disabled. If preemption is enabled, task->active_mm may
change (e.g. for kthreads), so the ASI domain may be destroyed if the
containing mm_struct is destroyed. Add a comment to document this and a
warning to enforce it. Stop using asi_get_current() in
asi_in_nonsensitive() to avoid the warning if preemption is enabled.
asi_in_nonsensitive() is inherently racy and does not access the ASI
domain.

It also applies naturally to ASI domains retrieved through mm->asi[*],
assuming the retriever is naturally holding a ref to the mm_struct
before dereferncing it.

Suggested-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>

asi/next-2025-10-09

Toggle asi/next-2025-10-09's commit message
SQUASHME: mm: asi: keep ASI domains initialized until the mm_struct i…

…s destroyed

BJ: In this branch, this is a bugfix. Otherwise we're taking a mutex
from asi_destroy() when we are atomic.

Currently, ASI domains are initialized individually when needed, and
destroyed when no longer being used. Multiple initializations are
allowed, and an init_count is used to keep track of when the domain
should be destroyed.

This design allows for an ASI domain of a specific class to be
initialized, destroyed, and re-initialized again by the same process.
However, the common case is that once an ASI domain is initialized, it
will remain in use until the end of the process lifetime (or shortly
before then).

Remove this complexity by keeping initialized ASI domains alive until
the containing mm_struct is being destroyed (after mm->mm_count drops to
zero). __asi_destroy() is mostly emptied, we no longer need to
increment the TLB gen to make sure the TLB is flushed if the ASI domain
is re-initialized. asi_destroy() is replaced with
asi_destroy_mm_state(), which destroys all ASI domains in an mm_struct.

asi_destroy_mm_state() can be called from unsleepable contexts and
cannot hold mm->asi_init_lock, so make sure asi_ini() could only be
called from a process where current->mm == asi->mm. This guarantees that
we cannot race with asi_destroy_mm_state(), which is only executed after
all users of the mm_struct are gone.

Ideally, asi_destroy_mm_state() is cheap enough that it doesn't impact
the process exit path.

Keeping the initialized ASI domains around has two effects:

(a) In a following change these domains will be dynamically allocated, by
    delaying their destruction until the mm_struct is destroyed we miss a
    chance to free their memory earlier. However, the size of struct asi
    is trivial, and the window between an ASI domain going out of use
    and the destruction of mm_struct is expected to be small.

(b) Keeping mm->asi[*] initialized when the ASI domain is no longer used
    means that we will unnecessarily flush the TLB in that ASI domain in
    asi_tlb_flush_one_user() -> asi_invpcid_restricted_one(). However,
    this is probably fine because it is a single address flush, and the
    window between an ASI domain going out of use and the destruction of
    mm_struct is expected to be small.

The goal of this is beyond code simplification. Incoming changes will
support context switching and exiting to userspace without exiting ASI
in some cases, which means that arbitrary kernel code can be run in an
ASI domain. This requires a protection mechanism to make sure that ASI
domains are not destroyed while they are being used or referenced.

Tying the lifetime to ASI domains sidesteps this problem. As long as a
process is running in an ASI domain, it naturally holds a ref to the
containing mm_struct (through task->mm or task->active_mm). This means
that the ASI domain cannot be destroyed. To enforce this, add a warning
in asi_enter() if asi->mm is not the same as current->mm.

This applies to ASI domains retrieved with asi_get_current() as long as
preemption is disabled. If preemption is enabled, task->active_mm may
change (e.g. for kthreads), so the ASI domain may be destroyed if the
containing mm_struct is destroyed. Add a comment to document this and a
warning to enforce it. Stop using asi_get_current() in
asi_is_restricted() to avoid the warning if preemption is enabled.
asi_is_restricted() is inherently racy and does not access the ASI
domain.

It also applies naturally to ASI domains retrieved through mm->asi[*],
assuming the retriever is naturally holding a ref to the mm_struct
before dereferncing it.

Suggested-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>