Skip to content

Commit cc067c2

Browse files
committed
refactor(semantic): implement Send and Sync for ScopingCell (#13042)
Previous PRs removed unsound `Sync` impl for `Allocator` (#13033) and `Send` impl for `Vec` (#13041). Previously `ScopingCell` was `Send` and `Sync` (and therefore `Scoping` was too). The recent PRs mean that `ScopingCell` lost those traits too. This PR implements both traits again for `ScopingCell` by restricting its API slightly, so now it is `Send` and `Sync` again, but this time in a manner which maintains soundness. The comments in the code outline the logic of why I believe it to be sound. Rolldown relies on `ScopingCell` being `Send` and `Sync`. After this PR, this stack does not require any changes in Rolldown. I've checked that `cargo ck` in Rolldown passes when using the version of Oxc from this branch.
1 parent f490d27 commit cc067c2

File tree

1 file changed

+141
-9
lines changed

1 file changed

+141
-9
lines changed

crates/oxc_semantic/src/scoping.rs

Lines changed: 141 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
use std::{collections::hash_map::Entry, fmt, mem};
22

33
use rustc_hash::{FxHashMap, FxHashSet};
4+
use self_cell::self_cell;
45

56
use oxc_allocator::{Allocator, CloneIn, FromIn, HashMap as ArenaHashMap, Vec as ArenaVec};
67
use oxc_index::{Idx, IndexVec};
@@ -99,13 +100,141 @@ impl Default for Scoping {
99100
}
100101
}
101102

102-
self_cell::self_cell!(
103-
pub struct ScopingCell {
104-
owner: Allocator,
105-
#[covariant]
106-
dependent: ScopingInner,
103+
/// [`ScopingCell`] contains parts of [`Scoping`] which are 2-dimensional structures
104+
/// e.g. `Vec<Vec<T>>`, `HashMap<Vec<T>>`, `Vec<HashMap<T>>`.
105+
///
106+
/// These structures are very expensive to construct and drop, due to the large number of allocations /
107+
/// deallocations involved. Therefore, we store them in an arena allocator to:
108+
/// 1. Avoid costly heap allocations.
109+
/// 2. Be able to drop all the data cheaply in one go.
110+
///
111+
/// We use [`self_cell!`] to be able to store the `Allocator` alongside `Vec`s and `HashMap`s which store
112+
/// their data in that `Allocator` (self-referential struct).
113+
///
114+
/// Conceptually, these structures own their data, and so `ScopingCell` (and therefore also `Scoping`)
115+
/// should be `Send` and `Sync`, exactly as it would be if `ScopingCell` contained standard heap-allocated
116+
/// `Vec`s and `HashMap`s.
117+
///
118+
/// However, the use of an arena allocator complicates matters, because `Allocator` is not `Sync`,
119+
/// and `oxc_allocator::Vec` is not `Send`.
120+
///
121+
/// ### `Sync`
122+
///
123+
/// For it to be safe for `&ScopingCell` to be sent across threads, we must make it impossible to obtain
124+
/// multiple `&Allocator` references from them on different threads, because those references could be
125+
/// used to allocate into the same arena simultaneously. `Allocator` is not thread-safe, and this would
126+
/// likely be undefined behavior.
127+
///
128+
/// We prevent this by wrapping the struct created by `self_cell!` in a further wrapper.
129+
/// That outer wrapper prevents access to `with_dependent` and `borrow_owner` methods of `ScopingCellInner`,
130+
/// which allow obtaining `&Allocator` from a `&ScopingCell`.
131+
///
132+
/// The only method which *does* allow access to `&Allocator` is `with_dependent_mut`.
133+
/// It takes `&mut self`, which guarantees exclusive access to `ScopingCell`. Therefore, no other code
134+
/// (on any thread) can simultaneously have access to the `Allocator` during a call to `with_dependent_mut`.
135+
///
136+
/// `allocator_used_bytes` obtains an `&Allocator` reference internally, without taking `&mut self`.
137+
/// But it doesn't mutate the `Allocator` in any way, and it doesn't expose the `&Allocator` to user.
138+
/// By taking `&self`, it guarantees that `with_dependent_mut` cannot be called at the same time.
139+
///
140+
/// ### `Send`
141+
///
142+
/// `Allocator` is `Send`. `oxc_allocator::Vec` is not, but that restriction is purely to prevent a `Vec`
143+
/// being moved to different thread from the `Allocator`, which would allow multiple threads making
144+
/// allocations in that arena simultaneously.
145+
///
146+
/// Here, the `Allocator` and the `Vec`s are contained in the same struct, and moving them to another
147+
/// thread *together* does not cause a problem.
148+
///
149+
/// This is all enclosed in a module, to prevent access to `ScopingCellInner` directly.
150+
mod scoping_cell {
151+
use super::*;
152+
153+
// Inner self-referential struct containing `Allocator` and `ScopingInner`,
154+
// where `ScopingInner` contains `Vec`s and `HashMap`s which store their data in the `Allocator`.
155+
self_cell!(
156+
pub struct ScopingCellInner {
157+
owner: Allocator,
158+
#[covariant]
159+
dependent: ScopingInner,
160+
}
161+
);
162+
163+
/// Wrapper around [`ScopingCellInner`], which only provides methods that give access to an
164+
/// `&Allocator` reference if provided with `&mut ScopingCell`. See comments above.
165+
#[repr(transparent)]
166+
pub struct ScopingCell(ScopingCellInner);
167+
168+
#[expect(clippy::inline_always)] // All methods just delegate
169+
impl ScopingCell {
170+
/// Construct a new [`ScopingCell`] with an [`Allocator`] and `dependent_builder` function.
171+
#[inline(always)]
172+
pub fn new(
173+
allocator: Allocator,
174+
dependent_builder: impl for<'_q> FnOnce(&'_q Allocator) -> ScopingInner<'_q>,
175+
) -> Self {
176+
Self(ScopingCellInner::new(allocator, dependent_builder))
177+
}
178+
179+
/// Borrow [`ScopingInner`].
180+
#[inline(always)]
181+
pub fn borrow_dependent(&self) -> &ScopingInner<'_> {
182+
self.0.borrow_dependent()
183+
}
184+
185+
/// Call given closure `func` with an unique reference to [`ScopingInner`].
186+
#[inline(always)]
187+
pub fn with_dependent_mut<'outer_fn, Ret>(
188+
&'outer_fn mut self,
189+
func: impl for<'_q> FnOnce(&'_q Allocator, &'outer_fn mut ScopingInner<'_q>) -> Ret,
190+
) -> Ret {
191+
self.0.with_dependent_mut(func)
192+
}
193+
194+
/// Calculate the total size of data used in the [`Allocator`], in bytes.
195+
///
196+
/// See [`Allocator::used_bytes`] for more info.
197+
#[expect(clippy::unnecessary_safety_comment)]
198+
#[inline(always)]
199+
pub fn allocator_used_bytes(&self) -> usize {
200+
// SAFETY:
201+
// `with_dependent_mut` is the only method which gives access to the `Allocator`, and it
202+
// takes `&mut self`. This method takes `&self`, which means it can't be called at the same
203+
// time as `with_dependent_mut` (or within `with_dependent_mut`'s callback closure).
204+
//
205+
// Therefore, the only other references to `&Allocator` which can be held at this point
206+
// are in other calls to this method on other threads.
207+
// `used_bytes` does not perform allocations, or mutate the `Allocator` in any way.
208+
// So it's fine if 2 threads are calling this method simultaneously, because they're
209+
// both performing read-only actions.
210+
//
211+
// Another thread could simultaneously hold a reference to `&ScopingInner` via `borrow_dependent`,
212+
// but the `Vec`s and `HashMap`s in `ScopingInner` don't allow making allocations in the arena
213+
// without a `&mut` reference (e.g. `Vec::push` takes `&mut self`). Such mutable references
214+
// cannot be obtained from an immutable `&ScopingInner` reference.
215+
// So there's no way for simultaneous usage of `borrow_dependent` on another thread to break
216+
// the guarantee that no mutation of the `Allocator` can occur during this method.
217+
self.0.borrow_owner().used_bytes()
218+
}
219+
220+
/// Consume [`ScopingCell`] and return the [`Allocator`] it contains.
221+
#[expect(dead_code)]
222+
#[inline(always)]
223+
pub fn into_owner(self) -> Allocator {
224+
self.0.into_owner()
225+
}
107226
}
108-
);
227+
228+
/// SAFETY: `ScopingCell` can be `Send` because both the `Allocator` and `Vec`s / `HashMap`s
229+
/// storing their data in that `Allocator` are moved to another thread together.
230+
unsafe impl Send for ScopingCell {}
231+
232+
/// SAFETY: `ScopingCell` can be `Sync` if `ScopingInner` is `Sync`, because `ScopingCell` provides
233+
/// no methods which give access to an `&Allocator` reference, except when taking `&mut self`,
234+
/// which guarantees exclusive access. See further explanation above.
235+
unsafe impl<'cell> Sync for ScopingCell where ScopingInner<'cell>: Sync {}
236+
}
237+
use scoping_cell::ScopingCell;
109238

110239
pub struct ScopingInner<'cell> {
111240
/* Symbol Table Fields */
@@ -839,6 +968,9 @@ impl Scoping {
839968
/// Clone all semantic data. Used in `Rolldown`.
840969
#[must_use]
841970
pub fn clone_in_with_semantic_ids_with_another_arena(&self) -> Self {
971+
let used_bytes = self.cell.allocator_used_bytes();
972+
let cell = self.cell.borrow_dependent();
973+
842974
Self {
843975
symbol_spans: self.symbol_spans.clone(),
844976
symbol_flags: self.symbol_flags.clone(),
@@ -850,8 +982,8 @@ impl Scoping {
850982
scope_build_child_ids: self.scope_build_child_ids,
851983
scope_node_ids: self.scope_node_ids.clone(),
852984
scope_flags: self.scope_flags.clone(),
853-
cell: self.cell.with_dependent(|allocator, cell| {
854-
let allocator = Allocator::with_capacity(allocator.used_bytes());
985+
cell: {
986+
let allocator = Allocator::with_capacity(used_bytes);
855987
ScopingCell::new(allocator, |allocator| ScopingInner {
856988
symbol_names: cell.symbol_names.clone_in_with_semantic_ids(allocator),
857989
resolved_references: cell
@@ -892,7 +1024,7 @@ impl Scoping {
8921024
copy
8931025
},
8941026
})
895-
}),
1027+
},
8961028
}
8971029
}
8981030
}

0 commit comments

Comments
 (0)