Skip to content

Conversation

@AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Aug 11, 2025

Summary

Fixes astral-sh/ty#956. Use boxed slices rather than Vecs where possible for types interned in the salsa database, to save on memory.

Test Plan

Existing tests

@AlexWaygood AlexWaygood added the ty Multi-file analysis & type inference label Aug 11, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 11, 2025

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

@github-actions
Copy link
Contributor

github-actions bot commented Aug 11, 2025

mypy_primer results

No ecosystem changes detected ✅

Memory usage changes were detected when running on open source projects
sphinx (https://github.com/sphinx-doc/sphinx)
-     struct fields = ~17MB
+     struct fields = ~16MB

@AlexWaygood AlexWaygood force-pushed the alex/smaller-tuples branch 2 times, most recently from 5d86e0f to ab0f599 Compare August 11, 2025 21:59
@AlexWaygood
Copy link
Member Author

Not sure if this is worth it — CI reports no significant memory-usage reductions in the memory.txt projects, and codspeed reports 1% regressions on several benchmarks.

Tomorrow I'll try getting some memory-usage reports for projects that we know have some complicated tuple types in them, such as colour-science and static-frame. Opening up for review now anyway, to see what folks think of the approach!

@AlexWaygood AlexWaygood marked this pull request as ready for review August 11, 2025 22:16
@MichaReiser
Copy link
Member

You probably want to wait for #19790 (or rebase on top) to get a full picture. The memory report today only accounts for queries (heap and stack size) and the stack size of tracked struct, but it doesn't account for the heap size of tracked structs. That means, the only memory reduction you'd see is from reducing the stack memory from 48 + stack_size::<T> to 32 + stack_size::<T>. That on its own might still be significant but only if we create a large number of tuple specs (and intern them)

@MichaReiser
Copy link
Member

MichaReiser commented Aug 12, 2025

This is the full memory report comparison from running on a large project. You can see how the tuple type's stack size has become much smaller. It's just not significant in overall terms because there aren't many enough (or put differently, some other queries use up so much space that the tuple type improvement is negligible). I suggest re-running the analysis with #19790

https://www.diffchecker.com/ZkHpFHCN/

Edit: The reason why we have different counts for some ingredients is because ty still panics when analysing some files and that leads to non-deterministic results.

@MichaReiser
Copy link
Member

I just ran main with the salsa struct memory usage information and the total memory usage for tuples is only:

`TupleType`                                        metadata=6.65MB   fields=13.88MB  count=118827

So i don't think this will lead to a very significant improvement

@AlexWaygood
Copy link
Member Author

AlexWaygood commented Aug 12, 2025

I tried running memory reports on colour-science and static-frame, which were codebases I thought might be interesting because we know operations on tuple types are quite hot when typechecking those codebases. The top-level memory usage for both those codebases is the same on main and with this PR. The memory taken up by TupleType does go down, but it's a small fraction of the overall memory usage

Project TupleType memory usage (main) TupleType memory usage (this PR) Total memory usage (main and this PR)
static-frame metadata: 0.98MB
fields: 1.91MB
metadata: 0.98MB
fields: 1.58MB
348MB
colour metadata: 0.12MB
fields: 0.44MB
metadata: 0.12MB
fields: 0.39MB
298MB

(All measurements done by executing TY_MEMORY_REPORT=full cargo run --release --manifest-path ../ruff/Cargo.toml -p ty check --quiet)

That does seem like a pretty decent reduction in memory usage on tuples in static-frame, even if it's a small slice of the overall pie. Using boxed slices is also consistent with what we do elsewhere for our Salsa-interned structs, and is more expressive of the fact that types are immutable once they're stored in the Salsa cache. I think the main question is whether this is worth the increase in code complexity -- I could go either way on that.

Performance is in the noise -- some benchmarks are showing tiny speedups, other ones tiny slowdowns.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the immutability of TupleSpec is a nice side effect that I value more than the tiny memory improvement :)

@MichaReiser
Copy link
Member

Eh, sphinx has a memory reduction 😆

@AlexWaygood AlexWaygood merged commit 498a048 into main Aug 12, 2025
38 checks passed
@AlexWaygood AlexWaygood deleted the alex/smaller-tuples branch August 12, 2025 11:51
dcreager added a commit that referenced this pull request Aug 12, 2025
* main:
  Don't cache files with diagnostics (#19869)
  [ty] support recursive type aliases (#19805)
  [ty] Remove unsafe `salsa::Update` implementations in `tuple.rs` (#19880)
  [ty] Function argument inlay hints (#19269)
  [ty] Remove Salsa interning for `TypedDictType` (#19879)
  Fix `lint.future-annotations` link (#19876)
  [ty] Reduce memory usage of `TupleSpec` and `TupleType` (#19872)
  [ty] Track heap usage of salsa structs (#19790)
  Update salsa to pull in tracked struct changes (#19843)
  [ty] simplify CycleDetector::visit signature (#19873)
  [ty] use interior mutability in type visitors (#19871)
  [ty] Fix tool name is None when no ty path is given in ty_benchmark (#19870)
  [ty] Remove `Type::Tuple` (#19669)
  [ty] Short circuit `ReachabilityConstraints::analyze_single` for dynamic types (#19867)
dcreager added a commit that referenced this pull request Aug 13, 2025
…aints

* dcreager/inferrable: (65 commits)
  this was right after all
  mark typevars inferrable as we go
  fix tests
  fix inference of constrained typevars
  [ty] Add precise inference for indexing, slicing and unpacking `NamedTuple` instances (#19560)
  Add rule code to GitLab description (#19896)
  [ty] render docstrings in hover (#19882)
  [ty] simplify return type of place_from_declarations (#19884)
  [ty] Various minor cleanups to tuple internals (#19891)
  [ty] Improve `sys.version_info` special casing (#19894)
  Don't cache files with diagnostics (#19869)
  [ty] support recursive type aliases (#19805)
  [ty] Remove unsafe `salsa::Update` implementations in `tuple.rs` (#19880)
  [ty] Function argument inlay hints (#19269)
  [ty] Remove Salsa interning for `TypedDictType` (#19879)
  Fix `lint.future-annotations` link (#19876)
  [ty] Reduce memory usage of `TupleSpec` and `TupleType` (#19872)
  [ty] Track heap usage of salsa structs (#19790)
  Update salsa to pull in tracked struct changes (#19843)
  [ty] simplify CycleDetector::visit signature (#19873)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider storing a Box<[T]> or shrink the tuple's inner Vec's before wrapping them as Type

3 participants