Skip to content

Make TypeId const comparable #142789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Jun 20, 2025

This should unblock stabilizing const TypeId::of and allow us to progress into any possible future we want to take TypeId to.

To achieve that TypeId now contains 16 / size_of<usize>() pointers which each are actually just size_of<usize>() bytes of the stable hash. At compile-time these pointers cannot be dereferenced or otherwise inspected (at present doing so might ICE the compiler). Preventing inspection of this data allows us to refactor TypeId to any other scheme in the future without breaking anyone who was tempted to transmute TypeId to obtain the hash at compile-time.

cc @eddyb for their previous work on #95845 (which we still can do in the future if we want to get rid of the hash as the final thing that declares two TypeIds as equal).

r? @RalfJung

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 20, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 20, 2025

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@rust-log-analyzer

This comment has been minimized.

@oli-obk oli-obk force-pushed the constable-type-id branch from 3cddd21 to 1fd7b66 Compare June 21, 2025 10:20
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member

It will be a while until I have the capacity to review a PR of this scale.

Meanwhile, could you say a bit more about the architecture of the change? It seems you want for the "new kind of allocation" approach, but it's not clear from the PR description how exactly that shows up in TypeId.

Also, I am definitely not comfortable landing this by myself, I can only review the const-eval parts. Changing the representation of TypeId has ramifications well beyond that that I do not feel qualified to evaluate -- I think an MCP would be justified.

@rustbot
Copy link
Collaborator

rustbot commented Jun 21, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 21, 2025

Well, I got private feedback yesterday that instead of encoding a 16 byte value as an 8 byte pointer to the 16 byte value and an 8 byte hash, I should just do the thing where we split up type id internally into pointer sized chunks and codegen will make a hash out of it again.

TLDR: no changes to runtime type id anymore in the latest revision of this PR. Only compile-time type id is now a bit funny

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 21, 2025

It will be a while until I have the capacity to review a PR of this scale.

I'm splitting unrelated parts out, so the high level feedback is already useful and I'll look for libs and codegen ppl to review the appropriate parts

@rust-log-analyzer

This comment has been minimized.

jdonszelmann added a commit to jdonszelmann/rust that referenced this pull request Jun 23, 2025
Make `PartialEq` a `const_trait`

r? `@fee1-dead` or `@compiler-errors`

something generally useful but also required for rust-lang#142789
jdonszelmann added a commit to jdonszelmann/rust that referenced this pull request Jun 23, 2025
Make `PartialEq` a `const_trait`

r? ``@fee1-dead`` or ``@compiler-errors``

something generally useful but also required for rust-lang#142789
rust-timer added a commit that referenced this pull request Jun 23, 2025
Rollup merge of #142822 - oli-obk:const-partial-eq, r=fee1-dead

Make `PartialEq` a `const_trait`

r? ``@fee1-dead`` or ``@compiler-errors``

something generally useful but also required for #142789
@bors
Copy link
Collaborator

bors commented Jun 23, 2025

☔ The latest upstream changes (presumably #142906) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk oli-obk force-pushed the constable-type-id branch 2 times, most recently from b8a7a10 to 1c47a64 Compare June 24, 2025 09:25
@rust-log-analyzer

This comment has been minimized.

@oli-obk oli-obk force-pushed the constable-type-id branch from 1c47a64 to bcb4aa2 Compare June 24, 2025 13:02
@rust-log-analyzer

This comment has been minimized.

@oli-obk oli-obk force-pushed the constable-type-id branch from bcb4aa2 to 693b930 Compare June 24, 2025 14:05
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member

RalfJung commented Jun 26, 2025

Well, I got private feedback yesterday that instead of encoding a 16 byte value as an 8 byte pointer to the 16 byte value and an 8 byte hash,

Well, we could make the hash larger then. ;)

I should just do the thing where we split up type id internally into pointer sized chunks and codegen will make a hash out of it again.

TLDR: no changes to runtime type id anymore in the latest revision of this PR. Only compile-time type id is now a bit funny

Uh, okay, that's... interesting.^^

What is the plan for opsem and Miri here? In the spec, are the hash bytes visible or still opaque "fake pointers"?

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 26, 2025

What is the plan for opsem and Miri here? In the spec, are the hash bytes visible or still opaque "fake pointers"?

Well, I got private feedback yesterday that instead of encoding a 16 byte value as an 8 byte pointer to the 16 byte value and an 8 byte hash,

Well, we could make the hash larger then. ;)

I should just do the thing where we split up type id internally into pointer sized chunks and codegen will make a hash out of it again.
TLDR: no changes to runtime type id anymore in the latest revision of this PR. Only compile-time type id is now a bit funny

Uh, okay, that's... interesting.^^

What is the plan for opsem and Miri here? In the spec, are the hash bytes visible or still opaque "fake pointers"?

I would definitely say they were always not visible and if miri treated them opaquely that would be better in general. The only implementation detail of the MIR interpreter is that you can only copy around pointer sized pieces of the hash. Unless we move to a forever guaranteed layout of TypeId I would say the contents are always opaque and observing them is not meaningful except through the methods on TypeId and the traits it implements

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 26, 2025

Well, we could make the hash larger then. ;)

we're still forward compatible to doing so with this change

@RalfJung
Copy link
Member

RalfJung commented Jun 26, 2025

The only implementation detail of the MIR interpreter is that you can only copy around pointer sized pieces of the hash

Miri supports copying around arbitrary parts of a pointer...

... but that relies on the bytes of the pointer having their "real" value, and the provenance being just extra metadata. With this proposal, I don't know which role the provenance would play, but it doesn't seem like it'd play any role. So the entire scheme doesn't really do anything in Miri. It's really just a const-eval hack where we use the fact that pointers are opaque to also make something that's definitely not a pointer opaque... hm, my first reaction to this is that I don't like this very much. :/

Taking a step back, the space of solutions seems to be:

  1. Put an actual pointer into TypeId, with or without an accompanying short hash for fast inequality checks. Make that pointer (but not the accompanying hash) point to "non-memory" similar to vtable pointers.
  2. Leave the runtime representation unchanged, and have "fake pointers" in const-eval. That's what this PR does.
  3. Do nothing, and just live with the fact that some code might stop compiling if we change the TypeId representation again.

Honestly I am not sure if I prefer 2 over 3. It's not like 3 is a complete disaster; we've had code stop compiling due to this before (due to transmute size checks). Const code doing transmutes could also be broken by internal changes to the representation of any library type. 1 has some appeal to it because of how principled it is in solving 3, but 2 doesn't have that same appeal.

@oli-obk oli-obk force-pushed the constable-type-id branch from 693b930 to 9009013 Compare June 26, 2025 14:38
@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 26, 2025

The one thing 1 & 2 make nicer is handling of type id in the compiler, as with 3 we will have to maintain a Map<u128, Ty<'tcx>> somewhere globally, which is fishy in many ways, especially around incremental and parallel-rustc if people keep those hashes around as literals in their code and just hope that it works if they transmute that to a TypeId.

@RalfJung
Copy link
Member

I don't understand, why do we need such a map?

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check-2 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
 Documenting rustc_lint_defs v0.0.0 (/checkout/compiler/rustc_lint_defs)
error: unresolved link to `core::mem::type_info::TypeId`
   --> compiler/rustc_middle/src/mir/interpret/mod.rs:510:39
    |
510 |     /// Generates an [AllocId] for a [core::mem::type_info::TypeId]. Will get deduplicated.
    |                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ no item named `type_info` in module `mem`
    |
    = note: `-D rustdoc::broken-intra-doc-links` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(rustdoc::broken_intra_doc_links)]`

 Documenting rustc_next_trait_solver v0.0.0 (/checkout/compiler/rustc_next_trait_solver)
error: could not document `rustc_middle`
warning: build failed, waiting for other jobs to finish...
[RUSTC-TIMING] rustc_middle test:false 16.155
Command has failed. Rerun with -v to see more details.
Build completed unsuccessfully in 0:00:41
  local time: Thu Jun 26 15:23:08 UTC 2025
  network time: Thu, 26 Jun 2025 15:23:08 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants