Skip to content

refactor: Implement component schemas#2193

Merged
mmagician merged 23 commits intonextfrom
igamigo-component-schema
Dec 19, 2025
Merged

refactor: Implement component schemas#2193
mmagician merged 23 commits intonextfrom
igamigo-component-schema

Conversation

@igamigo
Copy link
Collaborator

@igamigo igamigo commented Dec 17, 2025

Closes #2062.

This is mostly a rewrite of the account component templating structures. The main refactor implies moving away from templates and allowing users to describe component schemas, where a schema just represents the types of values that a storage slot is supposed to allow, with optional default (but overridable at instantiation) values for each of those slots.

Most of the concepts and structures used to serialize and deserialize the schema in this PR are similar to what we had before. However, because this also changes nomenclature of a lot of structures, variables, etc., even similar code will appear new in the diff. Additionally, I moved some code around into their own submodules for readability.

For reviewing this I first recommend going through:

Notable changes

  • Value slots can be typed in two ways: with word schema types (think word, auth::ecdsa_k256_keccak::pub_key as we had before), or as a product/tuple type of four typed felts.
    • For this, word is still the generic default type of a value slot and felt is the generic default element type.
    • Felt types (such as fungible_faucets::metadata::token_symbol or u8) can now be used to type a word as well (where the word is stored as [0, 0, 0, <felt_type_value>]).
    • void type was introduced for felts to express elements that are not part of the product type. For example, on a metadata storage slot there may be a padding element: [u32, token_symbol, u8, void]. void elements must omit name and default-value and always resolve to 0.
  • Map slots now have typed keys and typed values. Each of those works in the same way as the words described above.
  • Multi-slot values are not supported anymore.
  • Single felts can be addressed with slot_name.field_name. For instance, if the slot is miden::standards::fungible_faucets::metadata, you can reference the decimals element via miden::standards::fungible_faucets::metadata.decimals.
  • In the TOML representation, storage entries are declared under [[storage.slot]], and the slot kind is inferred by the shape of the required type field:
    • type = "..." or type = [ ... ] describes a value slot.
    • type = { key = ..., value = ... } (or equivalently type.key = ... and type.value = ...) describes a map slot. I believe this is an overall positive change, because it avoids overloading type = "map" to define that we are talking about a storage map, while also using type = "..." to define that this was a value slot with a specific data type. It also makes (de)serialization simpler because structs map more cleanly to their TOML representations. But since this was not discussed, it was done on a separate commit to easily undo it if we don't like it.
  • Also in init storage data TOML, keys that include :: must be written in quotes (e.g. "demo::token_metadata.max_supply" = "1000000"). This is a TOML grammar limitation. Additionally, init values are currently required to be TOML strings (including numeric values), and are parsed/ validated against the schema at instantiation time.
**Rundown of related structs**
  • AccountComponentMetadata: similar to previous approach, top-level “schema metadata” (name/description/version/supported account types) plus the storage schema
  • AccountStorageSchema: a set of named slots for the component; builds concrete storage slots from InitStorageData
    • StorageSlotSchema: per-slot schema, either a single-word value slot or a map slot
    • ValueSlotSchema: describes a single word; delegatestyping to WordSchema
      • WordSchema::Singular: the whole word is one typed value supplied at init time (optionally with an overridable default)
      • WordSchema::Composed: fixed 4-felt layout where each felt has its own FeltSchema (typed field, default, or void padding)
        • FeltSchema: describes one felt inside a composed word; non-void felts must be named (so they can be provided/overridden); void is always zero padding
  • MapSlotSchema: describes a map slot, can have static default_values and optional key/value schemas; init-provided entries are optional (omitting them gets you an empty map unless defaults exist).
  • StorageValueName: similar as before; the key space for init values; typically derived from a slot name, with optional .field suffixes for composed-word typed fields
  • InitStorageData with WordValue: also similar as before; raw init-time inputs before parsing/validation against the schema and type registry
  • SchemaTypeRegistry: Similar to what we had before with TemplateRegistry

Follow-ups/open questions:

  • I have a rework of InitStorageData that tackles Improve InitStorageData structure for better usability #1860 by giving it a constructor that works with native types instead of a string pretty much ready, will open a separate PR for it

  • StorageValueName should also be reworked. With this PR, when using InitStorageData you can either reference a slot directly via its StorageSlotName's string identifier, or an element of the storage slot by suffixing it with a .element if the storage slot is of a tuple type. So, we can type StorageValueName to follow these directives instead of it being mostly based on Strings.

  • Tackling Account storage schema validation #2104:

    • We need to decide what the storage schema commitment actually commits to. This should likely be AccountStorageSchema, but this also currently includes default values. Three different approaches here:
      • Decide we can let the commitment also commit to the schema's default values
      • Ignore default values altogether
      • Remove default values from AccountStorageSchema and put them at the level of AccountComponentMetadata. When parsing the metadata TOML we can just grab all defaults and put them in a separate mapping, instead of coupling them with the schema itself.
    • We need to add a way of displaying well known types (ie, the ones in the type registry) but this should be easy (adding some sort of display bound in the type registry)

@igamigo igamigo marked this pull request as ready for review December 17, 2025 21:56
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you! Not a full review from me, but I left some comments inline.

Copy link
Contributor

@PhilippGackstatter PhilippGackstatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a full review as I did not get through everything. The overall structure of what I've reviewed makes sense to me, though.

Comment on lines 103 to 105
} else {
return Err(InitStorageDataError::ArraysNotSupported);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We run into this when a user specifies an array of size 5, for instance, right? It would be nice to provide more context here to help with finding the error. I guess an alternative to including the actual size here is to mention the allowed layout in the error message.

Comment on lines +20 to +32
pub static SCHEMA_TYPE_REGISTRY: LazyLock<SchemaTypeRegistry> = LazyLock::new(|| {
let mut registry = SchemaTypeRegistry::new();
registry.register_felt_type::<Void>();
registry.register_felt_type::<u8>();
registry.register_felt_type::<u16>();
registry.register_felt_type::<u32>();
registry.register_felt_type::<Felt>();
registry.register_felt_type::<TokenSymbol>();
registry.register_word_type::<Word>();
registry.register_word_type::<rpo_falcon512::PublicKey>();
registry.register_word_type::<ecdsa_k256_keccak::PublicKey>();
registry
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for now, but after #2191, I think we should move TokenSymbol to miden-standards.

Similarly, registering rpo_falcon512::PublicKey and ecdsa_k256_keccak::PublicKey here also feels like it should be done at the standards level, because, afaict, these types are useful only in the context of their respective auth components, which are standards.

Just mentioning this in case we don't have an issue for making the type registry user-extensible, but nothing to do for this PR I think.

Copy link
Collaborator Author

@igamigo igamigo Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think we have a specific issue for making the registry user-extensible (was briefly mentioned in #2062 (comment)), but I agree we should keep it in mind

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should create a separate issue for making the registry extensible.

}

/// Trait for converting a string into a single `Word`.
pub trait WordType: alloc::fmt::Debug + Send + Sync {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Why does this trait require Debug but FeltType does not? Should they both require it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this. I tried different approaches to enable displaying types (for re-serialization, but also for enabling displaying values in places like the explorer, etc.) and I think this was a leftover from one of those attempts

Felt,
}

fn word_type_kind(schema_type: &SchemaTypeIdentifier) -> WordTypeKind {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could this function be a method on SchemaTypeIdentifier instead?

Copy link
Collaborator Author

@igamigo igamigo Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking this is more type registry-related. If the registry is somehow user-extensible, it will become the source of truth for identifying types and what they map to in terms of converting to native types. I'll note this in the list of followups

}
}

fn validate_word_value(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Similarly, could this and validate_felt_value be methods on SchemaTypeIdentifier? Just to avoid free functions that have less context.

Copy link
Collaborator

@mmagician mmagician left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the overall structure makes a lot of sense. Also thank you for extensive tests, these showcase the new functionality really well.

Comment on lines 380 to 385
type = [
{ type = "u32", name = "max_supply", description = "Maximum supply (base units)" },
{ type = "token_symbol", name = "symbol", default-value = "TST" },
{ type = "u8", name = "decimals", description = "Token decimals" },
{ type = "void" }
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if type array contains less than four entries?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will error as it expects exactly 4. As you suggested in the other comment, we could make void the default type and have it fill the remaining elements.

Comment on lines 137 to 138
/// Returns the init-time values required to instantiate this schema.
///
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns the init-time values required to instantiate this schema.
///
/// Returns the init-time values' requirements for this schema.
///

/// # Guarantees
///
/// - The metadata's storage schema does not contain duplicate slot names.
/// - The schema cannot contain protocol-reserved slot names.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not for this PR, but currently protocol-reserved slot names are hardcoded to a single check:

if component_slot.name() == Self::faucet_sysdata_slot() {
    return Err(AccountError::StorageSlotNameMustNotBeFaucetSysdata);
}

One idea would be to encode this in some helper enum ReservedSlotNames (or similar), which we could then reference from doc strings like this one here. On the other hand, this sounds like a bit of an overkill for the single reserved slot name that we have. But I admit, it's a little hard to find what protocol reserved slots are in the codebase.

cc @PhilippGackstatter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally agree this would be nice, but I think we want to get rid of the faucet sysdata slot and make issuance tracking the responsibility of the faucet implementation, in which case we'd no longer have any protocol-reserved slots at all, which would be even better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - though for now we could probably go with the RESERVED_SLOT_NAMES as already implemented in #2207

It will be equally easy to change if/once we remove the faucet's reserved slot.

Comment on lines 84 to 85
- **Singular**: defined through the `type` field, indicating the expected `SchemaTypeIdentifier` for the entire word. The value is supplied at instantiation time via `InitStorageData`.
- **Composed**: provided through `type = [ ... ]`, which contains exactly four `FeltSchema` descriptors. Each element is either a named typed field (optionally with `default-value`) or a `void` element for reserved/padding zeros.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which category does this fall into?

[[storage.value]]
name = "demo::protocol_version"
description = "A whole-word init-supplied value typed as a felt (stored as [0,0,0,<value>])."
type = "u8"

I think this is equivalent to an inferred-composite schema: [u8, void, void, void], so it might be somewhat confusing.

@igamigo
Copy link
Collaborator Author

igamigo commented Dec 18, 2025

Thanks for all the reviews! I think I got most of the comments. I explicitly left some for followups (some described in the PR description, some in the comments of your suggestions).

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! Not a super thorough review from me, but I left some comments inline (and there are a few un-addressed comments from previous reviews). Once these are addressed, we should be good to merge.

I think there are two broad categories of issues that we can push into the follow-ups:

  • Refactoring of how InitStorageData works (this should be relatively straight-forward).
  • Making the type registry extensible (this will require some discussion).

Copy link
Collaborator

@mmagician mmagician left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still prefer explicit [[storage.{map/value}]] over [[storage.slots]] for clarity when defining the TOML, but I don't want to get bogged down on this. LGTM with one small formatting nit

@mmagician mmagician merged commit 83c5e65 into next Dec 19, 2025
19 checks passed
@mmagician mmagician deleted the igamigo-component-schema branch December 19, 2025 15:24
@bobbinth
Copy link
Contributor

@igamigo - now that this PR is merged, could you guys help propagate it to miden-node and miden-client? Once we have corresponding PRs there, I could start merging the chain of PRs starting at #2158.

Comment on lines +91 to 101
```toml
[[storage.slots]]
name = "demo::faucet_id"
description = "Account ID of the registered faucet"
type = [
{ type = "felt", name = "prefix", description = "Faucet ID prefix" },
{ type = "felt", name = "suffix", description = "Faucet ID suffix" },
{ type = "void" },
{ type = "void" },
]
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this belongs somewhere in the "Storage entries" section.

@igamigo
Copy link
Collaborator Author

igamigo commented Dec 19, 2025

@igamigo - now that this PR is merged, could you guys help propagate it to miden-node and miden-client? Once we have corresponding PRs there, I could start merging the chain of PRs starting at #2158.

Yes, here is the node PR; and I also have the client PR almost ready to publish.

@igamigo
Copy link
Collaborator Author

igamigo commented Dec 19, 2025

Client PR: 0xMiden/miden-client#1626

mmagician added a commit that referenced this pull request Dec 30, 2025
* feat: renames, doc improvements, submodule export

* Update crates/miden-protocol/src/account/component/storage/type_registry.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Marti <marti@miden.team>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
afa7789 pushed a commit to afa7789/miden-base that referenced this pull request Jan 15, 2026
* refactor: replace templates with component schemas

* chore: overrideable -> overridable

* chore: more docs

* feat: avoid overloading type word and define slot type with the dotted key

* chore: CHANGELOG

* chore: update docs

* chore: update docs

* chore: display

* chore: spellcheck

* reviews: address most of the first review's smaller comments

* reviews: infer type based o type structure

* reviews: re-enable tests

* reviews: docs, typeregstry renames, doc comment rewrites

* chore: lints

* reviews: give context to errors, simplify validations, revert felt parsing to use registry

* reviews: simplify further

* reviews: initvaluerequirements -> schemarequirements; now collected into map

* reviews: more doc suggestions applied; validate schema

* reviews: singular->simple, scalar->atomic, docs reviews, nits

* reviews: initstoragedata duplicate detection

* feat: scope storagevaluename

* chore: docs fix

* chore: fix indentaion

---------

Co-authored-by: Marti <marti@miden.team>
afa7789 pushed a commit to afa7789/miden-base that referenced this pull request Jan 15, 2026
* feat: renames, doc improvements, submodule export

* Update crates/miden-protocol/src/account/component/storage/type_registry.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Marti <marti@miden.team>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
afa7789 pushed a commit to afa7789/miden-base that referenced this pull request Jan 15, 2026
* feat: renames, doc improvements, submodule export

* Update crates/miden-protocol/src/account/component/storage/type_registry.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Marti <marti@miden.team>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extend component metadata to represent schemas

4 participants