Skip to content

Commit 7fa8620

Browse files
committed
Add interchange data model description + JSON schema
1 parent e43a21a commit 7fa8620

File tree

2 files changed

+338
-0
lines changed

2 files changed

+338
-0
lines changed

spec/data-model.md

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
# DRAFT MessageFormat 2.0 Data Model
2+
3+
To work with messages defined in other syntaxes than that of MessageFormat 2,
4+
an equivalent data model representation is also defined.
5+
Implementations MAY provide interfaces which allow
6+
for MessageFormat 2 syntax to be parsed into this representation,
7+
for messages presented in this representation to be formatted,
8+
or for other operations to be performed on or with messages in this representation.
9+
10+
Implementations are not required to use this data model for their internal representation of messages.
11+
12+
To ensure compatibility across all platforms,
13+
this interchange data model is defined in terms of JSON-compatible values.
14+
While this document uses TypeScript syntax for their definition,
15+
the canonical and authoritative source is the `message.json` JSON Schema definition.
16+
17+
## Messages
18+
19+
A `SelectMessage` corresponds to a syntax message that includes _selectors_.
20+
A message without _selectors_ and with a single _pattern_ is represented by a `PatternMessage`.
21+
22+
```ts
23+
type Message = PatternMessage | SelectMessage
24+
25+
interface PatternMessage {
26+
type: 'message'
27+
declarations: Declaration[]
28+
pattern: Pattern
29+
}
30+
31+
interface SelectMessage {
32+
type: 'select'
33+
declarations: Declaration[]
34+
selectors: Expression[]
35+
variants: Variant[]
36+
}
37+
```
38+
39+
Each message _declaration_ is represented by a `Declaration`,
40+
which connects the left-hand side `target` _variable_
41+
with its right-hand side `value`.
42+
43+
```ts
44+
interface Declaration {
45+
target: VariableRef
46+
value: Expression
47+
}
48+
```
49+
50+
In a `SelectMessage`, the `keys` and `value` of each _variant_ are kept as an ordered list.
51+
For the `CatchallKey`, a string `value` may be provided to retain an identifier.
52+
This is always `'*'` in MessageFormat 2 syntax, but may vary in other formats.
53+
54+
```ts
55+
interface Variant {
56+
keys: Array<Literal | CatchallKey>
57+
value: Pattern
58+
}
59+
60+
interface CatchallKey {
61+
type: '*'
62+
value?: string
63+
}
64+
```
65+
66+
## Patterns
67+
68+
Each `Pattern` represents a single linear _pattern_ without selectors,
69+
with a `body` made up of `Text` and `Expression` shapes.
70+
`Text` represents literal _text_,
71+
while `Expression` wraps each of the potential _expression_ shapes.
72+
The `value` of `Text` is the "cooked" value (i.e. escape sequences are processed).
73+
74+
Implementations MUST NOT rely on the set of `Expression` `body` values being exhaustive,
75+
as future versions of this specification MAY define additional expressions.
76+
If encountering a `body` with an unrecognised value,
77+
an implementation SHOULD treat it as it would a `Reserved` value.
78+
79+
```ts
80+
interface Pattern {
81+
body: Array<Text | Expression>
82+
}
83+
84+
interface Text {
85+
type: 'text'
86+
value: string
87+
}
88+
89+
interface Expression {
90+
type: 'expression'
91+
body: Literal | VariableRef | FunctionRef | Reserved
92+
}
93+
```
94+
95+
## Expressions
96+
97+
The `Literal` and `VariableRef` correspond to the the _literal_ and _variable_ syntax rules.
98+
When they are used as the `body` of an `Expression`,
99+
they represent _expression_ values with no _annotation_.
100+
101+
An _unquoted_ value is represented by a `Literal` with `quoted: false`,
102+
while a _quoted_ value would have `quoted: true`.
103+
The `value` of `Literal` is the "cooked" value (i.e. escape sequences are processed).
104+
105+
In a `VariableRef`, the `name` does not include the initial `$` of the _variable_.
106+
107+
```ts
108+
interface Literal {
109+
type: 'literal'
110+
quoted: boolean
111+
value: string
112+
}
113+
114+
interface VariableRef {
115+
type: 'variable'
116+
name: string
117+
}
118+
```
119+
120+
A `FunctionRef` represents an _expression_ with a _function_ _annotation_.
121+
In a `FunctionRef`,
122+
the `kind` corresponds to the starting sigil of a _function_:
123+
`'open'` for `+`, `'close'` for `-`, and `'value'` for `:`.
124+
The `name` does not include this starting sigil.
125+
126+
If the _expression_ includes a _literal_ or _variable_ before the _annotation_,
127+
it is included as the `operand`.
128+
Each _option_ is represented by an `Option`.
129+
130+
```ts
131+
interface FunctionRef {
132+
type: 'function'
133+
kind: 'open' | 'close' | 'value'
134+
name: string
135+
operand?: Literal | VariableRef
136+
options?: Option[]
137+
}
138+
139+
interface Option {
140+
name: string
141+
value: Literal | VariableRef
142+
}
143+
```
144+
145+
A `Reserved` represents an _expression_ with a _reserved_ _annotation_.
146+
The `sigil` corresponds to the starting sigil of the _reserved_.
147+
The `source` is the "raw" value (i.e. escape sequences are not processed)
148+
and includes the starting `sigil`.
149+
150+
Implementations MUST NOT rely on the set of `sigil` values remaining constant,
151+
as future versions of this specification MAY assign other meanings to such sigils.
152+
153+
If the _expression_ includes a _literal_ or _variable_ before the _annotation_,
154+
it is included as the `operand`.
155+
156+
```ts
157+
interface Reserved {
158+
type: 'reserved'
159+
sigil: '!' | '@' | '#' | '%' | '^' | '&' | '*' | '<' | '>' | '?' | '~'
160+
source: string
161+
operand?: Literal | VariableRef
162+
}
163+
```
164+
165+
## Extensions
166+
167+
Implementations MAY extend this data model with additional fields,
168+
such as the start and end positions of parsed shapes.
169+
When encountering an unfamiliar field, an implementation MUST ignore it.
170+
171+
In general,
172+
implementations MUST NOT extend the sets of values for any defined field or type
173+
when representing a valid message.
174+
However, when using this data model to represent an invalid message,
175+
an implementation MAY do so.
176+
This is intended to allow for the representation of "junk" or invalid content within messages.

spec/message.json

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
{
2+
"$schema": "http://json-schema.org/draft-07/schema",
3+
"$id": "https://github.com/unicode-org/message-format-wg/blob/main/spec/message.json",
4+
5+
"oneOf": [{ "$ref": "#/$defs/message" }, { "$ref": "#/$defs/select" }],
6+
7+
"definitions": {
8+
"literal": {
9+
"type": "object",
10+
"properties": {
11+
"type": { "const": "literal" },
12+
"quoted": { "type": "boolean" },
13+
"value": { "type": "string" }
14+
},
15+
"required": ["type", "quoted", "value"]
16+
},
17+
"variable": {
18+
"type": "object",
19+
"properties": {
20+
"type": { "const": "variable" },
21+
"name": { "type": "string" }
22+
},
23+
"required": ["type", "name"]
24+
},
25+
"value": {
26+
"oneOf": [{ "$ref": "#/$defs/literal" }, { "$ref": "#/$defs/variable" }]
27+
},
28+
29+
"function": {
30+
"type": "object",
31+
"properties": {
32+
"type": { "const": "function" },
33+
"kind": { "enum": ["open", "close", "value"] },
34+
"name": { "type": "string" },
35+
"operand": { "$ref": "#/$defs/value" },
36+
"options": {
37+
"type": "array",
38+
"items": {
39+
"type": "object",
40+
"properties": {
41+
"name": { "type": "string" },
42+
"value": { "$ref": "#/$defs/value" }
43+
},
44+
"required": ["name", "value"]
45+
}
46+
}
47+
},
48+
"required": ["type", "kind", "name"]
49+
},
50+
"reserved": {
51+
"type": "object",
52+
"properties": {
53+
"type": { "const": "reserved" },
54+
"sigil": {
55+
"enum": ["!", "@", "#", "%", "^", "&", "*", "<", ">", "?", "~"]
56+
},
57+
"source": { "type": "string" },
58+
"operand": { "$ref": "#/$defs/value" }
59+
},
60+
"required": ["type", "sigil", "source"]
61+
},
62+
63+
"text": {
64+
"type": "object",
65+
"properties": {
66+
"type": { "const": "text" },
67+
"value": { "type": "string" }
68+
},
69+
"required": ["type", "value"]
70+
},
71+
"expression": {
72+
"type": "object",
73+
"properties": {
74+
"type": { "const": "expression" },
75+
"body": {
76+
"oneOf": [
77+
{ "$ref": "#/$defs/literal" },
78+
{ "$ref": "#/$defs/variable" },
79+
{ "$ref": "#/$defs/function" },
80+
{ "$ref": "#/$defs/reserved" }
81+
]
82+
}
83+
},
84+
"required": ["type", "body"]
85+
},
86+
"pattern": {
87+
"type": "object",
88+
"properties": {
89+
"body": {
90+
"type": "array",
91+
"items": {
92+
"oneOf": [
93+
{ "$ref": "#/$defs/text" },
94+
{ "$ref": "#/$defs/expression" }
95+
]
96+
}
97+
}
98+
},
99+
"required": ["body"]
100+
},
101+
102+
"declarations": {
103+
"type": "array",
104+
"items": {
105+
"type": "object",
106+
"properties": {
107+
"target": { "$ref": "#/$defs/variable" },
108+
"value": { "$ref": "#/$defs/expression" }
109+
},
110+
"required": ["target", "value"]
111+
}
112+
},
113+
"variant-key": {
114+
"oneOf": [
115+
{ "$ref": "#/$defs/literal" },
116+
{
117+
"type": "object",
118+
"properties": {
119+
"type": { "const": "*" },
120+
"value": { "type": "string" }
121+
},
122+
"required": ["type"]
123+
}
124+
]
125+
},
126+
"message": {
127+
"type": "object",
128+
"properties": {
129+
"type": { "const": "message" },
130+
"declarations": { "$ref": "#/$defs/declarations" },
131+
"pattern": { "$ref": "#/$defs/pattern" }
132+
},
133+
"required": ["type", "declarations", "pattern"]
134+
},
135+
"select": {
136+
"type": "object",
137+
"properties": {
138+
"type": { "const": "select" },
139+
"declarations": { "$ref": "#/$defs/declarations" },
140+
"selectors": {
141+
"type": "array",
142+
"items": { "$ref": "#/$defs/expression" }
143+
},
144+
"variants": {
145+
"type": "array",
146+
"items": {
147+
"type": "object",
148+
"properties": {
149+
"keys": {
150+
"type": "array",
151+
"items": { "$ref": "#/$defs/variant-key" }
152+
},
153+
"value": { "$ref": "#/$defs/pattern" }
154+
},
155+
"required": ["keys", "value"]
156+
}
157+
}
158+
},
159+
"required": ["type", "declarations", "selectors", "variants"]
160+
}
161+
}
162+
}

0 commit comments

Comments
 (0)