Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 22 additions & 32 deletions openspec/changes/coercion-public-api/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,49 +4,39 @@

The secondary `coerce` function will no longer attempt to reach into the `BaseRoot` instances or use `.transform()`. Instead, it will leverage ArkType's public `.pipe()` functionality to create a data transformation layer.

### 1. Introspection via `schema.in.toJsonSchema()`
### 1. Introspection via `schema.in.toJsonSchema()` with Fallback

We use `schema.in` to get a representation of the schema's input *without morphs* and then call `.toJsonSchema()` to get a standard JSON Schema representation for traversal. This ensures compatibility with schemas that use `.pipe()` or other morphs, which would otherwise cause `toJsonSchema()` to throw.
We use `schema.in` to get a representation of the schema's input *without morphs*. To ensure 100% stability even when encountering types that are not representable in JSON Schema (like those with customized predicates or narrows), we call `.toJsonSchema()` with a base-preserving fallback:

**Key identification rules (mapped from current `isNumeric`/`isBoolean`):**
- **Numeric**: `domain: "number"`, or `kind: "unit"` with a number value, or an `intersection` with a numeric basis.
- **Boolean**: `domain: "boolean"`, or `kind: "unit"` with a boolean value.
- **Unions**: Recursively check `branches`.
```ts
const json = schema.in.toJsonSchema({
fallback: (ctx) => ctx.base
})
```

### 2. Path Mapping
This strategy ensures:
- **Resilience**: The introspection never throws due to unjsonifiable refinements (e.g., `string.url`).
- **Granularity**: We "work for what we can". If a property is a `number` with a custom narrowing predicate, we can still identify it as a `number` via its base and apply coercion, while skipping the predicate during path discovery.
- **Standards Compliance**: We exclusively use public ArkType APIs.

We will build a `CoercionMap` which is a record of paths (dot-notated or array) indicating where coercion should be applied.
**Key identification rules (mapped from JSON Schema paths):**
- **Numeric**: `type: "number"`, or `type: "integer"`, or `const`/`enum` with numeric values.
- **Boolean**: `type: "boolean"`, or `const`/`enum` with boolean values.
- **Objects/Arrays**: Recursively traversed via `properties` and `items`.
- **Unions**: Handled via `anyOf`, `oneOf`, or `allOf`.

Example:
```ts
const schema = type({ PORT: "number", DEBUG: "boolean?" })
// Map: { "PORT": ["number"], "DEBUG": ["boolean"] }
```
### 2. Path Mapping
... (rest of section) ...

### 3. Execution Flow

The `coerce` function returns:
```ts
type("unknown")
.pipe(data => {
// 1. Iterate CoercionMap
// 2. data[path] = maybeParsedNumber(data[path]) constant-time-ish update
// 3. return coercedData
})
.pipe(schema)
```
... (rest of section) ...

## Trade-offs and Considerations

### Why `toJsonSchema()` over `in.json`?
### Why `toJsonSchema()` with Fallback?
1. **Standardization**: `toJsonSchema()` returns a Draft 2020-12 compliant structure, making the introspection logic decoupled from ArkType's internal `JsonStructure`.
2. **Type Safety**: The `JsonSchema` type provided by ArkType is exhaustive and strictly typed, whereas `in.json` returns a loose object.

### Why `.in.toJsonSchema()`?
ArkType's `toJsonSchema()` implementation throws a `ToJsonSchemaError` if the schema contains morphs. By accessing `.in` first, we resolve the input side of the root node (which is always morph-free) and generate a schema representing what the environment variables must look like before transformations.

### Performance
Introspection is performed once per `coerce()` call. Since `createEnv` usually runs once at startup, this is negligible. The resulting morph is a simple iteration over known paths.
2. **Robustness**: Types like `string.url` or custom `.narrow()` calls would normally cause `toJsonSchema()` to fail globally. The `fallback: (ctx) => ctx.base` mechanism allows the generator to "skip" individual unjsonifiable constraints while preserving the rest of the schema structure.
3. **API Stability**: This approach avoids any reliance on internal properties like `.internal` or `.json` structure, using only the documented `toJsonSchema` options.

### Handling Unions
The strategy preserves "loose" coercion for mixed-type unions (e.g. `number | string`). If it *could* be a number, we try to parse it. If parsing fails, we leave it alone, and the subsequent `.pipe(schema)` handles the validation.
2 changes: 1 addition & 1 deletion openspec/changes/coercion-public-api/proposal.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ The original coercion implementation relied on undocumented ArkType internal API
Switch to a **Schema-Directed Coercion** approach.

Instead of inspecting proprietary ArkType structures (`schema.in.json`) or mutating internals, we will:
1. Introspect the schema's input requirements using the **standard** `schema.in.toJsonSchema()` API. This provides a strictly typed, version-controlled JSON Schema (Draft 2020-12) of the schema's input side, ensuring compatibility even when the schema contains morphs.
1. Introspect the schema's input requirements using the **standard** `schema.in.toJsonSchema({ fallback: (ctx) => ctx.base })` API. This provides a strictly typed, version-controlled JSON Schema (Draft 2020-12) of the schema's input side. The fallback mechanism ensures resilience against unjsonifiable types (e.g., those with custom predicates like `string.url`) by preserving the base structural information.
2. Identify paths that expect `number` or `boolean` types by traversing standard JSON Schema fields (`type`, `anyOf`, `const`, `enum`).
3. Pre-process the input data (environment variables) to coerce values at those paths *before* passing the data to ArkType for final validation.
4. Wrap the original schema in a pipeline: `type("unknown").pipe(applyCoercion).pipe(schema)`.
Expand Down
8 changes: 7 additions & 1 deletion packages/arkenv/src/utils/coerce.ts
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,13 @@ const applyCoercion = (data: unknown, targets: CoercionTarget[]) => {
* before validation.
*/
export function coerce<t, $ = {}>(schema: BaseType<t, $>): BaseType<t, $> {
const json = schema.in.toJsonSchema();
// Use a fallback to handle unjsonifiable parts of the schema (like predicates)
// by preserving the base schema. This ensures that even if part of the schema
// cannot be fully represented in JSON Schema, we can still perform coercion
// for the parts that can.
const json = schema.in.toJsonSchema({
fallback: (ctx) => ctx.base,
});
const targets = findCoercionPaths(json);

if (targets.length === 0) {
Expand Down