-
Notifications
You must be signed in to change notification settings - Fork 5
Coercion through preprocessing PoC #465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c243856
31d9316
2777437
d9574c0
f88d25c
4aecd75
3aa3f6c
da67f34
a5ac144
5a90cd2
53cc46d
3afa7fa
2e79e95
a890b26
197f501
50c91b4
1551753
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| # Design: Coercion | ||
|
|
||
| ## Architecture | ||
| The coercion logic will be implemented as a preprocessing step within the `createEnv` function. | ||
|
|
||
| ### Flow | ||
| 1. **Input**: `createEnv` receives a schema definition (`def`) and an environment object (`env`). | ||
| 2. **Inspection**: We inspect `def` to identify keys that expect primitive types (number, boolean) but will receive strings from `env`. | ||
| * This inspection primarily targets schema definitions provided as plain objects with string values (e.g., `{ PORT: "number" }`). | ||
| * Complex ArkType definitions (already compiled types) may be skipped or require advanced introspection (out of scope for initial implementation). | ||
| 3. **Coercion**: | ||
| * For each identified key, we check the corresponding value in `env`. | ||
| * If the target type is `number` (or subtypes like `number.port`, `number.epoch`), we attempt to convert the string to a number using `Number()` or `parseFloat()`. | ||
| * If the target type is `boolean`, we convert "true" to `true` and "false" to `false`. | ||
| 4. **Validation**: The modified `env` object (with coerced values) is passed to the ArkType schema for validation. | ||
|
|
||
| ## Decisions | ||
|
|
||
| ### Decision: Use Preprocessing for Coercion | ||
| We decided to implement coercion as a preprocessing step that runs *before* ArkType validation, rather than using ArkType's native "morphs" or scope-level overrides. | ||
|
|
||
| **Rationale:** | ||
| 1. **Scope Limitations**: As confirmed by the ArkType creator, there is no mechanism to apply a morph to an entire scope (e.g., "all numbers"). We would have to manually override `number` and every subtype (`number.port`, `number.epoch`, etc.), which is brittle and unscalable. | ||
yamcodes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| 2. **Separation of Concerns**: Coercion (parsing a string into a primitive) is distinct from Validation (checking if that primitive meets criteria). Keeping coercion separate allows `arkenv` to handle the "environment variable boundary" explicitly, ensuring that `number` in the schema always validates a real JavaScript number. | ||
| 3. **Complexity**: Implementing type-level mapping for global coercion would introduce significant complexity to the types, whereas a runtime preprocessor is straightforward and easier to maintain. | ||
|
|
||
| **Alternatives Considered:** | ||
| * **ArkType Morphs**: We considered using `type("string").pipe(...)` or overriding keywords in the scope. This was rejected because it requires manual per-type configuration or complex scope manipulation that doesn't propagate to sub-keywords. | ||
| * **Manual Parsing**: Continuing with the current state where users manually pipe string types. This was rejected as it degrades developer experience. | ||
|
|
||
| ## Risks / Trade-offs | ||
| * **String Definitions**: This approach relies on inspecting the schema definition. It works best when users provide string definitions (e.g., `{ PORT: "number" }`). If a user provides a pre-compiled `type("number")`, we cannot easily inspect it to apply coercion, meaning those values might remain strings and fail validation. We will document this limitation. | ||
|
|
||
| ## Implementation Details | ||
|
|
||
| ### `coerce` Utility | ||
| We will create a utility function `coerce(def: Record<string, unknown>, env: Record<string, string | undefined>)` that returns a new environment object. | ||
|
|
||
| ```typescript | ||
| function coerce(def: Record<string, unknown>, env: Record<string, string | undefined>) { | ||
| const coerced = { ...env }; | ||
| for (const key in def) { | ||
| const typeDef = def[key]; | ||
| if (typeof typeDef === "string") { | ||
| if (typeDef.startsWith("number")) { | ||
| // Coerce to number | ||
| } else if (typeDef === "boolean") { | ||
| // Coerce to boolean | ||
| } | ||
| } | ||
| } | ||
| return coerced; | ||
| } | ||
| ``` | ||
|
|
||
| ### Integration | ||
| In `createEnv`: | ||
|
|
||
| ```typescript | ||
| export function createEnv(def, env = process.env) { | ||
| // ... | ||
| const coercedEnv = isPlainObject(def) ? coerce(def, env) : env; | ||
| const validatedEnv = schema(coercedEnv); | ||
| // ... | ||
| } | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # Coercion | ||
|
|
||
| ## Problem | ||
| Environment variables are always strings at runtime, but users want to treat them as typed primitives without manual conversion. | ||
|
|
||
| **Current state:** | ||
| ```typescript | ||
| // Manual conversion required | ||
| const env = arkenv({ | ||
| PORT: type("string").pipe(str => Number.parseInt(str, 10)), | ||
| DEBUG: type("string").pipe(str => str === "true") | ||
| }); | ||
| ``` | ||
|
|
||
| **Desired state:** | ||
| ```typescript | ||
| // Coercion | ||
| const env = arkenv({ | ||
| PORT: "number", // "3000" → 3000 | ||
| DEBUG: "boolean", // "true" → true | ||
| TIMESTAMP: "number.epoch" // "1640995200000" → 1640995200000 | ||
yamcodes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| }); | ||
| ``` | ||
|
|
||
| ## Solution | ||
| Implement an automatic coercion layer in `arkenv` that runs before ArkType validation. This layer will inspect the provided schema definition and, where possible, convert string environment variables into their target primitive types (number, boolean) so that ArkType can validate them as such. | ||
|
|
||
| This approach allows `arkenv` to support "native" feeling environment variables while leveraging ArkType's powerful validation for the final values. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| # Spec: Coercion | ||
|
|
||
| ## ADDED Requirements | ||
|
|
||
| ### Requirement: Coerce numeric strings to numbers | ||
| The system MUST coerce environment variable strings to numbers when the schema definition specifies `number` or a `number.*` subtype. | ||
|
|
||
| #### Scenario: Basic number coercion | ||
| Given a schema `{ PORT: "number" }` | ||
| And an environment `{ PORT: "3000" }` | ||
| When `arkenv` parses the environment | ||
| Then the result should contain `PORT` as the number `3000` | ||
|
|
||
| #### Scenario: Number subtype coercion | ||
| Given a schema `{ TIMESTAMP: "number.epoch" }` | ||
| And an environment `{ TIMESTAMP: "1640995200000" }` | ||
| When `arkenv` parses the environment | ||
| Then the result should contain `TIMESTAMP` as the number `1640995200000` | ||
yamcodes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ### Requirement: Coerce boolean strings to booleans | ||
| The system MUST coerce environment variable strings "true" and "false" to boolean values when the schema definition specifies `boolean`. | ||
|
|
||
| #### Scenario: Boolean true coercion | ||
| Given a schema `{ DEBUG: "boolean" }` | ||
| And an environment `{ DEBUG: "true" }` | ||
| When `arkenv` parses the environment | ||
| Then the result should contain `DEBUG` as the boolean `true` | ||
|
|
||
| #### Scenario: Boolean false coercion | ||
| Given a schema `{ DEBUG: "boolean" }` | ||
| And an environment `{ DEBUG: "false" }` | ||
| When `arkenv` parses the environment | ||
| Then the result should contain `DEBUG` as the boolean `false` | ||
|
|
||
| ### Requirement: Pass through non-coercible values | ||
| The system MUST pass through values unchanged if they do not match a coercible type definition or if coercion fails (letting ArkType handle the validation error). | ||
|
|
||
| #### Scenario: String pass-through | ||
| Given a schema `{ API_KEY: "string" }` | ||
| And an environment `{ API_KEY: "12345" }` | ||
| When `arkenv` parses the environment | ||
| Then the result should contain `API_KEY` as the string `"12345"` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| # Tasks | ||
|
|
||
| - [x] Implement `coerce` utility function in `src/utils.ts` or `src/coerce.ts` | ||
| - [x] Integrate `coerce` into `createEnv` in `src/create-env.ts` | ||
| - [x] Add unit tests for `coerce` logic | ||
| - [x] Add integration tests in `tests/coercion.test.ts` verifying `number`, `boolean`, and sub-keywords | ||
| - [x] Update documentation to explain coercion behavior and limitations |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -59,6 +59,21 @@ ArkEnvError: Errors found while validating environment variables | |
| PORT must be an integer between 0 and 65535 (was "hello") | ||
| ``` | ||
|
|
||
| ## Coercion | ||
|
|
||
| Environment variables are always strings, but ArkEnv automatically coerces them to their target types when possible: | ||
|
|
||
| - `number` and subtypes (`number.port`, `number.epoch`) are parsed as numbers. | ||
| - `boolean` strings ("true", "false") are parsed as booleans. | ||
|
|
||
| ```ts | ||
| const env = arkenv({ | ||
| PORT: "number", // "3000" → 3000 | ||
| DEBUG: "boolean", // "true" → true | ||
| TIMESTAMP: "number.epoch" // "1640995200000" → 1640995200000 | ||
yamcodes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| }); | ||
| ``` | ||
|
Comment on lines
+62
to
+75
|
||
|
|
||
| ## Features | ||
|
|
||
| - Zero external dependencies | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| import { describe, expect, it } from "vitest"; | ||
| import { coerce } from "./coerce"; | ||
|
|
||
| describe("coerce", () => { | ||
| it("should coerce number strings", () => { | ||
| const def = { PORT: "number" }; | ||
| const env = { PORT: "3000" }; | ||
| const result = coerce(def, env); | ||
| expect(result.PORT).toBe(3000); | ||
| }); | ||
|
Comment on lines
+5
to
+10
|
||
|
|
||
| it("should coerce number subtypes", () => { | ||
| const def = { TIMESTAMP: "number.epoch" }; | ||
| const env = { TIMESTAMP: "1640995200000" }; | ||
| const result = coerce(def, env); | ||
| expect(result.TIMESTAMP).toBe(1640995200000); | ||
yamcodes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| }); | ||
|
|
||
| it("should coerce boolean 'true'", () => { | ||
| const def = { DEBUG: "boolean" }; | ||
| const env = { DEBUG: "true" }; | ||
| const result = coerce(def, env); | ||
| expect(result.DEBUG).toBe(true); | ||
| }); | ||
|
|
||
| it("should coerce boolean 'false'", () => { | ||
| const def = { DEBUG: "boolean" }; | ||
| const env = { DEBUG: "false" }; | ||
| const result = coerce(def, env); | ||
| expect(result.DEBUG).toBe(false); | ||
| }); | ||
|
Comment on lines
+19
to
+31
|
||
|
|
||
| it("should pass through non-coercible values", () => { | ||
| const def = { API_KEY: "string" }; | ||
| const env = { API_KEY: "12345" }; | ||
| const result = coerce(def, env); | ||
| expect(result.API_KEY).toBe("12345"); | ||
| }); | ||
|
|
||
| it("should pass through values that fail number coercion", () => { | ||
| const def = { PORT: "number" }; | ||
| const env = { PORT: "not-a-number" }; | ||
| const result = coerce(def, env); | ||
| expect(result.PORT).toBe("not-a-number"); | ||
| }); | ||
|
|
||
| it("should pass through values that fail boolean coercion", () => { | ||
| const def = { DEBUG: "boolean" }; | ||
| const env = { DEBUG: "yes" }; | ||
| const result = coerce(def, env); | ||
| expect(result.DEBUG).toBe("yes"); | ||
| }); | ||
|
|
||
| it("should handle undefined values", () => { | ||
| const def = { PORT: "number" }; | ||
| const env = { PORT: undefined }; | ||
| const result = coerce(def, env); | ||
| expect(result.PORT).toBeUndefined(); | ||
| }); | ||
|
|
||
| it("should ignore keys not in definition", () => { | ||
| const def = { PORT: "number" }; | ||
| const env = { PORT: "3000", EXTRA: "foo" }; | ||
| const result = coerce(def, env); | ||
| expect(result.PORT).toBe(3000); | ||
| expect(result.EXTRA).toBe("foo"); | ||
| }); | ||
| }); | ||
|
Comment on lines
+61
to
+68
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,31 @@ | ||||||||||||
| export function coerce( | ||||||||||||
| def: Record<string, unknown>, | ||||||||||||
| env: Record<string, string | undefined>, | ||||||||||||
| ): Record<string, unknown> { | ||||||||||||
| const coerced: Record<string, unknown> = { ...env }; | ||||||||||||
|
Comment on lines
+4
to
+5
|
||||||||||||
| ): Record<string, unknown> { | |
| const coerced: Record<string, unknown> = { ...env }; | |
| ): Record<string, string | number | boolean | undefined> { | |
| const coerced: Record<string, string | number | boolean | undefined> = { ...env }; |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] After checking for undefined, the code should verify that value is actually a string before calling string methods. While RuntimeEnvironment type suggests values are string | undefined, defensive programming would add: if (typeof value !== "string") continue; after line 12 to prevent runtime errors if non-string values are passed.
| } | |
| } | |
| if (typeof value !== "string") { | |
| continue; | |
| } |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Number(value) for coercion can produce unexpected results for edge cases:
- Empty strings:
Number("")returns0, which may not be the intended behavior - Whitespace:
Number(" ")returns0 - Scientific notation:
Number("1e3")returns1000(might be acceptable)
Consider using Number.parseFloat() or adding explicit validation to reject empty/whitespace-only strings to avoid silent conversion of invalid inputs to 0.
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The coercion logic only handles simple string type definitions. It doesn't account for:
- Union types:
"number | string"- will try to coerce even though string is acceptable - Optional types:
"number?"- will coerce, but the?suffix might not be handled - Types with defaults:
"number = 3000"- will coerce, but the= 3000suffix needs testing
Consider adding logic to parse these modifiers or adding tests to verify current behavior with these patterns.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| import { describe, expect, it } from "vitest"; | ||
| import { createEnv } from "./create-env"; | ||
| import { type } from "./index"; | ||
|
|
||
| describe("coercion integration", () => { | ||
| it("should coerce and validate numbers", () => { | ||
| const env = createEnv({ PORT: "number" }, { PORT: "3000" }); | ||
| expect(env.PORT).toBe(3000); | ||
| expect(typeof env.PORT).toBe("number"); | ||
| }); | ||
|
|
||
| it("should coerce and validate booleans", () => { | ||
| const env = createEnv( | ||
| { DEBUG: "boolean", VERBOSE: "boolean" }, | ||
| { DEBUG: "true", VERBOSE: "false" }, | ||
| ); | ||
| expect(env.DEBUG).toBe(true); | ||
| expect(env.VERBOSE).toBe(false); | ||
| }); | ||
|
|
||
| it("should coerce and validate number subtypes (port)", () => { | ||
| const env = createEnv({ PORT: "number.port" }, { PORT: "8080" }); | ||
| expect(env.PORT).toBe(8080); | ||
| }); | ||
yamcodes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| it("should fail validation if coercion fails (not a number)", () => { | ||
| expect(() => createEnv({ PORT: "number" }, { PORT: "abc" })).toThrow(); | ||
| }); | ||
|
|
||
| it("should fail validation if value is valid number but invalid subtype", () => { | ||
| expect(() => | ||
| createEnv( | ||
| { PORT: "number.port" }, | ||
| { PORT: "99999" }, // Too large for port | ||
| ), | ||
| ).toThrow(); | ||
| }); | ||
|
|
||
| it("should work with mixed coerced and non-coerced values", () => { | ||
| const env = createEnv( | ||
| { | ||
| PORT: "number", | ||
| HOST: "string", | ||
| DEBUG: "boolean", | ||
| }, | ||
| { | ||
| PORT: "3000", | ||
| HOST: "localhost", | ||
| DEBUG: "true", | ||
| }, | ||
| ); | ||
| expect(env.PORT).toBe(3000); | ||
| expect(env.HOST).toBe("localhost"); | ||
| expect(env.DEBUG).toBe(true); | ||
| }); | ||
|
Comment on lines
+39
to
+55
|
||
|
|
||
| it("should NOT coerce if using compiled types", () => { | ||
| // This documents the limitation | ||
| const schema = type({ PORT: "number" }); | ||
| expect(() => createEnv(schema, { PORT: "3000" })).toThrow(); // "3000" is a string, schema expects number, no coercion happens | ||
| }); | ||
| }); | ||
|
Comment on lines
+57
to
+62
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 90
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 41
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 794
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 2431
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 643
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 752
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 41
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 41
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 671
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 2613
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 654
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 874
🏁 Script executed:
Repository: yamcodes/arkenv
Length of output: 41
Use the built-in
"number"type instead of customNumber()conversion.Replace the manual
.pipe((value) => Number(value))with ArkType's built-in"number"type, which handles string-to-number coercion safely with proper NaN validation:The built-in
"number"type already coerces valid numeric strings (e.g.,"3000"→3000) and rejects invalid ones (e.g.,"abc") during validation. This follows the guideline to leverage ArkType's built-in types where possible and keeps the schema readable and consistent with other fields likePORT: "number.port".📝 Committable suggestion
🤖 Prompt for AI Agents