Skip to content
/ HML Public

A minimal, artifact-driven machine learning core where every run is explicit, reproducible, and inspectable.

License

Notifications You must be signed in to change notification settings

DonnieTD/HML

Repository files navigation

HML is a minimal, artifact-driven machine learning core built from first principles.

There is:

  • no framework
  • no runtime magic
  • no mutable global state
  • no opaque training loop

Instead, HML treats structure, execution, and history as explicit, inspectable artifacts on disk.


Core Philosophy

HML separates concerns rigorously:

  • Structure
    What the model is (neurons, layers, networks)

  • Semantics
    How structure is interpreted (forward, backward, training)

  • Execution
    When semantics are run, producing immutable artifacts

Nothing implicit. Nothing hidden.


What Exists Today

1. A Complete XOR Network

Implemented from scratch using:

  • monoidal composition
  • structural induction
  • explicit semantic passes

The network supports:

  • forward evaluation
  • backward gradients
  • training via gradient descent

No ML libraries. No framework abstractions.


2. Artifact-Based Execution Model

Every run produces immutable artifacts instead of mutating state.

Inputs

Stored under:

/inputs

  • inputs-0-0-0.txt
  • inputs-0-0-1.txt
  • inputs-meta.json

Each input artifact:

  • is versioned
  • is hashed (SHA256)
  • has a human-readable note
  • is append-only

Weights

Stored under:

/weights

  • weights-0-0-0.txt
  • weights-0-0-1.txt
  • weights-meta.json

Each weights artifact:

  • corresponds to exactly one run
  • is immutable
  • is append-only
  • records evaluation metadata (e.g. accuracy)

3. Run Alignment (The Key Idea)

Runs are implicitly indexed by position.

There is no run ID.
The index is the run.

Invariant:

  • inputs-meta.json and weights-meta.json are append-only
  • index i in both files represents the same run
  • a run is complete iff both entries exist at index i
  • incomplete runs are detectable by length mismatch

This enables:

  • deterministic replay
  • crash detection
  • auditability
  • zero coordination overhead

4. Permutation Engine

Before each training run, inputs are transformed via a permutation engine.

Currently implemented:

  • deterministic A/B permutations
  • each permutation is committed as a new input artifact
  • hashes detect whether inputs are genuinely new

This lays the groundwork for:

  • seen-variant detection
  • recursive subdivision (halves → quarters → etc.)
  • controlled exploration of input space

Why This Matters

Most machine learning systems:

  • overwrite state
  • hide history
  • cannot be replayed
  • cannot be audited

HML does the opposite.

You can:

  • replay any run
  • inspect exactly what data was used
  • verify identity via hashes
  • detect partial or corrupted runs
  • extend the system without breaking invariants

Current Status

  • XOR model: complete
  • Training: functional
  • Input versioning: implemented
  • Weight versioning: implemented
  • Hashing: implemented
  • Run invariants: enforced

What Comes Next

  • recursive permutation refinement
  • hash-based pruning of seen inputs
  • accuracy-driven branching
  • promotion to a general experiment engine
  • structured JSON metadata (instead of Read/Show)

Non-Goals

  • no automatic hyperparameter search
  • no GPU acceleration
  • no black-box optimizers
  • no silent mutation

If something changes, it becomes a new artifact.


Summary

HML is not a framework.

It is a discipline:

  • structure first
  • semantics explicit
  • history preserved
  • artifacts over state

If it ran, it exists. If it exists, it can be audited. If it can’t be audited, it doesn’t belong here.

About

A minimal, artifact-driven machine learning core where every run is explicit, reproducible, and inspectable.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published