Skip to content

A dynamic, enriched intelligence system mapping the foundation model landscape. Trust. Trace. Transform.

License

Notifications You must be signed in to change notification settings

adrianwedd/ModelAtlas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

🌐 ModelAtlas

Map the Modelverse. Trace the Truth. Shape the Future.

βΈ»

🧬 Introduction

ModelAtlas is a forensic-grade, modular intelligence framework meticulously designed for parsing, enriching, auditing, and visualizing the ever-evolving landscape of foundational AI models.

Crafted for researchers, engineers, analysts, and agentic systems alike, it seamlessly bridges raw metadata with recursive enrichment and deep provenance trackingβ€”creating an inspectable, extensible, and trust-aware knowledge layer that empowers the open model ecosystem.

πŸ’‘ Trust. Trace. Transform.

⚑ Quick Start

pip install -r requirements.txt && playwright install
python enrich/main.py
python -m atlas search "llama"

Create a .env file for API keys:

cp .env.example .env  # or run `atlas init`

The resulting .env file holds configuration keys such as LLM_API_KEY, OPENAI_API_KEY, HUGGING_FACE_API_KEY, and the optional PLAYWRIGHT_BROWSERS_PATH. Populate these values before running the enrichment pipeline or CLI tools.

Run make test-deps if you need all packages for the test suite.

enrich/main.py runs the enrichment trace, and CLI commands reside in the atlas_cli/ package.

🐳 Docker Usage

Build a container with all Python dependencies and Playwright browsers preinstalled:

docker build -t modelatlas .

Launch an interactive shell inside the container:

docker run --rm -it modelatlas

You can then run the enrichment trace or CLI tools just as on the host system:

python enrich/main.py
atlas search "llama"

βΈ»

🧠 System Overview

The diagram below is generated from architecture.mmd. You can edit it there to update the visual representation of the system.

flowchart TD
  subgraph Trace [ModelAtlas Enrichment Trace]
    A[🌐 Ollama Scraper<br/>Collects raw metadata from Ollama.com] --> B[πŸ“¦ Raw Model Data]
    B --> C[🧠 RECURSOR-1<br/>Recursive Enrichment Agent]
    C --> D[πŸ“ Enriched JSON Records]
    D --> E[πŸ›‘οΈ TrustForge<br/>Score models based on heuristic fusion]
    D --> F[πŸ” TracePoint<br/>Lineage & Provenance Debugger]
    D --> G[πŸ“Š AtlasView<br/>Visual Analytics Dashboard]
  end
  G --> H[πŸ‘€ Developer / Analyst]
  F --> H
  E --> H
Loading
  • Ollama Scraper: Harvests raw model data, including tags, manifests, and configuration files.
  • RECURSOR-1: Normalizes fields, infers missing data, and leverages LLMs for comprehensive enrichment.
  • TrustForge: Computes trust scores by fusing heuristic metrics from multiple data sources and now runs automatically inside the enrichment trace.
  • TracePoint: Tracks enrichment lineage, prompt decision paths, and source deltas for transparent provenance.
  • AtlasView: A web-based dashboard enabling search, filtering, comparative analysis, and visual audits.

βΈ»

🧭 Core Components

atlas β€” 🌐 Semantic Search Subcommand

  • Enables powerful search across enriched model metadata fields.
  • Supports embeddings, advanced filters, and fuzzy matching techniques.
  • Example usage:
    atlas search "open model for code completion"

trustforge β€” πŸ›‘οΈ Trust Score Engine

  • Aggregates and fuses diverse metrics including:
    • License compliance and compatibility
    • Download statistics and popularity
    • Upstream lineage and provenance
    • LLM-inferred risk assessments
  • Produces a comprehensive trust_score for each model.

recursor β€” πŸ” Recursive Enrichment Agent

  • Parses manifests and configuration blobs to extract detailed metadata.
  • Enriches attributes such as context length, base model lineage, quantization details, and architecture specifics.
  • Suggests tasks.yml patches to correct or complete missing fields.
  • Employs LLMs to intelligently infer and validate metadata where necessary.

tracepoint β€” πŸ” Provenance & Lineage Debugger

  • Enables deep inspection of any model’s provenance by tracing:
    • Original raw scrape data
    • Config blob origins
    • Step-by-step enrichment history
    • Prompt decision trees and rationale
  • Usage example:
    tracepoint llama3:8b --lineage

atlasview β€” πŸ“Š React Dashboard

  • Developed with Tailwind CSS and Recharts for responsive and interactive visualizations.
  • Presents:
    • Model landscape visualizations by size, trust score, and license type
    • Detailed lineage trees illustrating model ancestry
    • Metadata completeness and quality indicators

Note: The dashboard code is currently under active development and is not yet included in this repository. It will be released in a future update.


πŸ“ Project Structure

modelatlas/
β”œβ”€β”€ atlas_cli/             # CLI Tool for semantic search and inspection
β”œβ”€β”€ enrich/                # Recursive enrichers and prompt injectors
β”œβ”€β”€ trustforge/            # Trust scoring engine and heuristics
β”œβ”€β”€ recursor/              # Autonomous enrichment agent logic
β”œβ”€β”€ tracepoint/            # Model inspection and audit trail tools
β”œβ”€β”€ dashboards/            # React-based frontend UI components
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ models_raw.json         # Raw, unprocessed scrape data
β”‚   └── models_enriched.json    # Post-enrichment metadata output
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ naming.md
β”‚   β”œβ”€β”€ schema.md
β”‚   β”œβ”€β”€ usage_examples.md
β”‚   └── PHASE_2_DESIGN.md
β”œβ”€β”€ AGENTS.md
β”œβ”€β”€ tasks.yml
β”œβ”€β”€ README.md
└── atlas.config.json

βΈ»

πŸ› οΈ Tech Stack

Layer Technology Stack
Backend Python (requests, asyncio, typer)
Dashboard React, Tailwind CSS, Recharts
LLMs OpenAI, DeepSeek, Gemma, Ollama (local)
CLI UX typer, rich, fuzzy search
Storage JSON (canonical), YAML (tasks), Git
Agents RECURSOR-1 + Codex-style patchers
DevOps GitHub Actions, local runners

βΈ»

πŸ§ͺ Example Commands

# Run enrichment trace (includes trust scoring)
python enrich/main.py

# Perform semantic search for multilingual open-license models
atlas search "multilingual open license"

# Trace enrichment lineage for a specific model
tracepoint gemma:2b --lineage

# Launch the interactive dashboard locally
cd dashboards && npm run dev

All HTTP requests made by the scrapers are cached in .cache/http.sqlite by default. Use --no-cache to disable caching.

βœ… Running Tests

Execute the test suite with pytest from the repository root:

pytest

πŸ“¦ Git LFS Setup

This repository uses Git LFS to version large JSON artifacts. Run the following commands after cloning:

git lfs install
git lfs pull

All files in data/ and enriched_outputs/ are tracked via LFS, so new assets in these directories are stored automatically.

βΈ»

πŸš€ Naming Subsystem

Subsystem Designation Role Description
Full System ModelAtlas The overarching meta-system
Enrichment Recursor Autonomous recursive enrichment agent
Trust Engine TrustForge Assigns and computes trust scores
Lineage Tool TracePoint Debugs provenance and lineage
Dashboard AtlasView Frontend user interface
CLI atlas-cli Search and inspection command-line tool

βΈ»

πŸ“‹ Meta Documentation

Documentation File Purpose
AGENTS.md Details on enrichment agents, memory, and state logic
tasks.yml Canonical task graph defining enrichment traces
naming.md Naming philosophy and conventions
schema.md Data schema specification for enriched model entries
usage_examples.md Real-world CLI traces and usage patterns
PHASE_2_DESIGN.md Design notes on manifest decoding, tag repair, and scoring implementation
CODE_OF_CONDUCT.md Community expectations and enforcement policy
SECURITY.md How to report vulnerabilities

βΈ»

🧠 Philosophy

ModelAtlas is founded on these core principles:

  • πŸ”Ž Transparency over Obfuscation
  • ♻️ Recursive Enrichment is Integral, Not Optional
  • πŸ›‘οΈ Trust Must Be Quantifiable and Measurable
  • 🧠 LLMs Are Tools That Can Self-Improve and Assist

We hold that metadata is critical infrastructure, and that systems should be able to explain their own construction with clarity and rigor.


🀝 Community


Map the modelscape. Trace the truth. Shape the future. 🧭 Welcome to the Atlas.

βš–οΈ License

This project is licensed under the MIT License.

About

A dynamic, enriched intelligence system mapping the foundation model landscape. Trust. Trace. Transform.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages