Interactive 2D visualization of embedding spaces. Generate static HTML explorers for embedding data with support for UMAP and t-SNE dimensionality reduction, interactive filtering, and configurable display options.
- Interactive Plotly.js scatter plots - Pan, zoom, hover for details
- Multiple embedding spaces - Compare different embeddings side by side
- UMAP and t-SNE - Toggle between dimensionality reduction methods
- Dynamic coloring - Color by any metadata field
- Filter by groups - Checkbox filters for categories
- Search - Filter points by name
- Toggle labels - Show/hide point labels
- Click navigation - Configure links to detail pages
- Static output - Works on GitHub Pages (no server required)
pip install linkml-embeddings-explorerOr with uv:
uv add linkml-embeddings-explorerCreate a JSON file with your embeddings and metadata:
{
"spaces": {
"main": {
"embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...], ...],
"metadata": [
{"name": "Item 1", "category": "A", "description": "..."},
{"name": "Item 2", "category": "B", "description": "..."}
]
}
}
}If you have pre-computed 2D coordinates, include them instead of raw embeddings:
{
"spaces": {
"main": {
"umap": [[1.2, 3.4], [2.3, 4.5], ...],
"tsne": [[0.1, 0.2], [0.3, 0.4], ...],
"metadata": [...]
}
}
}linkml-embeddings-explorer deploy embeddings.json output/open output/index.htmlGenerate an embedding explorer from an embeddings file.
linkml-embeddings-explorer deploy embeddings.json output/
# With custom title
linkml-embeddings-explorer deploy embeddings.json output/ --title "My Data Explorer"
# With config file
linkml-embeddings-explorer deploy embeddings.json output/ --config config.jsonGenerate a config template from your embeddings file.
linkml-embeddings-explorer init-config embeddings.json -o config.jsonShow information about an embeddings file.
linkml-embeddings-explorer info embeddings.jsonIndex records with linkml-store and cache embeddings using a pipeline config.
linkml-embeddings-explorer index pipeline.yaml
linkml-embeddings-explorer index pipeline.yaml --space gocam --recreateExport embeddings + metadata to embeddings.json or data.js.
linkml-embeddings-explorer app-data pipeline.yaml -o embeddings.json
linkml-embeddings-explorer app-data pipeline.yaml -o data.js --format js --no-tsneCreate a config.json to customize the explorer:
{
"title": "My Embedding Explorer",
"description": "Explore items in embedding space",
"colorFields": ["category", "type", "_group"],
"defaultColorField": "category",
"hoverFields": ["name", "description", "category"],
"labelField": "name",
"linkTemplate": "../items/{name}.html",
"defaultSpace": "main",
"defaultMethod": "umap",
"backLink": "../index.html",
"backText": "Back to Home"
}| Option | Description |
|---|---|
title |
Page title |
description |
Description shown in header |
colorFields |
Fields available in "Color By" dropdown |
defaultColorField |
Initial color field |
hoverFields |
Fields shown on hover |
labelField |
Field used for point labels |
linkTemplate |
URL template for click navigation (use {field} placeholders) |
defaultSpace |
Initial embedding space (for multi-space explorers) |
defaultMethod |
Initial reduction method (umap or tsne) |
backLink |
URL for back link in header |
backText |
Text for back link |
from linkml_embeddings_explorer import EmbeddingExplorerGenerator
import numpy as np
# Create from embeddings and metadata
embeddings = np.random.randn(100, 384)
metadata = [{"name": f"Item {i}", "category": "A" if i < 50 else "B"} for i in range(100)]
generator = EmbeddingExplorerGenerator(embeddings, metadata)
generator.generate(Path("output/"), title="My Explorer")
# Or add multiple spaces
generator = EmbeddingExplorerGenerator()
generator.add_space("pathophysiology", embeddings=emb1, metadata=meta1)
generator.add_space("phenotypes", embeddings=emb2, metadata=meta2)
generator.generate(Path("output/"), title="Multi-Space Explorer")If you're using linkml-store with LLMIndexer, you can export embeddings for visualization:
# Export embeddings from linkml-store collection
from linkml_embeddings_explorer.core import EmbeddingExplorerGenerator
import duckdb
import json
# Read embeddings from linkml-store cache
conn = duckdb.connect("cache/embeddings.db", read_only=True)
rows = conn.execute("SELECT text, embedding FROM all_embeddings").fetchall()
conn.close()
# Parse names from text and build metadata
embeddings = []
metadata = []
for text, embedding in rows:
name = text.split("\n")[0].replace("Name: ", "").strip()
embeddings.append(list(embedding))
metadata.append({"name": name, "category": "..."})
# Generate explorer
generator = EmbeddingExplorerGenerator(np.array(embeddings), metadata)
generator.generate(Path("explorer/"))This repo now includes a config-driven pipeline that mirrors the pattern in dismech:
- Index records with linkml-store + LLMIndexer (embeddings cached in DuckDB).
- Export app data (UMAP/t-SNE + metadata) to
embeddings.jsonordata.js. - Generate the static explorer via
deploy.
Install the extra dependencies for indexing:
just install-dev
uv sync --group embeddingsExample config (examples/gocams/pipeline.yaml):
source:
path: /Users/cjm/repos/go-cam-browser/public/data.json
format: json
store:
database: cache/gocam_embeddings.duckdb
alias: gocam
spaces:
gocam:
collection: gocams
template: templates/gocam.j2
index_name: gocam_index
cache_db: cache/gocam_cache.db
embedding_model_name: text-embedding-3-small
text_template_syntax: jinja2
metadata_fields:
- id
- title
- taxon_label
- enabled_by_gene_labels
- part_of_term_labels
- occurs_in_term_labels
add_name_from: title
group_by: taxon_label
groups: []
app_data:
output: embeddings.json
format: json
include_tsne: trueTemplate note: a starter GO-CAM template is provided at templates/gocam.j2.
Example config is available at examples/gocams/pipeline.yaml.
Run the pipeline:
linkml-embeddings-explorer index examples/gocams/pipeline.yaml
linkml-embeddings-explorer app-data examples/gocams/pipeline.yaml -o examples/gocams/embeddings.json
linkml-embeddings-explorer deploy examples/gocams/embeddings.json examples/gocams/app/ --title "GO-CAM Explorer"Notes:
- If you set
app_data.outputto a.jsfile (or--format js), it writeswindow.EMBEDDING_DATAdirectly. - Use
--no-tsneonapp-datato skip t-SNE for large datasets. - The YAML maps 1:1 onto a Pydantic model (
PipelineConfig).
Other examples:
examples/fake/is a self-contained tiny dataset for quick testing.
Justfile snippet (drop into your project):
pipeline-index config:
uv run linkml-embeddings-explorer index {{config}}
pipeline-app-data config output="embeddings.json":
uv run linkml-embeddings-explorer app-data {{config}} -o {{output}}# Clone repository
git clone https://github.com/linkml/linkml-embeddings-explorer
cd linkml-embeddings-explorer
# Install with dev dependencies
just install-dev
# Run tests
just test
# Run all QC checks
just qc
# Generate example
just exampleSee examples/README.md for the list of demos and how to run them.
The docs landing page lives at docs/index.html with a small gallery.
MIT License