Vizard: A DSL that Compiles High-Level Declarations to Altair & Matplotlib Code

A stateful declarative language for LLM-driven Python visualization code combining structured keywords with natural language.

Vizard lets you create data visualizations by describing what you want in a mix of CAPITALIZED keywords and natural language, with intelligent defaults and stateful keyword persistence for iterative figure development.

Features

🗣️ Natural + Structured: Mix CAPITALIZED keywords with plain English
💾 Stateful: Keywords persist across calls, enabling iterative refinement
🐻‍❄️ Polars-first: Modern, fast dataframe operations with streaming/chaining
📊 Multi-engine: Altair (default), Matplotlib, and Seaborn support
🔧 Flexible: Supports minimal to highly detailed specifications
🤖 Intelligent: LLM fills gaps with sensible defaults
🔄 Conversational: Refine figures through natural dialogue

Installation

git clone https://github.com/prairie-guy/vizard
cd vizard
./setup.sh

This installs:

vizard CLI tool to ~/.local/bin/vizard
vizard_magic Python package globally
cc_jupyter with patches applied

Verify installation:

vizard version

Note: Ensure ~/.local/bin is in your PATH:

export PATH="$HOME/.local/bin:$PATH"  # Add to ~/.bashrc or ~/.zshrc

Quick Start

Choose your preferred workflow:

Option A: Use Existing Jupyter/IPython (Fastest)

Best for: Quick exploration, working across multiple projects

Start Jupyter/IPython

jupyter lab
# or: jupyter notebook, ipython

Load vizard extension
```
%load_ext vizard_magic
```
Start creating visualizations (see Example Session below)

Option B: Use vizard CLI (Per-Project)

Best for: Reproducible research, isolated environments

Navigate to project and start vizard
```
cd ~/my-project
vizard start
```
This creates an isolated environment with:
- Project-local .venv/ with dependencies
- Vizard templates (pyproject.toml, CLAUDE.md)
- JupyterLab server with connection URL
Open the URL in your browser
Load vizard extension in notebook
```
%load_ext vizard_magic
```
When done:
```
vizard stop
```

Example Session

%load_ext vizard_magic

Create a simple bar chart:

%%cc
VZ
DATA data/sales.csv
PLOT bar
X product Y revenue
FILENAME images/most_basic_bar_chart_basic.png

Generated code:

df = pl.read_csv('data/sales.csv')

chart = alt.Chart(df).mark_bar(color='steelblue').encode(
    x=alt.X('product:N', title='Product'),
    y=alt.Y('revenue:Q', title='Revenue')
).properties(width=600, height=400)

chart.save('images/most_basic_bar_chart_basic.png', scale_factor=2.0)

chart

Check current state:

%cc KEYS

Output:

ENGINE: altair
DF: polars
WIDTH: 600
HEIGHT: 400
FUNCTION: false
IMPORT: false
OUTPUT: display
DATA: data/sales.csv
PLOT: bar
X: product
Y: revenue
FILENAME: images/most_basic_bar_chart_basic.png

Refine by adding color:

%%cc
COLOR category
FILENAME images/bar_chart_basic.png

Generated code:

df = pl.read_csv('data/sales.csv')

chart = alt.Chart(df).mark_bar().encode(
    x=alt.X('product:N', title='Product'),
    y=alt.Y('revenue:Q', title='Revenue'),
    color=alt.Color('category:N', title='Category')
).properties(width=600, height=400)

chart.save('images/bar_chart_basic.png', scale_factor=2.0)

chart

Preprocessing Data with ||

Vizard supports data preprocessing using the || delimiter to separate Polars data manipulation from Altair visualization:

Simple Examples

Filter before plotting:

%%cc
DATA genes.csv
FILTER pvalue < 0.05
|| PLOT scatter X expression Y pvalue TITLE Significant Genes

Select columns and add computed field:

%%cc
DATA raw_data.csv
SELECT gene_name, fold_change, pvalue
ADD log2_fc as log2(fold_change)
|| PLOT bar X gene_name Y log2_fc

Group and aggregate:

%%cc
DATA sales.csv
GROUP by category aggregating sum(revenue) as total
|| PLOT bar X category Y total

Preprocessing Keywords

FILTER - Filter rows: FILTER pvalue < 0.05 and expression > 2
SELECT - Keep columns: SELECT gene_name, expression, pvalue
DROP - Remove columns: DROP columns internal_id, debug_flag
SORT - Sort data: SORT by pvalue descending
ADD - Computed columns: ADD log2_expr as log2(expression)
GROUP - Aggregate: GROUP by condition aggregating mean(expression)
SAVE - Save to file: SAVE processed_data.csv

Complex Example

%%cc
DATA diff_expression.csv
SELECT gene_name, log2fc, pvalue
FILTER pvalue < 0.05 and abs(log2fc) > 1.5
ADD neg_log10_pv as -log10(pvalue)
SORT by neg_log10_pv descending
|| PLOT scatter X log2fc Y neg_log10_pv TITLE Volcano Plot

This generates chained Polars preprocessing code followed by Altair visualization code.

See CLAUDE.md for complete documentation and examples.

Examples

Scatter Plot

%%cc
DATA genes.csv
PLOT scatter
X expression Y pvalue
COLOR significant
Add tooltips with gene names

Generated Altair code:

df = pl.read_csv('genes.csv')

chart = alt.Chart(df).mark_point(size=60).encode(
    x=alt.X('expression:Q', title='Expression'),
    y=alt.Y('pvalue:Q', title='P-value'),
    color=alt.Color('significant:N', title='Significant'),
    tooltip=['gene_name:N', 'expression:Q', 'pvalue:Q']
).properties(
    title='Gene Expression vs P-value',
    width=600,
    height=400
)

chart

Line Chart (Time Series)

%%cc
DATA timeseries.csv
PLOT line
X date Y temperature
COLOR location

Generated Altair code:

df = pl.read_csv('timeseries.csv')

chart = alt.Chart(df).mark_line().encode(
    x=alt.X('date:T', title='Date'),
    y=alt.Y('temperature:Q', title='Temperature'),
    color=alt.Color('location:N', title='Location')
).properties(
    title='Temperature Over Time',
    width=600,
    height=400
)

chart

Grouped Bar Chart

%%cc
DATA expression.csv
PLOT bar
X gene_name Y expression_level
COLOR condition
BAR_LAYOUT grouped

Generated Altair code:

df = pl.read_csv('expression.csv')

chart = alt.Chart(df).mark_bar().encode(
    x=alt.X('gene_name:N', title='Gene Name'),
    xOffset=alt.XOffset('condition:N'),
    y=alt.Y('expression_level:Q', title='Expression Level'),
    color=alt.Color('condition:N', title='Condition')
).properties(
    title='Gene Expression by Condition',
    width=600,
    height=300
)

chart

Faceted Plot (Small Multiples)

%%cc
DATA data.csv
PLOT scatter
X value1 Y value2
ROW condition
COLUMN replicate

Multi-Series Line Chart

%%cc
DATA stocks
PLOT line
X date Y price
SERIES symbol
COLOR symbol
TITLE Stock Prices Over Time

Note: Uses Altair's built-in stocks dataset. SERIES groups lines by symbol.

Bar Chart with Text Labels

%%cc
DATA sales.csv
PLOT bar
X product Y revenue
TEXT revenue
Show revenue values on top of bars

Error Bars (Range Encodings)

%%cc
DATA measurements.csv
PLOT point
X category Y mean_value
Y2 upper_ci
Add error bars showing confidence intervals

Window Transformations

%%cc
DATA timeseries.csv
PLOT line
X date Y value
WINDOW cumsum
Show cumulative sum over time

Heatmap

%%cc
DATA expression_matrix.csv
PLOT heatmap
X sample Y gene
COLOR expression
Use viridis color scheme
TITLE Gene Expression Heatmap

Box Plot

%%cc
DATA measurements.csv
PLOT box
X group Y value
TITLE Measurement Distributions by Group

Volcano Plot with Iterative Refinement

%%cc
RESET

%%cc
DATA diff_expression.csv
PLOT volcano
X log2fc Y neg_log10_pvalue
IMPORT

%%cc
Add threshold lines at x=±1.5 and y=1.3

%%cc
Color upregulated red, downregulated blue, non-significant gray

%%cc
TITLE Differential Gene Expression Analysis
WIDTH 800
HEIGHT 800

%%cc
OUTPUT save
FILENAME figure1_volcano.png

Using Vizard Specifications

Basic Syntax

Mix CAPITALIZED keywords with natural language:

# All keywords
%%cc DATA genes.csv PLOT bar X gene_name Y expression_level

# All natural language
%%cc Create a bar chart from genes.csv showing gene_name vs expression_level

# Mixed (recommended)
%%cc DATA genes.csv - make a bar chart with X gene_name and Y expression_level, sorted by value

Essential Keywords

Keywords control behavior and persist in state (.vizard_state.json):

Data & Plot:

DATA - Data source (file path, URL, variable name, or Altair dataset)
PLOT - Chart type: bar, scatter, line, histogram, volcano, heatmap, box, etc.
DF - Dataframe library: polars (default), pandas

Visual Encodings:

X, Y - Columns for axes
X2, Y2 - Secondary positions for range encodings (error bars, Gantt charts)
COLOR - Column to color by
SIZE - Column for point/mark size
SHAPE - Column for point shape
OPACITY - Column for transparency
SERIES - Column for grouping without visual encoding (multi-series line charts)
TEXT - Column for text labels on marks
ROW, COLUMN - Faceting (small multiples)
- ROW arranges plots horizontally in a row
- COLUMN arranges plots vertically in a column

Grouping & Transformations:

BAR_LAYOUT - Bar chart layout: grouped, stacked, normalized
WINDOW - Window transformations: cumsum, rank, row_number, mean, lag, lead

Styling:

WIDTH - Chart width in pixels (default: 600)
HEIGHT - Chart height in pixels (default: 400)
TITLE - Chart title
ENGINE - Visualization library: altair (default), matplotlib, seaborn

Code Generation:

FUNCTION - Generate reusable function (default: false)
IMPORT - Include imports (default: false)

Meta Commands:

KEYWORDS or KEYS - Show current state
RESET - Clear state and restore defaults
HELP - Show help documentation

Iterative Refinement

Keywords persist across cells - refine your figure step by step:

# Start with basic plot
%%cc
DATA mydata.csv
PLOT bar
X category Y value

# Add color (other keywords persist)
%%cc
COLOR group

# Adjust size
%%cc
WIDTH 800
HEIGHT 500

# Add natural language styling
%%cc
Make the bars green and add value labels on top

# Check what's in state
%%cc
KEYWORDS

# Start fresh for new figure
%%cc
RESET

State Management

Vizard maintains state in .vizard_state.json:

%%cc WIDTH 700 HEIGHT 450 DATA mydata.csv

%%cc PLOT bar X category Y value
# ↑ Automatically uses WIDTH: 700, HEIGHT: 450 from previous cell

%%cc KEYWORDS
# Shows all current keyword values

%%cc RESET
# Clears state, restores defaults

Workflow:

Iterate on a figure → State accumulates
Figure complete → Use it
Start new figure → RESET → Fresh state

Code Generation Options

# Default: No imports, script-style code
%%cc
DATA data.csv
PLOT bar
X category Y value

# With imports (for copy-paste to .py files)
%%cc
DATA data.csv
PLOT bar
X category Y value
IMPORT

# Generate reusable function
%%cc
DATA data.csv
PLOT bar
X category Y value
FUNCTION
IMPORT

Natural Language + Keywords

# Natural language for details
%%cc
DATA genes.csv
Create a volcano plot showing log2fc vs pvalue
Color upregulated genes red and downregulated blue
Add threshold lines at x=±1.5 and y=1.3

# Keywords for structure
%%cc
DATA genes.csv
PLOT volcano
X log2fc
Y pvalue
THRESHOLD_FC 1.5
THRESHOLD_P 1.3

# Mix both (recommended)
%%cc
DATA genes.csv PLOT volcano X log2fc Y pvalue
Color significant genes red, add threshold lines at 1.5 and 1.3

Dynamic Keywords

Any CAPITALIZED word becomes a keyword and persists in state:

%%cc
DATA results.csv
PLOT scatter
X log2fc Y pvalue
THRESHOLD 0.05
Highlight points where pvalue < THRESHOLD in red

%%cc
THRESHOLD 0.01
# Now uses new threshold value

%%cc KEYWORDS
# Shows: THRESHOLD: 0.01

Vizard CLI Commands

The vizard command manages project environments and JupyterLab:

vizard start [options]     # Start JupyterLab server
  -p, --port PORT          # Custom port (default: 9999)
  -t, --token TOKEN        # Custom token (default: auto-generated)
  --host HOST              # Custom hostname (default: system hostname)
  -f, --foreground         # Run in foreground

vizard stop [options]      # Stop JupyterLab server
  -p, --port PORT          # Stop server on specific port

vizard status              # Show server status

vizard clean [options]     # Remove runtime files
  --purge                  # Remove all vizard files including .venv

vizard update              # Update CLAUDE.md and vizard executable

vizard version             # Show version
vizard help                # Show help

Examples:

# Start with custom port for remote server
vizard start --port 8888 --host myserver.example.com

# Check status
vizard status

# Clean up runtime files (keeps notebooks and .venv)
vizard clean

# Full cleanup (removes everything except notebooks)
vizard clean --purge

# Stop server
vizard stop

Advanced Features

Polars Data Manipulation

Vizard generates Polars streaming/chaining code when data prep is needed:

%%cc
DATA results.csv
Filter to rows where pvalue < 0.05
Create a volcano plot showing log2fc vs pvalue
Color significant genes red

# Generated code uses Polars chaining:
# df = (pl.read_csv('results.csv')
#     .filter(pl.col('pvalue') < 0.05)
#     .with_columns([...]))

Spelling Tolerance

Common typos are recognized:

%%cc
DATA data.csv
PLOT bar
X cat Y val
COLOUR blue
TITEL My Chart
HIGHT 450
# Works! Recognizes COLOUR→COLOR, TITEL→TITLE, HIGHT→HEIGHT

Multi-Engine Support

# Altair (default) - declarative, interactive
%%cc
ENGINE altair
DATA data.csv
PLOT scatter
X value1 Y value2

# Matplotlib - publication-quality
%%cc
ENGINE matplotlib
DATA data.csv
PLOT bar
X category Y value

# Seaborn - statistical plots
%%cc
ENGINE seaborn
DATA data.csv
PLOT box
X group Y measurement

Default Values

ENGINE: altair
DF: polars
WIDTH: 600
HEIGHT: 400
FUNCTION: false
IMPORT: false
OUTPUT: display

Other keywords (X, Y, COLOR, etc.) have no defaults—they only appear in state when specified.

Supported Plot Types

✅ Bar charts - Simple, stacked, grouped, normalized, with text labels
✅ Scatter plots - With size, color, shape, opacity encodings
✅ Line charts - Time series, multi-series (SERIES keyword)
✅ Histograms - Configurable bins
✅ Volcano plots - Bioinformatics differential expression
✅ Heatmaps - Matrix visualizations
✅ Box plots - Distribution comparisons
✅ Faceted plots - Small multiples (ROW/COLUMN)
✅ Error bars - Range encodings with X2/Y2
✅ Window functions - Cumulative sums, rankings, rolling calculations

Coming soon: Violin plots, ridgeline plots, chord diagrams

Workflow Tips

Start simple: Begin with minimal specification, iterate
Use KEYWORDS often: Check state to understand what's persisted
RESET between figures: Clear state when starting new visualization
Mix styles: Keywords for structure, natural language for styling
Leverage state: Set common parameters (WIDTH, HEIGHT) once, use many times
Generate functions: Use FUNCTION for reusable plotting code

Troubleshooting

Q: %load_ext vizard_magic gives "No module named 'vizard_magic'"

Run ./setup.sh again to ensure vizard_magic is installed
Check: python3 -c "import vizard_magic" should work

Q: My plot isn't using the right dimensions

Check state with %%cc KEYWORDS - are WIDTH/HEIGHT set?
Use %%cc RESET to clear old dimensions

Q: Code has imports but I don't want them

IMPORT defaults to false - don't include IMPORT keyword
Check if IMPORT is in state: %%cc KEYWORDS

Q: Keywords not persisting

Ensure keywords are CAPITALIZED
Check .vizard_state.json exists in directory

Q: Permission denied: '/root/code'

This is fixed by patches applied during setup.sh or vizard start
If you see this error, re-run setup.sh or check patch output

Q: Different Python versions causing issues

Global mode (setup.sh) installs for the Python version python3 points to
Per-project mode (vizard start) uses the .venv's Python version
If using multiple Python versions, use Per-Project mode for each project

Technical

Installation Modes

Vizard supports two installation modes:

Mode 1: Global Installation (Quick & Convenient)

setup.sh installs cc_jupyter and vizard_magic to ~/.local/lib/python*/site-packages/ and applies patches globally. Works in any Jupyter/IPython environment.

✅ Quick data exploration across projects
✅ Existing Jupyter workflows
⚠️ Single cc_jupyter version across all projects

Mode 2: Per-Project Isolated (Reproducible & Safe)

vizard start creates project-local .venv/ with cc_jupyter installed and patched in the project's virtual environment. Each project has its own dependency versions.

✅ Published research / production
✅ Reproducible environments with version-pinned dependencies
⚠️ Requires vizard start per project

Note: Both modes can coexist. Use Global for daily work, Per-Project for important analyses.

Version Management

setup.sh installs a pinned version of cc_jupyter (0.0.1) tested with Vizard's patches.

If you see a version mismatch warning during setup:

⚠ Version mismatch detected
Patches are tested with version 0.0.1

Force reinstall the vendored version:

pip install --user --force-reinstall ~/.local/share/vizard/lib/vendor/claude_code_jupyter_staging-0.0.1-py3-none-any.whl
~/.local/share/vizard/lib/patch_global_cc_jupyter.sh

Patching Mechanism

Vizard modifies cc_jupyter to enable the %%cc magic command for Vizard specifications:

Global Patching:

lib/patch_global_cc_jupyter.sh - Patches globally installed cc_jupyter
Applied once during setup.sh
Affects all projects using global installation

Per-Project Patching:

lib/patch_jupyter_magic.sh - Patches project-local cc_jupyter
Applied during vizard start for each project
Isolated to project's .venv/

What gets patched:

Registers the vizard_magic IPython extension
Enables %%cc cell magic in Jupyter notebooks
Loads the Vizard specification (CLAUDE.md) into Claude Code's context

Repository Structure

vizard/
├── vizard                         # Main executable (bash script)
├── setup.sh                       # Installation script
├── uninstall.sh                   # Uninstallation script
├── README.md                      # This file
├── lib/
│   ├── vizard_magic/
│   │   └── __init__.py            # Jupyter IPython extension
│   ├── patch_jupyter_magic.sh    # Per-project cc_jupyter patcher
│   ├── patch_global_cc_jupyter.sh # Global cc_jupyter patcher
│   └── vendor/
│       └── claude_code_jupyter_staging-0.0.1-py3-none-any.whl
├── templates/
│   ├── CLAUDE.md                  # Vizard specification (~37KB)
│   ├── pyproject.toml             # Project dependencies template
│   ├── vizard_template.ipynb      # Notebook template
│   └── purge_manifest.txt         # Cleanup manifest
└── test/
    ├── data/                      # Synthetic test datasets
    ├── generate_sample_data.py    # Test data generator
    ├── generate_images.ipynb      # Image generation notebook
    └── vizards_test.ipynb         # Comprehensive test suite (60+ tests)

Per-Project Files (created by vizard start):

<your-project>/
├── .env.jupyter              # Jupyter configuration
├── .jupyter.pid              # Process ID
├── .jupyter.log              # Server logs
├── pyproject.toml            # Python dependencies
├── uv.lock                   # Dependency lock file
├── .venv/                    # Virtual environment
├── CLAUDE.md                 # Vizard specification
├── .vizard_state.json        # Keyword state
├── .vizard_template.ipynb    # Notebook template
└── .claude/
    └── settings.json         # Claude Code permissions

Design Philosophy

Vizard is NOT:

❌ A rigid DSL with one-to-one code mapping
❌ A replacement for learning Altair/Matplotlib/Seaborn
❌ Guaranteed to produce identical code each time

Vizard IS:

✅ Structured guidance via keywords
✅ LLM reasoning for intelligent defaults
✅ Balance of consistency and flexibility
✅ Iterative figure development workflow
✅ Natural language + structure hybrid

Testing

Run the comprehensive test suite:

cd vizard/test
jupyter lab vizards_test.ipynb

Test coverage (60+ tests across 12 sections):

Setup & Configuration
Dataset Loading (CARS, STOCKS, MOVIES from Altair)
Core Visual Encodings (X, Y, COLOR, SIZE, SHAPE, OPACITY, ROW, COLUMN)
Range Encodings (X2, Y2)
Advanced Encodings (SERIES, TEXT)
Layout & Grouping (BAR_LAYOUT: stacked, grouped, normalized)
Data Transformations (WINDOW: cumsum, rank, row_number)
Meta Commands (KEYWORDS, KEYS, RESET, HELP)
Code Generation (FUNCTION, IMPORT)
State Management & Persistence
Syntax Flexibility (keywords, natural language, mixed)
Edge Cases (nulls, many encodings, ENGINE switching, spelling tolerance)

Future Roadmap

Phase 2: Refinement (based on testing feedback)

Gallery fetching for uncommon plot types
Additional plot type examples
Enhanced dynamic keywords

Phase 3: Expansion

Additional plot types for all engines
Multi-panel layout improvements
Interactive features (brush, zoom, tooltips)

Phase 4: Publication Mode

DPI control
Panel labels (A, B, C)
Journal-specific formats
Fine-grained typography

Contributing

This is an early prototype. Feedback welcome on:

Keyword design
Default values
Natural language parsing
Code quality
Missing features

License

To be determined.

Acknowledgments

Built with:

Claude (Anthropic) - LLM interpreter
Altair - Declarative visualization
Matplotlib - Comprehensive visualization
Seaborn - Statistical data visualization
Polars - Fast dataframes
Claude Code - CLI tool

Quick Reference Card:

# Essential commands
%load_ext vizard_magic    # Load extension (once per kernel)
%%cc KEYWORDS              # Show current state
%%cc RESET                 # Clear state and start fresh
%%cc HELP                  # Show help

# Basic syntax
%%cc DATA file.csv PLOT bar X col1 Y col2

# Natural + keywords
%%cc DATA file.csv - create a scatter plot with X val1 and Y val2 colored by group

# Iterate on a figure
%%cc WIDTH 800 COLOR category TITLE My Chart

# Generate code with imports
%%cc IMPORT

Ready to create beautiful visualizations? Start with Option A above - it takes 30 seconds!

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
images		images
lib		lib
prompts		prompts
templates		templates
test		test
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
setup.sh		setup.sh
uninstall.sh		uninstall.sh
vizard		vizard

prairie-guy/vizard

Folders and files

Latest commit

History

Repository files navigation

Vizard: A DSL that Compiles High-Level Declarations to Altair & Matplotlib Code

Features

Installation

Quick Start

Option A: Use Existing Jupyter/IPython (Fastest)

Option B: Use vizard CLI (Per-Project)

Example Session

Preprocessing Data with ||

Simple Examples

Preprocessing Keywords

Complex Example

Examples

Scatter Plot

Line Chart (Time Series)

Grouped Bar Chart

Faceted Plot (Small Multiples)

Multi-Series Line Chart

Bar Chart with Text Labels

Error Bars (Range Encodings)

Window Transformations

Heatmap

Box Plot

Volcano Plot with Iterative Refinement

Using Vizard Specifications

Basic Syntax

Essential Keywords

Iterative Refinement

State Management

Code Generation Options

Natural Language + Keywords

Dynamic Keywords

Vizard CLI Commands

Advanced Features

Polars Data Manipulation

Spelling Tolerance

Multi-Engine Support

Default Values

Supported Plot Types

Workflow Tips

Troubleshooting

Technical

Installation Modes

Version Management

Patching Mechanism

Repository Structure

Design Philosophy

Vizard is NOT:

Vizard IS:

Testing

Future Roadmap

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages