Skip to content
/ jot Public

Jot is a documentation generator. Built as a replacement for JetBrains deprecated Writerside IDE.

Notifications You must be signed in to change notification settings

onedusk/jot

Repository files navigation

Jot

CI Go Report Card License: MIT

Jot is a documentation generator that converts markdown files into modern, searchable documentation websites. Built as a replacement for JetBrains deprecated Writerside IDE.

Features

  • Automatic TOC Generation - Hierarchical table of contents from your file structure
  • Multiple Export Formats - HTML, JSON, YAML, llms.txt, JSONL, and enriched Markdown
  • LLM-Optimized Exports - Token-accurate chunking with multiple strategies for AI/ML workflows
  • Vector Database Ready - JSONL export with metadata for Pinecone, Weaviate, Qdrant
  • Pluggable Chunking - Fixed, semantic, markdown-headers, and recursive strategies
  • Workflow Presets - --for-rag, --for-context, --for-training for common use cases
  • Token-Based Chunking - Accurate token counting with tiktoken-go (GPT-4/Claude compatible)
  • Auto-Generation - LLM exports automatically generated during jot build
  • Large Scale - Proven on thousands of documents
  • Zero-Copy Markdown - Symlink support for direct markdown access

Installation

macOS

Binary Installation (Intel):

curl -L https://github.com/onedusk/jot/releases/latest/download/jot-darwin-amd64 -o jot
chmod +x jot
sudo mv jot /usr/local/bin/

Binary Installation (Apple Silicon):

curl -L https://github.com/onedusk/jot/releases/latest/download/jot-darwin-arm64 -o jot
chmod +x jot
sudo mv jot /usr/local/bin/

Linux

Binary Installation (amd64):

curl -L https://github.com/onedusk/jot/releases/latest/download/jot-linux-amd64 -o jot
chmod +x jot
sudo mv jot /usr/local/bin/

Binary Installation (arm64):

curl -L https://github.com/onedusk/jot/releases/latest/download/jot-linux-arm64 -o jot
chmod +x jot
sudo mv jot /usr/local/bin/

Windows

Binary Installation (PowerShell):

Invoke-WebRequest -Uri https://github.com/onedusk/jot/releases/latest/download/jot-windows-amd64.exe -OutFile jot.exe
# Add jot.exe to your PATH or move to a directory in PATH

Build from Source:

# Clone repository
git clone https://github.com/onedusk/jot
cd jot

# Install dependencies
go mod download

# Build binary
go build -o jot ./cmd/jot

# Run tests
go test ./...

Initialize a Project

# Create a new documentation project
jot init

# This creates:
# - jot.yml (configuration)
# - .jotignore (ignore patterns)
# - docs/ (documentation directory)
# - README.md (example file)

Build Documentation

# Scan and build documentation
jot build

# Output is generated in ./dist/

# Use custom configuration file
jot build --config my-config.yaml

# Enable verbose output for detailed logging
jot build --verbose

# By default, jot looks for jot.yml in the current directory

Export Documentation

# Export as JSON
jot export --format json --output docs.json

# Export as YAML
jot export --format yaml --output docs.yaml

# Export to llms.txt format (lightweight index per llmstxt.org)
jot export --format llms-txt --output llms.txt

# Export to llms-full.txt (complete documentation for LLM context)
jot export --format llms-full --output llms-full.txt

# Export to JSONL for vector databases (Pinecone, Weaviate, Qdrant)
jot export --format jsonl --output docs.jsonl

# Export to enriched markdown with YAML frontmatter
jot export --format markdown --output docs.md

# Use presets for common workflows
jot export --for-rag --output rag-ready.jsonl      # RAG: semantic chunking, 512 tokens
jot export --for-context --output context.md       # Context: header chunking, 1024 tokens
jot export --for-training --output training.jsonl  # Training: fixed chunking, 256 tokens

# Advanced: Custom chunking strategies
jot export --format jsonl --strategy semantic --chunk-size 1024 --chunk-overlap 256 --output custom.jsonl
jot export --format markdown --strategy markdown-headers --output docs-headers.md

# Advanced: Include embeddings (warning: API costs apply)
jot export --format jsonl --include-embeddings --output embeddings.jsonl

Generate Table of Contents

# Generate toc.xml in each directory with markdown files
jot toc

# Preview without writing files
jot toc --dry-run

# Include subdirectories in each toc.xml (recursive)
jot toc --recursive

# The command will:
# - Generate toc.xml in each directory containing .md files
# - Create toc.json at project root with paths to all toc.xml files
# - Use paths from jot.yml configuration

The toc.json file acts like a lock file, tracking all generated toc.xml locations with relative paths from the project root.

Configuration

Edit jot.yml to customize your documentation:

version: 0.1.0  # Configuration version (required)

project:
  name: "My Documentation"        # Project name (required)
  description: "Project documentation"  # Brief description (optional)
  author: "Your Name"             # Author name (optional)

input:
  paths:
    - "docs"        # Source paths to scan (required)
    - "README.md"   # Supports files and directories
  ignore:
    - "**/_*.md"    # Glob patterns to ignore (optional)
    - "**/drafts/**"
    - "**/node_modules/**"

output:
  path: "dist"     # Output directory (default: "dist")
  format: "html"   # Output format: html, json, yaml (default: "html")
  theme: "default" # Theme name (default: "default")

features:
  search: true      # Enable full-text search (default: true)
  llm_export: true  # Auto-generate llms.txt during build (default: true)
  toc: true         # Generate table of contents (default: true)

llm:
  chunk_size: 512   # Maximum tokens per chunk (default: 512)
  overlap: 128      # Token overlap between chunks (default: 128)

Project Structure

my-project/
 jot.yml           # Configuration
       installation.md
 dist/             # Generated output

Markdown Features

Jot supports standard markdown with extensions:

  • Frontmatter - YAML metadata in documents
  • Code Highlighting - Syntax highlighting for code blocks
  • Tables - GitHub-flavored markdown tables
  • Task Lists - Checkboxes in lists
  • Footnotes - Reference-style footnotes

LLM Integration

Jot provides comprehensive LLM-optimized export formats with token-accurate chunking:

Automatic Export During Build

# Build automatically generates llms.txt and llms-full.txt
jot build

# Skip LLM export if needed
jot build --skip-llms-txt

Export Formats

llms.txt - Lightweight index per llmstxt.org specification:

  • H1 header with project name
  • Blockquote with description
  • Grouped by directory with markdown links
  • Optimized for quick LLM scanning

llms-full.txt - Complete documentation for LLM context:

  • Full content concatenation with separators
  • README.md appears first
  • Preserves all markdown formatting
  • Size warnings for large outputs (>1MB)

JSONL - Vector database ingestion format:

  • One JSON object per line (streaming-friendly)
  • Token counts for each chunk
  • Navigation fields (prev/next chunk IDs)
  • Vector field for embeddings (optional)
  • Compatible with Pinecone, Weaviate, Qdrant

Enriched Markdown - Markdown with YAML frontmatter:

  • Metadata: source, section, chunk_id, token_count, modified
  • Preserved markdown formatting
  • Table of contents with anchor links
  • Ready for contextual enrichment

Chunking Strategies

  • Fixed: Token-based fixed-size chunks with word boundaries (default)
  • Semantic: Embedding-based boundary detection for natural breaks
  • Markdown-headers: Split at header boundaries (# to ######)
  • Recursive: Hierarchical splitting (paragraph → line → space → character)
  • Contextual: Context-aware chunking (alias for semantic)

Workflow Presets

# RAG workflows: JSONL + semantic + 512 tokens
jot export --for-rag --output rag.jsonl

# Context window optimization: Markdown + headers + 1024 tokens
jot export --for-context --output context.md

# Training datasets: JSONL + fixed + 256 tokens
jot export --for-training --output training.jsonl

Token Accuracy

Jot uses tiktoken-go with cl100k_base encoding for accurate token counting:

  • Compatible with GPT-4, GPT-3.5-turbo, and Claude models
  • Binary search algorithm for efficient chunking
  • Word boundary preservation to avoid splitting mid-word
  • Configurable chunk size and overlap

Troubleshooting

Build fails with "config file not found"

  • Ensure jot.yml exists in your project root
  • Use --config flag to specify custom config location
  • Run jot init to create a default configuration

No documents found during build

  • Check that input paths in jot.yml are correct
  • Verify markdown files aren't being ignored by patterns
  • Use --verbose flag to see which files are being scanned

Search not working in generated site

  • Ensure features.search: true in jot.yml
  • Check that JavaScript is enabled in browser
  • Verify search index was generated in output directory

Permission denied errors

  • On macOS/Linux: Run chmod +x jot after downloading binary
  • On Windows: Check that antivirus isn't blocking the executable
  • Ensure write permissions for output directory

FAQ

Can Jot handle large documentation sets?

Yes, Jot is designed to handle thousands of documents efficiently. It uses optimized scanning and rendering algorithms.

Does Jot support custom themes?

Currently Jot uses a default theme. Custom theme support is planned for future releases.

Can I use Jot with CI/CD pipelines?

Yes! Jot is a CLI tool that integrates easily with CI/CD. Run jot build in your pipeline to generate docs automatically.

What markdown flavors are supported?

Jot supports GitHub-flavored markdown with extensions for frontmatter, code highlighting, tables, task lists, and footnotes.

Can I export to formats other than HTML?

Yes, Jot supports JSON, YAML, and LLM-optimized formats via the jot export command.

How does LLM export work?

Jot provides multiple LLM-optimized formats:

  • llms.txt: Lightweight index for quick LLM scanning
  • llms-full.txt: Complete docs with full context
  • JSONL: Token-accurate chunks for vector databases
  • Enriched Markdown: Metadata-rich markdown with frontmatter

All formats use token-based chunking (default: 512 tokens with 128 overlap) with accurate token counting via tiktoken-go. You can customize chunking strategies (fixed, semantic, headers, recursive) and use workflow presets (--for-rag, --for-context, --for-training).

Is there a watch mode for development?

Yes, use jot watch to automatically rebuild when files change (requires the serve command to be running).

About

Jot is a documentation generator. Built as a replacement for JetBrains deprecated Writerside IDE.

Resources

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •