Skip to content

Klokinator/KlokCaptionTagger

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Klok Caption Tagger v1.1 ⏰

A powerful, browser-based tool for creating and managing training datasets for AI image generation, such as making LoRAs for the Z-Image model, Stable Diffusion, Flux, Illustrious, and more!

Join Discord

Klok Caption Tagger (KCT) is a lightweight single-HTML-file that includes tools to allow simplified AI-powered auto-tagging, auto-captioning, cropping images, and blotting out watermarks. It runs in any browser and is ideal for tagging and captioning images while training loras or generating images locally. It's also great for using with low-end computers and laptops.

Upload images from a folder/directory, individual images and text files, or a zip file.

image

Choose from five free Gemini models, or input a paid api key for OpenAI or Claude's models.

image

Note: If you want local caption generation, check out tagGUI. Local models are not suitable for KCT, but TagGUI suits this purpose perfectly!

image

Built-in system prompts come with helpful descriptions you can edit or outright delete. Make your own!

image

Helpful information for new users who don't know how to caption properly!

image

Crop images and edit out text/watermarks with a simple built-in Paint tool.

image

Key Features

  • Recursive Folder Support: Load a root folder and KCT will load all images within it, maintaining the subfolder structure.
  • Multi-Model AI Support: Connect to the best LLMs available.
    • Google Gemini: Free tier available (Gemini 2.0 Flash, etc.).
    • OpenAI: Support for GPT-4o and variants (Requires paid API key).
    • Anthropic: Support for Claude 3.5 Sonnet/Haiku (Requires paid API key).
  • Smart Auto-Captioning & Tagging:
    • Batch Control: Choose to Ignore existing text, Append to the end, or Overwrite completely.
    • Retry Logic: KCT automatically retries failed API requests (up to 3 times) with a cooldown to handle rate limits gracefully.
    • Prompt Presets: Save your favorite system prompts (e.g., "Danbooru Style", "Natural Language") and switch between them instantly.
  • Flexible Exports:
    • Save to Zip: Download the whole dataset structure as a compressed file.
    • Save to Folder: Export text files directly back to your local folder, preserving the directory layout and existing filenames.
  • Global Trigger Word: Easily set a trigger word that is automatically enforced as the first element of every caption.
  • Data Persistence: The tool remembers your API keys, model choices, and presets between sessions.
  • Tag Management:
    • Tag Viewer: Analyze tag frequency across your dataset.
    • Tag Mode vs Caption Mode: Switch the UI to optimize for comma-separated tags or block text captions.

Changes from TagPilot (Original tool by vavo)

  • No Image Renaming: KCT respects your existing filenames. When exporting, it keeps the original names and directory structure.
  • Direct File System Access: Now supports writing directly to disk (Chrome/Edge/Brave only) in addition to exporting images and .txts in a zip file.

Setup & Installation

Since KCT is a client-side application, there is no Python backend or Node.js server required.

  1. Clone or Download the Repository
    git clone https://github.com/Klokinator/KlokCaptionTagger.git
  2. Run the Application Simply double-click KlokCaptionTagger.html to open it in your web browser.

Usage Workflow

  1. Load Data:
    • Choose Folder: Select a folder to recursively load images and existing .txt files.
    • Upload Zip: Upload a generic dataset zip.
  2. Configure Models:
    • Click "Model Settings".
    • Select your provider (Gemini, OpenAI, or Anthropic).
    • Enter your API Key.
    • Select the specific model ID you wish to use from the table.
  3. Tagging/Captioning:
    • Click "Tag All" or "Caption All".
    • Select a Preset or write a custom System Prompt.
    • Click the [i] beside System Prompt for more information.
    • Read the Descriptions for each preset prompt to understand what they do.
    • Write your own prompt descriptions and edit the default ones too!
    • Set your Limits (Max tags or Max words).
    • Choose a mode (Ignore/Append/Overwrite) and start.
  4. Cropping and Painting:
    • Click the Crop/Edit button to crop images, and to use a simple paint tool to blot out unwanted text and watermarks.
    • (It is still recommended to use a dedicated image editor for more major image cleanup jobs)
  5. Review:
    • Use the Tag Viewer to clean up unwanted tags globally.
    • Click on individual text boxes to perform manual edits.
  6. Export:
    • Save to Folder: Updates the .txt files in your actual directory (requires permission).
    • Save to Zip: Downloads a standalone package.

Technology Stack

  • HTML5 / JavaScript (ES6+): Core logic.
  • Tailwind CSS: UI styling (loaded via CDN).
  • JSZip: Client-side archiving.
  • File System Access API: For direct folder reading/writing.
  • LocalStorage: For saving user preferences.

Klok Caption Tagger is based on the excellent TagPilot HTML code by vavo on Github!

Join Discord

About

A powerful, browser-based tool for captioning, tagging, and managing training datasets for AI image generation (Lora trainings).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • HTML 100.0%