Image Captioner

Image Captioner makes it trivial to generate image captions from Rust code using the BLIP model from Salesforce. All processing happens on your device. After the initial model download, processing an image takes ~5 seconds on an M1 MacBook Pro, no GPU required.

The captions are pretty good. For this image, the automatically generated caption is "a laptop on fire".

Example Usage

Assuming you have an image in your crate root called image.jpg:

use image_captioner::get_caption;
use std::path::Path;

fn main() {
    // This path is relative to the directory you run your Rust application from,
    // usually the crate root.
    let image_path = Path::new("./image.jpg");

    // The first time you run this will be slow since it has to download the model,
    // which is 990 MB.
    match get_caption(image_path) {
        Ok(caption) => println!("Caption: {}", caption),
        Err(err) => eprintln!("Error: {:?}", err),
    }
}

About the BLIP Deep Learning Model

BLIP (Bootstrapping Language-Image Pre-training) is a model released by Salesforce in 2022 that excels at a number of vision + language tasks, including image captioning. It's permissively licensed (BSD 3-Clause), allowing use in both personal and commercial projects.

For more info, see the BLIP model card on Hugging Face.

Model Download

The first time you run image_captioner, it automatically downloads the BLIP model. This process requires an internet connection. The model is 990 MB, so the download may take some time.

The model is downloaded and cached by the transformers Python library. The default cache location is typically:

Linux/macOS: ~/.cache/huggingface/hub
Windows: C:\Users\<username>\.cache\huggingface\hub

This location may vary based on your system configuration and environment settings. The transformers library manages the cache automatically, storing and retrieving the model for efficient usage in subsequent runs.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
README.md		README.md
image.jpg		image.jpg
image_captioner.py		image_captioner.py
requirements.txt		requirements.txt
setup_python.rs		setup_python.rs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Captioner

Example Usage

About the BLIP Deep Learning Model

Model Download

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

abusch419/image_captioner

Folders and files

Latest commit

History

Repository files navigation

Image Captioner

Example Usage

About the BLIP Deep Learning Model

Model Download

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages