Skip to content

Calliope is a framework meant to make modern AI tools like generative AI (large language and image generation models), computer vision, and vector databases accessible for use by artists creating interactive art works.

Notifications You must be signed in to change notification settings

chrisimmel/calliope

Repository files navigation

Calliope

image

(A Calliope self-portrait)

In Greek mythology, Calliope (/kəˈlaɪ.əpi/ kə-LY-ə-pee; Ancient Greek: Καλλιόπη, romanized: Kalliópē, lit. 'beautiful-voiced') is the Muse who presides over eloquence and epic poetry; so called from the ecstatic harmony of her voice. Hesiod and Ovid called her the "Chief of all Muses".

Calliope is a framework meant to make modern AI tools like generative AI (large language models and image generation models), computer vision, and vector databases accessible for use by artists creating interactive art works. The core system is a flexible framework, service, and API that enables an artist to build repeatable interaction strategies. The API can accept inputs such as images, text, and voice, then process these through an artist-defined pipeline of AI models to generate text and image output.

The focus is on enabling the creation of works that are "aware" of the environment in which they are installed or running, in the sense that they can see, hear, and react to things or people in that environment. This is currently limited to image input, such as from a Webcam, but the hope is to extend that to cover audio input as well, including speech recognition.

  • Processing is driven by pluggable modules called story strategies (or "storytellers" in Clio parlance), meant to be experimented with and extended by the artist-engineers who make use of the framework.

  • AI models can be either commercial or open models accessed via APIs (HuggingFace, OpenAI, Stability, Replicate, Azure, etc.) or locally or cloud hosted open source and/or fine-tuned models. (GPT-4, GPT-3, Stable Diffusion, DALL-E 2, MiniGPT-4, LLaMa 2, Claude, FLAN, etc.)

  • Images are interpreted by a combination of a multimodal LLM (MiniGPT-4) and the Azure computer vision API to generate a rich text description, lists of recognized objects and text, and metadata that can be passed to other components as input.

  • Large language model prompts are stored, manipulated, and applied along with the other processing modules in graphs (prompt chaining) that can be pre-programmed or dynamically created at runtime. LLMs are driven via LangChain (although this isn't central to the framework).

  • A semantic search facility is provided using the Pinecone vector database, with a scheduled ETL pipeline to index generated media.

In this present incarnation, Calliope invents and recites stories. This can be through any client of its story API. The two existing clients are:

  • An ESP32-Sparrow -- one of a family of bespoke hardware devices with a screen and optional input sensors such as camera and microphone.
  • Clio -- a small TypeScript client included in this repo, runnable in any browser on desktop or mobile devices. Clio accepts image input from any accessible webcam and passes this with its request for a story continuation. Calliope uses this input to condition its continuation of the story.

Try it Out!

You can try Calliope and Clio at https://calliope.chrisimmel.com/clio/.

image

Hints:

  • Clio works with "storytellers" in Calliope to construct a story for you, one frame at a time. You request a new frame by tapping any of the buttons at the bottom of the screen.

    • Click the plus (plus) icon to let the storyteller simply continue along its current train of thought.
    • Click the microphone (microphone) icon and speak a few words to give the storyteller an idea or inspiration.
    • Click the camera (camera) icon to take a photo and send it to the storyteller for inspiration.
  • After you do this, Calliope will work for several seconds and give you a new frame.

  • You can review previous frames by swiping to the right. (Clicking the arrows also works.)

  • Input images and sounds are not kept on the Calliope server, so you don't need to worry about it hording a cache of your photos and soundlclips!

  • You can start a new story at any time from the menu. The "Create New Story" option lets you start a new story using the storyteller of your choice, from either a photo, a spoken sound clip, or "thin air".

  • You can browse stories you've created in the past and select them to either review or update them.

  • Coming soon:

    • Bookmark stories and share them with friends.

Table of Contents

About

Calliope is a framework meant to make modern AI tools like generative AI (large language and image generation models), computer vision, and vector databases accessible for use by artists creating interactive art works.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published