A python software tool for computing "runs" of text->image and image->text models (with outputs fed recursively back in as inputs) and analysing the resulting text-image-text-image trajectories using topological data analysis.
If you've got a sufficiently capable rig you can use this tool to:
- speficy text->image and image->text generative AI models in various "network topologies"
- starting from a specified initial prompt & random seed, recursively iterate the output of one model in as the input of the next to create a "run" of model invocations
- embed each (text) output into a joint embedding space using a semantic embedding model
The results of all the above computations will be stored in a local sqlite
database for further analysis (see the datavis
module for existing
visualizations, or write your own).
Note: this tool was initially motivated by the PANIC! art installation (first exhibited 2022)---see DESIGN.md for more details. Watching PANIC! in action, there is clearly some structure to the trajectories that the genAI model outputs "trace out". This tool is an attempt to quantify and understand that structure (see why? below).
- python 3.12 (the giotto-ph dependency doesn't have wheels for 3.13)
- a GPU which supports CUDA 12.7 (earlier version maybe earlier ok, but untested)
- sqlite3
- ffmpeg (for generating the output videos)
With uv, it's just:
# install the package
uv pip install -e .
# run the main CLI
uv run panic-tda --help
If you're the sort of person who has alternate (strong) opinions on how to manage python environments, then you can probably figure out how to do it your preferred way. Godspeed to you.
The main CLI is panic-tda
.
$ uv run panic-tda --help
Usage: panic-tda [OPTIONS] COMMAND [ARGS]...
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion Install completion for the current shell. │
│ --show-completion Show completion for the current shell, to copy it or customize the installation. │
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ──────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ experiment Manage experiments │
│ run Manage runs │
│ cluster Manage clustering │
│ model Manage models │
│ export Export data and visualizations │
│ script │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
The most useful subcommands are:
experiment perform
: Run a panic-tda experiment defined in a configuration fileexperiment show
: Get the status of an experiment (% complete, broken down by stage: invocation/embedding/persistence diagram)experiment list
: List all experiments stored in the database, with options for detailed outputmodel list
: List all the supported models (both genAI t2i/i2t and embedding models)export video
: Export a "mosaic" from all the runs in a given experiment
To run an experiment, you'll need to create a configuration file. Here's an example:
// experiment1.json
{
"networks": [["FluxDev", "Moondream"]],
"seeds": [42, 56, 545654, 6545],
"prompts": ["Test prompt 1"],
"embedding_models": ["Nomic"],
"max_length": 100
}
The schema for the configuration file is based on ExperimentConfig
in the
schema
module (that's the class it needs to parse cleanly into).
Then, to "run" the experiment:
# Run an experiment with the above configuration
uv run panic-tda experiment perform experiment1.json
# check the status of the experiment
uv run panic-tda experiment show
# List all expeiments in the database
uv run panic-tda experiment list
# Export images from a specific experiment
uv run panic-tda export video 123e4567-e89b-12d3-a456-426614174000
If you're running it on a remote machine and kicking it off via ssh, you'll
probably want to use nohup
or tmux
or something to keep it running after you
log out (look at the perform-experiment.sh
file for ideas on how to do this).
There are (relatively) comprehensive tests in the tests
directory. They can be
run with pytest:
uv run pytest
A few of the tests (anything which involves actual GPU processing) are marked slow and won't run by default. To run them, use:
uv run pytest -m slow
If you'd like to add a new genAI or embedding model, have a look in
genai_models
or embeddings
respectively, and add a new subclass which
implements the desired interface. Look at the existing models and work from
there - and don't forget to add a new test to tests/test_genai_models.py
or
tests/test_embeddings.py
.
For further info, see the design doc.
At the School of Cybernetics we love thinking about the way that feedback loops (and the the connections between things) define the behaviour of the systems in which we live, work and create. That interest sits behind the design of PANIC! as a tool for making (and breaking!) networks of hosted generative AI models.
Anyone who's played with (or watched others play with) PANIC! has probably had one of these questions cross their mind at some point.
One goal in building PANIC is to provide answers to these questions which are both quantifiable and satisfying (i.e. it feels like they represent deeper truths about the process).
- was it predictable that it would end up here?
- how sensitive is it to the input, i.e. would it still have ended up here with a slightly different prompt?
- how sensitive is it to the random seed(s) of the models?
- the text/images it's generating now seem to be "semantically stable"; will it ever move on to a different thing?
- can we predict in advance which initial prompts lead to a "stuck" trajectory?
- how similar is this run's trajectory to previous runs?
- what determines whether they'll be similar? the initial prompt, or something else?
- does a certain genAI model "dominate" the behaviour of the network? or is the prompt more important? or the random seed? or is it an emergent property of the interactions between all models in the network?
Ben Swift wrote the code, and Sunyeon Hong is the mastermind behind the TDA stuff.
MIT