Motorhead is a memory and information retrieval server for LLMs.
When building chat applications using LLMs, memory handling is something that has to be built every time. Motorhead is a server to assist with that process. It provides 3 simple APIs:
- GET
/sessions/:id/memory
returns messages up toMAX_WINDOW_SIZE
.
{
"messages": [
{
"role": "AI",
"content": "Electronic music and salsa are two very different genres of music, and the way people dance to them is also quite different."
},
{
"role": "Human",
"content": "how does it compare to salsa?"
},
{
"role": "AI",
"content": "Electronic music is a broad genre that encompasses many different styles, so there is no one \"right\" way to dance to it."
},
{
"role": "Human",
"content": "how do you dance electronic music?"
},
{
"role": "AI",
"content": "Colombia has a vibrant electronic music scene, and there are many talented DJs and producers who have gained international recognition."
},
{
"role": "Human",
"content": "What are some famous djs from Colombia?"
},
{
"role": "AI",
"content": "Baum opened its doors in 2014 and has quickly become one of the most popular clubs for electronic music in Bogotá."
}
],
"context": "The conversation covers topics such as clubs for electronic music in Bogotá, popular tourist attractions in the city, and general information about Colombia. The AI provides information about popular electronic music clubs such as Baum and Video Club, as well as electronic music festivals that take place in Bogotá. The AI also recommends tourist attractions such as La Candelaria, Monserrate and the Salt Cathedral of Zipaquirá, and provides general information about Colombia's diverse culture, landscape and wildlife.",
"tokens": 744 // tokens used for incremental summarization
}
- POST
/sessions/:id/memory
- Send an array of messages to Motorhead to store.
curl --location 'localhost:8080/sessions/${SESSION_ID}/memory' \
--header 'Content-Type: application/json' \
--data '{
"messages": [{ "role": "Human", "content": "ping" }, { "role": "AI", "content": "pong" }]
}'
Either an existing or new SESSION_ID
can be used when storing messages, and the session is automatically created if it did not previously exist.
Optionally, context
can be send in if it needs to get loaded from another datastore.
- DELETE
/sessions/:id/memory
- deletes the session's message list.
A max window_size
is set for the LLM to keep track of the conversation. Once that max is hit, Motorhead will process (window_size / 2
messages) and summarize them. Subsequent summaries, as the messages grow, are incremental.
- POST
/sessions/:id/retrieval
- searches by text query using VSS.
curl --location 'localhost:8080/sessions/${SESSION_ID}/retrieval' \
--header 'Content-Type: application/json' \
--data '{
"text": "Generals gathered in their masses, just like witches in black masses"
}'
Searches are segmented (filtered) by the session id provided automatically.
MOTORHEAD_MAX_WINDOW_SIZE
(default:12) - Number of max messages returned by the server. When this number is reached, a job is triggered to halve it.MOTORHEAD_LONG_TERM_MEMORY
(default:false) - Enables long term memory using Redisearch VSS.MOTORHEAD_MODEL
(default:gpt-3.5-turbo) - Model used to run the incremental summarization. Usegpt-3.5-turbo
orgpt-4
- otherwise some weird things might happen.PORT
(default:8000) - Motorhead Server PortOPENAI_API_KEY
(required)- Your api key to connect to OpenAI.REDIS_URL
(required)- URL used to connect toredis
.
With docker-compose:
docker-compose build && docker-compose up
Or you can use the image docker pull ghcr.io/getmetal/motorhead:latest
directly:
docker run --name motorhead -p 8080:8080 -e PORT=8080 -e REDIS_URL='redis://redis:6379' -d ghcr.io/getmetal/motorhead:latest