Skip to content

an api for translating between arbitrary inputs and a local LLM, using an API based on HTTP.

Notifications You must be signed in to change notification settings

gabrilend/bot-chat-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chatbot

A terminal-based LLM chat client written in Lua that connects to Ollama for local language model inference. Features peer-to-peer operation exchange, distributed inference across multiple machines, recursive task decomposition, and a rich terminal UI with markdown rendering and syntax highlighting.

Features

  • Interactive chat with any Ollama-hosted model, including streaming responses and thinking/reasoning mode
  • Peer-to-peer mode — two nodes exchange operations and execute against their own local contexts, tracking divergences rather than forcing consensus
  • Distributed LLM inference — split model layers across multiple GPUs/machines with pipeline parallelism
  • Recursive task decomposition — break complex tasks into independent sub-tasks, each with its own LLM context and full tool access
  • Tool system — automatic discovery of executable tools, with built-in file I/O, code writing, and custom tool support
  • Rich terminal UI — real-time markdown rendering, syntax highlighting (Lua, C, Bash, Python, etc.), table formatting, and interactive model selection
  • Blind mode — hide input while typing for voice input or privacy
  • Context management — local resource access including filesystem, environment, processes, and system info

Requirements

  • LuaJIT or Lua 5.1+
  • Ollama running locally or on a reachable host
  • Linux (Wayland or X11)

Setup

  1. Clone the repository:
git clone https://github.com/gabrilend/bot-chat-api.git
cd chatbot
  1. Install dependencies:
./scripts/install-libs.sh
  1. Initialize configuration:
./chatbot.lua --init
  1. Edit config/library_config.lua to point at your Ollama instance:
return {
    host = "localhost",
    port = 11434,
    model = "llama3",
    timeout = 120,
}

Usage

# Start an interactive chat session
./chatbot.lua

# Select a specific model
./chatbot.lua --model gemma2

# Hide input while typing (blind/speak mode)
./chatbot.lua --blind

# Listen for a peer connection
./chatbot.lua --peer-listen=9000

# Connect to a peer
./chatbot.lua --peer-connect=192.168.1.10:9000

Environment variables

Variable Description
CHAT_HOST Override Ollama host
CHAT_PORT Override Ollama port
CHAT_MODEL Override default model
CHATBOT_DEBUG=1 Enable debug logging
CHATBOT_BLIND=1 Enable blind mode
PEER_LISTEN Port to listen for peers
PEER_CONNECT host:port to connect to a peer

Architecture

chatbot.lua              CLI entry point, model selection UI
core/
  chat.lua               Chat client, tool discovery, peer integration
  ui.lua                 Terminal UI, markdown rendering, syntax highlighting
  peer.lua               WebSocket peer connection management
  operation.lua          Operation abstraction (everything is a tool call)
  executor.lua           Execute operations against local context
  context.lua            Local resource access (fs, env, proc, sys)
  divergence.lua         Track result divergences between peers
  transport.lua          TCP/WebSocket transport layer
  tasklist.lua           Recursive task decomposition (make_list tool)
  distributed/
    coordinator.lua      Distributed inference session management
    tensor.lua           Tensor serialization for network transfer
config/
  chatbot_config.lua     Application settings
  library_config.lua     Ollama/model settings
libs/                    Bundled dependencies (luasocket, dkjson, cJSON, etc.)
wrappers/                C, Bash, and Lua API bindings
docs/                    Guides for tools, configuration, and design

Peer-to-peer mode

Two nodes connect over WebSocket and exchange operations. Each node executes operations against its own local context. When results differ, divergences are tracked — never forcibly reconciled. This preserves each node's local truth while maintaining awareness of the other's perspective.

Distributed inference

Model layers can be split between two machines using pipeline parallelism. A coordinator assigns layer ranges and manages the inference session while activation tensors are serialized and transferred over the network. Tokens stream to both peers as they are generated.

Tool system

Tools are executables that respond to --tool-info with a JSON description and accept JSON arguments on stdin. The chatbot automatically discovers tools in libs/tools/ and project-level tools/ directories. See docs/tools-guide.md for details on creating custom tools.

Configuration

config/chatbot_config.lua:

return {
    output_line_width = 100,   -- terminal text wrapping width
    format_tables = true,      -- render markdown tables
    show_vision_debug = false, -- show vision model descriptions
}

See docs/configuration.md for the full reference.

License

See the repository for license details.

About

an api for translating between arbitrary inputs and a local LLM, using an API based on HTTP.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •