Skip to content

human-bee/PRESENT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

606 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

custom Voice AI Application

A sophisticated Next.js application that combines custom AI's generative UI capabilities with LiveKit's real-time voice agents and Model Context Protocol (MCP) integration.

🎯 Features

  • Voice-Enabled AI Agent: Real-time voice interactions powered by LiveKit and OpenAI
  • Generative UI Components: Dynamic UI generation through custom AI
  • MCP Integration: Connect to various AI tools and services via Model Context Protocol
  • MCP Apps: Render tool-provided UI views in sandboxed iframes via McpAppWidget (see docs/mcp-apps.md)
  • Multi-Modal Interactions: Support for both chat and voice interfaces
  • Canvas Collaboration: Interactive canvas with AI-generated components
  • Demo Showcases: Live captions, presentation deck, and toolbar demonstrations

πŸš€ Getting Started

Prerequisites

  • Node.js 18+
  • Valid API keys for:
    • custom AI
    • LiveKit (Cloud or self-hosted)
    • OpenAI
    • Supabase (for auth/storage)

Installation

  1. Clone the repository
git clone <your-repo-url>
cd PRESENT
  1. Install dependencies
npm install
  1. Set up environment variables
  • Copy example.env.local to .env.local

  • Fill in all required API keys:

    NEXT_PUBLIC_custom_API_KEY=
    LIVEKIT_API_KEY=
    LIVEKIT_API_SECRET=
    LIVEKIT_URL=
    OPENAI_API_KEY=
    ANTHROPIC_API_KEY=           # optional, enables Claude models for the canvas steward
    NEXT_PUBLIC_SUPABASE_URL=
    NEXT_PUBLIC_SUPABASE_ANON_KEY=
    # Optional: voice realtime tuning
    VOICE_AGENT_TRANSCRIPTION_ENABLED=true
    VOICE_AGENT_INPUT_TRANSCRIPTION_MODEL=gpt-4o-mini-transcribe
    VOICE_AGENT_MIC_PROFILE=noisy_room
    

    Optional canvas steward controls (the legacy browser TLDraw agent is archived and disabled by default):

    CANVAS_STEWARD_MODEL=claude-haiku-4-5        # override the default model (falls back if provider unavailable)
    NEXT_PUBLIC_CANVAS_AGENT_CLIENT_ENABLED=false # legacy DOM TLDraw agent (leave false unless debugging an edge case; enabling also skips the server steward so the browser is the sole executor)
    NEXT_PUBLIC_CANVAS_AGENT_THEME_ENABLED=true   # keep TLDraw branding even when the client agent is disabled
    CANVAS_QUEUE_DIRECT_FALLBACK=false            # set true to execute the steward immediately when queue inserts fail (may duplicate actions)
    CANVAS_STEWARD_DEBUG=false                   # set true to dump prompts/actions to the server logs
    

Running the Application

Development Mode (Recommended)

Run in three terminals (new architecture):

Terminal 1 - Voice Agent (Realtime, start first):

npm run agent:realtime

Terminal 2 - Conductor (Agents SDK):

npm run agent:conductor

Terminal 3 - Next.js App:

npm run dev

Visit http://localhost:3000

Important: Start the agents before the web app to ensure proper connection.

Optional - TLDraw Sync Server:

npm run sync:dev

Runs the local TLDraw sync server so the canvas stays in sync across sessions.

Launch Entire Stack at Once

Prefer running everything in the background? Use the helper script:

npm run stack:start

This boots livekit-server --dev (as lk:server:dev), sync:dev, agent:conductor, agent:realtime, and next dev concurrently, writing output to logs/*.log so you can tail the services you care about.

For a foreground-supervised run with auto-restart on crash (and process lock ownership across worktrees), use:

STACK_MONITOR=1 npm run stack:start -- --realtime --sync --conductor --livekit --web

To stop this mode cleanly, run npm run stack:stop from the same workspace; it now terminates any active monitor process bound to that tree first.

If you have intentionally mixed ownership across worktrees, set STACK_AUTO_STOP_FOREIGN_MONITORS=0 to fail fast instead of auto-stopping foreign monitor/stack processes. For service-level enforcement, set STACK_AUTO_STOP_FOREIGN_SERVICES=0 to fail fast when existing matching services are started from another workspace; set to 1 (default) for auto-stop-and-restart semantics.

To stop all background services cleanly, run:

npm run stack:stop

The script reads the PID files in logs/ and terminates each dev process, removing stale entries along the way.

Need to collaborate from multiple devices? Run the share helper:

npm run stack:share

This restarts the entire stack and spins up ngrok tunnels for the Next.js app (:3000), TLDraw sync server (:3100), and LiveKit control ports (:7880/:7882). Install ngrok locally (and set your authtoken) beforehand; the script prints the public URLs plus the dashboard address so you can distribute the links quickly.

Production Mode

Terminal 1 - Voice Agent:

npm run agent:realtime

Terminal 2 - Conductor:

npm run agent:conductor

Terminal 3 - Next.js App:

npm run build    # Build once
npm run start    # Run production server

πŸ“± Key Features & Pages

  • / - Landing page with setup checklist
  • /chat - custom AI chat interface with MCP integration
  • /voice - Voice assistant with speech-to-text display
  • /canvas - Interactive canvas with voice agent integration
  • /mcp-config - Configure MCP servers
  • /demo/live-captions - Real-time transcription demo
  • /demo/presentation-deck - Interactive presentation system
  • /demo/livekit-toolbar - LiveKit UI components testing

Configure Model Context Protocol (MCP) Servers

Navigate to http://app.present.best/mcp-config to add MCP servers.

For the demo above we used smithery.ai's brave-search-mcp

brave-search-mcp

You can use any MCP compatible server that supports SSE or HTTP.

Our MCP config page is built using the custom-ai/react/mcp package:

// In your chat page
<customProvider
  apiKey={process.env.NEXT_PUBLIC_custom_API_KEY!}
  components={components}
>
  <customMcpProvider mcpServers={mcpServers}>
    <MessageThreadFull contextKey="custom-template" />
  </customMcpProvider>
</customProvider>

In this example, MCP servers are stored in browser localStorage and loaded when the application starts.

You could have these servers be stored in a database or fetched from an API.

For more detailed documentation, visit custom's official docs.

Customizing

Change what components custom can control

You can see how the Graph component is registered with custom in src/lib/custom.ts:

const components: customComponent[] = [
  {
    name: "Graph",
    description:
      "A component that renders various types of charts (bar, line, pie) using Recharts. Supports customizable data visualization with labels, datasets, and styling options.",
    component: Graph,
    propsSchema: graphSchema, // zod schema for the component props
  },
  // Add more components
];

You can find more information about the options here

Canvas branding (TLDraw defaults and tiny UI tweaks)

Set TLDraw’s default look-and-feel and a few tasteful UI tweaks via a focused hook.

  • Hook: src/components/ui/canvas/hooks/useTldrawBranding.ts
  • Used at: src/components/ui/canvas/canvas-space.tsx:~280 (passed to onMount)
  • Server-side macros + steward capabilities live in docs/canvas-agent.md; review that section if you need to extend the agent’s β€œhands” (apply presets, retries, screenshot tuning, etc.).

What it sets by default

  • β€œNext shape” defaults on editor mount: font: 'mono', size: 'm', dash: 'dotted', color: 'red' (mapped to the brutalist deep-orange swatch).
  • Optional: remap built-in color names (e.g., change what β€œviolet” points to), and nudge selection highlight via CSS variables (currently orange by default).

Change the defaults

Edit the useTldrawBranding call in src/components/ui/canvas/canvas-space.tsx and pass your preferences:

// src/components/ui/canvas/canvas-space.tsx
const branding = useTldrawBranding({
  defaultFont: 'serif',     // 'draw' | 'mono' | 'sans' | 'serif'
  defaultSize: 'm',         // 's' | 'm' | 'l' | 'xl'
  defaultDash: 'solid',     // 'solid' | 'dashed' | 'dotted'
  defaultColor: 'violet',   // see TLColor union in the hook
  palette: {
    violet: '#6a5acd',      // optional: remap built-in named colors
    blue: '#2563eb',
  },
  paletteEnabled: true,     // tie this to the @canvas-agent toggle if needed
  selectionCssVars: {
    '--tl-color-selection': '#7b66dc33',
    '--tl-color-selection-stroke': '#7b66dc',
  },
})

Scope & notes

  • Uses TLDraw v4 Editor APIs (editor.setStyleForNextShapes) and v4 theme palette (DefaultColorThemePalette).
  • Palette remaps apply once per page load and affect all canvases on the page (intended).
  • Branding respects the @canvas-agent toggle via NEXT_PUBLIC_CANVAS_AGENT_THEME_ENABLED (falls back to NEXT_PUBLIC_CANVAS_AGENT_CLIENT_ENABLED for legacy behaviour). Set either env to false to revert instantly to TLDraw stock colors and selection styles.
  • For deeper menu/control edits, compose TLDraw components and overrides. We already apply collaboration overrides at src/components/ui/tldraw/utils/collaborationOverrides.ts.

πŸŽ™οΈ Voice + Steward Architecture

The production pipeline now runs as two lightweight Node processes plus the client dispatcher:

  1. Voice Agent (Realtime) – src/lib/agents/realtime/voice-agent.ts

    • Uses the LiveKit Agents Realtime API.
    • Listens to room audio, transcribes, and calls exactly two UI tools: create_component and update_component.
    • Can optionally hand off server-side work by calling dispatch_to_conductor.
    • Realtime STT is the primary path. VOICE_AGENT_TRANSCRIPTION_MODE is deprecated and has no runtime effect.
  2. Conductor + Stewards – src/lib/agents/conductor/ and src/lib/agents/subagents/

    • Conductor is a tiny router (Agents SDK) that delegates to domain stewards via handoffs.
    • Stewards (e.g., Flowchart Steward) read state from Supabase, reason holistically, and emit one structured UI patch or component creation. Flowchart commits trigger /api/steward/commit, which broadcasts the update over LiveKit.
  3. Browser ToolDispatcher – src/components/tool-dispatcher.tsx

    • Executes the two UI tools, updates TLDraw or React components, and returns tool_result/tool_error events.
    • All other logic (diagram merging, lookups, narration) lives in stewards.

Legacy docs for the original three-agent setup now live in docs/THREE_AGENT_ARCHITECTURE.md under the archived section.

Project Structure

src/
β”œβ”€β”€ lib/
β”‚   β”œβ”€β”€ agents/                  # Voice agent + conductor + stewards
β”‚   β”œβ”€β”€ system-registry.ts       # Single source of truth
β”‚   └── shared-state.ts          # State synchronization types
β”œβ”€β”€ components/
β”‚   β”œβ”€β”€ tool-dispatcher.tsx      # Tool Dispatcher (Agent #3)
β”‚   └── ui/                      # custom components
└── app/                         # Next.js pages

🀝 Contributing

See CONTRIBUTING.md for development guidelines.

πŸ“„ License

This project is licensed under the MIT License.

Supabase Session Sync

Set NEXT_PUBLIC_SUPABASE_URL and NEXT_PUBLIC_SUPABASE_ANON_KEY in .env.local. Create a canvas_sessions table to track each meeting session canvas:

create table if not exists public.canvas_sessions (
  id uuid primary key default uuid_generate_v4(),
  canvas_id uuid references public.canvases(id),
  room_name text not null,
  participants jsonb not null default '[]',
  transcript jsonb not null default '[]',
  canvas_state jsonb,
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now()
);

create unique index if not exists canvas_sessions_room_canvas_uidx
  on public.canvas_sessions(room_name, canvas_id);

The headless SessionSync component will insert/update this row and stream:

  • LiveKit participants
  • LiveKit transcription bus messages
  • TLDraw canvas snapshot on save

RLS and triggers (recommended)

-- Enable RLS
alter table public.canvas_sessions enable row level security;

-- Example policy: user can read rows where the linked canvas belongs to them
-- Adjust to your auth schema; this assumes canvases.user_id = auth.uid()
create policy if not exists canvas_sessions_read_own
  on public.canvas_sessions
  for select
  using (
    canvas_id is null
    or exists (
      select 1 from public.canvases c
      where c.id = canvas_id and c.user_id = auth.uid()
    )
  );

-- Example write policy: allow owner to update
create policy if not exists canvas_sessions_update_own
  on public.canvas_sessions
  for update
  using (
    canvas_id is null
    or exists (
      select 1 from public.canvases c
      where c.id = canvas_id and c.user_id = auth.uid()
    )
  );

-- Auto-update updated_at
create or replace function public.set_updated_at()
returns trigger as $$
begin
  new.updated_at = now();
  return new;
end;
$$ language plpgsql;

drop trigger if exists set_canvas_sessions_updated_at on public.canvas_sessions;
create trigger set_canvas_sessions_updated_at
before update on public.canvas_sessions
for each row execute function public.set_updated_at();

About

REALTIME MEETING PRODUCER?ASSISTANT AGENT PROJECT

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors