Skip to content

dasilva333/airi

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3,366 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Project AIRI

Re-creating Neuro-sama, a soul container of AI waifu / virtual characters to bring them into our world.

This repository is a maintained downstream fork of moeru-ai/airi, focused on keeping the desktop experience usable, integrating high-value upstream ideas selectively, and shipping heavily tested improvements on top of the original project.

Important

Fork context: This build still credits and depends on the original moeru-ai/airi project for its foundation, vision, and broad architecture. The goal here is not to erase that lineage, but to provide a working fork that continues to land practical desktop-focused improvements while upstream changes are reviewed more selectively.

Why This Fork Exists

This fork exists to keep AIRI moving as a practical daily-driver build instead of waiting on broad upstream history churn to settle.

In this workspace, the priority is:

  • keep the desktop path stable and testable
  • preserve upstream intent where it is genuinely useful
  • selectively forward-port worthwhile upstream work instead of blindly rebasing
  • ship tangible UX, performance, and workflow improvements for real usage

If you want the original project history and broader upstream context, see moeru-ai/airi. If you want the branch that is being actively tuned for usability, this repository is that branch.

What This Fork Adds

This fork is not just a bugfix branch. It meaningfully expands what AIRI can do on desktop, especially around character setup, stage presentation, speech, proactivity, and everyday usability.

AIRI Cards Are A Real Character System Here

In upstream, AIRI cards are much thinner. In this fork, the AIRI card flow is treated like an actual character-management system. Cards can be imported, edited, previewed, and exported in ways that are useful for real users instead of just existing as a placeholder.

Each card now has working import and per-card export paths for both AIRI-native JSON and SillyTavern-compatible chara_card_v2 PNG. Hovering a card surfaces its picture, framed PNG export is supported, and the card itself can carry more of the character's real identity across machines. Setting a card is no longer just a name/description swap either: it can drive the active model, the character's preferred background, and the wider stage presentation around that character. Technical format details live in docs/AIRI_Card_Import_Export.md.

This fork also extends card portability well beyond the upstream baseline. AIRI-specific exports preserve things like thumbnails, preferred backgrounds, acting metadata, and other custom extensions. The PNG compatibility path is designed so you can participate in the broader card ecosystem without giving up AIRI-specific data.

New AIRI Card Tabs That Actually Matter

The card editor has been expanded into a multi-tab configuration surface instead of a thin metadata form.

Acting is where you teach AIRI how to perform. It has three prompt layers: one for model expressions and ACT tokens, one for speech-expression tags, and one for speech mannerisms. The point is not just "more fields"; it is to let the same personality speak differently depending on the selected VRM/Live2D model and the active speech provider. The prompts are there so you can align the model's face, the TTS delivery, and the character's writing style instead of leaving all of that implicit.

Modules lets a card carry its own model and stage choices. That includes the active chat model, speech provider/model/voice, the selected VRM or Live2D avatar, and the character's preferred background. In practice this means switching characters can switch the whole presentation, not just the text persona.

Artistry is brand new and turns image generation into a first-class character capability. Each card can carry an image provider, optional model override, default prompt prefix, widget instruction, and provider-specific JSON options. Right now that is wired around providers like Replicate, with the system shaped so other image APIs can plug in cleanly. ComfyUI support is part of the intended path forward here rather than an afterthought.

Proactivity is one of the most important additions in this fork. Instead of treating AIRI as a thing that only answers when spoken to, this tab lets you define when and how the character should decide to speak on her own. The system can inject real context like window history, system load, volume state, and usage metrics into heartbeat evaluations, and the prompt can be tuned so AIRI knows when to remain silent versus when to comment. The goal is not random interruptions; it is to give the character a believable sense of timing, awareness, and restraint.

Memory Is No Longer Just A Placeholder

This fork now has the beginning of a real character-centric memory system instead of a generic future promise. Memory is scoped per character, so one card's continuity is not silently mixed into another's.

Short-term memory now supports rebuilding daily continuity blocks from existing chat history. Those blocks are stored durably and injected back into new or reset sessions so a character can recover recent continuity without replaying the full raw logs. The current implementation is intentionally simple: rebuild from chat history first, store one summary block per day, and keep the prompt-side injection bounded and understandable.

Long-term memory now has a working append-only journal path through the new text_journal tool. Characters can create journal entries and search them later by keyword, and the Memory settings area now has a real long-term archive view instead of a WIP shell. This is designed as a lightweight memory layer that is useful right now, without requiring a separate memory server or a heavyweight external stack just to get durable recall. A sibling image_journal feature is also in development to provide similar durable storage for AI-generated art.

Unified memory lookup already works across both layers: a search checks the active character's long-term journal first and falls back into short-term memory blocks when the journal has no relevant hit, so retrieval feels like one memory system even though storage remains split.

The broader plan for semantic retrieval, short-term automation, and future image journaling is documented in docs/memory-architecture.md, docs/short-term-memory.md, docs/long-term-memory.md, and docs/image-journal-proposal.md.

Scene System Is New In This Fork

The Scene Manager is a real feature added here, not a minor tweak. Upstream did not have this broader character-aware background workflow in the same way.

You can upload scene backgrounds, manage them in a gallery, preserve readable names, choose a global default, and override that per character through AIRI cards. Cards can also carry preferred background metadata across export/import so the scene can be restored later instead of being lost as soon as you move machines. Editing the active card can live-apply the selected background so you are not making blind choices.

This makes stage composition much more intentional. Characters can now have their own look and environment instead of sharing one undifferentiated global presentation.

Tamagotchi / Desktop Stage Improvements

The desktop Tamagotchi experience is one of the areas this fork pushes the hardest. The Control Island is no longer just a minimal tray of generic actions. It has grown into a character-facing stage control surface with new dedicated icons for quick emotions, favorites, and idle-loop cycling, along with stronger refresh behavior, feedback toasts, and better resize handling for the floating desktop window.

That matters because these are the interactions people actually touch all day. The goal of this fork is not only to expose settings somewhere deep in a menu, but to make the live desktop character feel responsive and playful while she is on screen.

Models, Motion, And Character Presentation Are Much Deeper

This fork pushes both VRM and Live2D much further than the stock setup.

For Live2D, there are dedicated customization surfaces and expression-oriented tools so a model can be tuned instead of merely loaded. For VRM, there are expression controls, better recovery/reset behavior, and a growing motion ecosystem. Idle behavior is no longer just one hardcoded file forever: there is support for customizable idle loops, random idle cycling, and a broader direction toward VRMA-driven character motion. On the desktop side, those motion systems are also surfaced through the Control Island so they are not trapped in settings-only workflows.

The model selector itself is also significantly improved. This fork adds a denser multi-column library layout, better filtering and sorting, and an Explore tab for discovering models and related assets instead of forcing everything through one cramped view. VRMA add-file support is also part of the roadmap.

Speech And Provider Plumbing Are More Practical

One of the biggest quality-of-life fixes in this fork is speech quality. A major problem in the original stack was the audio degradation introduced by a library choice in the speech path; this fork replaced that weak point so TTS playback quality is materially better.

On top of that, OpenAI-compatible speech providers that expose a voices endpoint can now surface selectable voices in the UI instead of making users guess IDs manually. There is also broader provider work throughout the fork, including Chatterbox integration, local-server quality-of-life improvements, and other compatibility hardening so the app is easier to run with real-world backends.

Streaming support for DeepSeek and GLM-4 models is also hardened in this fork, including proper handling of reasoning-delta events and tolerance for malformed ACT tag typos that previously caused prompt stalls.

Widget System Lets AIRI Compose Desktop Views

The stage_widgets tool gives AIRI the ability to spawn, update, and remove floating desktop widgets during conversation. Pre-built widget components exist for weather and map views, and a generic JSON fallback renders any unknown component name as a styled info-card, so the model can "compose" a view for stocks, notes, or anything else without requiring a bespoke UI component.

Widgets are managed through a Tool → IPC → Main Process → Renderer pipeline, with each widget identified by a human-readable id so AIRI can update or remove it later. The system is documented in docs/widget-system-report.md.

Per-Character LLM Generation Settings

A planned Generation tab in the AIRI card editor will let each character carry its own chat-generation tuning — provider, model, max tokens, temperature, and top-p — instead of relying solely on global defaults. The schema is designed to grow toward SillyTavern preset import and advanced provider-specific JSON later. Design details are in docs/Character Configurable LLM.md.

All Interaction Pipelines Share One Toolchain

In this fork, typed chat, STT-triggered chat, and proactivity heartbeats all consume the same shared builtinTools surface. That means new builtin tools like text_journal or stage_widgets are automatically available across every interaction pipeline without per-surface wiring. The pipeline architecture and common failure-mode documentation lives in docs/Chat-STT-Proactive-Pipelines-Design.md.

Privacy And Daily-Driver Defaults

This fork also carries smaller but important product decisions that make it easier to recommend as an everyday build. One example is the option in Settings -> General to disable cloud sync, with this fork favoring a privacy-respecting local-default posture rather than assuming remote sync should be on.

Overall, the fork is trying to turn AIRI from an interesting base project into something more complete, more character-driven, and more usable as an actual personal desktop companion.

If you want the running journal for what is actively being built, refined, and thought through in this fork, start with docs/AIRI_PROGRESS.md. For the selective upstream strategy used here, see docs/PR/Selective Upstream Sync.md.

[Join Discord Server] [Try it] [简体中文] [日本語] [Ру��кий] [Tiếng Việt] [Français] [한국어]

AIRI - A container of cyber living souls, re-creation of Neuro-sama | Product Hunt moeru-ai%2Fairi | Trendshift

Heavily inspired by Neuro-sama

Warning

Attention: We do not have any officially minted cryptocurrency or token associated with this project. Please check the information and proceed with caution.

Note

We've got a whole dedicated organization @proj-airi for all the sub-projects born from Project AIRI. Check it out!

RAG, memory system, embedded database, icons, Live2D utilities, and more!

Tip

We have a translation project on Crowdin. If you find any inaccurate translations, feel free to contribute improvements there. Crowdin | Agile localization for tech companies

Have you dreamed about having a cyber living being (cyber waifu, digital pet) or digital companion that could play with and talk to you?

With the power of modern large language models like ChatGPT and famous Claude, asking a virtual being to roleplay and chat with us is already easy enough for everyone. Platforms like Character.ai (a.k.a. c.ai) and JanitorAI as well as local playgrounds like SillyTavern are already good-enough solutions for a chat based or visual adventure game like experience.

But, what about the abilities to play games? And see what you are coding at? Chatting while playing games, watching videos, and is capable of doing many other things.

Perhaps you know Neuro-sama already. She is currently the best virtual streamer capable of playing games, chatting, and interacting with you and the participants. Some also call this kind of being "digital human." Sadly, as it's not open sourced, you cannot interact with her after her live streams go offline.

Therefore, this project, AIRI, offers another possibility here: let you own your digital life, cyber living, easily, anywhere, anytime.

DevLogs We Posted & Recent Updates

What's So Special About This Project?

Unlike the other AI driven VTuber open source projects, アイリ was built with support of many Web technologies such as WebGPU, WebAudio, Web Workers, WebAssembly, WebSocket, etc. from the first day.

Tip

Worrying about the performance drop since we are using Web related technologies?

Don't worry, while Web browser version is meant to give an insight about how much we can push and do inside browsers, and webviews, we will never fully rely on this, the desktop version of AIRI is capable of using native NVIDIA CUDA and Apple Metal by default (thanks to HuggingFace & beloved candle project), without any complex dependency managements, considering the tradeoff, it was partially powered by Web technologies for graphics, layouts, animations, and the WIP plugin systems for everyone to integrate things.

This means that アイリ is capable of running on modern browsers and devices and even on mobile devices (already done with PWA support). This brings a lot of possibilities for us (the developers) to build and extend the power of アイリ VTuber to the next level, while still leaving the flexibilities for users to enable features that requires TCP connections or other non-Web technologies such as connecting to a Discord voice channel or playing Minecraft and Factorio with friends.

Note

We are still in the early stage of development where we are seeking out talented developers to join us and help us to make アイリ a reality.

It's ok if you are not familiar with Vue.js, TypeScript, and devtools required for this project, you can join us as an artist, designer, or even help us to launch our first live stream.

Even if you are a big fan of React, Svelte or even Solid, we welcome you. You can open a sub-directory to add features that you want to see in アイリ, or would like to experiment with.

Fields (and related projects) that we are looking for:

  • Live2D modeller
  • VRM modeller
  • VRChat avatar designer
  • Computer Vision
  • Reinforcement Learning
  • Speech Recognition
  • Speech Synthesis
  • ONNX Runtime
  • Transformers.js
  • vLLM
  • WebGPU
  • Three.js
  • WebXR (checkout the another project we have under the @moeru-ai organization)

If you are interested, why not introduce yourself here? Would like to join part of us to build AIRI?

Current Progress

Capable of

  • Brain
    • Play Minecraft
    • Play Factorio (WIP, but PoC and demo available)
    • Chat in Telegram
    • Chat in Discord
    • Memory
      • Pure in-browser database support (DuckDB WASM | pglite)
      • Short-term memory rebuild from per-character chat history
      • Short-term continuity injection into new/reset sessions
      • Long-term text_journal create/search tools
      • Long-term per-character journal archive UI
      • Unified memory lookup fallback (long-term → short-term)
      • Memory Alaya (WIP)
    • Pure in-browser local (WebGPU) inference
  • Ears
    • Audio input from browser
    • Audio input from Discord
    • Client side speech recognition
    • Client side talking detection
  • Mouth
  • Body
    • VRM support
      • Control VRM model
    • VRM model animations
      • Auto blink
      • Auto look at
      • Idle eye movement
    • Live2D support
      • Control Live2D model
    • Live2D model animations
      • Auto blink
      • Auto look at
      • Idle eye movement
  • Desktop widgets
    • stage_widgets tool (spawn / update / remove)
    • Pre-built weather and map widgets
    • Generic JSON fallback for arbitrary data
    • Artistry / image generation via widget pipeline
  • Shared builtin toolchain across chat, STT, and proactivity pipelines (Proactivity: DISABLED for tuning)
  • Refactor Proactivity Sensors: Transition from PowerShell to native integration (Injeca/Eventa) currently causing main-thread lag (INP >1s).

Development

For detailed instructions to develop this project, follow CONTRIBUTING.md

Note

By default, pnpm dev will start the development server for the Stage Web (browser version). If you would like to try developing the desktop version, please make sure you read CONTRIBUTING.md to setup the environment correctly.

pnpm i
pnpm dev

Stage Web (Browser Version at airi.moeru.ai)

pnpm dev

Stage Tamagotchi (Desktop Version)

pnpm dev:tamagotchi

A Nix package for Tamagotchi is included. To run airi with Nix, first make sure to enable flakes, then run:

nix run github:moeru-ai/airi

NixOS

Electron requires shared libraries that aren't in standard paths on NixOS. Use the FHS shell defined in flake.nix:

nix develop .#fhs
pnpm dev:tamagotchi

Stage Pocket (Mobile Version)

Start the development server for the capacitor:

pnpm dev:pocket:ios <DEVICE_ID_OR_SIMULATOR_NAME>
# Or
CAPACITOR_DEVICE_ID=<DEVICE_ID_OR_SIMULATOR_NAME> pnpm dev:pocket:ios

You can see the list of available devices and simulators by running pnpm exec cap run ios --list.

If you need to connect server channel on pocket in wireless mode, you need to start tamagotchi as root:

sudo pnpm dev:tamagotchi

Then enable secure websocket in tamagotchi settings/system/general.

Documentation Site

pnpm dev:docs

Publish

Please update the version in Cargo.toml after running bumpp:

npx bumpp --no-commit --no-tag

Support of LLM API Providers (powered by xsai)

Sub-projects Born from This Project

%%{ init: { 'flowchart': { 'curve': 'catmullRom' } } }%%

flowchart TD
  Core("Core")
  Unspeech("unspeech")
  DBDriver("@proj-airi/drizzle-duckdb-wasm")
  MemoryDriver("[WIP] Memory Alaya")
  DB1("@proj-airi/duckdb-wasm")
  SVRT("@proj-airi/server-runtime")
  Memory("Memory")
  STT("STT")
  Stage("Stage")
  StageUI("@proj-airi/stage-ui")
  UI("@proj-airi/ui")

  subgraph AIRI
    DB1 --> DBDriver --> MemoryDriver --> Memory --> Core
    UI --> StageUI --> Stage --> Core
    Core --> STT
    Core --> SVRT
  end

  subgraph UI_Components
    UI --> StageUI
    UITransitions("@proj-airi/ui-transitions") --> StageUI
    UILoadingScreens("@proj-airi/ui-loading-screens") --> StageUI
    FontCJK("@proj-airi/font-cjkfonts-allseto") --> StageUI
    FontXiaolai("@proj-airi/font-xiaolai") --> StageUI
  end

  subgraph Apps
    Stage --> StageWeb("@proj-airi/stage-web")
    Stage --> StageTamagotchi("@proj-airi/stage-tamagotchi")
    Core --> RealtimeAudio("@proj-airi/realtime-audio")
    Core --> PromptEngineering("@proj-airi/playground-prompt-engineering")
  end

  subgraph Server_Components
    Core --> ServerSDK("@proj-airi/server-sdk")
    ServerShared("@proj-airi/server-shared") --> SVRT
    ServerShared --> ServerSDK
  end

  STT -->|Speaking| Unspeech
  SVRT -->|Playing Factorio| F_AGENT
  SVRT -->|Playing Minecraft| MC_AGENT

  subgraph Factorio_Agent
    F_AGENT("Factorio Agent")
    F_API("Factorio RCON API")
    factorio-server("factorio-server")
    F_MOD1("autorio")

    F_AGENT --> F_API -.-> factorio-server
    F_MOD1 -.-> factorio-server
  end

  subgraph Minecraft_Agent
    MC_AGENT("Minecraft Agent")
    Mineflayer("Mineflayer")
    minecraft-server("minecraft-server")

    MC_AGENT --> Mineflayer -.-> minecraft-server
  end

  XSAI("xsAI") --> Core
  XSAI --> F_AGENT
  XSAI --> MC_AGENT

  Core --> TauriMCP("@proj-airi/tauri-plugin-mcp")
  Memory_PGVector("@proj-airi/memory-pgvector") --> Memory

  style Core fill:#f9d4d4,stroke:#333,stroke-width:1px
  style AIRI fill:#fcf7f7,stroke:#333,stroke-width:1px
  style UI fill:#d4f9d4,stroke:#333,stroke-width:1px
  style Stage fill:#d4f9d4,stroke:#333,stroke-width:1px
  style UI_Components fill:#d4f9d4,stroke:#333,stroke-width:1px
  style Server_Components fill:#d4e6f9,stroke:#333,stroke-width:1px
  style Apps fill:#d4d4f9,stroke:#333,stroke-width:1px
  style Factorio_Agent fill:#f9d4f2,stroke:#333,stroke-width:1px
  style Minecraft_Agent fill:#f9d4f2,stroke:#333,stroke-width:1px

  style DBDriver fill:#f9f9d4,stroke:#333,stroke-width:1px
  style MemoryDriver fill:#f9f9d4,stroke:#333,stroke-width:1px
  style DB1 fill:#f9f9d4,stroke:#333,stroke-width:1px
  style Memory fill:#f9f9d4,stroke:#333,stroke-width:1px
  style Memory_PGVector fill:#f9f9d4,stroke:#333,stroke-width:1px
Loading

Similar Projects

Open sourced ones

Non-open-sourced ones

Project Status

Repobeats analytics image

Acknowledgements

Star History

Star History Chart

About

💖🧸 Self hosted, you-owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achieve Neuro-sama's altitude. Capable of realtime voice chat, Minecraft, Factorio playing. Web / macOS / Windows supported.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • TypeScript 57.5%
  • Vue 38.6%
  • JavaScript 1.6%
  • Rust 1.0%
  • CSS 0.6%
  • HTML 0.4%
  • Other 0.3%