home-nlp

A ROS 2 package for speech-driven human-robot interaction, combining

mic_node: captures live microphone audio
asr_node: performs real-time transcription with Whisper (via faster-whisper)
llm_node: processes transcribed text with an LLM and generates robotic behavior trees

This pipeline enables a home robot to understand spoken input and respond with context-aware actions.

┌─────────────┐    /audio_data    ┌─────────────┐    /transcription    ┌─────────────┐
│  mic_node   │ ──────────────→   │  asr_node   │ ──────────────────→  │  llm_node   │
│             │                   │             │     /llm_input       │             │
│ Microphone  │                   │   Whisper   │                      │ LLM + BT    │
│   Audio     │                   │ Streaming   │                      │ Generation  │
│  Capture    │                   │     ASR     │                      │             │
└─────────────┘                   └─────────────┘                      └─────────────┘

Quickstart

Using Launch File (Recommended)

Launch all nodes together with default parameters:

ros2 launch home_nlp launch.py

Customize parameters:

ros2 launch home_nlp launch.py \
  sample_rate:=48000 \
  device:="USB Composite Device" \
  language:="en" \
  asr_model:="large-v2" \
  llm_model:="google/gemma-3-4b-it"

View all available parameters:

ros2 launch home_nlp launch.py --show-args

Running Individual Nodes

Launch the mic_node:

ros2 run home_nlp mic_node --ros-args \
  -p sample_rate:=48000 \
  -p block_duration:=1.0 \
  -p num_channel:=1 \
  -p device:="USB Composite Device"

Launch the asr_node:

ros2 run home_nlp asr_node --ros-args \
  -p language:="zh" \
  -p model:="large-v2" \
  -p sample_rate:=48000 \
  -p block_duration:=1.0 \
  -p period:=1.0 \
  -p max_empty_count:=0

Launch the llm_node:

ros2 run home_nlp llm_node --ros-args \
  -p period:=1.0 \
  -p model:="google/gemma-3-1b-it"

Running in Docker Containers

Build the image:

docker build -t lnfu/home_nlp

Run individual nodes:

docker run --rm lnfu/home_nlp ros2 run home_nlp mic_node
docker run --rm lnfu/home_nlp ros2 run home_nlp asr_node
docker run --rm -e HF_TOKEN="${HF_TOKEN}" lnfu/home_nlp ros2 run home_nlp llm_node

LLM Model Comparison

Valid XML indicates the percentage of runs (out of 100) producing syntactically valid XML.
Note: This does not verify semantic correctness of the behavior tree.

Model	Loading Time (s)	Response Time (s)	VRAM Usage (MB)	RAM Usage (MB)	Valid XML (%)
gemma-3-1b-it	4.19	3.22	2482	2363	66
gemma-3-4b-it	7.25	4.34	9480	5455	99
deepseek 6.7b-it	15.64	3.03	14096	10094	76
Phi-4-mini-it	8.26	3.34	8735	5251	60
Mistral-7B-it-v0.3	5.68	2.02	14135	5344	98
LLama-3.1-8B-it	6.08	1.80	15623	5350	99

Whisper Streaming

This project integrates ideas and components from
ufal/whisper_streaming,
which provides the foundation for real-time Whisper transcription.

Reference

@inproceedings{machacek-etal-2023-turning,
    title = "Turning Whisper into Real-Time Transcription System",
    author = "Mach{\'a}{\v{c}}ek, Dominik  and
      Dabre, Raj  and
      Bojar, Ond{\v{r}}ej",
    editor = "Saha, Sriparna  and
      Sujaini, Herry",
    booktitle = "Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics: System Demonstrations",
    month = nov,
    year = "2023",
    address = "Bali, Indonesia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.ijcnlp-demo.3",
    pages = "17--24",
}

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github		.github
home_nlp		home_nlp
launch		launch
resource		resource
test		test
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
README.md		README.md
package.xml		package.xml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

home-nlp

Quickstart

Using Launch File (Recommended)

Running Individual Nodes

Running in Docker Containers

LLM Model Comparison

Whisper Streaming

Reference

About

Uh oh!

Releases 3

Packages

Contributors 2

Uh oh!

Languages

HCIS-Lab/home-nlp

Folders and files

Latest commit

History

Repository files navigation

home-nlp

Quickstart

Using Launch File (Recommended)

Running Individual Nodes

Running in Docker Containers

LLM Model Comparison

Whisper Streaming

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Uh oh!

Languages

Packages