jobot-ai-elephant

myCobot280 RISCV Smart Retail Scene System

Install the Code

Use git

git clone  https://github.com/elephantrobotics/jobot-ai-elephant.git

Download version

Environment Setup

sudo apt install -y \
    spacemit-ollama-toolkit \
    portaudio19-dev \
    python3-dev \
    libopenblas-dev \
    ffmpeg \
    python3-venv \
    python3-spacemit-ort \
    libceres-dev \
    libopencv-dev

Large Model Dependency Installation

cd ~/jobot-ai-elephant/spacemit_audio
bash ollama.sh

Python Dependency Installation

cd ~/jobot-ai-elephant
python3 -m venv ~/asr_env
source ~/asr_env/bin/activate
pip install -r requirements.txt

Add User to Audio Group

sudo usermod -aG audio $USER

Using the Code

Check Recording Devices

Supports automatic recognition. If automatic recognition fails, manual Settings are required

arecord -l

Sample output:

Devices with "camera" in the name are camera-related and should not be selected. Card 3 is usable.

**** List of CAPTURE Hardware Devices ****
card 1: Camera [USB Camera], device 0: USB Audio [USB Audio]
    Subdevices: 1/1
    Subdevice #0: subdevice #0
card 2: Camera_1 [USB 2.0 Camera], device 0: USB Audio [USB Audio]
    Subdevices: 1/1
    Subdevice #0: subdevice #0
card 3: Device [USB PnP Sound Device], device 0: USB Audio [USB Audio]
    Subdevices: 1/1
    Subdevice #0: subdevice #0

Modify the recording device index to 3 in the jobot-ai-pipeline/smart_main_asr.py file:

...
record_device = 3  # Recording device index, needs to be changed
rec_audio = RecAudioThreadPipeLine(vad_mode=1, sld=2, max_time=2, channels=1, rate=48000, device_index=record_device)
...

Control Maximum Recording Duration

rec_audio.max_time_record = 3  # Maximum recording time in seconds

Recording runs in non-blocking mode by default. For most applications, serial execution is more common—use join() to wait for recording to finish:

# Start recording user audio
rec_audio.max_time_record = 3
rec_audio.frame_is_append = True
rec_audio.start_recording()
rec_audio.thread.join()  # Wait for recording to complete

Check Playback Devices

Supports automatic recognition. If automatic recognition fails, manual Settings are required

aplay -l

Sample output:

(asr_env) jobot-ai-pipeline git:(main) aplay -l
card 0: sndes8326 [snd-es8326], device 0: i2s-dai0-ES8326 HiFi ES8326 HiFi-0 []
    Subdevices: 1/1
    Subdevice #0: subdevice #0
card 2: Device [USB Audio Device], device 0: USB Audio [USB Audio]
    Subdevices: 1/1
    Subdevice #0: subdevice #0

The USB speaker corresponds to card 2. Therefore, set:
play_device = 'plughw:2,0'

Update the following files accordingly:

# smart_main_asr.py
play_device='plughw:0,0'  # Playback device

Run the Code

cd ~/jobot-ai-elephant
source ~/asr_env/bin/activate  # Run within the virtual environment
python smart_main_asr.py

After pressing Enter with no input, it enters recording mode. Default is 3 seconds.

Fuzzy command matching supported examples:

"Grab the orange", "Grab the apple", "Checkout", etc.

Commands supported by the large language model:

"Give me an apple", "And an orange" ...
The large model can recognize object names.

Startup script description

smart_main_asr.py: Chinese voice input, including the entire process of speech-to-text conversion, LLM, object detection, capture, QR code recognition, and OCR text recognition

smart_main.py: English text input, including the entire process of LLM, object detection, crawling, QR code recognition, and OCR text recognition

smart_simple_asr.py: Chinese voice input, only including speech-to-text conversion, LLM, object detection, and capture processes, used for quick demonstration

Smart_simple.py: English text input, only including LLM, object detection, and crawling processes, used for quick demonstration

Project Directory Structure

├── spacemit_audio          # Audio module: recording, playback, ASR
├── spacemit_cv             # Computer vision module
├── spacemit_llm            # Large language model module
├── spacemit_orc            # OCR module
├── tools                   # Utilities
├── feedback_wav            # Feedback audio
├── cv_robot_arm_demo.py
├── ocr_demo.py             # Standalone OCR test
├── README_EN.md            # English Use Documentation
├── README.md               # Chinese Use Documentation
├── smart_main_asr.py       # Main retail program(Voice interaction)
├── smart_main.py       	# Without voice interaction
├── smart_simple_asr.py 	# Simple recognition and capture examples (voice interaction)
├── smart_simple.py     	# Simple recognition and capture examples
├── test_asr.py     # The recording can be tested.
├── test_llm.py     # Test the large model separately
├── test_match.py   # Test the function matching separately
├── test_play.py    # Test playback alone
└── to_zero.py      # The robotic arm returns to the recognition zero point

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

jobot-ai-elephant

Install the Code

Environment Setup

Large Model Dependency Installation

Python Dependency Installation

Add User to Audio Group

Using the Code

Check Recording Devices

Control Maximum Recording Duration

Check Playback Devices

Run the Code

Startup script description

Project Directory Structure

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
feedback_wav		feedback_wav
spacemit_audio		spacemit_audio
spacemit_cv		spacemit_cv
spacemit_llm		spacemit_llm
spacemit_orc		spacemit_orc
tools		tools
.gitignore		.gitignore
README.md		README.md
README_CN.md		README_CN.md
cv_robot_arm_demo.py		cv_robot_arm_demo.py
ocr_demo.py		ocr_demo.py
requirements.txt		requirements.txt
silero_vad.onnx		silero_vad.onnx
smart_interface.py		smart_interface.py
smart_main.py		smart_main.py
smart_main_asr.py		smart_main_asr.py
smart_main_asr2.py		smart_main_asr2.py
smart_main_asr_loop.py		smart_main_asr_loop.py
smart_simple.py		smart_simple.py
smart_simple_asr.py		smart_simple_asr.py
test_asr.py		test_asr.py
test_llm.py		test_llm.py
test_match.py		test_match.py
test_play.py		test_play.py
to_zero.py		to_zero.py

elephantrobotics/jobot-ai-elephant

Folders and files

Latest commit

History

Repository files navigation

jobot-ai-elephant

Install the Code

Environment Setup

Large Model Dependency Installation

Python Dependency Installation

Add User to Audio Group

Using the Code

Check Recording Devices

Control Maximum Recording Duration

Check Playback Devices

Run the Code

Startup script description

Project Directory Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages