An easy-to-use virtual avatar platform driven by Large Language Models.
Official code for the paper:
SAPIEN: Affective Virtual Agents Powered by Large Language Models
Masum Hasan, Cengiz Ozel, Sammy Potter, Ehsan Hoque (ACIIW 2023)
Works on any OS. This fork uses Gemini + ElevenLabs + Whisper (original SAPIEN used Azure).
git clone https://github.com/GilJob-E/SAPIEN.git
cd SAPIEN
pip install -r requirements.txt
pip install --upgrade transformers # sentence-transformers 호환성start_app/dialogue_manager/keys.py를 생성하고 API 키를 설정:
import os
os.environ["GOOGLE_API_KEY"] = "your-gemini-api-key"
os.environ["ELEVENLABS_API_KEY"] = "your-elevenlabs-api-key"
os.environ["ELEVENLABS_VOICE_ID"] = "your-voice-id"- Google Cloud Console에서 OAuth 2.0 Client ID 생성
- 다운로드한 JSON을
start_app/client_secret.json으로 저장 - 승인된 리디렉션 URI에
http://localhost:5001/callback추가
cp start_app/files/local_mode_dummy.json start_app/files/local_mode.json- Download: https://rochester.box.com/v/sapien-videos
- Place
staticandspeakingfolders under:start_app/static/video/Metahumans
cd start_app
TOKENIZERS_PARALLELISM=false python app.pyhttp://localhost:5001에서 Google 로그인 후 사용.
cd start_app
python -m pytest tests/ -v -m "not api"- Install
ffmpegand add it to Path. - macOS에서 포트 5000은 AirPlay가 점유하므로 5001 사용.
TOKENIZERS_PARALLELISM=false는 sentence-transformers mutex deadlock 방지에 필수.
- Masum Hasan
- Cengiz Ozel
- Sammy Potter
- Sara Jeiter-Johnson
- Kate Giugno
- Erman Ural
- Richard Chuong
Developed at Roc-HCI lab, University of Rochester Supervised by, Prof. Ehsan Hoque
If you use this work, please cite the following paper,
@misc{hasan2023sapien,
title={SAPIEN: Affective Virtual Agents Powered by Large Language Models},
author={Masum Hasan and Cengiz Ozel and Sammy Potter and Ehsan Hoque},
year={2023},
eprint={2308.03022},
archivePrefix={arXiv},
primaryClass={cs.HC}
}
MIT License
Copyright (c) 2023 University of Rochester
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
SAPIEN:tm: is a trademark owned by SAPIEN Coach LLC. which is being soft licensed to the University of Rochester. Using the name outside this project is prohibited.