Skip to content

Commit

Permalink
2023.01
Browse files Browse the repository at this point in the history
Fay2.0:
1、控制器pc内网穿透,音频输入输出设备远程直连;
2、提供android 音频输入输出工程示例代码;
3、提供python音频输入输出工程示例代码(远程PC、树莓派等可用);
4、补传1.0语音指令音乐播放模块(暂不支持远程播放);
5、重构及补充若干工具模块:websocket、多线程、缓冲器、音频流录制器等;
6、修复1.x版本的多个bug。
  • Loading branch information
xszyou committed Jan 31, 2023
1 parent 09fecff commit 55fb089
Show file tree
Hide file tree
Showing 120 changed files with 29,212 additions and 166 deletions.
40 changes: 36 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<br>
<img src="images/icon.png" alt="Fay">
<h1>FAY</h1>
<h3>数 字 人 控 制 器(这是元宇宙吗?)</h3>
<h3>数 字 人 Fay 控 制 器(这是元宇宙吗?)</h3>
</div>


Expand All @@ -20,9 +20,18 @@

2、[(34条消息) Fay数字人开源项目在mac 上的安装办法_郭泽斌之心的博客-CSDN博客](https://blog.csdn.net/aa84758481/article/details/127551258)

目前最新版本是2.0。在新版本里我们提出一个全新的架构。在这个架构下每个人都可以把Fay控制器搭建在自己个人电脑上(未来,或许我们会提供终端),让你电脑成为你数字助理的载体。你的所有设备(手表、手机、眼镜、笔记本)随时可以与你的数字助理通讯,数字助理将通过电脑为你处理数字世界里的所有事情。(贾维斯?Her?)
![](images/20230122074644.png)


最近更新:
2023.01
1、控制器pc内网穿透,音频输入输出设备远程直连;
2、提供android 音频输入输出工程示例代码;
3、提供python音频输入输出工程示例代码(远程PC、树莓派等可用);
4、补传1.0语音指令音乐播放模块(暂不支持远程播放);
5、重构及补充若干工具模块:websocket、多线程、缓冲器、音频流录制器等;
6、修复1.x版本的多个bug。

2022.12

Expand Down Expand Up @@ -161,6 +170,13 @@ python main.py



#### socket远程音频输入

可以接入远程音频输入,远程音频输出




#### 商品栏

填入商品介绍,数字人将自动讲解商品。
Expand All @@ -175,18 +191,32 @@ python main.py



启动前需填入应用密钥
启动前需填入应用密钥[`system.conf`](https://github.com/TheRamU/Fay/blob/main/system.conf)

| 模块 | 描述 | 链接 |
| 代码模块 | 描述 | 链接 |
| ------------------------- | -------------------------- | ------------------------------------------------------------ |
| ./ai_module/ali_nls.py | 阿里云 实时语音识别 | https://ai.aliyun.com/nls/trans |
| ./ai_module/ms_tts_sdk.py | 微软 文本转语音 基于SDK | https://azure.microsoft.com/zh-cn/services/cognitive-services/text-to-speech/ |
| ./ai_module/xf_aiui.py | 讯飞 人机交互-自然语言处理 | https://aiui.xfyun.cn/solution/webapi |
| ./ai_module/xf_ltp.py | 讯飞 情感分析 | https://www.xfyun.cn/service/emotion-analysis |
| ./utils/ngrok_util.py | ngrok.cc 外网穿透 | http://ngrok.cc |





## 与远程音频输入输出设备连接(非必须,外网需要配置http://ngrok.cc ngrok tcp通道的clientid)

控制器与采用 socket(非websocket) 方式与 音频输出设备通讯

内网通讯地址: [`ws://127.0.0.1:10001`](ws://127.0.0.1:10001)

外网通讯地址: 通过http://ngrok.cc获取

![](images/Dingtalk_20230131122109.jpg)


消息格式: 参考 [remote_audio.py](https://github.com/TheRamU/Fay/blob/main/python_connector_demo/remote_audio.py)

## 与数字形象通讯(非必须,控制器需要关闭“面板播放”)

Expand All @@ -202,6 +232,8 @@ python main.py





## 目录结构

```
Expand Down Expand Up @@ -238,7 +270,7 @@ python main.py

技术交流群

<img src="images/20230116105510.jpg" alt="微信群">
<img src="images/-1101731868-3469777.png" alt="微信群">
v2.0:2023年1月25晚上10点腾讯会议见:https://meeting.tencent.com/dm/y2Vq5Iut8mN0


3 changes: 3 additions & 0 deletions [Start] PowerShell.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
start powershell ^
$host.ui.RawUI.WindowTitle='FeiFei Alpha';^
python ./main.py;^
3 changes: 3 additions & 0 deletions [Start].bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
echo off
cls
start ./bin/Start.vbs
11 changes: 9 additions & 2 deletions ai_module/ali_nls.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from aliyunsdkcore.client import AcsClient
from aliyunsdkcore.request import CommonRequest

from core import wsa_server
from core import wsa_server, song_player
from scheduler.thread_manager import MyThread
from utils import util
from utils import config_util as cfg
Expand Down Expand Up @@ -69,6 +69,10 @@ def __create_header(self, name):
}
return header

def __on_msg(self):
if "暂停" in self.finalResults or "不想听了" in self.finalResults or "别唱了" in self.finalResults:
song_player.stop()

# 收到websocket消息的处理
def on_message(self, ws, message):
try:
Expand All @@ -79,9 +83,11 @@ def on_message(self, ws, message):
self.done = True
self.finalResults = data['payload']['result']
wsa_server.get_web_instance().add_cmd({"panelMsg": self.finalResults})
self.__on_msg()
elif name == 'TranscriptionResultChanged':
self.finalResults = data['payload']['result']
wsa_server.get_web_instance().add_cmd({"panelMsg": self.finalResults})
self.__on_msg()

except Exception as e:
print(e)
Expand Down Expand Up @@ -112,12 +118,13 @@ def run(*args):
try:
if len(self.__frames) > 0:
frame = self.__frames[0]

self.__frames.pop(0)
if type(frame) == dict:
ws.send(json.dumps(frame))
elif type(frame) == bytes:
ws.send(frame, websocket.ABNF.OPCODE_BINARY)
# print('发送 ------> ' + str(type(frame)))
#print('发送 ------> ' + str(type(frame)))
except Exception as e:
print(e)
time.sleep(0.04)
Expand Down
27 changes: 25 additions & 2 deletions ai_module/ms_tts_sdk.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,16 @@
from core.tts_voice import EnumVoice
from utils import util, config_util
from utils import config_util as cfg
import pygame



class Speech:
def __init__(self):
self.__speech_config = speechsdk.SpeechConfig(subscription=cfg.key_ms_tts_key, region=cfg.key_ms_tts_region)
self.__speech_config.speech_recognition_language = "zh-CN"
self.__speech_config.speech_synthesis_voice_name = "zh-CN-XiaoxiaoNeural"
self.__speech_config.set_speech_synthesis_output_format(speechsdk.SpeechSynthesisOutputFormat.Audio16Khz32KBitRateMonoMp3)
self.__speech_config.set_speech_synthesis_output_format(speechsdk.SpeechSynthesisOutputFormat.Riff16Khz16BitMonoPcm)
self.__synthesizer = speechsdk.SpeechSynthesizer(speech_config=self.__speech_config, audio_config=None)
self.__connection = None
self.__history_data = []
Expand Down Expand Up @@ -57,7 +59,8 @@ def to_sample(self, text, style):
'</speak>'.format(voice_name, style, 1.8, text)
result = self.__synthesizer.speak_ssml(ssml)
audio_data_stream = speechsdk.AudioDataStream(result)
file_url = './samples/sample-' + str(int(time.time() * 1000)) + '.mp3'

file_url = './samples/sample-' + str(int(time.time() * 1000)) + '.wav'
audio_data_stream.save_to_wav_file(file_url)
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
self.__history_data.append((voice_name, style, text, file_url))
Expand All @@ -66,3 +69,23 @@ def to_sample(self, text, style):
util.log(1, "[x] 语音转换失败!")
util.log(1, "[x] 原因: " + str(result.reason))
return None
if __name__ == '__main__':
cfg.load_config()
sp = Speech()
sp.connect()
pygame.init()
text = """一座城市,总有一条标志性道路,它见证着这座城市的时代变迁,并随着城市历史积淀砥砺前行,承载起城市的非凡荣耀。季华路,见证了佛山的崛起,从而也被誉为“最代表佛山城市发展的一条路”。季华路位于佛山市禅城区,是佛山市总体道路规划网中东西走向的城市主干道,全长20公里,是佛山市公路网络规划"四纵、九横、两环"主骨架中的重要组成部分,西接禅城南庄、高明、三水,东连南海、广州,横跨佛山一环、禅西大道、佛山大道、岭南大道、南海大道五大主干道,贯穿中心城区四个镇街,沿途经过多处文化古迹和重要产业区,是名副其实的“交通动脉”。同时季华路也是佛山的经济“大动脉”,代表着佛山蓬勃发展的现在,也影响着佛山日新月异的未来。
季华六路起于南海大道到文华北截至,道路为东西走向,全长1.5公里,该路段为1996年完成建设并投入使用,该道路为一级公路,路面使用混凝土材质,道路为双向5车道,路宽30米,途径1个行政单位,一条隧道,该路段设有格栅518个,两边护栏1188米,沙井盖158个,其中供水26个,市政77个,移动通讯2个,联通通讯3个,电信通讯3个,交通信号灯1个,人行天桥2个,电梯4台,标志牌18个,标线为1.64万米。
道路南行是文华中路,可通往亚洲艺术公园,亚洲艺术公园位于佛山市发展区的中心,占地40公顷,其中水体面积26.6公顷,以岭南水乡为文脉,以水上森林为绿脉,以龙舟竞渡为水脉,通过建筑、雕塑、植物、桥梁等设计要素,营造出一个具有亚洲艺术风采的艺术园地。曾获选佛山十大最美公园之一。
道路北行是文华北路,可通往佛山市委市政府。佛山市委市政府是广东省佛山市的行政管理机关。
道路西行到达文华公园。佛山市文华公园位于佛山市禅城区季华路以南(电视塔旁)、文华路以西,大福路以东路段,建设面积约11万平方米,主要将传统文化和现代园林有机结合,全园布局以大树木、大草坪、多彩植被和人工湖为表现主体,精致的溪涧、小桥、亲水平台点缀其间,通过棕榈植物错落有序的巧妙搭配,令园区既蕴涵亚热带曼妙风情,又不失岭南园艺的独特风采。通过“借景”、“透景”造园手法,与邻近的电视塔相映成趣,它的落成,为附近市民的休闲生活添上了色彩绚丽的一笔。
季华五路是季华路最先建设的一段道路,起于岭南大道到佛山大道截至,道路为东西走向,全长2.1公里,该路段为1993年完成建设并投入使用,该道路为一级公路,路面使用混凝土材质,道路为双向5车道,路宽30米,途径1个行政单位,该路段设有格栅634个,两边护栏1310米,沙井盖180个,其中供水30个,市政81个,移动通讯5个,联通通讯3个,交通信号灯2个,人行天桥3个,电梯12台,标志牌26个,标线为2.131万米。
沿途经过季华园,季华园即佛山季华公园,位于佛山市城南新区,1994年5月建成。占地200多亩。场内所有设施免费使用。景点介绍风格清新、意境优雅季华公园是具有亚热带风光的大型开放游览性公园。由于场内所有设施免费使用,地方广阔,每天都吸引着众多的游人前来休闲、运动等。
道路南行是佛山大道中,可通往乐从方向乐从镇,地处珠三角腹地,广佛经济圈核心带,是国家级重大国际产业、城市发展合作平台--中德工业服务区、中欧城镇化合作示范区的核心。
道路北行佛山大道中,可通往佛山火车站,佛山火车站是广东省的铁路枢纽之一,广三铁路经过该站。"""
s = sp.to_sample(text, "cheerful")
print(s)
pygame.mixer.music.load(s)
pygame.mixer.music.play()
sp.close()
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
*.iml
.gradle
/local.properties
/.idea/caches
/.idea/libraries
/.idea/modules.xml
/.idea/workspace.xml
/.idea/navEditor.xml
/.idea/assetWizardSettings.xml
.DS_Store
/build
/captures
.externalNativeBuild
.cxx
local.properties

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/build
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
plugins {
id 'com.android.application'
}

android {
compileSdk 32

defaultConfig {
applicationId "com.yaheen.fayconnectordemo"
minSdk 29
targetSdk 32
versionCode 1
versionName "1.0"

testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
}

buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
}
}
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_8
targetCompatibility JavaVersion.VERSION_1_8
}
}

dependencies {

implementation 'androidx.appcompat:appcompat:1.3.0'
implementation 'com.google.android.material:material:1.4.0'
implementation 'androidx.constraintlayout:constraintlayout:2.0.4'
testImplementation 'junit:junit:4.13.2'
androidTestImplementation 'androidx.test.ext:junit:1.1.3'
androidTestImplementation 'androidx.test.espresso:espresso-core:3.4.0'
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Add project specific ProGuard rules here.
# You can control the set of applied configuration files using the
# proguardFiles setting in build.gradle.
#
# For more details, see
# http://developer.android.com/guide/developing/tools/proguard.html

# If your project uses WebView with JS, uncomment the following
# and specify the fully qualified class name to the JavaScript interface
# class:
#-keepclassmembers class fqcn.of.javascript.interface.for.webview {
# public *;
#}

# Uncomment this to preserve the line number information for
# debugging stack traces.
#-keepattributes SourceFile,LineNumberTable

# If you keep the line number information, uncomment this to
# hide the original source file name.
#-renamesourcefileattribute SourceFile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
package com.yaheen.fayconnectordemo;

import android.content.Context;

import androidx.test.platform.app.InstrumentationRegistry;
import androidx.test.ext.junit.runners.AndroidJUnit4;

import org.junit.Test;
import org.junit.runner.RunWith;

import static org.junit.Assert.*;

/**
* Instrumented test, which will execute on an Android device.
*
* @see <a href="http://d.android.com/tools/testing">Testing documentation</a>
*/
@RunWith(AndroidJUnit4.class)
public class ExampleInstrumentedTest {
@Test
public void useAppContext() {
// Context of the app under test.
Context appContext = InstrumentationRegistry.getInstrumentation().getTargetContext();
assertEquals("com.yaheen.fayconnectordemo", appContext.getPackageName());
}
}
Loading

0 comments on commit 55fb089

Please sign in to comment.