2023.01

Fay2.0: 1、控制器pc内网穿透，音频输入输出设备远程直连； 2、提供android 音频输入输出工程示例代码； 3、提供python音频输入输出工程示例代码（远程PC、树莓派等可用）； 4、补传1.0语音指令音乐播放模块（暂不支持远程播放）； 5、重构及补充若干工具模块：websocket、多线程、缓冲器、音频流录制器等； 6、修复1.x版本的多个bug。
keyman9848 · Jan 31, 2023 · 55fb089 · 55fb089
1 parent 09fecff
commit 55fb089
Show file tree

Hide file tree

Showing 120 changed files with 29,212 additions and 166 deletions.
diff --git a/README.md b/README.md
@@ -2,7 +2,7 @@
     <br>
     <img src="images/icon.png" alt="Fay">
     <h1>FAY</h1>
-	<h3>数  字  人  控  制  器(这是元宇宙吗？)</h3>
+	<h3>数  字  人  Fay  控  制  器(这是元宇宙吗？)</h3>
 </div>
 
 
@@ -20,9 +20,18 @@
 
 2、[(34条消息) Fay数字人开源项目在mac 上的安装办法_郭泽斌之心的博客-CSDN博客](https://blog.csdn.net/aa84758481/article/details/127551258)
 
+目前最新版本是2.0。在新版本里我们提出一个全新的架构。在这个架构下每个人都可以把Fay控制器搭建在自己个人电脑上（未来，或许我们会提供终端），让你电脑成为你数字助理的载体。你的所有设备（手表、手机、眼镜、笔记本）随时可以与你的数字助理通讯，数字助理将通过电脑为你处理数字世界里的所有事情。（贾维斯？Her?）
+![](images/20230122074644.png)
 
 
 最近更新：
+2023.01
+1、控制器pc内网穿透，音频输入输出设备远程直连；
+2、提供android 音频输入输出工程示例代码；
+3、提供python音频输入输出工程示例代码（远程PC、树莓派等可用）；
+4、补传1.0语音指令音乐播放模块（暂不支持远程播放）；
+5、重构及补充若干工具模块：websocket、多线程、缓冲器、音频流录制器等；
+6、修复1.x版本的多个bug。
 
 2022.12
 
@@ -161,6 +170,13 @@ python main.py
 
 
 
+#### socket远程音频输入
+
+可以接入远程音频输入，远程音频输出
+
+
+
+
 #### 商品栏
 
 填入商品介绍，数字人将自动讲解商品。
@@ -175,18 +191,32 @@ python main.py
 
 
 
-启动前需填入应用密钥
+启动前需填入应用密钥[`system.conf`](https://github.com/TheRamU/Fay/blob/main/system.conf)
 
-| 模块                      | 描述                       | 链接                                                         |
+| 代码模块                  | 描述                       | 链接                                                         |
 | ------------------------- | -------------------------- | ------------------------------------------------------------ |
 | ./ai_module/ali_nls.py    | 阿里云 实时语音识别        | https://ai.aliyun.com/nls/trans                              |
 | ./ai_module/ms_tts_sdk.py | 微软 文本转语音 基于SDK    | https://azure.microsoft.com/zh-cn/services/cognitive-services/text-to-speech/ |
 | ./ai_module/xf_aiui.py    | 讯飞 人机交互-自然语言处理 | https://aiui.xfyun.cn/solution/webapi                        |
 | ./ai_module/xf_ltp.py     | 讯飞 情感分析              | https://www.xfyun.cn/service/emotion-analysis                |
+| ./utils/ngrok_util.py     | ngrok.cc 外网穿透          | http://ngrok.cc                                              |
+
+
 
 
 
+## 与远程音频输入输出设备连接（非必须,外网需要配置http://ngrok.cc ngrok tcp通道的clientid）
 
+控制器与采用 socket(非websocket) 方式与 音频输出设备通讯
+
+内网通讯地址: [`ws://127.0.0.1:10001`](ws://127.0.0.1:10001)
+
+外网通讯地址: 通过http://ngrok.cc获取
+
+![](images/Dingtalk_20230131122109.jpg)
+
+
+消息格式: 参考 [remote_audio.py](https://github.com/TheRamU/Fay/blob/main/python_connector_demo/remote_audio.py)
 
 ## 与数字形象通讯（非必须,控制器需要关闭“面板播放”）
 
@@ -202,6 +232,8 @@ python main.py
 
 
 
+
+
 ## 目录结构
 
 ```
@@ -238,7 +270,7 @@ python main.py
 
 技术交流群
 
-<img src="images/20230116105510.jpg" alt="微信群">
+<img src="images/-1101731868-3469777.png" alt="微信群">
 v2.0：2023年1月25晚上10点腾讯会议见：https://meeting.tencent.com/dm/y2Vq5Iut8mN0
 
 
diff --git a/[Start] PowerShell.bat b/[Start] PowerShell.bat
@@ -0,0 +1,3 @@
+start powershell ^
+$host.ui.RawUI.WindowTitle='FeiFei Alpha';^
+python ./main.py;^
diff --git a/[Start].bat b/[Start].bat
@@ -0,0 +1,3 @@
+echo off
+cls
+start ./bin/Start.vbs
diff --git a/ai_module/ali_nls.py b/ai_module/ali_nls.py
@@ -8,7 +8,7 @@
 from aliyunsdkcore.client import AcsClient
 from aliyunsdkcore.request import CommonRequest
 
-from core import wsa_server
+from core import wsa_server, song_player
 from scheduler.thread_manager import MyThread
 from utils import util
 from utils import config_util as cfg
@@ -69,6 +69,10 @@ def __create_header(self, name):
         }
         return header
 
+    def __on_msg(self):
+        if "暂停" in self.finalResults or "不想听了" in self.finalResults or "别唱了" in self.finalResults:
+            song_player.stop()
+
     # 收到websocket消息的处理
     def on_message(self, ws, message):
         try:
@@ -79,9 +83,11 @@ def on_message(self, ws, message):
                 self.done = True
                 self.finalResults = data['payload']['result']
                 wsa_server.get_web_instance().add_cmd({"panelMsg": self.finalResults})
+                self.__on_msg()
             elif name == 'TranscriptionResultChanged':
                 self.finalResults = data['payload']['result']
                 wsa_server.get_web_instance().add_cmd({"panelMsg": self.finalResults})
+                self.__on_msg()
 
         except Exception as e:
             print(e)
@@ -112,12 +118,13 @@ def run(*args):
                 try:
                     if len(self.__frames) > 0:
                         frame = self.__frames[0]
+
                         self.__frames.pop(0)
                         if type(frame) == dict:
                             ws.send(json.dumps(frame))
                         elif type(frame) == bytes:
                             ws.send(frame, websocket.ABNF.OPCODE_BINARY)
-                        # print('发送 ------> ' + str(type(frame)))
+                        #print('发送 ------> ' + str(type(frame)))
                 except Exception as e:
                     print(e)
                 time.sleep(0.04)

diff --git a/ai_module/ms_tts_sdk.py b/ai_module/ms_tts_sdk.py
@@ -6,14 +6,16 @@
 from core.tts_voice import EnumVoice
 from utils import util, config_util
 from utils import config_util as cfg
+import pygame
+
 
 
 class Speech:
     def __init__(self):
         self.__speech_config = speechsdk.SpeechConfig(subscription=cfg.key_ms_tts_key, region=cfg.key_ms_tts_region)
         self.__speech_config.speech_recognition_language = "zh-CN"
         self.__speech_config.speech_synthesis_voice_name = "zh-CN-XiaoxiaoNeural"
-        self.__speech_config.set_speech_synthesis_output_format(speechsdk.SpeechSynthesisOutputFormat.Audio16Khz32KBitRateMonoMp3)
+        self.__speech_config.set_speech_synthesis_output_format(speechsdk.SpeechSynthesisOutputFormat.Riff16Khz16BitMonoPcm)
         self.__synthesizer = speechsdk.SpeechSynthesizer(speech_config=self.__speech_config, audio_config=None)
         self.__connection = None
         self.__history_data = []
@@ -57,7 +59,8 @@ def to_sample(self, text, style):
                '</speak>'.format(voice_name, style, 1.8, text)
         result = self.__synthesizer.speak_ssml(ssml)
         audio_data_stream = speechsdk.AudioDataStream(result)
-        file_url = './samples/sample-' + str(int(time.time() * 1000)) + '.mp3'
+
+        file_url = './samples/sample-' + str(int(time.time() * 1000)) + '.wav'
         audio_data_stream.save_to_wav_file(file_url)
         if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
             self.__history_data.append((voice_name, style, text, file_url))
@@ -66,3 +69,23 @@ def to_sample(self, text, style):
             util.log(1, "[x] 语音转换失败！")
             util.log(1, "[x] 原因: " + str(result.reason))
             return None
+if __name__ == '__main__':
+    cfg.load_config()
+    sp = Speech()
+    sp.connect()
+    pygame.init()
+    text = """一座城市，总有一条标志性道路，它见证着这座城市的时代变迁，并随着城市历史积淀砥砺前行，承载起城市的非凡荣耀。季华路，见证了佛山的崛起，从而也被誉为“最代表佛山城市发展的一条路”。季华路位于佛山市禅城区，是佛山市总体道路规划网中东西走向的城市主干道，全长20公里，是佛山市公路网络规划"四纵、九横、两环"主骨架中的重要组成部分，西接禅城南庄、高明、三水，东连南海、广州，横跨佛山一环、禅西大道、佛山大道、岭南大道、南海大道五大主干道，贯穿中心城区四个镇街，沿途经过多处文化古迹和重要产业区，是名副其实的“交通动脉”。同时季华路也是佛山的经济“大动脉”，代表着佛山蓬勃发展的现在，也影响着佛山日新月异的未来。
+        季华六路起于南海大道到文华北截至，道路为东西走向，全长1.5公里，该路段为1996年完成建设并投入使用，该道路为一级公路，路面使用混凝土材质，道路为双向5车道，路宽30米，途径1个行政单位，一条隧道，该路段设有格栅518个，两边护栏1188米，沙井盖158个，其中供水26个，市政77个，移动通讯2个，联通通讯3个，电信通讯3个，交通信号灯1个，人行天桥2个，电梯4台，标志牌18个，标线为1.64万米。
+        道路南行是文华中路，可通往亚洲艺术公园，亚洲艺术公园位于佛山市发展区的中心，占地40公顷，其中水体面积26.6公顷，以岭南水乡为文脉，以水上森林为绿脉，以龙舟竞渡为水脉，通过建筑、雕塑、植物、桥梁等设计要素，营造出一个具有亚洲艺术风采的艺术园地。曾获选佛山十大最美公园之一。
+        道路北行是文华北路，可通往佛山市委市政府。佛山市委市政府是广东省佛山市的行政管理机关。
+        道路西行到达文华公园。佛山市文华公园位于佛山市禅城区季华路以南（电视塔旁）、文华路以西，大福路以东路段，建设面积约11万平方米，主要将传统文化和现代园林有机结合，全园布局以大树木、大草坪、多彩植被和人工湖为表现主体，精致的溪涧、小桥、亲水平台点缀其间，通过棕榈植物错落有序的巧妙搭配，令园区既蕴涵亚热带曼妙风情，又不失岭南园艺的独特风采。通过“借景”、“透景”造园手法，与邻近的电视塔相映成趣，它的落成，为附近市民的休闲生活添上了色彩绚丽的一笔。
+
+        季华五路是季华路最先建设的一段道路，起于岭南大道到佛山大道截至，道路为东西走向，全长2.1公里，该路段为1993年完成建设并投入使用，该道路为一级公路，路面使用混凝土材质，道路为双向5车道，路宽30米，途径1个行政单位，该路段设有格栅634个，两边护栏1310米，沙井盖180个，其中供水30个，市政81个，移动通讯5个，联通通讯3个，交通信号灯2个，人行天桥3个，电梯12台，标志牌26个，标线为2.131万米。
+        沿途经过季华园，季华园即佛山季华公园，位于佛山市城南新区，1994年5月建成。占地200多亩。场内所有设施免费使用。景点介绍风格清新、意境优雅季华公园是具有亚热带风光的大型开放游览性公园。由于场内所有设施免费使用，地方广阔，每天都吸引着众多的游人前来休闲、运动等。
+        道路南行是佛山大道中，可通往乐从方向乐从镇，地处珠三角腹地，广佛经济圈核心带，是国家级重大国际产业、城市发展合作平台--中德工业服务区、中欧城镇化合作示范区的核心。
+        道路北行佛山大道中，可通往佛山火车站，佛山火车站是广东省的铁路枢纽之一，广三铁路经过该站。"""
+    s = sp.to_sample(text, "cheerful")
+    print(s)
+    pygame.mixer.music.load(s)
+    pygame.mixer.music.play()
+    sp.close()
diff --git a/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.gitignore b/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.gitignore
@@ -0,0 +1,15 @@
+*.iml
+.gradle
+/local.properties
+/.idea/caches
+/.idea/libraries
+/.idea/modules.xml
+/.idea/workspace.xml
+/.idea/navEditor.xml
+/.idea/assetWizardSettings.xml
+.DS_Store
+/build
+/captures
+.externalNativeBuild
+.cxx
+local.properties
diff --git a/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.idea/.gitignore b/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.idea/.gitignore
diff --git a/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.idea/compiler.xml b/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.idea/compiler.xml
diff --git a/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.idea/gradle.xml b/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.idea/gradle.xml
diff --git a/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.idea/misc.xml b/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/.idea/misc.xml
diff --git a/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/app/.gitignore b/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/app/.gitignore
@@ -0,0 +1 @@
+/build
diff --git a/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/app/build.gradle b/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/app/build.gradle
@@ -0,0 +1,38 @@
+plugins {
+    id 'com.android.application'
+}
+
+android {
+    compileSdk 32
+
+    defaultConfig {
+        applicationId "com.yaheen.fayconnectordemo"
+        minSdk 29
+        targetSdk 32
+        versionCode 1
+        versionName "1.0"
+
+        testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
+    }
+
+    buildTypes {
+        release {
+            minifyEnabled false
+            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
+        }
+    }
+    compileOptions {
+        sourceCompatibility JavaVersion.VERSION_1_8
+        targetCompatibility JavaVersion.VERSION_1_8
+    }
+}
+
+dependencies {
+
+    implementation 'androidx.appcompat:appcompat:1.3.0'
+    implementation 'com.google.android.material:material:1.4.0'
+    implementation 'androidx.constraintlayout:constraintlayout:2.0.4'
+    testImplementation 'junit:junit:4.13.2'
+    androidTestImplementation 'androidx.test.ext:junit:1.1.3'
+    androidTestImplementation 'androidx.test.espresso:espresso-core:3.4.0'
+}
diff --git a/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/app/proguard-rules.pro b/android_connector_demo/fayConnectorDemo-蓝牙service后台运行版/app/proguard-rules.pro
@@ -0,0 +1,21 @@
+# Add project specific ProGuard rules here.
+# You can control the set of applied configuration files using the
+# proguardFiles setting in build.gradle.
+#
+# For more details, see
+#   http://developer.android.com/guide/developing/tools/proguard.html
+
+# If your project uses WebView with JS, uncomment the following
+# and specify the fully qualified class name to the JavaScript interface
+# class:
+#-keepclassmembers class fqcn.of.javascript.interface.for.webview {
+#   public *;
+#}
+
+# Uncomment this to preserve the line number information for
+# debugging stack traces.
+#-keepattributes SourceFile,LineNumberTable
+
+# If you keep the line number information, uncomment this to
+# hide the original source file name.
+#-renamesourcefileattribute SourceFile
diff --git a/...ce后台运行版/app/src/androidTest/java/com/yaheen/fayconnectordemo/ExampleInstrumentedTest.java b/...ce后台运行版/app/src/androidTest/java/com/yaheen/fayconnectordemo/ExampleInstrumentedTest.java
@@ -0,0 +1,26 @@
+package com.yaheen.fayconnectordemo;
+
+import android.content.Context;
+
+import androidx.test.platform.app.InstrumentationRegistry;
+import androidx.test.ext.junit.runners.AndroidJUnit4;
+
+import org.junit.Test;
+import org.junit.runner.RunWith;
+
+import static org.junit.Assert.*;
+
+/**
+ * Instrumented test, which will execute on an Android device.
+ *
+ * @see <a href="http://d.android.com/tools/testing">Testing documentation</a>
+ */
+@RunWith(AndroidJUnit4.class)
+public class ExampleInstrumentedTest {
+    @Test
+    public void useAppContext() {
+        // Context of the app under test.
+        Context appContext = InstrumentationRegistry.getInstrumentation().getTargetContext();
+        assertEquals("com.yaheen.fayconnectordemo", appContext.getPackageName());
+    }
+}