周末愉快

1、唇型计算的视音素更换成33毫秒； 2、内置rwkv_api nlp可以直接使用； 3、降低情绪性向数字人端推送的频度； 4、非数字人连接状态不产生接口消息； 5、修复因mp3格式错误而导致一定概率不推送播放信息给数字人端的问题； 6、修复静音等指令执行时提前结束nlp逻辑，而导致用户提问消息不推送数字人端问题； 7、补充wav文件启动清理； 8、websocket工具类升级完善。
xiaojiayang · Aug 4, 2023 · ba6972a · ba6972a
1 parent f8ff5a2
commit ba6972a
Show file tree

Hide file tree

Showing 38 changed files with 309 additions and 246 deletions.
diff --git a/README.md b/README.md
@@ -120,43 +120,42 @@ Remote Android　　[Live2D](https://www.bilibili.com/video/BV1sx4y1d775/?vd_sou
 
 ## **三、升级日志**
 
+**2023.08.04：**
+
++ UE5工程更新；
++ 唇型计算的视音素更换成33毫秒；
++ 内置rwkv_api nlp可以直接使用；
++ 降低情绪性向数字人端推送的频度；
++ 非数字人连接状态不产生接口消息；
++ 修复因mp3格式错误而导致一定概率不推送播放信息给数字人端的问题；
++ 修复静音等指令执行时提前结束nlp逻辑，而导致用户提问消息不推送数字人端问题；
++ 补充wav文件启动清理；
++ websocket工具类升级完善。
+
 **2023.07.28：**
 
 + 增加运行时自动清理ui缓存；
 + 增加gpt代理设置可为空；
 + 提高灵聚对接的稳定性。
 
-**2023.07.26：**
-
 + 修复连接数字人之前产生大量ws信息问题；
 + 增加数字人（ue、live2d、xuniren）通讯接口：实时日志；
 + 更新数字人（ue、live2d、xuniren）通讯接口：音频推送。
 
-**2023.07.21：**
-
 + 带货版多项更新；
 
-
-**2023.07.19：**
-
 + 修复远程语音不识别问题；
 + 修复asr时有不灵问题；
 + 去除唱歌指令。
 
-**2023.07.14：**
-
 + 修复linux及mac运行出错问题；
 + 修复因唇型出错无法继续执行问题；
 + 提供rwkv对接方案。
 
-**2023.07.12：**
-
 + 修复助理版文字输入不读取人设回复问题；
 + 修复助理版文字输入不读取qa回复问题；
 + 增强麦克风接入稳定性。
 
-**2023.07.05：**
-
 + 修复无法运行唇型算法而导致的不播放声音问题。
 
 **2023.06：**

diff --git a/README_EN.md b/README_EN.md
@@ -122,40 +122,43 @@ Message format: View [WebSocket.md](https://github.com/TheRamU/Fay/blob/main/Web
 
 ## **Upgrade Log**
 
-**2023.07.28：**
+**2023.08.04:**
+
+- UE5 project updated.
+- Audio-visual pixel for lip-reading is replaced by 33ms.
+- Built-in rwkv_api nlp can be used directly.
+- The frequency of emotional pushing to digital human terminal is reduced.
+- No interface message is generated when the digital human is not connected.
+- The problem that the playback information is not pushed to the digital human terminal with a certain probability due to the wrong mp3 format is fixed.
+- The problem that the nlp logic is ended early when commands such as mute are executed, and the user's question message is not pushed to the digital human terminal is fixed.
+- wav file startup cleaning is supplemented.
+- WebSocket tool class is upgraded and improved.
+
+**2023.07：**
 
 + Add runtime automatic cleaning of UI cache;
 +  Add GPT proxy setting can be null;
 + Improve the stability of Lingju docking.
 
-**2023.07.21：**
-
 + Fixed the problem of generating a large amount of WS information before connecting digital humans;
 +  Add digital human (UE, Live2D, Xuniren) communication interface: real-time logs;
 + Update digital human (UE, Live2D, Xuniren) communication interface: audio push.
 
-**2023.07.21：**
-
 + Multiple updates for the merchandise version.
 
-**2023.07.19：**
 + Fixed the issue of remote voice recognition.
 + Fixed the issue of occasional unresponsiveness during ASR (Automatic Speech Recognition).
 + Removed the singing command.
 
-**2023.07.14：**
-
 + Fixed Linux and macOS runtime errors.
 + Fixed the issue of being unable to continue execution due to lip-sync errors.
 + Provided an integration solution for RWKV.
 
-**2023.07.12：**
-
 + Fixed an issue in Assistant Edition where text input does not read persona responses.
 + Fixed an issue in Assistant Edition where text input does not read QA responses.
 + Enhanced microphone stability.
 
-**2023.07.05：**
+****
 
 + Fixed a sound playback issue caused by the inability to run the lip-sync algorithm.
 

diff --git a/[Start] PowerShell.bat b/[Start] PowerShell.bat
diff --git a/[Start].bat b/[Start].bat
diff --git a/ai_module/nlp_rwkv_api.py b/ai_module/nlp_rwkv_api.py
@@ -43,7 +43,7 @@ def question(cont):
 
     print("接口调用耗时 :" + str(time.time() - starttime))
 
-    return response_text
+    return response_text.strip()
 
 if __name__ == "__main__":
     for i in range(3):

diff --git a/core/fay_core.py b/core/fay_core.py
@@ -36,6 +36,7 @@
 from ai_module import yolov8
 from ai_module import nlp_VisualGLM
 from ai_module import nlp_lingju
+from ai_module import nlp_rwkv_api
 
 import platform
 if platform.system() == "Windows":
@@ -51,7 +52,8 @@
     "nlp_chatgpt": nlp_chatgpt,
     "nlp_rasa": nlp_rasa,
     "nlp_VisualGLM": nlp_VisualGLM,
-    "nlp_lingju": nlp_lingju
+    "nlp_lingju": nlp_lingju,
+    "nlp_rwkv_api":nlp_rwkv_api
 }
 
 
@@ -125,6 +127,8 @@ def __init__(self):
         self.q_msg = '你叫什么名字？'
         self.a_msg = 'hi,我叫菲菲，英文名是fay'
         self.mood = 0.0  # 情绪值
+        self.old_mood = 0.0
+        self.connect = False
         self.item_index = 0
         self.deviceSocket = None
         self.deviceConnect = None
@@ -228,7 +232,9 @@ def __auto_speak(self):
                     index = interact.interact_type
                     if index == 1:
                         self.q_msg = interact.data["msg"]
-
+                        if not config_util.config["interact"]["playSound"]: # 非展板播放
+                            content = {'Topic': 'Unreal', 'Data': {'Key': 'question', 'Value': self.q_msg}}
+                            wsa_server.get_instance().add_cmd(content)
                         #fay eyes
                         fay_eyes = yolov8.new_instance()            
                         if fay_eyes.get_status():#YOLO正在运行
@@ -251,10 +257,6 @@ def __auto_speak(self):
                         contentdb = Content_Db()    
                         contentdb.add_content('member','speak',self.q_msg)
                         wsa_server.get_web_instance().add_cmd({"panelReply": {"type":"member","content":self.q_msg}})
-                        if not config_util.config["interact"]["playSound"]: # 非展板播放
-                            content = {'Topic': 'Unreal', 'Data': {'Key': 'question', 'Value': self.q_msg}}
-                            wsa_server.get_instance().add_cmd(content)
-
                         text = ''
                         textlist = []
                         self.speaking = True
@@ -304,11 +306,20 @@ def __fay(self, index):
 
     # 发送情绪
     def __send_mood(self):
-        while self.__running:
+         while self.__running:
             time.sleep(3)
             if not self.sleep and not config_util.config["interact"]["playSound"] and wsa_server.get_instance().isConnect:
                 content = {'Topic': 'Unreal', 'Data': {'Key': 'mood', 'Value': self.mood}}
-                wsa_server.get_instance().add_cmd(content)
+                if not self.connect:
+                      wsa_server.get_instance().add_cmd(content)
+                      self.connect = True
+                else:
+                    if  self.old_mood != self.mood:
+                        wsa_server.get_instance().add_cmd(content)
+                        self.old_mood = self.mood
+
+            else:
+                  self.connect = False
 
     # 更新情绪
     def __update_mood(self, typeIndex):
@@ -364,13 +375,13 @@ def __say(self, styleType):
                 self.speaking = False
             else:
                 util.printInfo(1, '菲菲', '({}) {}'.format(self.__get_mood_voice(), self.a_msg))
+                if not config_util.config["interact"]["playSound"]: # 非展板播放
+                    content = {'Topic': 'Unreal', 'Data': {'Key': 'text', 'Value': self.a_msg}}
+                    wsa_server.get_instance().add_cmd(content)
                 MyThread(target=storer.storage_live_interact, args=[Interact('Fay', 0, {'user': 'Fay', 'msg': self.a_msg})]).start()
                 util.log(1, '合成音频...')
                 tm = time.time()
                 #文字也推送出去，为了ue5
-                if not config_util.config["interact"]["playSound"]: # 非展板播放
-                    content = {'Topic': 'Unreal', 'Data': {'Key': 'text', 'Value': self.a_msg}}
-                    wsa_server.get_instance().add_cmd(content)
                 result = self.sp.to_sample(self.a_msg, self.__get_mood_voice())
                 util.log(1, '合成音频完成. 耗时: {} ms 文件:{}'.format(math.floor((time.time() - tm) * 1000), result))
                 if result is not None:            
@@ -390,7 +401,10 @@ def __play_sound(self, file_url):
 
     def __send_or_play_audio(self, file_url, say_type):
         try:
-            audio_length = eyed3.load(file_url).info.time_secs #mp3音频长度
+            try:
+                audio_length = eyed3.load(file_url).info.time_secs #mp3音频长度
+            except Exception as e:
+                audio_length = 3
             # with wave.open(file_url, 'rb') as wav_file: #wav音频长度
             #     audio_length = wav_file.getnframes() / float(wav_file.getframerate())
             #     print(audio_length)

diff --git a/core/wsa_server.py b/core/wsa_server.py
@@ -25,51 +25,57 @@ def __init__(self, host='0.0.0.0', port=10000):
     def __del__(self):
         self.stop_server()
 
+    # 接收处理
     async def __consumer_handler(self, websocket, path):
         async for message in websocket:
             await self.__consumer(message)
 
+    # 发送处理
     async def __producer_handler(self, websocket, path):
         while self.__running:
             await asyncio.sleep(0.000001)
             message = await self.__producer()
             if message:
                 await websocket.send(message)
-                # util.log('发送 {}'.format(message))
+
 
     async def __handler(self, websocket, path):
-        isConnect = True
+        self.isConnect = True
         util.log(1,"websocket连接上:{}".format(self.__port))
         self.on_connect_handler()
         consumer_task = asyncio.ensure_future(self.__consumer_handler(websocket, path))
         producer_task = asyncio.ensure_future(self.__producer_handler(websocket, path))
         done, self.__pending = await asyncio.wait([consumer_task, producer_task], return_when=asyncio.FIRST_COMPLETED, )
         for task in self.__pending:
             task.cancel()
-            isConnect = False
+            self.isConnect = False
             util.log(1,"websocket连接断开:{}".format(self.__port))
-
-    # 接收处理
+
     async def __consumer(self, message):
         self.on_revice_handler(message)
-
-    # 发送处理
+
     async def __producer(self):
         if len(self.__listCmd) > 0:
-            return self.__listCmd.pop(0)
+            message = self.on_send_handler(self.__listCmd.pop(0))
+            return message
         else:
             return None
 
 
-    #Edit by xszyou on 20230113:通过继承此类来实现服务端的接收处理逻辑
+    #Edit by xszyou on 20230113:通过继承此类来实现服务端的接收后处理逻辑
     @abstractmethod
     def on_revice_handler(self, message):
         pass
+
     #Edit by xszyou on 20230114:通过继承此类来实现服务端的连接处理逻辑
     @abstractmethod
     def on_connect_handler(self):
         pass
 
+    #Edit by xszyou on 20230804:通过继承此类来实现服务端的发送前的处理逻辑
+    @abstractmethod
+    def on_send_handler(self, message):
+        return message
 
     # 创建server
     def __connect(self):
@@ -98,7 +104,7 @@ def start_server(self):
     # 关闭服务
     def stop_server(self):
         self.__running = False
-        isConnect = False
+        self.isConnect = False
         if self.__server is None:
             return
         self.__server.ws_server.close()
@@ -114,6 +120,7 @@ def stop_server(self):
         except BaseException as e:
             print("Error: {}".format(e))
 
+#数字人端server
 class HumanServer(MyServer):
     def __init__(self, host='0.0.0.0', port=10000):
         super().__init__(host, port)
@@ -124,6 +131,11 @@ def on_revice_handler(self, message):
     def on_connect_handler(self):
         pass
 
+    def on_send_handler(self, message):
+        # util.log(1, '向human发送 {}'.format(message))
+        return message
+
+#ui端server
 class WebServer(MyServer):
     def __init__(self, host='0.0.0.0', port=10000):
         super().__init__(host, port)
@@ -134,6 +146,10 @@ def on_revice_handler(self, message):
     def on_connect_handler(self):
         self.add_cmd({"panelMsg": "使用提示：直播，请关闭麦克风。连接数字人，请关闭面板播放。"})
 
+    def on_send_handler(self, message):
+        return message
+
+#测试
 class TestServer(MyServer):
     def __init__(self, host='0.0.0.0', port=10000):
         super().__init__(host, port)
@@ -143,9 +159,12 @@ def on_revice_handler(self, message):
 
     def on_connect_handler(self):
         print("连接上了")
+
+    def on_send_handler(self, message):
+        return message
 
 
-
+#单例
 
 __instance: MyServer = None
 __web_instance: MyServer = None

diff --git a/fay_booter.py b/fay_booter.py
@@ -6,11 +6,12 @@
 from scheduler.thread_manager import MyThread
 from utils import util, config_util, stream_util, ngrok_util
 from core.wsa_server import MyServer
+from scheduler.thread_manager import MyThread
 
 feiFei: FeiFei = None
 recorderListener: Recorder = None
 
-__running = True
+__running = False
 
 #录制麦克风音频输入并传给aliyun
 class RecorderListener(Recorder):
@@ -19,6 +20,7 @@ def __init__(self, device, fei):
         self.__device = device
         self.__RATE = 16000
         self.__FORMAT = pyaudio.paInt16
+        self.__running = False
 
         super().__init__(fei)
 
@@ -39,8 +41,15 @@ def get_stream(self):
             util.log(1, '请检查设备是否有误，再重新启动!')
             return
         self.stream = self.paudio.open(input_device_index=device_id, rate=self.__RATE, format=self.__FORMAT, channels=channels, input=True)
+        self.__running = True
+        MyThread(target=self.__pyaudio_clear).start()
         return self.stream
 
+    def __pyaudio_clear(self):
+        while self.__running:
+            time.sleep(30)
+
+
     def __findInternalRecordingDevice(self, p):
         for i in range(p.get_device_count()):
             devInfo = p.get_device_info_by_index(i)
@@ -53,6 +62,7 @@ def __findInternalRecordingDevice(self, p):
 
     def stop(self):
         super().stop()
+        self.__running = False
         try:
             self.stream.stop_stream()
             self.stream.close()