-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs to Transcribe Streaming Audio from Microphone and Performing Speech Recognition for Speech v2 API #11389
Comments
I have the same issue with speech-to-text-v2. I'll try to provide a bit more context: I have multiple IoT-Devices at different places. Some work, some don't. I have no Idea why, or what's the difference. Software and Hardware are the same on all devices.
Note: I removed the IPv6 from the error-message. pip3 freeze | grep google:
I happened to have this same problem with As by the examples, I feed audio-data via def generator(self):
"""acts as a blocking generator for buffered audio_data
when no data is there, the generator blocks till there is new data
this generator uses queue.Queue, thus it is thread-safe
Yields:
bytes: the buffered audio
"""
while not self.closed:
# use blocking get
chunk = self._buff.get()
# return when stop signal detected (None)
if chunk is None:
return
data = [chunk]
# consume the rest of the queue
while True:
try:
chunk = self._buff.get(block=False)
if chunk is None:
return
data.append(chunk)
except queue.Empty:
break
# yield result
yield b"".join(data) |
The Documentation here states, that 25 KB is the maximum. I attempted a fix: # yield result
bytes_chunk = b"".join(data)
for chunk in [bytes_chunk[x:x+25600] for x in range(0, len(bytes_chunk), 25600)]:
yield chunk Does get rid of this exact error, but then we just get another error:
Note: I removed the IPv6 from the error-message. |
I searched all over the internet but all I could find people that have same problems with me. Recently, speech v2 is released and there sample codes for various tasks. The most relevant sample is streaming speech recognition on a local file.
Whenever I try to implement for microphone, like we did in speech_v1p1beta1, an error occurs. The last error I stuck on is:
Google Speech Error: 400 Audio chunk can be of a a maximum of 25600 bytes. Received audio of 253952 bytes instead.
I assume it occurs because I can not define and split into chunk size for incoming microphone audio.
There is a need for Streaming Audio from Microphone and Performing Speech Recognition for Speech v2 API sample code in docs.
The text was updated successfully, but these errors were encountered: