Skip to content

Commit

Permalink
Speech GAPIC to master (#3607)
Browse files Browse the repository at this point in the history
* Vendor the GAPIC for Speech.

* Speech Partial Veneer (#3483)

* Update to docs based on @dhermes catch.

* Fix incorrect variable.

* Fix the docs.

* Style fixes to unit tests.

* More PR review from me.
  • Loading branch information
lukesneeringer authored and dhermes committed Jul 14, 2017
1 parent 66a9258 commit 401bf40
Show file tree
Hide file tree
Showing 35 changed files with 2,589 additions and 288 deletions.
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
resource-manager/api
runtimeconfig/usage
spanner/usage
speech/usage
speech/index
error-reporting/usage
monitoring/usage
logging/usage
Expand Down
7 changes: 0 additions & 7 deletions docs/speech/alternative.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/speech/client.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/speech/encoding.rst

This file was deleted.

6 changes: 6 additions & 0 deletions docs/speech/gapic/api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Speech Client API
=================

.. automodule:: google.cloud.speech_v1
:members:
:inherited-members:
5 changes: 5 additions & 0 deletions docs/speech/gapic/types.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Speech Client Types
===================

.. automodule:: google.cloud.speech_v1.types
:members:
228 changes: 140 additions & 88 deletions docs/speech/usage.rst → docs/speech/index.rst
Original file line number Diff line number Diff line change
@@ -1,49 +1,41 @@
######
Speech
======

.. toctree::
:maxdepth: 2
:hidden:

client
encoding
operation
result
sample
alternative
######

The `Google Speech`_ API enables developers to convert audio to text.
The API recognizes over 80 languages and variants, to support your global user
base.

.. _Google Speech: https://cloud.google.com/speech/docs/getting-started

Client
------

:class:`~google.cloud.speech.client.Client` objects provide a
Authentication and Configuration
--------------------------------

:class:`~google.cloud.speech_v1.SpeechClient` objects provide a
means to configure your application. Each instance holds
an authenticated connection to the Cloud Speech Service.

For an overview of authentication in ``google-cloud-python``, see
:doc:`/core/auth`.

Assuming your environment is set up as described in that document,
create an instance of :class:`~google.cloud.speech.client.Client`.
create an instance of :class:`~.speech_v1.SpeechClient`.

.. code-block:: python
>>> from google.cloud import speech
>>> client = speech.Client()
>>> client = speech.SpeechClient()
Asynchronous Recognition
------------------------

The :meth:`~google.cloud.speech.Client.long_running_recognize` sends audio
data to the Speech API and initiates a Long Running Operation. Using this
operation, you can periodically poll for recognition results. Use asynchronous
requests for audio data of any duration up to 80 minutes.
The :meth:`~.speech_v1.SpeechClient.long_running_recognize` method
sends audio data to the Speech API and initiates a Long Running Operation.

Using this operation, you can periodically poll for recognition results.
Use asynchronous requests for audio data of any duration up to 80 minutes.

See: `Speech Asynchronous Recognize`_

Expand All @@ -52,13 +44,16 @@ See: `Speech Asynchronous Recognize`_
>>> import time
>>> from google.cloud import speech
>>> client = speech.Client()
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
... encoding=speech.Encoding.LINEAR16,
... sample_rate_hertz=44100)
>>> operation = sample.long_running_recognize(
... language_code='en-US',
... max_alternatives=2,
>>> client = speech.SpeechClient()
>>> operation = client.long_running_recognize(
... audio=speech.types.RecognitionAudio(
... uri='gs://my-bucket/recording.flac',
... ),
... config=speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... ),
... )
>>> retry_count = 100
>>> while retry_count > 0 and not operation.complete:
Expand All @@ -80,7 +75,7 @@ See: `Speech Asynchronous Recognize`_
Synchronous Recognition
-----------------------

The :meth:`~google.cloud.speech.Client.recognize` method converts speech
The :meth:`~.speech_v1.SpeechClient.recognize` method converts speech
data to text and returns alternative text transcriptions.

This example uses ``language_code='en-GB'`` to better recognize a dialect from
Expand All @@ -89,12 +84,17 @@ Great Britain.
.. code-block:: python
>>> from google.cloud import speech
>>> client = speech.Client()
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
... encoding=speech.Encoding.FLAC,
... sample_rate_hertz=44100)
>>> results = sample.recognize(
... language_code='en-GB', max_alternatives=2)
>>> client = speech.SpeechClient()
>>> results = client.recognize(
... audio=speech.types.RecognitionAudio(
... uri='gs://my-bucket/recording.flac',
... ),
... config=speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... ),
... )
>>> for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
Expand All @@ -112,14 +112,17 @@ Example of using the profanity filter.
.. code-block:: python
>>> from google.cloud import speech
>>> client = speech.Client()
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
... encoding=speech.Encoding.FLAC,
... sample_rate_hertz=44100)
>>> results = sample.recognize(
... language_code='en-US',
... max_alternatives=1,
... profanity_filter=True,
>>> client = speech.SpeechClient()
>>> results = client.recognize(
... audio=speech.types.RecognitionAudio(
... uri='gs://my-bucket/recording.flac',
... ),
... config=speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... profanity_filter=True,
... sample_rate_hertz=44100,
... ),
... )
>>> for result in results:
... for alternative in result.alternatives:
Expand All @@ -137,15 +140,20 @@ words to the vocabulary of the recognizer.
.. code-block:: python
>>> from google.cloud import speech
>>> client = speech.Client()
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
... encoding=speech.Encoding.FLAC,
... sample_rate_hertz=44100)
>>> hints = ['hi', 'good afternoon']
>>> results = sample.recognize(
... language_code='en-US',
... max_alternatives=2,
... speech_contexts=hints,
>>> from google.cloud import speech
>>> client = speech.SpeechClient()
>>> results = client.recognize(
... audio=speech.types.RecognitionAudio(
... uri='gs://my-bucket/recording.flac',
... ),
... config=speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... speech_contexts=[speech.types.SpeechContext(
... phrases=['hi', 'good afternoon'],
... )],
... ),
... )
>>> for result in results:
... for alternative in result.alternatives:
Expand All @@ -160,7 +168,7 @@ words to the vocabulary of the recognizer.
Streaming Recognition
---------------------

The :meth:`~google.cloud.speech.Client.streaming_recognize` method converts
The :meth:`~speech_v1.SpeechClient.streaming_recognize` method converts
speech data to possible text alternatives on the fly.

.. note::
Expand All @@ -170,18 +178,27 @@ speech data to possible text alternatives on the fly.

.. code-block:: python
>>> import io
>>> from google.cloud import speech
>>> client = speech.Client()
>>> with open('./hello.wav', 'rb') as stream:
... sample = client.sample(stream=stream,
... encoding=speech.Encoding.LINEAR16,
... sample_rate_hertz=16000)
... results = sample.streaming_recognize(language_code='en-US')
... for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
>>> client = speech.SpeechClient()
>>> config = speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... )
>>> with io.open('./hello.wav', 'rb') as stream:
... requests = [speech.types.StreamingRecognizeRequest(
... audio_content=stream.read(),
... )]
>>> results = sample.streaming_recognize(
... config=speech.types.StreamingRecognitionConfig(config=config),
... requests,
... )
>>> for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
====================
transcript: hello thank you for using Google Cloud platform
confidence: 0.927983105183
Expand All @@ -193,20 +210,36 @@ until the client closes the output stream or until the maximum time limit has
been reached.
If you only want to recognize a single utterance you can set
``single_utterance`` to :data:`True` and only one result will be returned.
``single_utterance`` to :data:`True` and only one result will be returned.
See: `Single Utterance`_
.. code-block:: python
>>> with open('./hello_pause_goodbye.wav', 'rb') as stream:
... sample = client.sample(stream=stream,
... encoding=speech.Encoding.LINEAR16,
... sample_rate_hertz=16000)
... results = sample.streaming_recognize(
... language_code='en-US',
... single_utterance=True,
... )
>>> import io
>>> from google.cloud import speech
>>> client = speech.SpeechClient()
>>> config = speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... )
>>> with io.open('./hello-pause-goodbye.wav', 'rb') as stream:
... requests = [speech.types.StreamingRecognizeRequest(
... audio_content=stream.read(),
... )]
>>> results = sample.streaming_recognize(
... config=speech.types.StreamingRecognitionConfig(
... config=config,
... single_utterance=False,
... ),
... requests,
... )
>>> for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
... for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
Expand All @@ -221,22 +254,31 @@ If ``interim_results`` is set to :data:`True`, interim results
.. code-block:: python
>>> import io
>>> from google.cloud import speech
>>> client = speech.Client()
>>> with open('./hello.wav', 'rb') as stream:
... sample = client.sample(stream=stream,
... encoding=speech.Encoding.LINEAR16,
... sample_rate=16000)
... results = sample.streaming_recognize(
... interim_results=True,
... language_code='en-US',
... )
... for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
... print('is_final:' + str(result.is_final))
>>> client = speech.SpeechClient()
>>> config = speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... )
>>> with io.open('./hello.wav', 'rb') as stream:
... requests = [speech.types.StreamingRecognizeRequest(
... audio_content=stream.read(),
... )]
>>> results = sample.streaming_recognize(
... config=speech.types.StreamingRecognitionConfig(
... config=config,
... iterim_results=True,
... ),
... requests,
... )
>>> for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
... print('is_final:' + str(result.is_final))
====================
'he'
None
Expand All @@ -254,3 +296,13 @@ If ``interim_results`` is set to :data:`True`, interim results
.. _Single Utterance: https://cloud.google.com/speech/reference/rpc/google.cloud.speech.v1beta1#streamingrecognitionconfig
.. _sync_recognize: https://cloud.google.com/speech/reference/rest/v1beta1/speech/syncrecognize
.. _Speech Asynchronous Recognize: https://cloud.google.com/speech/reference/rest/v1beta1/speech/asyncrecognize
API Reference
-------------
.. toctree::
:maxdepth: 2
gapic/api
gapic/types
7 changes: 0 additions & 7 deletions docs/speech/operation.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/speech/result.rst

This file was deleted.

Loading

0 comments on commit 401bf40

Please sign in to comment.