Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Speech API development. #2522

Closed
14 tasks done
daspecster opened this issue Oct 10, 2016 · 10 comments
Closed
14 tasks done

Tracking Speech API development. #2522

daspecster opened this issue Oct 10, 2016 · 10 comments
Assignees
Labels
api: speech Issues related to the Speech-to-Text API.

Comments

@daspecster
Copy link
Contributor

daspecster commented Oct 10, 2016

These are the items required to reach feature completion.

Cleanup

Future Updates:

@daspecster daspecster added the api: speech Issues related to the Speech-to-Text API. label Oct 10, 2016
@daspecster daspecster self-assigned this Oct 10, 2016
@daspecster
Copy link
Contributor Author

@dhermes I commented on your gist from today but I'll copy it here as well...
Ref: https://gist.github.com/dhermes/09c964d6d27003ae817b650424fda7c3


Thank you for this!

I have another question.
As you saw in your output here, the final result has a response attribute.

In the sample the way that you unpack the response is via

response = cloud_speech_pb2.AsyncRecognizeResponse()
operation.response.Unpack(response)

But that has to be done after the Operation is completed.
How would we handle that in google-cloud-python?

The point of async is not to block right? If our lib has to wait until the operation is complete before the data can be unpacked then it would block right?

One thought I had was to add a helper of some kind that unpacks the data for us.

import time
from google.cloud import speech
from google.cloud.speech.encoding import Encoding

client = speech.Client()
sample = client.sample(source_uri='gs://ferrous-arena-my-test-bucket/sample.raw', encoding=Encoding.LINEAR16, sample_rate=16000)

operation = client.async_recognize(sample, max_alternatives=2)

retry_count = 10
while retry_count > 0 and not operation.complete:
    retry_count -= 1
    time.sleep(1)

    operation.poll()  # API call

for result in speech.unpack_async(operation):  # The helper `unpack_async`.
        print('Result:')
        for alternative in result.alternatives:
            print(u'  ({}): {}'.format(
                alternative.confidence, alternative.transcript))

But otherwise I'm not sure how to get the data without blocking unless we do some kind of weird pubsub design.

@daspecster
Copy link
Contributor Author

Scratch that...we could just parse it in Operation.poll() right? I think I would have to either override google.cloud.core.operation.Operation.poll or I could add some kind of polling proxy method.

@dhermes
Copy link
Contributor

dhermes commented Oct 28, 2016

@daspecster It's not up to use to get the data, just give the user the Operation and let them decide how to poll.

@fehrenbacher
Copy link

Is this available yet? I've installed google-cloud 0.20.0, but can't find the google.cloud.speech.client.Client class. I've also found this page for a separate google-cloud-speech library, but pip can't find any actual releases there. The documentation sure makes it sound like this is available to use now...

@gw00207
Copy link

gw00207 commented Nov 9, 2016

@fehrenbacher yes, that documentation is confusing. hopefully a message is added soon to explain: #2620

@daspecster
Copy link
Contributor Author

@fehrenbacher it's not released yet, however if you install from source you can play with the API.

As you can see in the first link, it's at the "Planning" stage. That package is a place holder.

The documentation message hasn't gone up because there hasn't been a release yet. We're working on a better release strategy as well.

@fehrenbacher
Copy link

K thanks for clearing that up!

@daspecster
Copy link
Contributor Author

@fehrenbacher it's my mistake with having the docs out there. I think I have a better process for next time. If you decide to install from source and play with it, please let me know if you run into any issues!

I have a handful of things I'm working to resolve this week and then I'm hoping we will be able to release it.

See: https://github.com/GoogleCloudPlatform/google-cloud-python/issues?q=is%3Aissue+is%3Aopen+label%3Aspeech

@gw00207
Copy link

gw00207 commented Nov 11, 2016

@daspecster nice :-) looking forward to it!

@daspecster
Copy link
Contributor Author

Closing this since everything is complete and I opened #2842 to track the last few little changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: speech Issues related to the Speech-to-Text API.
Projects
None yet
Development

No branches or pull requests

4 participants