Add speech beta samples#1151
Conversation
…rd level confidence
|
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here (e.g. What to do if you already signed the CLAIndividual signers
Corporate signers
|
nnegrey
left a comment
There was a problem hiding this comment.
Couple small things here and there, looks good and thanks for updating the code.
speech/cloud-client/pom.xml
Outdated
| <groupId>com.google.cloud</groupId> | ||
| <artifactId>google-cloud-speech</artifactId> | ||
| <version>0.52.0-alpha</version> | ||
| <version>0.52.1-alpha-SNAPSHOT</version> |
There was a problem hiding this comment.
Update lib version when released.
| .setSampleRateHertz(8000) | ||
| .setEnableSpeakerDiarization(true) | ||
| .setDiarizationSpeakerCount(2) | ||
| .setEnableAutomaticPunctuation(true) |
There was a problem hiding this comment.
Let's remove the automatic punctuation for this sample. Unless its needed.
There was a problem hiding this comment.
+1 Especially in a sample, it should only have the minimum needed to get the feature you're demonstrating working.
| .setEnableSpeakerDiarization(true) | ||
| .setDiarizationSpeakerCount(2) | ||
| .setEnableAutomaticPunctuation(true) | ||
| .setModel("phone_call") |
There was a problem hiding this comment.
Are models required here? If not, remove.
| "Speaker Tag : %s \n", | ||
| alternative.getWords((alternative.getWordsCount() - 1)).getSpeakerTag()); | ||
| System.out.format( | ||
| "Word: %s\n\n", alternative.getWords((alternative.getWordsCount() - 1)).getWord()); |
There was a problem hiding this comment.
Lines: 838-841:
Is this printing out the last speaker and their last word?
Is is possible to maybe do something like: (Not sure what the results look like)
Speaker Tag ###: Hey, how are you?
Speaker Tag ***: I'm doing good.
Thoughts?
There was a problem hiding this comment.
It is printing out the last speaker and last word. The words array contains the entire transcript up until that point.
Speaker Tag ###: Hey, how are you? : Definitely makes more sense. Will switch it out.
There was a problem hiding this comment.
Can you add this as a comment? ie the explanation of why you're getting the last word of the alternative instead of, say, all the words or the first word.
| .setEnableSpeakerDiarization(true) | ||
| .setDiarizationSpeakerCount(2) | ||
| .setEnableAutomaticPunctuation(true) | ||
| .setModel("phone_call") |
There was a problem hiding this comment.
Same as above, remove setEnableAutomaticPunctuation, and setModel (if possible)
| .setSampleRateHertz(44100) | ||
| .setAudioChannelCount(2) | ||
| .setEnableSeparateRecognitionPerChannel(true) | ||
| .setEnableAutomaticPunctuation(true) |
There was a problem hiding this comment.
Remove setEnableAutomaticPunctuation (if possible)
| RecognitionAudio recognitionAudio = | ||
| RecognitionAudio.newBuilder().setContent(ByteString.copyFrom(content)).build(); | ||
|
|
||
| // Configure request to enable enhanced models |
There was a problem hiding this comment.
// Configure request to enable multiple channels
| .setSampleRateHertz(44100) | ||
| .setAudioChannelCount(2) | ||
| .setEnableSeparateRecognitionPerChannel(true) | ||
| .setEnableAutomaticPunctuation(true) |
| try (SpeechClient speechClient = SpeechClient.create()) { | ||
| RecognitionAudio recognitionAudio = | ||
| RecognitionAudio.newBuilder().setContent(ByteString.copyFrom(content)).build(); | ||
| // Configure request to enable multiple channels |
| public static void transcribeWordLevelConfidenceGcs(String gcsUri) throws Exception { | ||
| try (SpeechClient speechClient = SpeechClient.create()) { | ||
|
|
||
| // Configure request to enable multiple channels |
| StreamingRecognitionConfig config = StreamingRecognitionConfig.newBuilder() | ||
| .setConfig(recConfig) | ||
| .build(); | ||
| RecognitionConfig recConfig = |
There was a problem hiding this comment.
It'd be helpful if you could do these formatting changes in a separate PR, so that there isn't this giant diff of unrelated changes for reviewers to review..
There was a problem hiding this comment.
Sorry, given my access issues, I'm not sure how it may affect creating a separate PR. Putting everything together in one for now.
| .setSampleRateHertz(8000) | ||
| .setEnableSpeakerDiarization(true) | ||
| .setDiarizationSpeakerCount(2) | ||
| .setEnableAutomaticPunctuation(true) |
There was a problem hiding this comment.
+1 Especially in a sample, it should only have the minimum needed to get the feature you're demonstrating working.
| "Speaker Tag : %s \n", | ||
| alternative.getWords((alternative.getWordsCount() - 1)).getSpeakerTag()); | ||
| System.out.format( | ||
| "Word: %s\n\n", alternative.getWords((alternative.getWordsCount() - 1)).getWord()); |
There was a problem hiding this comment.
Can you add this as a comment? ie the explanation of why you're getting the last word of the alternative instead of, say, all the words or the first word.
|
All changes done. Please let me know if this is good to merge. |
| System.out.format("Transcript : %s\n", alternative.getTranscript()); | ||
| // The words array contains the entire transcript up until that point. | ||
| //Referencing the last spoken word to get the associated Speaker tag | ||
| System.out.format("Speaker Tag %s:%s\n", |
There was a problem hiding this comment.
Add a space between %s:%s --> %s: %s
| RecognitionAudio recognitionAudio = | ||
| RecognitionAudio.newBuilder().setContent(ByteString.copyFrom(content)).build(); | ||
|
|
||
| // Configure request to enable enhanced models |
There was a problem hiding this comment.
// Configure request to enable multiple channels
|
I signed it! |
|
All changes have been made. |
|
Checking if this triggers the CLA bot |
|
@nnegrey - Updated the client library too. Please let me know if this is good to merge |
|
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
|
Per an email from OSPO, nirupa-kumar has a signed CLA. |
|
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here (e.g. What to do if you already signed the CLAIndividual signers
Corporate signers
|
|
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
|
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here (e.g. What to do if you already signed the CLAIndividual signers
Corporate signers
|
|
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
|
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here (e.g. What to do if you already signed the CLAIndividual signers
Corporate signers
|
|
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
* Speech beta samples: Diarization,Multi-channel, Multi-language and Word level confidence * Update client library * Updates after review * Updates after review : Please let this be the last one :) * Update to released client library * Update to Inc. * Update to Inc. * Update to reference bucket for files
### Migrating samples from [googleapis/java-speech](https://togithub.com/googleapis/java-speech/tree/main/samples) into [java-docs-samples/speech](https://togithub.com/GoogleCloudPlatform/java-docs-samples) --- - samples: Speech GA - library update (#1212) - samples: Due to API backend changes, update the samples to match (#1595) - fix: update retry configs, adds generated samples (#26) - build: move clirr to separate check (#30) - feat: add speaker_tag to WordInfo (#40) - chore: update common templates, regenerate tests - samples: Fix flaky speech test for speaker diarization (#1829) - chore(regen): update license year for generated files (#82) - chore(regen): regenerate with updated year - samples: move generated samples to generated directory (#105) - chore: update common templates - samples: fix: flaky tests in speech (#2286) - samples: speech: move samples out of branch (#2324) - samples: scaffold pom.xml files (#118) - chore(deps): update dependency com.google.cloud:libraries-bom to v4.3.0 (#122) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.13 (#126) - samples: update shared config (#2443) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.14 (#130) - chore(deps): update dependency com.google.cloud:libraries-bom to v4.4.0 (#131) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.15 (#133) - chore(deps): update dependency com.google.cloud:libraries-bom to v4.4.1 (#134) - chore(deps): update dependency com.google.cloud:libraries-bom to v5 (#144) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.16 (#149) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.17 (#153) - chore: fix samples snippets and update name in repo-metadata (#155) - chore(deps): update dependency com.google.cloud:libraries-bom to v5.2.0 (#160) - chore(deps): update dependency com.google.cloud:libraries-bom to v5.3.0 (#167) - chore(deps): update dependency com.google.cloud:libraries-bom to v5.5.0 (#177) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.18 (#200) - chore(deps): update dependency com.google.cloud:libraries-bom to v5.7.0 (#199) - chore(deps): update dependency com.google.cloud:libraries-bom to v6 (#210) - chore(deps): update dependency com.google.cloud:libraries-bom to v7 (#214) - chore(deps): update dependency com.google.cloud:libraries-bom to v7.0.1 (#222) - chore(deps): update dependency com.google.cloud:libraries-bom to v8 (#227) - chore(deps): update dependency com.google.cloud:libraries-bom to v8.1.0 (#237) - samples: Add Speech API quickstart sample. (#497) - samples: Adds sync / async examples for local and remote files - samples: Fixes whitespace around while blocks - samples: Adds some basic javadocs and comments - samples: Infer project from env - samples: Updates to use v1 release. - samples: Fixes checkstyle issues. - samples: Adds streaming example and tests. - samples: Nits found in self-review. - samples: Removes commented out code snippet and adds note on async local file limit. - samples: Speech async examples (#612) - samples: Vision speech upgrade (#641) - samples: updating to latest google-cloud-* dependencies (#723) - samples: Upgrades client and addresses changes to long running operations - samples: Adds support for word time offset - samples: Minimizes cloud maven dependencies and fixes lint warnings - samples: Fixes seconds reported in word time offsets and enables maven checks - samples: Updates to highlight word time offsets (#787) - samples: Use only first alternative. Comments for clarity (#837) - samples: Auto-update dependencies. (#853) - samples: Auto-update dependencies. (#912) - samples: Updated mlengine, monitoring, pubsub, spanner, and speech. (#993) - samples: Speech samples (#1036) - samples: Add model selection to streaming sample (#1073) - samples: Model selection (#1074) - samples: Add Auto-Punctuation samples to speech (#1079) - samples: Add samples for enhanced models and metadata (#1093) - samples: Add speech beta samples (#1151) - samples: [DO_NOT_MERGE] Microphone streaming with a 1 minute duration. (#1185) - samples: Speech region tag update (#1188) - samples: updates word time offsets region tag (#1191) - samples: Speech GA - library update (#1212) - samples: Bump QuickStartSample to v1 (#1285) - samples: Infinite Stream recognition (#1297) - samples: Speech multi-channel GA (#1341) - samples: Data logging opt-in is no longer required for enhanced models (#1360) - samples: Updated Infinite streaming sample (#1422) - samples: Revert Tests, product team rolled back changes, Auto Punctuation behavior is back to the expected output (#1428) - samples: Increase timeout to 5 mins (#1453) - samples: Update Recognize.java (#1460) - samples: Add back missing break statement (#1512) - samples: Added command line option class + option to pass different lang code as argument (#1504) - samples: Update a default value to parameter (#1522) - samples: Add samples for speech diarization ga (auto-punctuation samples alrea… (#1744) - samples: speech: add ga samples and fix some flaky tests (#2049) - samples: update shared config (#2443) - samples: speech: make flaky tests generic (#2825) - samples: fix test dependencies - chore(deps): update dependency com.google.cloud:libraries-bom to v9 (#263) - chore(deps): update dependency com.google.cloud:libraries-bom to v10 (#271) - chore(deps): update dependency com.google.cloud:libraries-bom to v11 - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.21 (#294) - chore(deps): update dependency com.google.cloud:libraries-bom to v12 (#298) - test(deps): update dependency junit:junit to v4.13.1 - chore(deps): update dependency com.google.cloud:libraries-bom to v12.1.0 (#310) - chore(deps): update dependency com.google.cloud:libraries-bom to v13 (#321) - chore(deps): update dependency com.google.cloud:libraries-bom to v13.1.0 (#326) - test(deps): update dependency com.google.truth:truth to v1.1 (#322) - chore(deps): update dependency com.google.cloud:libraries-bom to v13.2.0 (#332) - chore(deps): update dependency com.google.cloud:libraries-bom to v13.3.0 (#334) - chore(deps): update dependency com.google.cloud:libraries-bom to v13.4.0 (#338) - chore(deps): update dependency com.google.cloud:libraries-bom to v14 (#347) - chore(deps): update dependency com.google.cloud:libraries-bom to v15 (#350) - chore(deps): update dependency com.google.cloud:libraries-bom to v15.1.0 (#357) - chore(deps): update dependency com.google.cloud:libraries-bom to v16 (#364) - samples: add recognize sample with profanity filter (#376) - samples: refactor quickstart to use a gcs file (#378) - chore(deps): update dependency com.google.cloud:libraries-bom to v16.2.0 (#389) - samples: add multi region transcribe sample (#394) - chore(deps): update dependency com.google.cloud:libraries-bom to v16.2.1 (#398) - chore(deps): update dependency com.google.cloud:libraries-bom to v16.3.0 (#405) - test(deps): update dependency com.google.truth:truth to v1.1.2 (#407) - chore(deps): update dependency com.google.cloud:libraries-bom to v16.4.0 (#423) - test(deps): update dependency junit:junit to v4.13.2 (#428) - chore(deps): update dependency com.google.cloud:libraries-bom to v17 (#441) - chore(deps): update dependency com.google.cloud:libraries-bom to v18 (#445) - chore(deps): update dependency com.google.cloud:libraries-bom to v18.1.0 (#456) - chore(deps): update dependency com.google.cloud:libraries-bom to v19 (#459) - chore(samples): adds model adaptation sample (#468) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.22 (#482) - chore(deps): update dependency com.google.cloud:libraries-bom to v20 (#486) - chore(deps): update dependency com.google.cloud:libraries-bom to v20.1.0 (#493) - chore(deps): update dependency com.google.cloud:libraries-bom to v20.2.0 (#505) - chore(deps): update dependency com.google.cloud:libraries-bom to v20.3.0 (#514) - chore(deps): update dependency com.google.cloud:libraries-bom to v20.4.0 (#523) - chore(deps): update dependency com.google.cloud:libraries-bom to v20.5.0 (#535) - test(deps): update dependency com.google.truth:truth to v1.1.3 (#537) - chore: change region (#538) - samples: adds export to GCS sample (#544) - chore(deps): update dependency com.google.cloud:libraries-bom to v20.6.0 (#552) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.23 (#551) - chore(deps): update dependency com.google.cloud:libraries-bom to v20.7.0 (#568) - chore(deps): update dependency com.google.cloud:libraries-bom to v20.8.0 (#578) - chore(deps): update dependency com.google.cloud:libraries-bom to v20.9.0 (#589) - chore(deps): update dependency com.google.cloud:libraries-bom to v21 (#625) - chore(deps): update dependency com.google.cloud:libraries-bom to v22 (#650) - chore(deps): update dependency com.google.cloud:libraries-bom to v23 (#663) - chore: migrate to owlbot (#660) - chore(deps): update dependency com.google.cloud:libraries-bom to v23.1.0 (#702) - chore(deps): update dependency com.google.cloud:libraries-bom to v24 (#719) - deps: update dependency commons-cli:commons-cli to v1.5.0 (#720) - sample: Configure polling algorithm in long recognition sample (#464) - chore: cleanup cloud RAD generation (#1269) (#725) - docs(samples): refactors the export-to-gcs sample (#737) - deps: update dependency org.json:json to v20211205 (#745) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.0.24 (#742) - chore(deps): update dependency com.google.cloud.samples:shared-configuration to v1.2.0 (#753) - chore(deps): update dependency com.google.cloud:libraries-bom to v24.1.0 (#758) - chore(deps): update dependency com.google.cloud:libraries-bom to v24.1.1 (#759) - chore(deps): update dependency com.google.cloud:libraries-bom to v24.1.2 (#764) - chore(deps): update dependency com.google.cloud:libraries-bom to v24.2.0 (#775) - chore(deps): update dependency com.google.cloud:libraries-bom to v24.3.0 (#794) - chore(deps): update dependency com.google.cloud:libraries-bom to v24.4.0 (#823) - deps: update dependency org.json:json to v20220320 (#835) - chore(deps): update dependency com.google.cloud:libraries-bom to v25 (#834) - chore(deps): update dependency com.google.cloud:libraries-bom to v25.1.0 (#849) - chore(deps): update dependency com.google.cloud:libraries-bom to v25.2.0 (#876) - chore(deps): update dependency com.google.cloud:libraries-bom to v25.3.0 (#883) - chore(deps): update dependency com.google.cloud:libraries-bom to v25.4.0 (#892) - chore(deps): update dependency com.google.cloud:libraries-bom to v26 (#918) - chore(deps): update dependency com.google.cloud:libraries-bom to v26.1.0 (#938) - chore(deps): update dependency com.google.cloud:libraries-bom to v26.1.1 (#941) - chore(deps): update dependency com.google.cloud:libraries-bom to v26.1.2 (#957) - deps: update dependency org.json:json to v20220924 (#961) - chore(deps): update dependency com.google.cloud:libraries-bom to v26.1.3 (#975) Fixes #issue > It's a good idea to open an issue first for discussion. - [ ] I have followed [Sample Format Guide](https://togithub.com/GoogleCloudPlatform/java-docs-samples/blob/main/SAMPLE_FORMAT.md) - [ ] `pom.xml` parent set to latest `shared-configuration` - [ ] Appropriate changes to README are included in PR - [ ] API's need to be enabled to test (tell us) - [ ] Environment Variables need to be set (ask us to set them) - [ ] **Tests** pass: `mvn clean verify` **required** - [ ] **Lint** passes: `mvn -P lint checkstyle:check` **required** - [ ] **Static Analysis**: `mvn -P lint clean compile pmd:cpd-check spotbugs:check` **advisory only** - [ ] Please **merge** this PR for me once it is approved.
Diarization,Multi-channel, Multi-language and Word level confidence
@nnegrey Please review