Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Rough Draft of README #157

Merged
merged 4 commits into from
Aug 19, 2020
Merged

WIP - Rough Draft of README #157

merged 4 commits into from
Aug 19, 2020

Conversation

rbracco
Copy link
Contributor

@rbracco rbracco commented Aug 12, 2020

  • Updates the README to explain the expected inputs, outputs of CTCDecode, as well as their shapes.
  • Adds an installation command that works on Google Colab
  • Adds a resources section for people less familiar with Connectionist Temporal Classification (CTC) and Beam Search.

@SeanNaren
Copy link
Collaborator

Epic! I see your TODOs, will fill them in tomorrow

… duplicate install instructions, formatted usage example
@SeanNaren
Copy link
Collaborator

This is so awesome, thanks so much for doing this.

I've added some changes on top of your branch, could you have a look? I took a lot of things from the original code:

https://github.com/PaddlePaddle/DeepSpeech/blob/develop/decoders/swig_wrapper.py

As well as some descriptions that I've made. The only one I'm still a bit unsure (but is an important param) is the beta parameter. I'll try dig up a better description but I think currently it's better than nothing!

@rbracco
Copy link
Contributor Author

rbracco commented Aug 13, 2020

Thank you for the help. Those changes look great and clarified things a lot. I showed this to a friend and they suggested adding the parameter descriptions as a docstring for easier usage. Let me know which option you prefer from A. leaving things as they are B. Add parameters to docstring and leave them in the README. C. Add parameters to the docstring and remove them from the README.

Also are there any other things you'd like to see in the README? I've done some timing tests on various lengths of audio and beam sizes. I could include the results of that as well. The one variable I didn't consider was the vocab size. Please let me know your thoughts and thanks again.

image

@SeanNaren
Copy link
Collaborator

100% agreed, we could add docstrings here and I think it would be very beneficial: https://github.com/parlance/ctcdecode/blob/master/ctcdecode/__init__.py#L6

Is this something you can tackle?

And for option A/B or C, which would you think makes sense?

That graph is very informative, and something useful for sure! Feel free to add additional information :)

@rbracco
Copy link
Contributor Author

rbracco commented Aug 19, 2020

Thanks for the input. I added the docstrings but wasn't entirely sure about the format. I ended up deciding to keep the docstrings shorter and drier and leave the readme longer. I think I'll leave the timing stuff out for now. Is there anything else you would like to see added?

@SeanNaren
Copy link
Collaborator

LGTM! thanks so much @rbracco :)

@SeanNaren SeanNaren merged commit 4487b31 into parlance:master Aug 19, 2020
@SeanNaren
Copy link
Collaborator

Realised there was one TODO in their, will update and make a PR

@SeanNaren SeanNaren mentioned this pull request Aug 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants