GitHub - annagrr/ctcdecode: PyTorch CTC Decoder bindings

Fixing the bug when your target vocab lisr containing word or chinese char.

In the original code, the word list can only have numbers, letters, or single-byte characters, because this is done when passing from Python to C ++ code.

self._labels = ''.join(labels).encode()

I modified here, and passed the python list directly, and then when binding, the accepted parameters were changed from char to vector.

======================================================================================================

ctcdecode

ctcdecode is an implementation of CTC (Connectionist Temporal Classification) beam search decoding for PyTorch. C++ code borrowed liberally from Paddle Paddles' DeepSpeech. It includes swappable scorer support enabling standard beam search, and KenLM-based decoding.

Installation

The library is largely self-contained and requires only PyTorch 1.0. Building the C++ library requires gcc or clang. KenLM language modeling support is also optionally included, and enabled by default.

# get the code
git clone --recursive https://github.com/parlance/ctcdecode.git
cd ctcdecode
pip install .

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
ctcdecode		ctcdecode
tests		tests
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
build.py		build.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ctcdecode

Installation

About

Uh oh!

Releases

Packages

Languages

License

annagrr/ctcdecode

Folders and files

Latest commit

History

Repository files navigation

ctcdecode

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages