Skip to content

Commit

Permalink
Added a github action, improved index.rst, added common_issues.rst to…
Browse files Browse the repository at this point in the history
… the menu.
  • Loading branch information
hadware committed May 16, 2022
1 parent f735c42 commit 0661054
Show file tree
Hide file tree
Showing 3 changed files with 112 additions and 11 deletions.
45 changes: 45 additions & 0 deletions .github/workflows/doc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# build the sphinx documentation and pushes it to a doc branch, then used by github pages

name: Doc

on: [ push, pull_request ]

jobs:
docs:
runs-on: ubuntu-latest
strategy:
max-parallel: 4
matrix:
python-version: [ 3.7 ]
steps:
- uses: actions/checkout@v1
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v1
with:
python-version: ${{ matrix.python-version }}
- name: Install
run: |
python -m pip install --upgrade pip
pip install .[doc]
- name: Build documentation
run: |
make --directory=docs html
touch ./docs/build/html/.nojekyll
- name: Commit documentation changes
run: |
git clone https://github.com/bootphon/phonemizer.git --branch doc --single-branch doc
cp -r docs/build/html/* doc
cd doc
touch .nojekyll
git config --local user.email "action@github.com"
git config --local user.name "GitHub Action"
git add .
git commit -m "Update documentation" -a || true
# The above command will fail if no changes were present, so we ignore
# the return code.
- name: Push changes
uses: ad-m/github-push-action@master
with:
branch: doc
directory: doc
github_token: ${{ secrets.GITHUB_TOKEN }}
30 changes: 29 additions & 1 deletion docs/source/common_issues.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,33 @@
==============
Command Issues
Common Issues
==============


Phonemization is slow
---------------------

You may have realized that large number of calls to the ``phonemize``
makes for a very slow execution. It is much more efficient to minimize the number of calls to the phonemize function.
Indeed the initialization of the phonemization backend can be expensive, especially for espeak.
It's much more efficient to either:

- group all the calls into one using a list of strings
- "manually" instantiate your backend of choice, then call its own ``phonemize`` method

.. code-block:: python
from phonemizer import phonemize
text = [line1, line2, ...]
# Do this:
phonemized = phonemize(text, ...)
# Not this:
phonemized = [phonemize(line, ...) for line in text]
# An alternative is to directly instanciate the backend and to call the
# phonemize function from it:
from phonemizer.backend import EspeakBackend
backend = EspeakBackend('en-us', ...)
phonemized = [backend.phonemize(line, ...) for line in text]
48 changes: 38 additions & 10 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,44 @@
Welcome to Phonemizer's documentation!
======================================


* ``phonemizer`` allows simple phonemization of words and texts in many languages.

* Provides both the ``phonemize`` command-line tool and the Python function
``phonemizer.phonemize``. See :ref:`phonemize`.

* It is based on four backends: **espeak**, **espeak-mbrola**, **festival** and
**segments**. The backends have different properties and capabilities resumed
in table below. The backend choice is let to the user.

* `espeak-ng <https://github.com/espeak-ng/espeak-ng>`_ is a Text-to-Speech
software supporting a lot of languages and IPA (International Phonetic
Alphabet) output.

* `espeak-ng-mbrola <https://github.com/espeak-ng/espeak-ng/blob/master/docs/mbrola.md>`_
uses the SAMPA phonetic alphabet instead of IPA but does not preserve word
boundaries.

* `festival <http://www.cstr.ed.ac.uk/projects/festival>`_ is another
Tex-to-Speech engine. Its phonemizer backend currently supports only
American English. It uses a [custom phoneset][festival-phoneset], but it
allows tokenization at the syllable level.

* `segments <https://github.com/cldf/segments>`_ is a Unicode tokenizer that
build a phonemization from a grapheme to phoneme mapping provided as a file
by the user.


.. toctree::
:maxdepth: 2
:caption: Contents:

install
cli
python_examples
common_issues
api_reference

To reference ``phonemizer`` in your own work, please cite the following
`JOSS paper <https://joss.theoj.org/papers/10.21105/joss.03958>`_.

Expand All @@ -24,15 +62,5 @@ To reference ``phonemizer`` in your own work, please cite the following
journal = {Journal of Open Source Software}
}
.. toctree::
:maxdepth: 2
:caption: Contents:

install
cli
common_issues
api_reference


0 comments on commit 0661054

Please sign in to comment.