Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: sfischer13/python-arpa
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 0.1.0b3
Choose a base ref
...
head repository: sfischer13/python-arpa
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 0.1.0b4
Choose a head ref
  • 15 commits
  • 21 files changed
  • 1 contributor

Commits on Dec 6, 2018

  1. Fix keywords

    sfischer13 committed Dec 6, 2018
    Copy the full SHA
    f57b3ef View commit details
  2. Add more URLs

    sfischer13 committed Dec 6, 2018
    Copy the full SHA
    8838dbd View commit details

Commits on Dec 7, 2018

  1. Set zip_safe=True

    sfischer13 committed Dec 7, 2018
    Copy the full SHA
    cfe0966 View commit details
  2. Update README

    sfischer13 committed Dec 7, 2018
    Copy the full SHA
    06d7966 View commit details
  3. Update documentation

    sfischer13 committed Dec 7, 2018
    Copy the full SHA
    af2b774 View commit details
  4. Update notes

    sfischer13 committed Dec 7, 2018
    Copy the full SHA
    2d16774 View commit details

Commits on Dec 8, 2018

  1. Add support for .gz

    sfischer13 committed Dec 8, 2018
    Copy the full SHA
    31c4cec View commit details

Commits on Dec 9, 2018

  1. Set credits

    sfischer13 committed Dec 9, 2018
    Copy the full SHA
    3acbd05 View commit details
  2. Add exception

    sfischer13 committed Dec 9, 2018
    Copy the full SHA
    9b0caa5 View commit details
  3. Make code explicit

    sfischer13 committed Dec 9, 2018
    Copy the full SHA
    10a35ee View commit details
  4. Parse int

    sfischer13 committed Dec 9, 2018
    Copy the full SHA
    ad7f298 View commit details

Commits on Dec 10, 2018

  1. Simplify code

    sfischer13 committed Dec 10, 2018
    Copy the full SHA
    6b7ab95 View commit details
  2. Remove mode parameter

    sfischer13 committed Dec 10, 2018
    Copy the full SHA
    8b9010a View commit details

Commits on Dec 12, 2018

  1. Update Pipfile.lock

    sfischer13 committed Dec 12, 2018
    Copy the full SHA
    e947428 View commit details
  2. Bump version

    sfischer13 committed Dec 12, 2018
    Copy the full SHA
    afd4873 View commit details
Showing with 229 additions and 59 deletions.
  1. +5 −0 CONTRIBUTING.md
  2. +3 −0 HISTORY.md
  3. +1 −0 MANIFEST.in
  4. +9 −9 Pipfile.lock
  5. +23 −10 README.md
  6. +3 −3 arpa/__init__.py
  7. +20 −8 arpa/api.py
  8. +6 −0 arpa/exceptions.py
  9. +2 −3 arpa/models/simple.py
  10. +3 −3 arpa/parsers/quick.py
  11. +0 −6 docs/api.rst
  12. +30 −0 docs/arpa.models.rst
  13. +30 −0 docs/arpa.parsers.rst
  14. +38 −0 docs/arpa.rst
  15. +7 −1 docs/conf.py
  16. +3 −4 docs/index.rst
  17. +9 −4 setup.py
  18. +1 −0 tests/.gitignore
  19. +3 −0 tests/data/download.sh
  20. +32 −7 tests/test_arpa.py
  21. +1 −1 tests/test_model_simple.py
5 changes: 5 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -49,6 +49,11 @@ pipenv run flake8 .
Documentation
-------------

```sh
cd docs
pipenv run sphinx-apidoc -f -o . ../arpa
```

```sh
cd docs
pipenv run make html
3 changes: 3 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -20,6 +20,9 @@ You should [Keep a CHANGELOG](https://keepachangelog.com/), too!

### Security

[0.1.0b4](https://github.com/sfischer13/python-arpa/compare/0.1.0b3...0.1.0b4) - 2018-12-12
-------------------------------------------------------------------------------------------

[0.1.0b3](https://github.com/sfischer13/python-arpa/compare/0.1.0b2...0.1.0b3) - 2018-12-06
-------------------------------------------------------------------------------------------

1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -22,6 +22,7 @@ recursive-include docs *.rst

recursive-include tests *
recursive-exclude tests *.arpa
recursive-exclude tests *.arpa.*

recursive-exclude * __pycache__
recursive-exclude * *.py[co]
18 changes: 9 additions & 9 deletions Pipfile.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

33 changes: 23 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
Python ARPA Package
===================

[![PyPI Version](https://img.shields.io/pypi/v/arpa.svg)](https://pypi.python.org/pypi/arpa) [![Documentation Status](https://readthedocs.org/projects/arpa/badge/?version=latest)](https://arpa.readthedocs.io/en/latest/?badge=latest) [![Travis](https://img.shields.io/travis/sfischer13/python-arpa.svg)](https://travis-ci.org/sfischer13/python-arpa) [![Coverage Status](https://coveralls.io/repos/sfischer13/python-arpa/badge.svg?branch=master&service=github)](https://coveralls.io/github/sfischer13/python-arpa?branch=master)

Python library for reading ARPA n-gram models.
It was initiated by Stefan Fischer and is developed and maintained by many others.

- [Documentation](https://readthedocs.org/projects/arpa/badge/?version=latest) is available.
- [Documentation](https://arpa.readthedocs.io/en/latest/) is available.
- [Changes](https://github.com/sfischer13/python-arpa/blob/master/HISTORY.md) between releases are documented.
- [Bugs](https://github.com/sfischer13/python-arpa/issues) can be reported on the issue tracker.
- [Questions](mailto:sfischer13@ymail.com) can be asked via e-mail.
@@ -15,18 +12,31 @@ It was initiated by Stefan Fischer and is developed and maintained by many other
Setup
-----

[![PyPI Python Versions](https://img.shields.io/pypi/pyversions/arpa.svg)](https://pypi.python.org/pypi/arpa)
### Python 3.4+

[![PyPI Python Versions](https://img.shields.io/pypi/pyversions/arpa.svg)](https://pypi.python.org/pypi/arpa) [![PyPI Version](https://img.shields.io/pypi/v/arpa.svg)](https://pypi.python.org/pypi/arpa)

In order to install the Python 3 version:

$ pip install --user -U arpa

The package is available on [PyPI](https://pypi.python.org/pypi/arpa):
### Python 2.7

$ pip install arpa
[![PyPI Python Versions](https://img.shields.io/pypi/pyversions/arpa-backport.svg)](https://pypi.python.org/pypi/arpa-backport) [![PyPI Version](https://img.shields.io/pypi/v/arpa-backport.svg)](https://pypi.python.org/pypi/arpa-backport)

In order to install the Python 2.7 version:

$ pip install --user -U arpa-backport

Usage
-----

The package may be imported directly:

import arpa
import arpa # Python 3.4+
# OR
import arpa_backport as arpa # Python 2.7

models = arpa.loadf("foo.arpa")
lm = models[0] # ARPA files may contain several models.

@@ -42,9 +52,12 @@ The package may be imported directly:
lm.s("This is the end .", sos=False, eos=False)
lm.log_s("This is the end .", sos=False, eos=False)

Contribute
----------
Development
-----------

[![Travis](https://img.shields.io/travis/sfischer13/python-arpa.svg)](https://travis-ci.org/sfischer13/python-arpa) [![Documentation Status](https://readthedocs.org/projects/arpa/badge/?version=latest)](https://arpa.readthedocs.io/en/latest/?badge=latest) [![Coverage Status](https://coveralls.io/repos/sfischer13/python-arpa/badge.svg?branch=master&service=github)](https://coveralls.io/github/sfischer13/python-arpa?branch=master)

*Contributions are welcome!*
Write a bug report or send a pull request.
Other [contributors](https://github.com/sfischer13/python-arpa/graphs/contributors) have done so before.

6 changes: 3 additions & 3 deletions arpa/__init__.py
Original file line number Diff line number Diff line change
@@ -41,8 +41,8 @@
__author__ = 'Stefan Fischer'
__contact__ = 'Stefan Fischer <sfischer13@ymail.com>'
__copyright__ = 'Copyright (c) 2015-2018 Stefan Fischer'
__credits__ = []
__date__ = '2018-12-06'
__credits__ = ['Stefan Fischer']
__date__ = '2018-12-12'
__license__ = 'MIT'
__status__ = 'development'
__version__ = '0.1.0b3'
__version__ = '0.1.0b4'
28 changes: 20 additions & 8 deletions arpa/api.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import gzip

from io import StringIO

from .models.simple import ARPAModelSimple
@@ -9,10 +11,15 @@ def dump(obj, fp):
obj.write(fp)


def dumpf(obj, path, mode='wt', encoding=None):
"""Serialize obj to path in ARPA format."""
with open(path, mode=mode, encoding=encoding) as f:
dump(obj, f)
def dumpf(obj, path, encoding=None):
"""Serialize obj to path in ARPA format (.arpa, .gz)."""
path = str(path)
if path.endswith('.gz'):
with gzip.open(path, mode='wt', encoding=encoding) as f:
return dump(obj, f)
else:
with open(path, mode='wt', encoding=encoding) as f:
dump(obj, f)


def dumps(obj):
@@ -40,10 +47,15 @@ def load(fp, model=None, parser=None):
raise ValueError


def loadf(path, mode='rt', encoding=None, model=None, parser=None):
"""Deserialize path (a text file) to a Python object."""
with open(path, mode=mode, encoding=encoding) as f:
return load(f, model=model, parser=parser)
def loadf(path, encoding=None, model=None, parser=None):
"""Deserialize path (.arpa, .gz) to a Python object."""
path = str(path)
if path.endswith('.gz'):
with gzip.open(path, mode='rt', encoding=encoding) as f:
return load(f, model=model, parser=parser)
else:
with open(path, mode='rt', encoding=encoding) as f:
return load(f, model=model, parser=parser)


def loads(s, model=None, parser=None):
6 changes: 6 additions & 0 deletions arpa/exceptions.py
Original file line number Diff line number Diff line change
@@ -7,6 +7,12 @@ class ARPAException(Exception):
pass


class FatalException(ARPAException):
"""This should not have happened."""

pass


class FrozenException(ARPAException):
"""Language model is frozen."""

5 changes: 2 additions & 3 deletions arpa/models/simple.py
Original file line number Diff line number Diff line change
@@ -24,10 +24,9 @@ def add_count(self, order, count):
def add_entry(self, ngram, p, bo=None, order=None):
if self._vocabulary is not None:
raise FrozenException
key = tuple(ngram)
self._ps[key] = p
self._ps[ngram] = p
if bo is not None:
self._bos[key] = bo
self._bos[ngram] = bo

def counts(self):
return sorted(self._counts.items())
6 changes: 3 additions & 3 deletions arpa/parsers/quick.py
Original file line number Diff line number Diff line change
@@ -64,7 +64,7 @@ def _header(self, line):
match = self.re_header.match(line)
if match:
self._state = self.State.ENTRY
self._tmp_order = match.group(1)
self._tmp_order = int(match.group(1))
elif line == '\\end\\':
self._result.append(self._tmp_model)
self._state = self.State.DATA
@@ -80,8 +80,8 @@ def _entry(self, line):
if match:
p = self._float_or_int(match.group(1))
ngram = tuple(match.group(4).split(' '))
bo = match.group(7)
bo = self._float_or_int(bo) if bo else None
bo_match = match.group(7)
bo = self._float_or_int(bo_match) if bo_match else None
self._tmp_model.add_entry(ngram, p, bo, self._tmp_order)
elif not line:
self._state = self.State.HEADER # last entry
6 changes: 0 additions & 6 deletions docs/api.rst

This file was deleted.

30 changes: 30 additions & 0 deletions docs/arpa.models.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
arpa.models package
===================

Submodules
----------

arpa.models.base module
-----------------------

.. automodule:: arpa.models.base
:members:
:undoc-members:
:show-inheritance:

arpa.models.simple module
-------------------------

.. automodule:: arpa.models.simple
:members:
:undoc-members:
:show-inheritance:


Module contents
---------------

.. automodule:: arpa.models
:members:
:undoc-members:
:show-inheritance:
30 changes: 30 additions & 0 deletions docs/arpa.parsers.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
arpa.parsers package
====================

Submodules
----------

arpa.parsers.base module
------------------------

.. automodule:: arpa.parsers.base
:members:
:undoc-members:
:show-inheritance:

arpa.parsers.quick module
-------------------------

.. automodule:: arpa.parsers.quick
:members:
:undoc-members:
:show-inheritance:


Module contents
---------------

.. automodule:: arpa.parsers
:members:
:undoc-members:
:show-inheritance:
38 changes: 38 additions & 0 deletions docs/arpa.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
arpa package
============

Subpackages
-----------

.. toctree::

arpa.models
arpa.parsers

Submodules
----------

arpa.api module
---------------

.. automodule:: arpa.api
:members:
:undoc-members:
:show-inheritance:

arpa.exceptions module
----------------------

.. automodule:: arpa.exceptions
:members:
:undoc-members:
:show-inheritance:


Module contents
---------------

.. automodule:: arpa
:members:
:undoc-members:
:show-inheritance:
8 changes: 7 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
@@ -25,7 +25,7 @@
# The short X.Y version
version = '0.1'
# The full version, including alpha/beta/rc tags
release = '0.1.0b3'
release = '0.1.0b4'


# -- General configuration ---------------------------------------------------
@@ -182,3 +182,9 @@


# -- Extension configuration -------------------------------------------------

nitpick_ignore = [
('py:class', 'Exception'),
('py:class', 'enum.Enum'),
('py:class', 'object'),
]
7 changes: 3 additions & 4 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -3,16 +3,15 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Documentation for Python :mod:`arpa`
====================================
Python :mod:`arpa` package
==========================

.. toctree::
:maxdepth: 2
:caption: Contents:

setup
examples
api
arpa


Indices and tables
Loading