Skip to content

Commit ef08133

Browse files
committed
Merge branch 'release/0.4.4'
2 parents 284ea57 + d43104e commit ef08133

File tree

8 files changed

+150
-95
lines changed

8 files changed

+150
-95
lines changed

.gitignore

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
11
__pycache__
2-
build
2+
.cache
3+
.tox
34
.env
45
.ropeproject
56
*.so
67
*.pyc
7-
dist
88
MANIFEST
99
*.egg*
10+
build
11+
dist
1012
pyemd/emd.cpp

.travis.yml

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,12 @@
11
sudo: false
22
language: python
33
python:
4-
- '3.3'
5-
- '3.4'
6-
- '3.5'
7-
- '3.6'
8-
install:
9-
- pip install Cython
10-
- make
11-
- pip install -e .
12-
- pip install pytest
13-
script: python -m pytest
4+
- '2.7'
5+
- '3.6'
6+
install:
7+
- pip install tox-travis
8+
- pip install cython
9+
script: tox
1410
notifications:
1511
email: false
1612
slack:

CONTRIBUTING.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
Installation issues
2+
===================
3+
4+
Before opening an issue related to installation, please try to install PyEMD in
5+
a fresh, empty Python 3 virtual environment and check that the problem
6+
persists:
7+
8+
```shell
9+
pip install virtualenvwrapper
10+
mkvirtualenv -p `which python3` pyemd
11+
# Now we're an empty Python 3 virtual environment
12+
pip install pyemd
13+
```
14+
15+
PyEMD is not officially supported for (but may nonetheless work with) the following:
16+
17+
- Python 2
18+
- Anaconda distributions
19+
- Windows operating systems
20+
21+
However, if you need to use it in these cases, pull requests are welcome!

README.rst

Lines changed: 42 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
.. image:: https://travis-ci.org/wmayner/pyemd.svg?branch=develop
1+
.. image:: https://img.shields.io/travis/wmayner/pyemd/develop.svg?style=flat-square&maxAge=3600
22
:target: https://travis-ci.org/wmayner/pyemd
3-
.. image:: http://img.shields.io/badge/Python%203%20-compatible-brightgreen.svg
3+
.. image:: https://img.shields.io/pypi/pyversions/pyemd.svg?style=flat-square&maxAge=86400
44
:target: https://wiki.python.org/moin/Python2orPython3
5-
:alt: Python 3 compatible
5+
:alt: Python versions badge
66

77
**************************
88
PyEMD: Fast EMD for Python
@@ -14,10 +14,6 @@ Distance <http://en.wikipedia.org/wiki/Earth_mover%27s_distance>`_ that allows
1414
it to be used with NumPy. **If you use this code, please cite the papers listed
1515
at the end of this document.**
1616

17-
This wrapper does not expose the full functionality of the underlying
18-
implementation; it can only used be with the ``np.float`` data type, and with a
19-
symmetric distance matrix that represents a true metric. See the documentation
20-
for the original Pele and Werman library for the other options it provides.
2117

2218
Installation
2319
~~~~~~~~~~~~
@@ -28,11 +24,10 @@ To install the latest release:
2824
2925
pip install pyemd
3026
31-
To install the latest development version:
27+
Before opening an issue related to installation, please try to install PyEMD in
28+
a fresh, empty Python 3 virtual environment and check that the problem
29+
persists.
3230

33-
.. code:: bash
34-
35-
pip install "git+https://github.com/wmayner/pyemd@develop#egg=pyemd"
3631

3732
Usage
3833
~~~~~
@@ -41,60 +36,60 @@ Usage
4136
4237
>>> from pyemd import emd
4338
>>> import numpy as np
44-
>>> first_signature = np.array([0.0, 1.0])
45-
>>> second_signature = np.array([5.0, 3.0])
46-
>>> distance_matrix = np.array([[0.0, 0.5], [0.5, 0.0]])
47-
>>> emd(first_signature, second_signature, distance_matrix)
39+
>>> first_histogram = np.array([0.0, 1.0])
40+
>>> second_histogram = np.array([5.0, 3.0])
41+
>>> distance_matrix = np.array([[0.0, 0.5],
42+
... [0.5, 0.0]])
43+
>>> emd(first_histogram, second_histogram, distance_matrix)
4844
3.5
4945
5046
You can also get the associated minimum-cost flow:
5147

5248
.. code:: python
5349
5450
>>> from pyemd import emd_with_flow
55-
>>> emd_with_flow(first_signature, second_signature, distance_matrix)
51+
>>> emd_with_flow(first_histogram, second_histogram, distance_matrix)
5652
(3.5, [[0.0, 0.0], [0.0, 1.0]])
5753
54+
5855
API
5956
~~~
6057

6158
.. code:: python
6259
63-
emd(first_signature, second_signature, distance_matrix)
64-
65-
- ``first_signature``: A 1-dimensional numpy array of ``np.float``, of size N.
66-
- ``second_signature``: A 1-dimensional numpy array of ``np.float``, of size N.
67-
- ``distance_matrix``: A 2-dimensional array of ``np.float``, of size NxN. Must
68-
be symmetric and represent a metric.
60+
emd(first_histogram, second_histogram, distance_matrix)
6961
62+
- ``first_histogram``: A 1-dimensional numpy array of type ``np.float64``, of
63+
length :math:`N`.
64+
- ``second_histogram``: A 1-dimensional numpy array of type ``np.float64``, of
65+
length :math:`N`.
66+
- ``distance_matrix``: A 2-dimensional array of type ``np.float64``, of size at
67+
least :math:`N \times N`. This defines the underlying metric, or ground
68+
distance, by giving the pairwise distances between the histogram bins. It
69+
must represent a metric; there is no warning if it doesn't.
7070

71-
.. code:: python
72-
73-
emd, flow = emd_with_flow(first_signature, second_signature, distance_matrix)
74-
75-
- ``first_signature``: A 1-dimensional numpy array of ``np.float``, of size N.
76-
- ``second_signature``: A 1-dimensional numpy array of ``np.float``, of size N.
77-
- ``distance_matrix``: A 2-dimensional array of ``np.float``, of size NxN. Must
78-
be symmetric and represent a metric.
71+
The arguments to ``emd_with_flow`` are the same.
7972

8073

8174
Limitations and Caveats
8275
~~~~~~~~~~~~~~~~~~~~~~~
8376

84-
- ``distance_matrix`` must be symmetric.
85-
- ``distance_matrix`` is assumed to represent a true metric. This must be
86-
enforced by the user. See the documentation in ``pyemd/lib/emd_hat.hpp``.
77+
- ``distance_matrix`` is assumed to represent a metric; there is no check to
78+
ensure that this is true. See the documentation in ``pyemd/lib/emd_hat.hpp``
79+
for more information.
8780
- The flow matrix does not contain the flows to/from the extra mass bin.
88-
- The signatures and distance matrix must be numpy arrays of ``np.float``. The
89-
original C++ template function can accept any numerical C++ type, but this
90-
wrapper only instantiates the template with ``double`` (Cython converts
91-
``np.float`` to ``double``). If there's demand, I can add support for other
92-
types.
81+
- The histograms and distance matrix must be numpy arrays of type
82+
``np.float64``. The original C++ template function can accept any numerical
83+
C++ type, but this wrapper only instantiates the template with ``double``
84+
(Cython converts ``np.float64`` to ``double``). If there's demand, I can add
85+
support for other types.
86+
9387

9488
Contributing
9589
~~~~~~~~~~~~
9690

97-
To help develop PyEMD, fork the project on GitHub and install the requirements with ``pip``.
91+
To help develop PyEMD, fork the project on GitHub and install the requirements
92+
with ``pip``.
9893

9994
The ``Makefile`` defines some tasks to help with development:
10095

@@ -104,6 +99,8 @@ The ``Makefile`` defines some tasks to help with development:
10499
* ``clean``: remove the build directory and the compiled C++ extension
105100
* ``test``: run unit tests with ``py.test``
106101

102+
Tests for different Python environments can be run by installing ``tox`` with
103+
``pip install tox`` and running the ``tox`` command.
107104

108105
Credit
109106
~~~~~~
@@ -118,7 +115,9 @@ Credit
118115
Please cite these papers if you use this code:
119116
``````````````````````````````````````````````
120117

121-
Ofir Pele and Michael Werman, "A linear time histogram metric for improved SIFT matching," in *Computer Vision - ECCV 2008*, Marseille, France, 2008, pp. 495-508.
118+
Ofir Pele and Michael Werman, "A linear time histogram metric for improved SIFT
119+
matching," in *Computer Vision - ECCV 2008*, Marseille, France, 2008, pp.
120+
495-508.
122121

123122
.. code-block:: latex
124123

@@ -132,7 +131,9 @@ Ofir Pele and Michael Werman, "A linear time histogram metric for improved SIFT
132131
publisher={Springer}
133132
}
134133

135-
Ofir Pele and Michael Werman, "Fast and robust earth mover's distances," in *Proc. 2009 IEEE 12th Int. Conf. on Computer Vision*, Kyoto, Japan, 2009, pp. 460-467.
134+
Ofir Pele and Michael Werman, "Fast and robust earth mover's distances," in
135+
*Proc. 2009 IEEE 12th Int. Conf. on Computer Vision*, Kyoto, Japan, 2009, pp.
136+
460-467.
136137

137138
.. code-block:: latex
138139

pyemd/__about__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"""PyEMD metadata"""
66

77
__title__ = 'pyemd'
8-
__version__ = '0.4.3'
8+
__version__ = '0.4.4'
99
__description__ = ("A Python wrapper for Ofir Pele and Michael Werman's "
1010
"implementation of the Earth Mover's Distance.")
1111
__author__ = 'Will Mayner'

pyemd/emd.pyx

Lines changed: 67 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -36,30 +36,39 @@ cdef extern from "lib/emd_hat.hpp":
3636
DEFAULT_EXTRA_MASS_PENALTY = -1.0
3737

3838

39-
def validate(first_signature, second_signature, distance_matrix):
39+
def validate(first_histogram, second_histogram, distance_matrix):
4040
"""Validate input."""
41-
if (first_signature.shape[0] > distance_matrix.shape[0] or
42-
second_signature.shape[0] > distance_matrix.shape[0]):
43-
raise ValueError('Signature dimension cannot be larger than '
44-
'dimensions of distance matrix')
45-
if (first_signature.shape[0] != second_signature.shape[0]):
46-
raise ValueError('Signature dimensions must be equal')
41+
if (first_histogram.shape[0] > distance_matrix.shape[0] or
42+
second_histogram.shape[0] > distance_matrix.shape[0]):
43+
raise ValueError('Histogram lengths cannot be greater than the '
44+
'number of rows or columns of the distance matrix')
45+
if (first_histogram.shape[0] != second_histogram.shape[0]):
46+
raise ValueError('Histogram lengths must be equal')
4747

4848

49-
def emd(np.ndarray[np.float64_t, ndim=1, mode="c"] first_signature,
50-
np.ndarray[np.float64_t, ndim=1, mode="c"] second_signature,
49+
def emd(np.ndarray[np.float64_t, ndim=1, mode="c"] first_histogram,
50+
np.ndarray[np.float64_t, ndim=1, mode="c"] second_histogram,
5151
np.ndarray[np.float64_t, ndim=2, mode="c"] distance_matrix,
5252
extra_mass_penalty=DEFAULT_EXTRA_MASS_PENALTY):
53-
"""
54-
Compute the EMD between signatures with the given distance matrix.
55-
56-
Args:
57-
first_signature (np.ndarray): A 1-dimensional array of type
58-
``np.double``, of length :math:`N`.
59-
second_signature (np.ndarray): A 1-dimensional array of ``np.double``,
60-
also of length :math:`N`.
61-
distance_matrix (np.ndarray): A 2-dimensional array of ``np.double``,
62-
of size :math:`N \cross N`.
53+
u"""Return the EMD between two histograms using the given distance matrix.
54+
55+
The Earth Mover's Distance is the minimal cost of turning one histogram
56+
into another by moving around the “dirt” in the bins, where the cost of
57+
moving dirt from one bin to another is given by the amount of dirt times
58+
the “ground distance” between the bins.
59+
60+
Arguments:
61+
first_histogram (np.ndarray): A 1-dimensional array of type np.float64,
62+
of length N.
63+
second_histogram (np.ndarray): A 1-dimensional array of np.float64,
64+
also of length N.
65+
distance_matrix (np.ndarray): A 2-dimensional array of np.float64, of
66+
size at least N × N. This defines the underlying metric, or ground
67+
distance, by giving the pairwise distances between the histogram
68+
bins. It must represent a metric; there is no warning if it
69+
doesn't.
70+
71+
Keyword Arguments:
6372
extra_mass_penalty: The penalty for extra mass. If you want the
6473
resulting distance to be a metric, it should be at least half the
6574
diameter of the space (maximum possible distance between any two
@@ -70,28 +79,42 @@ def emd(np.ndarray[np.float64_t, ndim=1, mode="c"] first_signature,
7079
7180
Returns:
7281
float: The EMD value.
82+
83+
Raises:
84+
ValueError: If the length of either histogram is greater than the
85+
number of rows or columns of the distance matrix, or if the histograms
86+
aren't the same length.
7387
"""
74-
validate(first_signature, second_signature, distance_matrix)
75-
return emd_hat_gd_metric_double(first_signature,
76-
second_signature,
88+
validate(first_histogram, second_histogram, distance_matrix)
89+
return emd_hat_gd_metric_double(first_histogram,
90+
second_histogram,
7791
distance_matrix,
7892
extra_mass_penalty)
7993

8094

81-
def emd_with_flow(np.ndarray[np.float64_t, ndim=1, mode="c"] first_signature,
82-
np.ndarray[np.float64_t, ndim=1, mode="c"] second_signature,
95+
def emd_with_flow(np.ndarray[np.float64_t, ndim=1, mode="c"] first_histogram,
96+
np.ndarray[np.float64_t, ndim=1, mode="c"] second_histogram,
8397
np.ndarray[np.float64_t, ndim=2, mode="c"] distance_matrix,
8498
extra_mass_penalty=DEFAULT_EXTRA_MASS_PENALTY):
85-
"""
86-
Compute the EMD between signatures with the given distance matrix.
87-
88-
Args:
89-
first_signature (np.ndarray): A 1-dimensional array of type
90-
``np.double``, of length :math:`N`.
91-
second_signature (np.ndarray): A 1-dimensional array of ``np.double``,
92-
also of length :math:`N`.
93-
distance_matrix (np.ndarray): A 2-dimensional array of ``np.double``,
94-
of size :math:`N \cross N`.
99+
u"""Return the EMD between two histograms using the given distance matrix.
100+
101+
The Earth Mover's Distance is the minimal cost of turning one histogram
102+
into another by moving around the “dirt” in the bins, where the cost of
103+
moving dirt from one bin to another is given by the amount of dirt times
104+
the “ground distance” between the bins.
105+
106+
Arguments:
107+
first_histogram (np.ndarray): A 1-dimensional array of type np.float64,
108+
of length N.
109+
second_histogram (np.ndarray): A 1-dimensional array of np.float64,
110+
also of length N.
111+
distance_matrix (np.ndarray): A 2-dimensional array of np.float64, of
112+
size at least N × N. This defines the underlying metric, or ground
113+
distance, by giving the pairwise distances between the histogram
114+
bins. It must represent a metric; there is no warning if it
115+
doesn't.
116+
117+
Keyword Arguments:
95118
extra_mass_penalty: The penalty for extra mass. If you want the
96119
resulting distance to be a metric, it should be at least half the
97120
diameter of the space (maximum possible distance between any two
@@ -101,10 +124,16 @@ def emd_with_flow(np.ndarray[np.float64_t, ndim=1, mode="c"] first_signature,
101124
used.
102125
103126
Returns:
104-
(float, list(float)): The EMD value and the associated minimum-cost flow.
127+
(float, list(list(float))): The EMD value and the associated
128+
minimum-cost flow.
129+
130+
Raises:
131+
ValueError: If the length of either histogram is greater than the
132+
number of rows or columns of the distance matrix, or if the histograms
133+
aren't the same length.
105134
"""
106-
validate(first_signature, second_signature, distance_matrix)
107-
return emd_hat_gd_metric_double_with_flow_wrapper(first_signature,
108-
second_signature,
135+
validate(first_histogram, second_histogram, distance_matrix)
136+
return emd_hat_gd_metric_double_with_flow_wrapper(first_histogram,
137+
second_histogram,
109138
distance_matrix,
110139
extra_mass_penalty)

setup.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -98,10 +98,9 @@ def no_cythonize(extensions, **_ignore):
9898
'Natural Language :: English',
9999
'License :: OSI Approved :: MIT License',
100100
'Programming Language :: Python',
101+
'Programming Language :: Python :: 2',
102+
'Programming Language :: Python :: 2.7',
101103
'Programming Language :: Python :: 3',
102-
'Programming Language :: Python :: 3.3',
103-
'Programming Language :: Python :: 3.4',
104-
'Programming Language :: Python :: 3.5',
105104
'Programming Language :: Python :: 3.6'
106105
],
107106
)

tox.ini

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[tox]
2+
envlist = py{27,36}
3+
4+
[testenv]
5+
deps = pytest
6+
commands = make test
7+
whitelist_externals = make

0 commit comments

Comments
 (0)