1
- .. image :: https://img.shields.io/travis/wmayner/pyemd/develop.svg?style=flat-square&maxAge=3600
2
- :target: https://travis-ci.org/wmayner/pyemd
1
+ .. image :: https://img.shields.io/github/actions/workflow/status/wmayner/pyemd/build_wheels.yml?style=flat-square&maxAge=86400
2
+ :target: https://github.com/wmayner/pyemd/actions/workflows/build_wheels.yml
3
+ :alt: Build status badge
3
4
.. image :: https://img.shields.io/pypi/pyversions/pyemd.svg?style=flat-square&maxAge=86400
4
- :target: https://wiki.python. org/moin/Python2orPython3
5
+ :target: https://pypi. org/project/pyemd/
5
6
:alt: Python versions badge
6
7
7
8
PyEMD: Fast EMD for Python
8
9
==========================
9
10
10
11
PyEMD is a Python wrapper for `Ofir Pele and Michael Werman's implementation
11
- <http ://ofirpele.droppages.com/> `_ of the `Earth Mover's
12
- Distance <http ://en.wikipedia.org/wiki/Earth_mover%27s_distance> `_ that allows
12
+ <https ://ofirpele.droppages.com/> `_ of the `Earth Mover's
13
+ Distance <https ://en.wikipedia.org/wiki/Earth_mover%27s_distance> `_ that allows
13
14
it to be used with NumPy. **If you use this code, please cite the papers listed
14
15
at the end of this document. **
15
16
@@ -54,8 +55,9 @@ You can also calculate the EMD directly from two arrays of observations:
54
55
>> > emd_samples(first_array, second_array, bins = 2 )
55
56
0.5
56
57
57
- Documentation
58
- -------------
58
+
59
+ API Documentation
60
+ -----------------
59
61
60
62
emd()
61
63
~~~~~
@@ -75,17 +77,17 @@ emd()
75
77
*N *.
76
78
- ``distance_matrix `` *(np.ndarray) *: A 2D array of ``np.float64, `` of size at
77
79
least *N * × *N *. This defines the underlying metric, or ground distance, by
78
- giving the pairwise distances between the histogram bins. It must represent a
79
- metric; there is no warning if it doesn't.
80
+ giving the pairwise distances between the histogram bins.
81
+ ** NOTE: It must represent a metric; there is no warning if it doesn't.**
80
82
81
83
*Keyword Arguments: *
82
84
83
85
- ``extra_mass_penalty `` *(float) *: The penalty for extra mass. If you want the
84
86
resulting distance to be a metric, it should be at least half the diameter of
85
87
the space (maximum possible distance between any two points). If you want
86
88
partial matching you can set it to zero (but then the resulting distance is
87
- not guaranteed to be a metric). The default value is ``-1.0 ``, which means the
88
- maximum value in the distance matrix is used.
89
+ not guaranteed to be a metric). The default value is ``-1.0 ``, which means
90
+ the maximum value in the distance matrix is used.
89
91
90
92
*Returns: * *(float) * The EMD value.
91
93
@@ -123,18 +125,18 @@ emd_samples()
123
125
124
126
*Arguments: *
125
127
126
- - ``first_array `` *(Iterable) *: A 1D array of samples used to generate a
128
+ - ``first_array `` *(Iterable) *: An array of samples used to generate a
127
129
histogram.
128
- - ``second_array `` *(Iterable) *: A 1D array of samples used to generate a
130
+ - ``second_array `` *(Iterable) *: An array of samples used to generate a
129
131
histogram.
130
132
131
133
*Keyword Arguments: *
132
134
133
135
- ``extra_mass_penalty `` *(float) *: Same as for ``emd() ``.
134
136
- ``distance `` *(string or function) *: A string or function implementing
135
- a metric on a 1D ``np.ndarray ``. Defaults to the Euclidean distance. Currently
136
- limited to 'euclidean' or your own function, which must take a 1D array and
137
- return a square 2D array of pairwise distances.
137
+ a metric on a 1D ``np.ndarray ``. Defaults to the Euclidean distance.
138
+ Currently limited to 'euclidean' or your own function, which must take
139
+ a 1D array and return a square 2D array of pairwise distances.
138
140
- ``normalized `` (*boolean *): If true (default), treat histograms as fractions
139
141
of the dataset. If false, treat histograms as counts. In the latter case the
140
142
EMD will vary greatly by array length.
@@ -147,11 +149,12 @@ emd_samples()
147
149
``first_array `` and ``second_array ``. Note: if the given range is not a
148
150
superset of the default range, no warning will be given.
149
151
150
- *Returns: * *(float) * The EMD value between the histograms of ``first_array `` and
151
- ``second_array ``.
152
+ *Returns: * *(float) * The EMD value between the histograms of ``first_array ``
153
+ and ``second_array ``.
152
154
153
155
----
154
156
157
+
155
158
Limitations and Caveats
156
159
-----------------------
157
160
@@ -163,66 +166,36 @@ Limitations and Caveats
163
166
- The histograms and distance matrix must be numpy arrays of type
164
167
``np.float64 ``. The original C++ template function can accept any numerical
165
168
C++ type, but this wrapper only instantiates the template with ``double ``
166
- (Cython converts ``np.float64 `` to ``double ``). If there's demand, I can add
167
- support for other types.
169
+ (Cython converts ``np.float64 `` to ``double ``). If there's demand, I can
170
+ add support for other types.
168
171
169
172
- ``emd_with_flow() ``:
170
173
171
174
- The flow matrix does not contain the flows to/from the extra mass bin.
172
175
173
176
- ``emd_samples() ``:
174
177
175
- - Using the default ``bins='auto' `` results in an extra call to
176
- ``np.histogram() `` to determine the bin lengths, since `the NumPy
177
- bin-selectors are not exposed in the public API
178
+ - With `` numpy < 1.15.0 ``, using the default ``bins='auto' `` results in an
179
+ extra call to ``np.histogram() `` to determine the bin lengths, since `the
180
+ NumPy bin-selectors are not exposed in the public API
178
181
<https://github.com/numpy/numpy/issues/10183> `_. For performance, you may
179
- want to set the bins yourself.
180
-
181
-
182
- Contributing
183
- ------------
184
-
185
- To help develop PyEMD, fork the project on GitHub and install the requirements
186
- with ``pip install -r requirements.txt ``.
187
-
188
- The ``Makefile `` defines some tasks to help with development:
189
-
190
- - ``test ``: Run the test suite
191
- - ``build `` Generate and compile the Cython extension
192
- - ``clean ``: Remove the compiled Cython extension
193
- - ``default ``: Run ``build ``
194
-
195
- Tests for different Python environments can be run with ``tox ``.
182
+ want to set the bins yourself. If ``numpy >= 1.15 `` is available,
183
+ ``np.histogram_bin_edges() `` is called instead, which is more efficient.
196
184
197
185
198
186
Credit
199
187
------
200
188
201
189
- All credit for the actual algorithm and implementation goes to `Ofir Pele
202
- <http ://www.ariel.ac.il/sites/ofirpele /> `_ and `Michael Werman
203
- <http ://www.cs.huji.ac.il/~werman/> `_. See the `relevant paper
204
- <http ://www.seas.upenn.edu/~ofirpele/publications/ICCV2009.pdf > `_.
190
+ <https ://ofirpele.droppages.com /> `_ and `Michael Werman
191
+ <https ://www.cs.huji.ac.il/~werman/> `_. See the `relevant paper
192
+ <https ://doi.org/10.1109/ICCV.2009.5459199 > `_.
205
193
- Thanks to the Cython developers for making this kind of wrapper relatively
206
194
easy to write.
207
195
208
196
Please cite these papers if you use this code:
209
197
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
210
198
211
- Ofir Pele and Michael Werman. A linear time histogram metric for improved SIFT
212
- matching. *Computer Vision - ECCV 2008 *, Marseille, France, 2008, pp. 495-508.
213
-
214
- .. code-block :: latex
215
-
216
- @INPROCEEDINGS{pele2008,
217
- title={A linear time histogram metric for improved sift matching},
218
- author={Pele, Ofir and Werman, Michael},
219
- booktitle={Computer Vision--ECCV 2008},
220
- pages={495--508},
221
- year={2008},
222
- month={October},
223
- publisher={Springer}
224
- }
225
-
226
199
Ofir Pele and Michael Werman. Fast and robust earth mover's distances. *Proc.
227
200
2009 IEEE 12th Int. Conf. on Computer Vision *, Kyoto, Japan, 2009, pp. 460-467.
228
201
@@ -237,3 +210,18 @@ Ofir Pele and Michael Werman. Fast and robust earth mover's distances. *Proc.
237
210
month={September},
238
211
organization={IEEE}
239
212
}
213
+
214
+ Ofir Pele and Michael Werman. A linear time histogram metric for improved SIFT
215
+ matching. *Computer Vision - ECCV 2008 *, Marseille, France, 2008, pp. 495-508.
216
+
217
+ .. code-block :: latex
218
+
219
+ @INPROCEEDINGS{pele2008,
220
+ title={A linear time histogram metric for improved sift matching},
221
+ author={Pele, Ofir and Werman, Michael},
222
+ booktitle={Computer Vision--ECCV 2008},
223
+ pages={495--508},
224
+ year={2008},
225
+ month={October},
226
+ publisher={Springer}
227
+ }
0 commit comments