Skip to content

Commit 0a26bf8

Browse files
author
Pyry Kovanen
committed
Merge remote-tracking branch 'upstream/master' into empty-json-empty-df-fix
2 parents fc15ba0 + abfac97 commit 0a26bf8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

96 files changed

+3113
-382
lines changed

Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,3 +23,4 @@ doc:
2323
cd doc; \
2424
python make.py clean; \
2525
python make.py html
26+
python make.py spellcheck

doc/make.py

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -224,8 +224,9 @@ def _sphinx_build(self, kind):
224224
--------
225225
>>> DocBuilder(num_jobs=4)._sphinx_build('html')
226226
"""
227-
if kind not in ('html', 'latex'):
228-
raise ValueError('kind must be html or latex, not {}'.format(kind))
227+
if kind not in ('html', 'latex', 'spelling'):
228+
raise ValueError('kind must be html, latex or '
229+
'spelling, not {}'.format(kind))
229230

230231
self._run_os('sphinx-build',
231232
'-j{}'.format(self.num_jobs),
@@ -304,6 +305,18 @@ def zip_html(self):
304305
'-q',
305306
*fnames)
306307

308+
def spellcheck(self):
309+
"""Spell check the documentation."""
310+
self._sphinx_build('spelling')
311+
output_location = os.path.join('build', 'spelling', 'output.txt')
312+
with open(output_location) as output:
313+
lines = output.readlines()
314+
if lines:
315+
raise SyntaxError(
316+
'Found misspelled words.'
317+
' Check pandas/doc/build/spelling/output.txt'
318+
' for more details.')
319+
307320

308321
def main():
309322
cmds = [method for method in dir(DocBuilder) if not method.startswith('_')]

doc/source/advanced.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -342,7 +342,7 @@ As usual, **both sides** of the slicers are included as this is label indexing.
342342
columns=micolumns).sort_index().sort_index(axis=1)
343343
dfmi
344344
345-
Basic multi-index slicing using slices, lists, and labels.
345+
Basic MultiIndex slicing using slices, lists, and labels.
346346

347347
.. ipython:: python
348348
@@ -1039,7 +1039,7 @@ On the other hand, if the index is not monotonic, then both slice bounds must be
10391039
KeyError: 'Cannot get right slice bound for non-unique label: 3'
10401040
10411041
:meth:`Index.is_monotonic_increasing` and :meth:`Index.is_monotonic_decreasing` only check that
1042-
an index is weakly monotonic. To check for strict montonicity, you can combine one of those with
1042+
an index is weakly monotonic. To check for strict monotonicity, you can combine one of those with
10431043
:meth:`Index.is_unique`
10441044
10451045
.. ipython:: python

doc/source/basics.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -593,7 +593,7 @@ categorical columns:
593593
frame = pd.DataFrame({'a': ['Yes', 'Yes', 'No', 'No'], 'b': range(4)})
594594
frame.describe()
595595
596-
This behaviour can be controlled by providing a list of types as ``include``/``exclude``
596+
This behavior can be controlled by providing a list of types as ``include``/``exclude``
597597
arguments. The special value ``all`` can also be used:
598598

599599
.. ipython:: python

doc/source/conf.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,10 +73,14 @@
7373
'sphinx.ext.ifconfig',
7474
'sphinx.ext.linkcode',
7575
'nbsphinx',
76+
'sphinxcontrib.spelling'
7677
]
7778

7879
exclude_patterns = ['**.ipynb_checkpoints']
7980

81+
spelling_word_list_filename = ['spelling_wordlist.txt', 'names_wordlist.txt']
82+
spelling_ignore_pypi_package_names = True
83+
8084
with open("index.rst") as f:
8185
index_rst_lines = f.readlines()
8286

doc/source/contributing.rst

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -436,6 +436,25 @@ the documentation are also built by Travis-CI. These docs are then hosted `here
436436
<http://pandas-docs.github.io/pandas-docs-travis>`__, see also
437437
the :ref:`Continuous Integration <contributing.ci>` section.
438438

439+
Spell checking documentation
440+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
441+
442+
When contributing to documentation to **pandas** it's good to check if your work
443+
contains any spelling errors. Sphinx provides an easy way to spell check documentation
444+
and docstrings.
445+
446+
Running the spell check is easy. Just navigate to your local ``pandas/doc/`` directory and run::
447+
448+
python make.py spellcheck
449+
450+
The spellcheck will take a few minutes to run (between 1 to 6 minutes). Sphinx will alert you
451+
with warnings and misspelt words - these misspelt words will be added to a file called
452+
``output.txt`` and you can find it on your local directory ``pandas/doc/build/spelling/``.
453+
454+
The Sphinx spelling extension uses an EN-US dictionary to correct words, what means that in
455+
some cases you might need to add a word to this dictionary. You can do so by adding the word to
456+
the bag-of-words file named ``spelling_wordlist.txt`` located in the folder ``pandas/doc/``.
457+
439458
.. _contributing.code:
440459

441460
Contributing to the code base

doc/source/contributing_docstring.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ left before or after the docstring. The text starts in the next line after the
103103
opening quotes. The closing quotes have their own line
104104
(meaning that they are not at the end of the last sentence).
105105

106-
In rare occasions reST styles like bold text or itallics will be used in
106+
In rare occasions reST styles like bold text or italics will be used in
107107
docstrings, but is it common to have inline code, which is presented between
108108
backticks. It is considered inline code:
109109

@@ -706,7 +706,7 @@ than 5, to show the example with the default values. If doing the ``mean``, we
706706
could use something like ``[1, 2, 3]``, so it is easy to see that the value
707707
returned is the mean.
708708

709-
For more complex examples (groupping for example), avoid using data without
709+
For more complex examples (grouping for example), avoid using data without
710710
interpretation, like a matrix of random numbers with columns A, B, C, D...
711711
And instead use a meaningful example, which makes it easier to understand the
712712
concept. Unless required by the example, use names of animals, to keep examples
@@ -877,7 +877,7 @@ be tricky. Here are some attention points:
877877
the actual error only the error name is sufficient.
878878

879879
* If there is a small part of the result that can vary (e.g. a hash in an object
880-
represenation), you can use ``...`` to represent this part.
880+
representation), you can use ``...`` to represent this part.
881881

882882
If you want to show that ``s.plot()`` returns a matplotlib AxesSubplot object,
883883
this will fail the doctest ::

doc/source/cookbook.rst

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -286,7 +286,7 @@ New Columns
286286
df = pd.DataFrame(
287287
{'AAA' : [1,1,1,2,2,2,3,3], 'BBB' : [2,1,3,4,5,1,2,3]}); df
288288
289-
Method 1 : idxmin() to get the index of the mins
289+
Method 1 : idxmin() to get the index of the minimums
290290

291291
.. ipython:: python
292292
@@ -307,7 +307,7 @@ MultiIndexing
307307

308308
The :ref:`multindexing <advanced.hierarchical>` docs.
309309

310-
`Creating a multi-index from a labeled frame
310+
`Creating a MultiIndex from a labeled frame
311311
<http://stackoverflow.com/questions/14916358/reshaping-dataframes-in-pandas-based-on-column-labels>`__
312312

313313
.. ipython:: python
@@ -330,7 +330,7 @@ The :ref:`multindexing <advanced.hierarchical>` docs.
330330
Arithmetic
331331
**********
332332

333-
`Performing arithmetic with a multi-index that needs broadcasting
333+
`Performing arithmetic with a MultiIndex that needs broadcasting
334334
<http://stackoverflow.com/questions/19501510/divide-entire-pandas-multiindex-dataframe-by-dataframe-variable/19502176#19502176>`__
335335

336336
.. ipython:: python
@@ -342,7 +342,7 @@ Arithmetic
342342
Slicing
343343
*******
344344

345-
`Slicing a multi-index with xs
345+
`Slicing a MultiIndex with xs
346346
<http://stackoverflow.com/questions/12590131/how-to-slice-multindex-columns-in-pandas-dataframes>`__
347347

348348
.. ipython:: python
@@ -363,7 +363,7 @@ To take the cross section of the 1st level and 1st axis the index:
363363
364364
df.xs('six',level=1,axis=0)
365365
366-
`Slicing a multi-index with xs, method #2
366+
`Slicing a MultiIndex with xs, method #2
367367
<http://stackoverflow.com/questions/14964493/multiindex-based-indexing-in-pandas>`__
368368

369369
.. ipython:: python
@@ -386,13 +386,13 @@ To take the cross section of the 1st level and 1st axis the index:
386386
df.loc[(All,'Math'),('Exams')]
387387
df.loc[(All,'Math'),(All,'II')]
388388
389-
`Setting portions of a multi-index with xs
389+
`Setting portions of a MultiIndex with xs
390390
<http://stackoverflow.com/questions/19319432/pandas-selecting-a-lower-level-in-a-dataframe-to-do-a-ffill>`__
391391

392392
Sorting
393393
*******
394394

395-
`Sort by specific column or an ordered list of columns, with a multi-index
395+
`Sort by specific column or an ordered list of columns, with a MultiIndex
396396
<http://stackoverflow.com/questions/14733871/mutli-index-sorting-in-pandas>`__
397397

398398
.. ipython:: python
@@ -664,7 +664,7 @@ The :ref:`Pivot <reshaping.pivot>` docs.
664664
`Plot pandas DataFrame with year over year data
665665
<http://stackoverflow.com/questions/30379789/plot-pandas-data-frame-with-year-over-year-data>`__
666666

667-
To create year and month crosstabulation:
667+
To create year and month cross tabulation:
668668

669669
.. ipython:: python
670670
@@ -677,7 +677,7 @@ To create year and month crosstabulation:
677677
Apply
678678
*****
679679

680-
`Rolling Apply to Organize - Turning embedded lists into a multi-index frame
680+
`Rolling Apply to Organize - Turning embedded lists into a MultiIndex frame
681681
<http://stackoverflow.com/questions/17349981/converting-pandas-dataframe-with-categorical-values-into-binary-values>`__
682682

683683
.. ipython:: python
@@ -1029,8 +1029,8 @@ Skip row between header and data
10291029
01.01.1990 05:00;21;11;12;13
10301030
"""
10311031
1032-
Option 1: pass rows explicitly to skiprows
1033-
""""""""""""""""""""""""""""""""""""""""""
1032+
Option 1: pass rows explicitly to skip rows
1033+
"""""""""""""""""""""""""""""""""""""""""""
10341034

10351035
.. ipython:: python
10361036

doc/source/dsintro.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1014,7 +1014,7 @@ Deprecate Panel
10141014
Over the last few years, pandas has increased in both breadth and depth, with new features,
10151015
datatype support, and manipulation routines. As a result, supporting efficient indexing and functional
10161016
routines for ``Series``, ``DataFrame`` and ``Panel`` has contributed to an increasingly fragmented and
1017-
difficult-to-understand codebase.
1017+
difficult-to-understand code base.
10181018

10191019
The 3-D structure of a ``Panel`` is much less common for many types of data analysis,
10201020
than the 1-D of the ``Series`` or the 2-D of the ``DataFrame``. Going forward it makes sense for
@@ -1023,7 +1023,7 @@ pandas to focus on these areas exclusively.
10231023
Oftentimes, one can simply use a MultiIndex ``DataFrame`` for easily working with higher dimensional data.
10241024

10251025
In addition, the ``xarray`` package was built from the ground up, specifically in order to
1026-
support the multi-dimensional analysis that is one of ``Panel`` s main usecases.
1026+
support the multi-dimensional analysis that is one of ``Panel`` s main use cases.
10271027
`Here is a link to the xarray panel-transition documentation <http://xarray.pydata.org/en/stable/pandas.html#panel-transition>`__.
10281028

10291029
.. ipython:: python

doc/source/ecosystem.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -187,8 +187,8 @@ and metadata disseminated in
187187
`SDMX <http://www.sdmx.org>`_ 2.1, an ISO-standard
188188
widely used by institutions such as statistics offices, central banks,
189189
and international organisations. pandaSDMX can expose datasets and related
190-
structural metadata including dataflows, code-lists,
191-
and datastructure definitions as pandas Series
190+
structural metadata including data flows, code-lists,
191+
and data structure definitions as pandas Series
192192
or multi-indexed DataFrames.
193193

194194
`fredapi <https://github.com/mortada/fredapi>`__
@@ -263,7 +263,7 @@ Data validation
263263
`Engarde <http://engarde.readthedocs.io/en/latest/>`__
264264
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
265265

266-
Engarde is a lightweight library used to explicitly state your assumptions abour your datasets
266+
Engarde is a lightweight library used to explicitly state your assumptions about your datasets
267267
and check that they're *actually* true.
268268

269269
.. _ecosystem.extensions:

0 commit comments

Comments
 (0)