Skip to content

DOC: update and seperate the Series.drop and Dataframe.drop docstring #20120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 13, 2018

Conversation

JennaVergeynst
Copy link
Contributor

@JennaVergeynst JennaVergeynst commented Mar 10, 2018

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

  • PR title is "DOC: update the docstring"
  • The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single <your-function-or-method>
  • It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

Errors found:
Errors in parameters section
Parameter "columns" description should finish with "."

This error is caused by .. versionadded:: 0.21.0, which needs to stay in place.

@pep8speaks
Copy link

pep8speaks commented Mar 10, 2018

Hello @JennaVergeynst! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 12, 2018 at 21:59 Hours UTC

weight 1.0 0.8
"""
return super(DataFrame,
self).drop(labels=labels, axis=axis, index=index,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indendation is a bit strange here. I would set the self on the same line

Whether to drop labels from the index (0 / 'index') or
columns (1 / 'columns').
index : None
Redundant for application on Series, but index can be used instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is now in the data.frame section, this is probably not redundant?

Redundant for application on Series, but index can be used instead
of labels.
columns : None
Redundant for application on Series, but index can be used instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is now in the data.frame section, this is probably not redundant?


.. versionadded:: 0.21.0
level : int or level name, optional
For MultiIndex.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it useful to make this more detailed? e.g. Level for which the labels will be removed


Returns
-------
dropped : type of caller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the generic, this was the same as the caller, but here we know it is a dataframe

weight 1.0 0.8
length 0.3 0.2

>>> df.drop(index='cow',columns='small')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing space after ,


Drop columns and/or rows of MultiIndex

>>> midx = pd.MultiIndex(levels=[['lama','cow','falcon'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing spaces after ,


Returns
-------
dropped : type of caller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas.Series

Index or column labels to drop.
axis : int or axis name, default 0
Whether to drop labels from the index (0 / 'index') or
columns (1 / 'columns').
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write (0 or 'index') and columns (1 or 'columns')

----------
labels : single label or list-like
Index or column labels to drop.
axis : int or axis name, default 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfr. gitter discussion: axis : {0 or 'index', 1 or 'columns'}, default 0

DataFrame.dropna : Return DataFrame with labels on given axis omitted
where (all or any) data are missing
DataFrame.drop_duplicates : Return DataFrame with duplicate rows
removed, optionally only considering certain columns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reference DataFrame.loc (first) as well as this is an indexing operations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also Series.drop

A B C D
2 8 9 10 11

Drop columns and/or rows of MultiIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MultiIndexed DataFrame

[0.3,0.2]])
>>> df
big small
lama speed 45.0 30.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is layed out correctly, a copy-paste issue?

>>> midx = pd.MultiIndex(levels=[['lama','cow','falcon'],
... ['speed','weight','length']],
... labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2],
... [0, 1, 2, 0, 1, 2, 0, 1, 2]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this example a bit simpler, maybe using integers for the value column (and order them, e.g. use np.arange(len(midx)) to create (big/small)

DataFrame.dropna : Return DataFrame with labels on given axis omitted
where (all or any) data are missing
DataFrame.drop_duplicates : Return DataFrame with duplicate rows
removed, optionally only considering certain columns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also Series.drop

@jreback jreback added Docs Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Mar 10, 2018
@codecov
Copy link

codecov bot commented Mar 12, 2018

Codecov Report

Merging #20120 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20120      +/-   ##
==========================================
- Coverage   91.72%    91.7%   -0.03%     
==========================================
  Files         150      150              
  Lines       49156    49156              
==========================================
- Hits        45090    45078      -12     
- Misses       4066     4078      +12
Flag Coverage Δ
#multiple 90.08% <100%> (-0.03%) ⬇️
#single 41.85% <54.54%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 95.84% <ø> (ø) ⬆️
pandas/core/frame.py 97.18% <100%> (ø) ⬆️
pandas/core/series.py 93.86% <100%> (ø) ⬆️
pandas/plotting/_converter.py 65.07% <0%> (-1.74%) ⬇️
pandas/core/base.py 96.78% <0%> (-0.02%) ⬇️
pandas/core/indexes/base.py 96.66% <0%> (-0.01%) ⬇️
pandas/core/indexes/multi.py 95.06% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 840d432...643db5c. Read the comment docs.

@jorisvandenbossche jorisvandenbossche added this to the 0.23.0 milestone Mar 13, 2018
@jorisvandenbossche jorisvandenbossche merged commit c6eec25 into pandas-dev:master Mar 13, 2018
@jorisvandenbossche
Copy link
Member

@JennaVergeynst Thanks for the PR!

@JennaVergeynst JennaVergeynst deleted the DOC_series_drop branch March 13, 2018 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants