Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: GH17054: read_html() handles rowspan/colspan and infers headers #17089

Closed
wants to merge 145 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
145 commits
Select commit Hold shift + click to select a range
4bf2f2e
ENH: GH17054: read_html() handles rowspan/colspan and infers headers
jowens Jul 26, 2017
80d9c2b
in python 3, lambdas no longer take tuples as args. thanks pep 3113.
jowens Jul 27, 2017
26d1f6a
fixing lint error
jowens Jul 27, 2017
37af4ea
in python3, zip does not return a list, so list(zip(...))
jowens Jul 27, 2017
86dee93
Merge branch 'master' into read_html_with_colspan_rowspan
jowens Aug 29, 2017
d3eca72
Merge branch 'master' into read_html_with_colspan_rowspan
jowens Sep 6, 2017
f064562
documentation changes only
jowens Sep 6, 2017
67c8a59
Merge branch 'read_html_with_colspan_rowspan' of github.com:jowens/pa…
jowens Sep 6, 2017
5a38278
documentation changes only
jowens Sep 7, 2017
39f7814
documentation changes only, limited to 80 cols
jowens Sep 7, 2017
531863f
more documentation edits
jowens Sep 8, 2017
818d394
minor documentation edits
jowens Sep 9, 2017
f3a6aa3
better return type explanation in code, added issue number to tests
jowens Sep 9, 2017
2f904b2
cleaning up legacy documentation issues
jowens Sep 18, 2017
f4e7592
remove 'if'
jowens Sep 18, 2017
293d9e4
newlines for clarity
jowens Sep 18, 2017
efabae4
DOC: whatsnew typos
jreback Jul 26, 2017
552677f
ENH: GH17054: read_html() handles rowspan/colspan and infers headers
jowens Jul 26, 2017
1aacf17
TST: Check more error messages in tests (#17075)
gfyoung Jul 26, 2017
359890f
BUG: Respect dtype when calling pivot_table with margins=True
toobaz Jul 26, 2017
3fd2612
MAINT: Add missing space in parsers.pyx
gfyoung Jul 27, 2017
76249bf
MAINT: Add missing paren around print statement
gfyoung Jul 27, 2017
77d16d4
DOC: fix typos in missing.rst
jreback Jul 27, 2017
bd50a4f
in python 3, lambdas no longer take tuples as args. thanks pep 3113.
jowens Jul 27, 2017
452e08d
fixing lint error
jowens Jul 27, 2017
ecfaa4c
in python3, zip does not return a list, so list(zip(...))
jowens Jul 27, 2017
69cd83c
DOC: further clean-up null/na changes (#17113)
jorisvandenbossche Jul 29, 2017
1e5cfa1
BUG: Allow pd.unique to accept tuple of strings (#17108)
mroeschke Jul 30, 2017
c502dba
BUG: Allow Series with same name with crosstab (#16028)
mroeschke Jul 30, 2017
2155c3e
COMPAT: make sure use_inf_as_null is deprecated (#17126)
jreback Aug 1, 2017
3ed9f53
CI: bump version of xlsxwriter to 0.5.2 (#17142)
jreback Aug 1, 2017
9a50c21
DOC: Clean up instructions in ISSUE_TEMPLATE (#17146)
gfyoung Aug 1, 2017
5759eff
Add missing space to the NotImplementedError's message for compound d…
FKint Aug 1, 2017
3855039
DOC: (de)type the return value of concat (#17079) (#17119)
jebob Aug 1, 2017
d7cb627
BUG: Thoroughly dedup column names in read_csv (#17095)
gfyoung Aug 1, 2017
9d32df6
DOC: Additions/updates to documentation (#17150)
alanyee Aug 2, 2017
5ce00e1
ENH: add to/from_parquet with pyarrow & fastparquet (#15838)
jreback Aug 2, 2017
9aadb64
DOC: doc typos, xref #15838
jreback Aug 2, 2017
89fa421
TST: test for categorical index monotonicity (#17152)
jreback Aug 3, 2017
ccdae36
MAINT: Remove non-standard and inconsistently-used imports (#17085)
jbrockmendel Aug 3, 2017
5b42bdf
DOC: typos in whatsnew
Aug 3, 2017
56957cf
DOC: whatsnew 0.21.0 fixes
jreback Aug 3, 2017
d2e21c3
BUG: Fix CSV parsing of singleton list header (#17090)
threecgreen Aug 3, 2017
20487bf
ENH: Support strings containing '%' in add_prefix/add_suffix (#17151)…
jschendel Aug 3, 2017
b4b4c77
REF: repr - allow block to override values that get formatted (#17143)
jorisvandenbossche Aug 4, 2017
b720f0d
MAINT: Drop unnecessary newlines in issue template
gfyoung Aug 7, 2017
43dab45
remove direct import of nan
jbrockmendel Aug 7, 2017
94a734a
use == to test String equality (#17171)
jhelie Aug 7, 2017
e143ee1
ENH: Add warning when setting into nonexistent attribute (#16951)
deniederhut Aug 7, 2017
5a523bb
DOC: added string processing comparison with SAS (#16497)
natethedrummer Aug 7, 2017
0bfad7c
CLN: remove unused get methods in internals (#17169)
jbrockmendel Aug 7, 2017
a4e4909
TST: Partial Boolean DataFrame Indexing (#17186)
mroeschke Aug 7, 2017
e8fab8a
CLN: Reformat docstring for IPython fixture
gfyoung Aug 7, 2017
d089d44
Define Series.plot and Series.hist in class definition (#17199)
jbrockmendel Aug 8, 2017
b09b274
BUG: support pandas objects in iloc with old numpy versions (#17194)
toobaz Aug 8, 2017
cc8c5d7
Implement _make_accessor classmethod for PandasDelegate (#17166)
jbrockmendel Aug 8, 2017
df9710b
Create ABCDateOffset (#17165)
jbrockmendel Aug 9, 2017
e71e6d7
BUG: resample and apply modify the index type for empty Series (#17149)
discort Aug 9, 2017
e9c7f29
DOC: Updated NDFrame.astype docs (#17203)
topper-123 Aug 9, 2017
38293d3
MAINT: Minor touch-ups to GitHub PULL_REQUEST_TEMPLATE (#17207)
dhimmel Aug 9, 2017
7280e6c
CLN: replace %s syntax with .format in core.computation (#17209)
jschendel Aug 10, 2017
421dcf4
Bugfix for multilevel columns with empty strings in Python 2 (#17099)
chrisjbillington Aug 10, 2017
d5733ee
CLN/ASV clean-up frame stat ops benchmarks (#17205)
jorisvandenbossche Aug 10, 2017
9f69583
BUG: Rolling apply on DataFrame with Datetime index returns NaN (#17156)
FXocena Aug 10, 2017
1e1ce40
CLN: Remove import exception handling (#17218)
dhimmel Aug 10, 2017
a1509dc
MAINT: Remove extra the's in deprecation messages (#17222)
gfyoung Aug 11, 2017
6788533
DOC: Patch docs in _decorators.py
gfyoung Aug 11, 2017
619e031
CLN: replace %s syntax with .format in pandas.util (#17224)
jschendel Aug 11, 2017
9e26997
Add 'See also' sections (#17223)
topper-123 Aug 11, 2017
a7311d2
move pivot_table doc-string to DataFrame (#17174)
jbrockmendel Aug 11, 2017
1ac9ede
Remove import of pandas as pd in core.window (#17233)
jbrockmendel Aug 12, 2017
a2d8d23
TST: Move more frame tests to SharedWithSparse (#17227)
kernc Aug 12, 2017
013b983
REF: _get_objs_combined_axis (#17217)
toobaz Aug 12, 2017
fddb66d
ENH/PERF: Remove frequency inference from .dt accessor (#17210)
cpcloud Aug 14, 2017
2e55156
Fix apparent typo in tests (#17247)
jbrockmendel Aug 14, 2017
b49446e
COMPAT: avoid calling getsizeof() on PyPy
mattip Aug 15, 2017
536b761
CLN: replace %s syntax with .format in pandas.core.reshape (#17252)
jschendel Aug 15, 2017
a1ff671
ENH: Infer compression from non-string paths (#17206)
dhimmel Aug 15, 2017
df1b0dc
Fix bugs in IntervalIndex.is_non_overlapping_monotonic (#17238)
jschendel Aug 15, 2017
8fe1cc3
BUG: Fix behavior of argmax and argmin with inf (#16449) (#16449)
DGrady Aug 15, 2017
357e7ae
CLN: Remove have_pytz (#17266)
jbrockmendel Aug 16, 2017
aa97aa6
CLN: replace %s syntax with .format in core.dtypes and core.sparse (#…
jschendel Aug 17, 2017
a618bec
Replace imports of * with explicit imports (#17269)
jbrockmendel Aug 17, 2017
db3ea2f
TST: pytest deprecation warnings GH17197 (#17253)
swyoon Aug 17, 2017
de60666
Handle more date/datetime/time formats (#15871)
Winand Aug 18, 2017
0bbda54
DOC: add example on json_normalize (#16438)
zzgao Aug 18, 2017
c148dd2
BUG: Have object dtype for empty Categorical.categories (#17249)
TomAugspurger Aug 19, 2017
155c11a
CLN: replace %s syntax with .format in pandas.tseries (#17290)
jschendel Aug 19, 2017
e4aeed2
TST: parameterize consistency tests for rolling/expanding windows (#1…
jreback Aug 19, 2017
db11418
FIX: define `DataFrame.items` for all versions of python (#17214)
tacaswell Aug 19, 2017
a256e26
PERF: Update ASV publish config (#17293)
TomAugspurger Aug 20, 2017
75d46a6
DOC: Expand docstrings for head / tail methods (#16941)
yosukeBaya4 Aug 21, 2017
172abfb
MAINT: Use set literal for unsupported + depr args
gfyoung Aug 21, 2017
1982aca
DOC: Add proper docstring to maybe_convert_indices
gfyoung Aug 21, 2017
393bb19
DOC: Improving docstring of take method (#16948)
matagus Aug 21, 2017
595e0a4
BUG: Fixed regex in asv.conf.json (#17300)
TomAugspurger Aug 21, 2017
6a45d36
Remove unnecessary usage of _TSObject (#17297)
jbrockmendel Aug 21, 2017
5f077f3
BUG: clip should handle null values
mgasvoda Aug 21, 2017
a10fa92
BUG: fillna returns frame when inplace=True if value is a dict (#1615…
Aug 21, 2017
8dfb95b
CLN: Index.append() refactoring (#16236)
toobaz Aug 22, 2017
8326c83
DEPS: set min versions (#17002)
jreback Aug 22, 2017
8fbd8f8
CLN: replace %s syntax with .format in core.tools, algorithms.py, bas…
jschendel Aug 22, 2017
3625190
BUG: Fix strange behaviour of Series.iloc on MultiIndex Series (#1714…
Aug 22, 2017
7364711
DOC: Add module doc-string to tseries/api.py
gfyoung Aug 23, 2017
e5797fa
MAINT: Clean up docs in pandas/errors/__init__.py
gfyoung Aug 23, 2017
9be531a
CLN: replace %s syntax with .format in missing.py, nanops.py, ops.py …
jschendel Aug 24, 2017
a9574b0
Make pd.Period immutable (#17239)
jbrockmendel Aug 24, 2017
3e31383
Bug: groupby multiindex levels equals rows (#16859)
Aug 24, 2017
e5030b3
BUG: Cannot use tz-aware origin in to_datetime (#16842)
ivybae Aug 24, 2017
7be53ed
Replace usage of total_seconds compat func with timedelta method (#17…
jbrockmendel Aug 25, 2017
f4adbb9
CLN: replace %s syntax with .format in core/indexing.py (#17357)
cbertinato Aug 28, 2017
b1b3325
DOC: Point to dev-docs in issue template (#17353)
gfyoung Aug 28, 2017
76cc924
CLN: remove total_seconds compat from json (#17341)
chris-b1 Aug 29, 2017
0309dae
CLN: Move test_intersect_str_dates (#17366)
jschendel Aug 29, 2017
c523bfc
BUG: Respect dups in reindexing CategoricalIndex (#17355)
gfyoung Aug 29, 2017
5a6f2ac
Unify Index._dir_* with Series implementation (#17117)
jbrockmendel Aug 29, 2017
ce8ccba
BUG: make order of index from pd.concat deterministic (#17364)
toobaz Aug 29, 2017
a585e09
Fix typo that causes several NaT methods to have incorrect docstrings…
jbrockmendel Aug 29, 2017
8199559
CLN: replace %s syntax with .format in io/formats/format.py (#17358)
cbertinato Aug 30, 2017
6ec1044
PKG: Added pyproject.toml for PEP 518 (#16745)
TomAugspurger Aug 30, 2017
c33af56
DOC: Update Overview page in documentation (#17368)
iuliakhomenko Aug 30, 2017
0f8205c
API: Have MultiIndex consturctors always return a MI (#17236)
TomAugspurger Aug 30, 2017
54f68b4
CLN: replace %s syntax with .format in io/formats/css.py, excel.py, p…
cbertinato Aug 31, 2017
b717ebc
BUG: not correctly using OrderedDict in test_series_apply (#17384)
sylviawhoa Aug 31, 2017
b61af0e
Remove boxplot from _dataframe_apply_whitelist (#17381)
jbrockmendel Aug 31, 2017
c80e8d0
API: Localize Series when calling to_datetime with utc=True (#6415) (…
mroeschke Sep 1, 2017
3a0dc92
TST: Enable tests in test_tools.py (#17405)
jschendel Sep 1, 2017
365f2fe
TST: remove tests and docs for legacy (pre 0.12) hdf5 support (#17404)
topper-123 Sep 1, 2017
d994323
Tslib unused (#17402)
jbrockmendel Sep 1, 2017
e94e572
DOC: Cleaned references to pandas <v0.12 in docs (#17375)
topper-123 Sep 2, 2017
6a02ffa
Remove unused _day and _month attrs (#17431)
jbrockmendel Sep 4, 2017
519c57f
DOC: Clean-up references to v12 to v14 (both included) (#17420)
topper-123 Sep 5, 2017
f22b895
BUG: Plotting Timedelta on y-axis #16953 (#17430)
s-weigand Sep 6, 2017
8edd85a
COMPAT: handle pyarrow deprecation of timestamps_to_ms in .from_panda…
jreback Sep 6, 2017
047727a
DOC/TST: Add examples to MultiIndex.get_level_values + related change…
topper-123 Sep 6, 2017
91a2300
documentation changes only
jowens Sep 6, 2017
41058ab
documentation changes only
jowens Sep 7, 2017
4926913
documentation changes only, limited to 80 cols
jowens Sep 7, 2017
14235ec
more documentation edits
jowens Sep 8, 2017
196c835
minor documentation edits
jowens Sep 9, 2017
fed4b03
better return type explanation in code, added issue number to tests
jowens Sep 9, 2017
c2d9cc6
cleaning up legacy documentation issues
jowens Sep 18, 2017
d4b213b
remove 'if'
jowens Sep 18, 2017
b16f6d5
newlines for clarity
jowens Sep 18, 2017
092889a
Merge branch 'read_html_with_colspan_rowspan' of github.com:jowens/pa…
jowens Sep 20, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,7 @@ Other Enhancements
- :func:`DataFrame.select_dtypes` now accepts scalar values for include/exclude as well as list-like. (:issue:`16855`)
- :func:`date_range` now accepts 'YS' in addition to 'AS' as an alias for start of year (:issue:`9313`)
- :func:`date_range` now accepts 'Y' in addition to 'A' as an alias for end of year (:issue:`9313`)
- :func:`read_html` handles colspan and rowspan arguments and attempts to infer a header if the header is not explicitly specified (:issue:`17054`)
- Integration with `Apache Parquet <https://parquet.apache.org/>`__, including a new top-level :func:`read_parquet` and :func:`DataFrame.to_parquet` method, see :ref:`here <io.parquet>`. (:issue:`15838`, :issue:`17438`)
- :func:`DataFrame.add_prefix` and :func:`DataFrame.add_suffix` now accept strings containing the '%' character. (:issue:`17151`)
- `read_*` methods can now infer compression from non-string paths, such as ``pathlib.Path`` objects (:issue:`17206`).
Expand Down
Loading