ENH: read_html to handle rowspan, colspan

#### Code Sample, a copy-pastable example if possible

```python
import pandas as pd
pd.read_html('https://www.ssa.gov/policy/docs/statcomps/supplement/2015/5h.html')[0]
```

This has complex table headings:

<img width="967" alt="annual_statistical_supplement__2015_-_beneficiary_families_with_oasdi_benefits_in_current-payment_status__5_h_" src="https://user-images.githubusercontent.com/976970/28495218-7db85530-6efb-11e7-9f64-8b56a61509d4.png">

`read_html` output begins with:

```
                                                 Year                            Retired-worker families        Survivor families             Disabled-worker families                                Unnamed: 4_level_0                              Unnamed: 5_level_0 Unnamed: 6_level_0 Unnamed: 7_level_0 Unnamed: 8_level_0 Unnamed: 9_level_0 Unnamed: 10_level_0 Unnamed: 11_level_0  \
                                          Worker only                                  Worker and wife?a  Non-disabled widow only        Widowed mother or father and?                                       Worker only                            Worker, wife,?b and?  Worker and spouse Unnamed: 7_level_1 Unnamed: 8_level_1 Unnamed: 9_level_1 Unnamed: 10_level_1 Unnamed: 11_level_1
                                                  All                                                Men                    Women                              1?child                                        2?children                              3 or more children                All                Men              Women            1?child  2 or more children Unnamed: 11_level_2
0                                                 NaN                                 Number?(thousands)                      NaN                                  NaN                                               NaN                                             NaN                NaN                NaN                NaN                NaN                 NaN                 NaN
1                                                1945                                                416                      338                                   78                                               181                                              95              86.00              48.00              24.00              .?.?.               .?.?.               .?.?.
```

(row 0 of the output is probably something one would have to manually eliminate)
#### Problem description

For HTML headings with rowspan and colspan elements, `read_html` has undesirable behavior. Basically `read_html` packs all heading `<th>` elements in any particular row to the left, so any particular column no longer has any association with the `<th>` elements that are actually above it in the HTML table.

Ample discussion here about the analogous pandas+Excel test case: https://github.com/pandas-dev/pandas/issues/4679

Relevant web discussions:
- https://stackoverflow.com/questions/23703638/bug-in-pandas-read-html-method
- https://stackoverflow.com/questions/28763891/what-should-i-do-when-tr-has-rowspan
- https://johnricco.github.io/2017/04/04/python-html/   (using his example)

This may be an issue with the underlying parsers and cannot be solved well in pandas. This appears to be the behavior with both lxml and bs4/html5lib.

#### Expected Output

Each column should be associated with the `<th>` elements above it in the table. This might be a multi-row column name (as it is now) (a `MultiIndex`?) or a tuple (presumably if the argument `tupleize_cols` is set to `True`). Instead, currently, column _n_ is associated with the _n_ th `<th>` entry in the table row regardless of the settings of rowspan/colspan.

It may be this is possible to do properly in current pandas in which case I apologize for filing the issue (but I'd be happy to know how to do it).

#### Output of ``pd.show_versions()``

<details>
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.US-ASCII
LOCALE: None.None

pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 36.2.0
Cython: 0.26
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 5.3.0
sphinx: 1.6.3
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: 3.7.3
bs4: 4.5.3
html5lib: 1.0b10
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: read_html to handle rowspan, colspan #17054

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of `pd.show_versions()`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH: read_html to handle rowspan, colspan #17054

Description

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `pd.show_versions()`