Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read html with colspan rowspan #17084

Closed
wants to merge 3 commits into from
Closed

Read html with colspan rowspan #17084

wants to merge 3 commits into from

Conversation

jowens
Copy link

@jowens jowens commented Jul 26, 2017

There's a parsererror include in the test file that doesn't need to be there any more since I removed the test.

whatsnew entry:

read_html now:

  • handles colspan/rowspan in table elements
  • attempts to infer table headers in the absence of <thead>

@gfyoung
Copy link
Member

gfyoung commented Jul 26, 2017

@jowens : Yikes! That's a pretty massive diff. Let's rebase onto master first so we can have a better idea of what you're actually changing.

@gfyoung gfyoung added Enhancement IO HTML read_html, to_html, Styler.apply, Styler.applymap labels Jul 26, 2017
@jowens
Copy link
Author

jowens commented Jul 26, 2017

Please help me with the rebase. I thought I'd done it right ...

 9847  git commit -m"read_html handles rowspan and colspan in tables, infers headers if <thead> is not specified" pandas/io/html.py
 9848  git commit -m"added rowspan/colspan/infer-header tests. removed test_computer_sales_page, which now appears to parse correctly." pandas/tests/io/test_html.py
 9849  git rebase upstream/master
 9851  git rebase -i HEAD~2
 9852  git rebase -i master
 9853  git push origin html_read-handles-colspan-rowspan-and-infers-headers -f
 9854  git push origin read_html_with_colspan_rowspan -f
 9855  git push origin read_html_with_colspan_rowspan

@jowens
Copy link
Author

jowens commented Jul 26, 2017

FWIW, I changed exactly two files.

@gfyoung
Copy link
Member

gfyoung commented Jul 26, 2017

@jowens : Save the commit hashes of your two PR commits and then do the following:

  1. Update your origin master branch to match upstream.
  2. Once they are even, on this PR branch, run git reset --hard origin/master.
  3. Then for each commit hash, perform git cherry-pick <hash>
  4. Push to your PR branch

Let me know if you have any questions!

@jowens
Copy link
Author

jowens commented Jul 26, 2017

@gfyoung you have the patience of a saint. I don't know what went so wrong that I'm having all this trouble merging; I'm tempted to just start over with a clean branch. Anyway, for your step 4, can you help?

$ git status
On branch read_html_with_colspan_rowspan
nothing added to commit but untracked files present (use "git add" to track)
...
$ git push
fatal: The current branch read_html_with_colspan_rowspan has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream origin read_html_with_colspan_rowspan

$     git push --set-upstream origin read_html_with_colspan_rowspan
To github.com:jowens/pandas.git
 ! [rejected]            read_html_with_colspan_rowspan -> read_html_with_colspan_rowspan (non-fast-forward)
error: failed to push some refs to 'git@github.com:jowens/pandas.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

@jowens
Copy link
Author

jowens commented Jul 26, 2017

My history, for posterity. I tried to follow your directions, but my git experience doesn't involve pull requests in this way.

 8954  git pull upstream
 8956  git checkout upstream
 8957  git checkout master
 8958  git fetch upstream
 8959  git checkout read_html_with_colspan_rowspan
 8960  git reset --hard origin/master
 8961  git cherry-pick b922af71bc606aa207cda0725990204b47f4e24f
 8970  python setup.py build_ext --inplace
 8971  py.test-2.7 pandas/io/tests/test_html.py
 8974  git cherry-pick --continue
 8975  git cherry-pick 5818f804b0ce99cecbb7ae079764a634e69ab8e4
 8976  git pull
 8977  git pull origin/master
 8978  git pull master
 8979  git pull origin
 8980  git cherry-pick --continue
 8981  git status
 8982  git push
 8983      git push --set-upstream origin read_html_with_colspan_rowspan

@jowens
Copy link
Author

jowens commented Jul 26, 2017

Should I just start over? It'd surely save you a lot of effort.

@gfyoung
Copy link
Member

gfyoung commented Jul 26, 2017

Should I just start over? It'd surely save you a lot of effort.

Whatever is easiest for you. I'm here to help you through the process so that we can get beneficial contributions like yours merged.

Also, at your last step, you should run:

git push -f --set-upstream origin read_html_with_colspan_rowspan

@pep8speaks
Copy link

pep8speaks commented Jul 26, 2017

Hello @jowens! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on July 26, 2017 at 23:16 Hours UTC

@jowens
Copy link
Author

jowens commented Jul 26, 2017

Ah, the mysterious -f.

This has now been pushed. What next? You're awesome, @gfyoung.

@jowens
Copy link
Author

jowens commented Jul 26, 2017

OK, I'm gonna start over. Stay tuned!

@gfyoung
Copy link
Member

gfyoung commented Jul 26, 2017

OK, I'm gonna start over. Stay tuned!

I'm confused...why are you starting over?

@gfyoung
Copy link
Member

gfyoung commented Jul 26, 2017

Ah, the mysterious -f.

This means push your version of the branch to origin, overwriting whatever was there previously. Thus, it doesn't care if your branch was behind its remote counterpart. (-f = --force)

@jowens
Copy link
Author

jowens commented Jul 26, 2017

I started over because even after doing a second round I had merge conflicts, and so starting over seemed most sensible in terms of my time. This round went better. #17089

@gfyoung
Copy link
Member

gfyoung commented Jul 26, 2017

Okay, closing this then.

@gfyoung gfyoung closed this Jul 26, 2017
@gfyoung gfyoung added this to the No action milestone Jul 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO HTML read_html, to_html, Styler.apply, Styler.applymap
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: read_html to handle rowspan, colspan
3 participants