Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[skip ci] Fix Broken Link in NYC Taxi Notebook #616

Merged
merged 3 commits into from
Aug 4, 2022

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Aug 2, 2022

This PR fixes both broken links for the NYC 2016 and NYC 2017 datasets and reran the notebooks under 22.08.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added the Python Related to Python code label Aug 2, 2022
@@ -58,9 +58,9 @@
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a big slowdown? Running on a different GPU?


Reply via ReviewNB

Copy link
Contributor Author

@isVoid isVoid Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taureandyernv , are you aware how was the notebook run prior to this change? On what machine, with/without memory pool etc.?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're saving the parquet file as a csv :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taureandyernv do you have an answer to the question about what hardware it was run on, memory pool etc? We need to make sure this is not a real regression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we should always store this information in the notebook.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acutally...this is about right. This should take 30ish seconds on a tesla based hardware, 3.5 minutes on quadro. @isVoid, what slowdown are you talking about?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the diff in this PR, you will see the slowdown I am talking about.

The notebook as checked in stores the time taken by the previous run. We should document in the notebook what hardware, RAPIDS and CUDA version, etc. it was run on. Otherwise when people run and they see a slower time they are likely to be confused or surprised.

None of this "this is about right" guesswork. Let's be precise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the slowdown:

old:

      "CPU times: user 11.2 s, sys: 6.27 s, total: 17.5 s\n",
      "Wall time: 17.7 s\n"

new

      "CPU times: user 22.6 s, sys: 14.6 s, total: 37.2 s\n",
      "Wall time: 37.2 s\n"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is interesting and I've honestly don't recall seeing it run that fast. However, we can't yet compare these numbers and say there's a regression, but we can strongly suspect it. When we built this notebook for the 0.14 release, I did my work on RTX8000s (here is the commit with the ~3.5 minute run), ran some quick tests on the GV100s to see how my time could be so different, and only reran it on GV100s with this update. @thomcom used a different machine that was consistently faster with the PIP algorithm. It was cuSpatial's PIP that helped me realize the depth of the "by algorithm" performance discrepancies that classes of cards in the same arch family could have. Looking back through the branches, the notebook hasn't actually been reran since its merge and the CI jenkins jobs for each release may have already been deleted. This may end up being a @thomcom question, as those 17 second numbers were run on his machine, as per 6c3d853 and this commit (which I could only render ) . We may need him to rerun the notebook on that system to get any realistic understanding about if there is a regression and where it could have happened. @thomcom , may you provide your system details for @harrism and, if you still have that system, rerun the notebook on that system on 22.08 nightlies and provide us numbers?

A bit about the past:

While I can talk about what I recall observing then (and could be mistaken), the challenge you're facing comes from the stoppage of testing and record keeping for regressions on notebooks right around 0.14 and notebooks testing didn't start back up until after 0.18, which wasn't nearly as thorough until very recently. Some libraries reestablished forms of testing and record keeping a bit more whole heartedly than others, but cuGraph remains a shining example of what might help with system spec tracking with their table at the top). For now, would it help that current and future cuSpatial notebooks include a version of cuGraph's tables?

Thoughts on the future:

I absolutely agree on the precise record keeping, with system specs, which is why it's a built in portion of the outputs with that new beta notebooks testing script (although we currently don't test the cuSpatial notebook). We're only collecting data, not yet processing or analyzing it. We're in talks to get it integrated into Ops. Would you like me to start testing cuSpatial notebooks? Happy to share more info.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @taureandyernv @harrism I'll try to rerun the notebook this afternoon or tomorrow.

@harrism harrism added bug Something isn't working non-breaking Non-breaking change labels Aug 2, 2022
@harrism
Copy link
Member

harrism commented Aug 2, 2022

could probably skip ci for PRs like this?

@isVoid isVoid changed the title Fix Broken Link in NYC Taxi Notebook [skip ci] Fix Broken Link in NYC Taxi Notebook Aug 2, 2022
Copy link
Contributor

@taureandyernv taureandyernv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the taxi2017.csv to taxi2017.parquet in the noted areas. Also, that data needs to be downloaded ad its about 4gb.
getting the data in the folder, which may be hard for a CI machine without getting ops support, may solve the speed issues

@taureandyernv
Copy link
Contributor

Created a PR into your repo @isVoid that has a fully working notebook that also runs fast (esp when data is already downloaded :) ). Please merge it: isVoid#1

My wall time is ~40 seconds on average, which admittedly is a touch slower than I remember, but maybe by a second or two.

Fix a parquet issues, fix dtype issues
Comment on lines 184 to 196
"CPU times: user 11.2 s, sys: 6.27 s, total: 17.5 s\n",
"Wall time: 17.7 s\n"
"CPU times: user 22.6 s, sys: 14.6 s, total: 37.2 s\n",
"Wall time: 37.2 s\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the slowdown I'm talking about. More than 2x slower.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh thanks! I hadn't noticed that at all!

@thomcom thomcom self-assigned this Aug 3, 2022
@harrism
Copy link
Member

harrism commented Aug 3, 2022

Since there's still longwinded discussion happening, we have to retarget this to 22.10. We are in code freeze.

@taureandyernv
Copy link
Contributor

Since there's still longwinded discussion happening, we have to retarget this to 22.10. We are in code freeze.

Hmmm, @harrism , i thought the discussion was a side topic of the possibility of a regression, but not dealing with the material updates of the notebook itself. All necessary changes are complete. This means that cuspatial won't have a working notebook for 22.08 and there are more eyes on spatial analytics right now...

@harrism
Copy link
Member

harrism commented Aug 4, 2022

We have rules: PRs must be approved before merging. Also, the labels don't say whether this PR is ready to merge. We should treat opening new PRs during burndown as a "big deal".

I approved this PR, but really @thomcom should approve it since he is more of a Python codeowner than I am (and since @isVoid can't approve his own PR). I will leave merging up to @isVoid since I don't know if it is ready.

@isVoid
Copy link
Contributor Author

isVoid commented Aug 4, 2022

I ran the code on a dgx and branch-22.08 and I have this timing for the cell:

CPU times: user 7.28 s, sys: 5.22 s, total: 12.5 s
Wall time: 12.5 s

I think the function performance depends a lot on which machine it runs on. Since code freeze starts tomorrow. Let's merge this and follow up on the effects of machine to performance.

@isVoid
Copy link
Contributor Author

isVoid commented Aug 4, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 9beebc3 into rapidsai:branch-22.08 Aug 4, 2022
@thomcom
Copy link
Contributor

thomcom commented Aug 4, 2022

I reran the benchmark and got 14.5 seconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change Python Related to Python code
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants