Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise CONTRIBUTING.md #10644

Merged
merged 11 commits into from
Apr 27, 2022
Merged

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Apr 12, 2022

I have revised the CONTRIBUTING.md file to address several pieces that are out of date. I also revised a good portion of the text and updated external references. Finally, I wrapped the lines at 100 characters to align with other Markdown files in the C++ docs. I would prefer to adopt a convention of one sentence per line if reviewers agree, but went with the 100 character wrapping for now to be consistent with other docs.

@bdice bdice added the doc Documentation label Apr 12, 2022
@bdice bdice self-assigned this Apr 12, 2022
@bdice bdice added the non-breaking Non-breaking change label Apr 12, 2022

## Automated Build in Docker Container
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted the sections about using gpuCI locally and building Docker containers, because I think both are out of date. The Docker builds were removed in #10069. It is my understanding that the gpuCI scripts have not been an accurate representation of the CI workflow for some time. @rapidsai/ops-codeowners if you have any guidance or input on this, please feel free to weigh in!

@bdice bdice marked this pull request as ready for review April 12, 2022 19:58
CONTRIBUTING.md Outdated Show resolved Hide resolved
- Create the conda development environment `cudf_dev`:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it seem like conda is required. Many users frequently complain about this and we've gone to great lengths to make it possible to build without conda.

We should have different sections for "Here's how to build with just cmake" and "here's how to build with a conda environment".

Copy link
Contributor Author

@bdice bdice Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I have only built libcudf/cudf outside of rapids-compose once in the last year. I don't think I'm qualified to write/update these docs but I agree with the idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is worth putting the effort into as it likely the most valuable change we can make to this document. Building without conda is one of the most frequent user questions/requests we get.

See
#2770
#6979

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrhemstad Could we defer this to a later PR? I do think this would be a significant improvement (and may illuminate some challenges with installing from source as mentioned in those issues), but I think it's out of scope for this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minimum I'd like to see for this PR:

  • Try building w/o conda and see what happens. If it works, great! Document that this is possible
  • If it doesn't:
    • Add some words to this document that states how a conda environment is currently required to satisfy build dependencies (these could be separated based on building libcudf vs cython/cudf python)
    • List what those dependencies are
    • File an issue to capture what work is needed to avoid requiring a conda environment for building libcudf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I put this information in an awkward place. At first I couldn't decide how to organize this information and then I forgot to change it once I made a decision. I am going to put these dependencies between "Create the build environment" and "Build cuDF from source." The instructions will be the same, calling build.sh, so we just need to list the dependencies above that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to include a special note about --cmake-args=\"\-DCUDF_ENABLE_ARROW_S3=OFF\"? (Or disable that option by default, like the other disabled-by-default I/O features?)

So we have a couple of options:

  • We can enable arrow by also installing libcurl4-openssl-dev and updating the get_arrow.cmake script to pass -DAWSSDK_SOURCE=AUTO
  • Or we can document why we are disabling S3 ( less requirements? )

I am hestiant to change the default value of CUDF_ENABLE_ARROW_S3 as that will be a breaking change to anyone building cudf presuming enabled by default ( users of build.sh or people directly calling cmake ).

As for why we need to pass AWSSDK_SOURCE=AUTO that is required for Arrow's build to work correctly when you don't have arrow already installled. This tells it you are okay with it building the dependency from source when it can't find it locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in f674f7f. @jlowe Does this make more sense now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this makes more sense. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrhemstad Any final comments on this before we merge?

CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved

```bash
make test
make test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful if there was an example of how to run just one single test, for c++ and python. I rarely work in the c++ layer and always have to poke around slack to figure out how to run just one c++ test when needed.

Copy link
Contributor Author

@bdice bdice Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I started this PR specifically to address concerns like:

  • running a single test
  • running a single benchmark
  • running all benchmarks
  • building docs

However, I realized that all of these topics would fit better in the C++ Developer Guide, Benchmarking Guide, Documentation Guide, and Testing Guide. Would it solve the problem if we cross-linked one or more of those documents here and did corresponding updates to those docs? If so, I'd like to focus this PR specifically on the Contributing Guide since it's so outdated, then follow up with revisions to the other documents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also #9251 which has similar ideas for cross-linking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as it makes it in somewhere I'm happy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally for leaving it here for better visibility. But doesn't have to include every detail. An example usage followed by a cross link to a detail section might be a a way to go. I also agree it doesn't need to be tackled in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to leave this conversation open since it contains action items for future PRs. However, it would be nice to get a PR approval if you think the rest is good to go.

@jrhemstad
Copy link
Contributor

@bdice see also #9251

CONTRIBUTING.md Outdated Show resolved Hide resolved
@bdice
Copy link
Contributor Author

bdice commented Apr 15, 2022

@jrhemstad @brandon-b-miller I tagged you both for review since you provided the most comments in prior iterations - I'd like to merge this as-is and defer follow-up work to later PRs. This PR gets the formatting to be consistent, which will make the diffs of future changes easier to handle. Thanks!

@bdice bdice requested a review from robertmaynard April 18, 2022 22:20
@bdice
Copy link
Contributor Author

bdice commented Apr 27, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit f0b9117 into rapidsai:branch-22.06 Apr 27, 2022
@bdice bdice mentioned this pull request Aug 4, 2022
3 tasks
rapids-bot bot pushed a commit that referenced this pull request Aug 8, 2022
This PR is a breaking change that disables Arrow S3 support by default. Enabling this feature by default has caused build issues for many downstream consumers, all of whom (to my knowledge) manually disable support for this feature. Most commonly, that build error appears as `fatal error: aws/core/Aws.h: No such file or directory`. In my understanding, several downstream consumers of cudf no longer rely on Arrow S3 support from this library and instead get S3 access via fsspec. I am not aware of any users of libcudf who rely on this being enabled by default (or enabled at all).

See related issues and discussions: #8617, #11333, #8867, #10644 (comment), NVIDIA/spark-rapids#2827. Build errors caused by this default behavior have also been reported internally.

cc: @rjzamora @beckernick @jdye64 @randerzander @robertmaynard @jlowe @quasiben if you have comments following our previous discussion.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #11470
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Documentation non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants