Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Split Python docs #5727

Merged
merged 2 commits into from
Sep 21, 2022
Merged

Python: Split Python docs #5727

merged 2 commits into from
Sep 21, 2022

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Sep 8, 2022

This PR will split the Python docs into a separate site. The main reason for this is that the docs are part of the Java release, which is not in sync with the Python release cycle. Meaning that there is a high probability that the docs do not match with the current version of the code.

This will publish the docs to Github pages, by pushing this to the gh-pages branch. We can set up an alias from Apache, and point pyiceberg.apache.org to the GitHub pages endpoint.

I also tried readthedocs, but I found that not straightforward. Mostly because they have a build process on their end that will pull the code, and build the docs. This involves another pipeline that we have to monitor, and we have to set up webhooks. I am a simple man, and I like simple things, therefore I went for mkdocs. This can push the docs to GitHub pages in a single command: https://www.mkdocs.org/user-guide/deploying-your-docs/#project-pages

Considerations:

  • Decided to keep it to a single page, for now, we can break it out into different pages later on. Let me know what you think of this.
  • We build the docs now when we push to master, probably we'll change this later to trigger on tags.
  • I've removed the Python docs from the other docs to avoid confusion and make sure that we have a single source of truth.

An example is shown here: https://fokko.github.io/incubator-iceberg/ (Once this is merged, I'll remove that one)

python/README.md will be shown on PyPi, and refers to the documentation.

@Fokko Fokko force-pushed the fd-add-docs branch 2 times, most recently from 1be0b8f to f0f9680 Compare September 8, 2022 11:32
@Fokko Fokko mentioned this pull request Sep 8, 2022
@Fokko
Copy link
Contributor Author

Fokko commented Sep 8, 2022

The deploy CI is failing, but I'm confident that we can make it work. Arrow has something similar. I'd love to know what other people think 👍🏻

@dhruv-pratap
Copy link
Contributor

The deploy CI is failing, but I'm confident that we can make it work. Arrow has something similar. I'd love to know what other people think 👍🏻

I prefer this clean split, especially given the different release cycle for pyiceberg.

Just a minor nit-pick: the two levels of docs folder is a bit confusing here python/docs/docs/index.md
We should rename one of those folders to something more explicit. Maybe something like python/mkdocs/docs/index.md or python/docs/content/index.md

@Fokko
Copy link
Contributor Author

Fokko commented Sep 9, 2022

Hey @dhruv-pratap thanks, I didn't think of that, but it makes a lot of sense. I've changed docs/docs to mkdocs/docs because the inner docs directory is part of the standard mkdocs structure.

Copy link
Contributor

@dhruv-pratap dhruv-pratap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now.

This PR will split the Python docs in a separate site. The main reason
for this is that the docs are part of the Java release, which is not in
sync with the Python release cylce. Meaning that there is a high probability
that the docs does not match with current version of the code.

This will publish the docs to Github pages, by pushing this to the `gh-pages`
branch. We can set up an alias from Apache, and point pyiceberg.apache.org to
the github pages endpoint.

I also tried readthedocs, but I found that not straightforward. Mostly because
they have a build process on their end that will pull the code, and build the
docs. This involves another pipeline that we have to monitor, and we have to
set up webhooks. I am a simple man, and I like simple things, therefore I went
for mkdocs. This can push the docs to github pages in a single command:
https://www.mkdocs.org/user-guide/deploying-your-docs/#project-pages

Considerations:

- Decided to keep it to a single page for now, we can break it out into different
  pages later on. Let me know what you think of this.
- We build the docs now when we push to master, probably we'll change this
  later to trigger on tags.
- I've removed the Python docs from the other docs to avoid confusion and make sure
  that we have a single source of truth.

An example is shown here: https://fokko.github.io/incubator-iceberg/
(Once this is merged, I'll remove that one)

Closes #363
Closes apache#3283
- name: Copy
working-directory: ./python/mkdocs
run: mv ./site /tmp/site
- name: Push changes to gh-pages branch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use mkdocs gh-deploy? That should automatically push to gh-pages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure what mkdocs gh-deploy is doing under the hood, therefore I made the steps explicit.

push:
branches:
- 'master'
pull_request:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to publish for pull requests? Why not just master?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was mostly for testing, removed that one 👍🏻

on:
push:
branches:
- 'master'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add paths here with python/docs/**?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call 👍🏻

additional_dependencies:
- mdformat-black
- mdformat-config
- mdformat-beautysh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's happening here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my personal favorite. This will actually detect shell, python, and config blocks in the code, and format it automatically. This way the code, script, and config will be nicely formatted 👍🏻 If it isn't formatted properly, it will fail the CI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. In the past, I've just configured those things in mkdocs, but if it's working that's great.

Copy link
Contributor

@rdblue rdblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good to me. I have a few questions about cleaning up the github action, but that can be done async.

@Fokko Fokko merged commit 759e690 into apache:master Sep 21, 2022
@Fokko Fokko deleted the fd-add-docs branch September 21, 2022 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants