Skip to content

Conversation

@amoeba
Copy link
Member

@amoeba amoeba commented May 21, 2025

Rationale for this change

The docs don't build cleanly for me locally and some warnings and errors that are consequential are being ignored. Some of these issues ended up being as severe as entirely missing tables.

What changes are included in this PR?

  • Fixes for each issue I ran into. Each change is a commit with whatever extra detail I needed included in the commit body.
  • This also includes an auto-update of our Doxyfile which silences a number of deprecation warnings. I took a look at the diff and it seemed pretty safe and the docs look good to me locally. We can do a preview in CI first though.

Are these changes tested?

Yes, each change was tested locally.

Are there any user-facing changes?

No.

@amoeba
Copy link
Member Author

amoeba commented May 21, 2025

@github-actions crossbow submit preview-docs

@amoeba
Copy link
Member Author

amoeba commented May 21, 2025

Note to reviewers: I'd look through this commit-by-commit. Hopefully each commit has enough detail in it but let me know if any need more context.

@github-actions
Copy link

Revision: 896941aaf4cc329b00cf1b7a9b5f54f2a7610a65

Submitted crossbow builds: ursacomputing/crossbow @ actions-11796d6459

Task Status
preview-docs GitHub Actions

@amoeba
Copy link
Member Author

amoeba commented May 21, 2025

Did a partial review of failing CI jobs and I have at least one thing to fix:

amoeba added 15 commits May 21, 2025 08:49
/Users/bryce/src/apache/arrow/docs/source/developers/cpp/emscripten.rst:22: WARNING: Explicit markup ends without a blank line; unexpected unindent.
/Users/bryce/src/apache/arrow/docs/source/developers/python.rst:320: ERROR: Error in "code-block" directive:
Before this, this block throws a warning which causes the entire code block to fail

<ipython-input-18-5050e858dd71>:1: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
  current_year = datetime.datetime.utcnow().year
This is currently broken on the website.
This didn't link to anything, now it links correctly.
Otherwise you get a broken link and a warning
Without this change, the usage of,

:term:`tag <Tag>

later in the document wasn't linking to anything since Tag isn't a term in any glossary Sphinx knows about. You also get this warning when you build the docs,

/Users/bryce/src/apache/arrow/docs/source/format/DissociatedIPC.rst:278: WARNING: term not in glossary: 'Tag'

I didn't think we wanted to define Tag in the site-wide glossary so I use a page-specific glossary here. It renders the same as before but now can be linked to properly
Sphinx thinks +1 is the syntax we want to use and tries to interpret it as such,

/Users/bryce/src/apache/arrow/docs/source/developers/release_verification.rst:143: WARNING: Pygments lexer name '+1' is not known

This fixes that.
These were 0x00a0 and are now just regular spaces. I've verified this renders fine still.
One instance on this pages reports a "failed to lex" warning,

/Users/bryce/src/apache/arrow/docs/source/developers/java/building.rst:193: WARNING: Could not lex literal_block

so I gave that and the rest of the code blocks a specific language name. I used text because `bash` doesn't actually do a good job and so doesn't add anything beyond not highlighting at al.
apache#46476 renamed streaming_execution to acero. These links should now work
Um wow. So I guess this table was malformed

/Users/bryce/src/apache/arrow/docs/source/cpp/compute.rst:20
2: ERROR: Malformed table.

due to a few missing spaces. Due to this, the table is entirely missing from the docs. !!!
@amoeba amoeba force-pushed the fix/GH-46520--fix-various-docs-build-issues-two branch from 896941a to 5099af7 Compare May 21, 2025 15:49
I previous rewrote the code in this block to use the non-deprecated API but we test on older versions (<3.11) of Python so the new code will error and Sphinx will barf.
@amoeba
Copy link
Member Author

amoeba commented May 22, 2025

@AlenkaF @thisisnic would either of you like to review?

@AlenkaF
Copy link
Member

AlenkaF commented May 22, 2025

Thanks so much for working on this, much appreciated!
I have a plan to review but can not today. Will try tomorrow 🤞

@AlenkaF
Copy link
Member

AlenkaF commented May 23, 2025

@github-actions crossbow submit preview-docs

@github-actions
Copy link

Revision: cbf26ee

Submitted crossbow builds: ursacomputing/crossbow @ actions-43ab4cf3eb

Task Status
preview-docs GitHub Actions

@AlenkaF
Copy link
Member

AlenkaF commented May 23, 2025

I think there is one more fix we can add. From the Docs CI build:

/build/python/docs/source/cpp/api/table.rst:44: WARNING: doxygenfunction: Cannot find function "arrow::ConcatenateTables" in doxygen xml output for project "arrow_cpp" from directory: /build/cpp/apidoc/xml
/build/python/docs/source/cpp/api/table.rst:47: WARNING: doxygenfunction: Cannot find function "arrow::PromoteTableToSchema" in doxygen xml output for project "arrow_cpp" from directory: /build/cpp/apidoc/xml

a warning can also be found on the page: https://arrow.apache.org/docs/cpp/api/table.html#_CPPv4N5arrow17ConcatenateTablesERKNSt6vectorINSt10shared_ptrI5TableEEEE24ConcatenateTablesOptionsP10MemoryPool.

There are a number of warnings and errors connected to docstrings for BaseListArray.flatten in the same CI build (https://github.com/apache/arrow/actions/runs/15167885952/job/42650356152?pr=46521#step:6:7465). Not sure what to do with these. Looks OK in the array.pxi code.

The grouping operator for ConcatenateTables was unclosed which caused issues. I just removed the group since we weren't using it. It's possible to fix it without removing the group but I figured it out once and could re-figure-it-out so I just did this.

Once the above was fixed, PromoteTableToSchema started throwing docs warnings (because it could now see the functions) and I fixed this by specifying the variants explicitly.

This now looks right locally.
@amoeba
Copy link
Member Author

amoeba commented May 23, 2025

Thanks for taking a look @AlenkaF. I fixed the C++ issues in 16e4174. I looked at the BaseListArray and couldn't figure it out. How would you feel if we merged this with just the changes already included and left the remaining warnings for another day?

@amoeba
Copy link
Member Author

amoeba commented May 23, 2025

@github-actions crossbow submit preview-docs

Copy link
Member

@AlenkaF AlenkaF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @amoeba!
I am OK with merging this PR as is 👍 it's already a big improvement in the docs build.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels May 24, 2025
@amoeba amoeba merged commit 76c7a68 into apache:main May 28, 2025
32 of 38 checks passed
@amoeba amoeba removed the awaiting committer review Awaiting committer review label May 28, 2025
@amoeba
Copy link
Member Author

amoeba commented May 28, 2025

Thanks for the review @AlenkaF. Merged.

@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 76c7a68.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants