This is a unified documentation build template designed for the Data-Juicer ecosystem. Built on Sphinx and pydata-sphinx-theme, it provides multi-version and multi-language documentation capabilities, ensuring consistent documentation appearance and user experience across all subprojects.
- Unified Appearance: All subprojects share the same documentation theme and styling.
- Multi-Version Support: Automatically builds documentation for multiple Git branches and tags.
- Multi-Language Support: Supports both English and Chinese by default.
- Ecosystem Interconnectivity: Enables seamless navigation between different project documentations via header external links.
- Markdown-Friendly: Automatically discovers and integrates Markdown documents within the project.
data-juicer-sphinx/
├── docs/
│ └── sphinx_doc/ # Sphinx documentation build directory
│ ├── build_versions.py # Multi-version build script (main entry point)
│ ├── make.bat / Makefile # Build scripts
│ ├── redirect.html # Redirect page
│ └── source/ # Documentation source files
│ ├── conf.py # Sphinx configuration file
│ ├── custom_myst.py # Custom MyST extension
│ ├── external_links.yaml # External project link configuration
│ ├── index.rst / index_ZH.rst # Home page (customization recommended)
│ ├── docs_index.rst / docs_index_ZH.rst # Documentation index page (customization recommended)
│ ├── api.rst # API documentation index (customization recommended)
│ ├── _static/ # Static assets
│ │ ├── custom.css # Custom styles
│ │ └── images/ # Logos and icons
│ └── _templates/ # Custom templates
│ └── version-language-switcher.html
├── guides/ # Usage guides
├── pyproject.toml # Project configuration
├── README.md
└── README_ZH.md
Build the simplest English Data-Juicer Sphinx documentation (without API docs):
git clone https://github.com/datajuicer/data-juicer-sphinx.git
uv pip install .
cd docs/sphinx_doc
export PROJECT="data-juicer-sphinx"
python build_versions.py -A -l en- Creates an independent Git worktree for each version (branch/tag) at
.worktrees/<version>. - Automatically cleans up after building (unless
KEEP_WORKTREES=Trueis set indocs/sphinx_doc/build_versions.py) to avoid polluting the main working directory.
- Automatically scans the entire worktree to collect all
.mdand.rstfiles (excluding directories likeoutputs,sphinx_doc,.github, etc.). - Copies these files into a unified Sphinx source directory:
docs/sphinx_doc/source/. - (Customized for Data-Juicer operator documentation) For subdirectories under
operators/, automatically generates correspondingindex.rstandindex_ZH.rstfiles to facilitate categorized operator indexing.
A: Ensure all dependencies are installed before building:
uv pip install .A: Check the following:
- Ensure you didn't use the
--no-api-docor-Aflags - Verify your project contains importable Python modules
- Confirm the
CODE_ROOTenvironment variable is correctly set
A:
- Verify that
external_links.yamlis configured correctly - Ensure the
PROJECTenvironment variable is properly set - Check the browser console for JavaScript errors
A: Ensure:
- Chinese documentation files end with
_ZH.mdor_ZH.rst index_ZH.rstexists and is correctly configured
A: Documentation structures may differ between versions:
- Older versions might lack certain new pages
- Version switching attempts to access the same path; if unavailable, it redirects to the homepage
Contributions and improvements to this template are warmly welcomed! ❤