Skip to content

Breaks reproducibility in html builders #4240

Closed as not planned
Closed as not planned
@osermay

Description

@osermay

Subject: sphinx/sphinx/builders/html.py html builder

Problem

  • sphinx is causing reproducibility issues in other packages
  • reproducible-builds is a software development practice that ensures that a package can be recreated bit-by-bit identical copies of all specified artifacts
  • to find out more information about reproducible-builds checkout: https://reproducible-builds.org/
  • the source of unreproducibility comes from sphinx/sphinx/builders/html.py hash generation
  • currently seeing this issue in the scikit-learn package

Procedure to reproduce the problem

scikit-learn package

  • build the unstable scikit-learn (version 0.19.1-1) package twice on debian
  • run diffoscope
  • difference occurs in /usr/share/doc/python-sklearn-doc/stable/.buildinfo where the config hash is recorded
  • the diffoscope can be seen here https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/scikit-learn.html
    sphinx source code in /sphinx/sphinx/builders/html.py
  • the current way of creating a config hash using get_stable_hash() is causing unreproducibility in other packages that use sphinx

Expected results

The hash created for tags is reproducible, so it is also expected that config is also reproducible. The current way of creating the tags hash is creating the same hash string every build. There should be a similar object that could be passed to get_stable_hash that will created the same hash string for config.

self.config_hash = get_stable_hash(cfgdict)

Environment info

  • OS: Debian
  • diffoscope, scikit-learn

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions