Sphinx breaks transforms that depend on `document.transformer.components`, like the standard `Filter` transform #9632

cpitclaudel · 2021-09-13T14:07:30Z

Describe the bug

Sphinx does not give transforms (nor post-transforms) a way to determine which writer will be used. This is because it reads the document with a DummyWriter to generate a doctree, and then it calls post_transforms with an empty list of self.document.transformer.components.

This breaks transforms that depend on components, including the standard docutils.transforms.components.Filter.

How to Reproduce

The standard Filter transform does this:

    def apply(self):
        pending = self.startnode
        component_type = pending.details['component'] # 'reader' or 'writer'
        format = pending.details['format']
        component = self.document.transformer.components[component_type]
        if component.supports(format):
            pending.replace_self(pending.details['nodes'])
        else:
            pending.parent.remove(pending)

This breaks with Sphinx: if added as a regular transform by note_pending, components[component_type] will be Sphinx' DummyWriter if component_type is "writer", and the call to supports will return the wrong results. If added as a post-transform it throws an exception because components is empty.

Here is another example simplified from a separate project; it works with plain Docutils, but not with Sphinx:

class MyTransform(Transform):
    def apply(self):
        formats = set(self.document.transformer.components['writer'].supported)
        for node in self.document.traverse(some_pending_node_type):
            if "html" in formats:
                node.replace_self(nodes.raw("<em>Hello!</em>", format="html"))
            if {'latex', 'xelatex', 'lualatex'} & formats:
                node.replace_self(nodes.raw(r"\emph{Hello!}", format="latex"))

If added as a regular transform, components['writer'] will be Sphinx' DummyWriter and supported will be only {'html'}. If added as a post_transform the code will throw an exception because components won't have a 'writer' key.

Expected behavior

Ideally, the Filter transform (and other similar transforms) should just work, which might require running transforms after the caching stage, with the correct set of components (I imagine DummyWriter is for caching purposes?).

If that's not possible, then maybe it's possible for post_transforms? At the moment post_transforms do not see any components at all.

If that's not possible, then it would be nice to have some (Sphinx-specific, unfortunately) way to determine the list of formats supported by the writer from a post-transform.

Python version

Python 3.8.10

Sphinx version

sphinx-build 3.5.4

The text was updated successfully, but these errors were encountered:

cpitclaudel · 2021-09-13T17:27:41Z

(For future visitors, I should add that the "right" way to do this under Sphinx is using document.settings.env.app.tags in a post-transform. Still, it would be much nicer if post-transforms were called with an apropriately set-up list of components.

Another note: if reading from cache, Sphinx actually calls post-transforms with document.transformer == None, not just document.transformer.components == [].

tk0miya · 2021-09-19T06:35:43Z

You're right. Sphinx outputs intermediate doctree using DummyWriter for some purpose; cache, build cross-references, and so on. Hence the pending nodes depend on output format will not work as expected. We must admit it's a restriction of Sphinx at this moment.

If that's not possible, then maybe it's possible for post_transforms? At the moment post_transforms do not see any components at all.
If that's not possible, then it would be nice to have some (Sphinx-specific, unfortunately) way to determine the list of formats supported by the writer from a post-transform.

Good point. You can refer to what builder is used and what format is specified on post_transforms. Please check self.document.settings.env.app.builder on post_transforms (As a shortcut, you can use self.app.builder instead if your transform inherits sphinx.transforms.SphinxTransform).

I hope this will help your case.

Note: Sphinx wraps docutils' writer component by Builder. The writer is an internal component of the builders.

cpitclaudel · 2021-09-19T16:03:14Z

Please check self.document.settings.env.app.builder on post_transforms (As a shortcut, you can use self.app.builder instead if your transform inherits sphinx.transforms.SphinxTransform).

Thanks, but that's still Sphinx-specific, which still means that there's no way to write code compatible with Sphinx and Docutils without special-casing Sphinx.

if your transform inherits sphinx.transforms.SphinxTransform

That also breaks compatibility, since SphinxTransform doesn't exist in Docutils

Is it not possible to set document.transformer.components to the same list as Docutils would? This would greatly improve compatibility.

cpitclaudel added the type:bug label Sep 13, 2021

cpitclaudel mentioned this issue Sep 13, 2021

Feature request: hiding a paragraph cpitclaudel/alectryon#66

Closed

tk0miya added api type:question and removed type:bug labels Sep 19, 2021

cpitclaudel mentioned this issue Sep 23, 2021

Adjust to breaking change in format_href in pybtex mcmtroffaes/pybtex-docutils#16

Merged

AA-Turner added this to the some future version milestone Sep 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sphinx breaks transforms that depend on `document.transformer.components`, like the standard `Filter` transform #9632

Sphinx breaks transforms that depend on `document.transformer.components`, like the standard `Filter` transform #9632

cpitclaudel commented Sep 13, 2021 •

edited

Loading

cpitclaudel commented Sep 13, 2021

tk0miya commented Sep 19, 2021

cpitclaudel commented Sep 19, 2021

Sphinx breaks transforms that depend on document.transformer.components, like the standard Filter transform #9632

Sphinx breaks transforms that depend on document.transformer.components, like the standard Filter transform #9632

Comments

cpitclaudel commented Sep 13, 2021 • edited Loading

Describe the bug

How to Reproduce

Expected behavior

Python version

Sphinx version

cpitclaudel commented Sep 13, 2021

tk0miya commented Sep 19, 2021

cpitclaudel commented Sep 19, 2021

Sphinx breaks transforms that depend on `document.transformer.components`, like the standard `Filter` transform #9632

Sphinx breaks transforms that depend on `document.transformer.components`, like the standard `Filter` transform #9632

cpitclaudel commented Sep 13, 2021 •

edited

Loading