Asset catalog does not include inlined images #1166

adrpar · 2023-04-07T11:32:03Z

First of all, thanks for Asciidoctor and its Java bindings!

I am struggling with using the AST, as I would like to obtain all references to images in my document. I have understood that findBy is not able to find any inlined images, however from the issues and discussions around the Asset Catalog I got to understand (probably wrongly) that it would contain also inlined images when processing the source with the appropriate option catalogAssets set to true.

Is it possible using AsciidoctorJ to obtain the inlined images or do I need to process the individual blocks myself?

Thank you for your help.

The text was updated successfully, but these errors were encountered:

abelsromero · 2023-04-07T19:15:56Z

2 things:

AsciidoctorJ does not expose catalog images now, but that can be added. The structures are there, we just need to add the method.
Inline images don't seem to be added to the catalog. The documentation mentions that's possible but I could not get them, from AsciidoctorJ or Asciidoctor Ruby.
Exploring the AST, block images have their own Block node, but inline images are embedded inside the paragraph node (no sub-nodes added for the image itself).

@mojavelinux sorry to tag you. Is the assumption correct or am I doing something wrong? Here's my test with asciidoctor gem v2.0.18.

require 'asciidoctor'

input_file = 'sample.adoc'
doc = Asciidoctor.load_file(input_file, catalog_assets: true)

images = doc.catalog[:images]
puts "Found #{images.size} images:"
puts images

= This is a title
Author Name
:icons: font

This a preamble

== This is a section

These are some replacements {value}

== Some admonitions

NOTE: Asciidoctor supports font-based admonition icons, powered by Font Awesome!

== A few images

A block image

image::images/sunset.jpg[]

An inlined image image:images/AsciiDoc-R-logo-color.png[] in the text.

mojavelinux · 2023-04-07T21:57:47Z

Inline images are cataloged (when catalog assets is enabled), but only while the document is converted to the output format. Thus, they won't be available until after the document is converted. That's because the load method only parses block-level elements. All inline elements are parsed during conversion. (This is a well-known shortcoming of Asciidoctor and one we are working to address in the language project).

adrpar · 2023-04-08T07:19:23Z

Thank you very much for your responses. Indeed, calling getContent on the document before accessing the asset catalog does fill it up with all images.

Two thoughts on this where you could maybe help future users of the bindings:

Document this behaviour clearly in the catalogAssets section of https://docs.asciidoctor.org/maven-tools/latest/plugin/goals/process-asciidoc/#configuration
or maybe improve the behaviour of the bindings to be more intuitive: If catalogAssets is set to true, already convert the document once on load, so that all the information is there as one would at least naively expect.

abelsromero · 2023-04-08T16:03:39Z

Inline images are cataloged (when catalog assets is enabled), but only while the document is converted to the output format

Now I understand why the docs use convert_file in the example 👍 And using convert it works perfectly.
In all fairness, the docs are well explained, I read diagonally, my fault.

Generally speaking, inline elements are not cataloged until conversion. Therefore, they’re only available after the document has been converted, not after it has been loaded.

Two thoughts on this where you could maybe help future users of the bindings:

Everything makes sense from Asciidoctor (the Ruby core) point of view. The issue I see here is that AsciidoctorJ has two methods to do the same and they are not working (for me at least) and we have no tests 😞

var options = Options.builder()
            .safe(SafeMode.SAFE)
            .catalogAssets(true)
            .build();
var doc = asciidoctor.convert(Files.readString(sourceDocument.toPath()), options, Document.class);

Returns an internal Ruby String

And this which I think should be good one returns null.

var document = asciidoctor.convertFile(sourceDocument, options, Document.class);

Once this is fixed, we can start thinking about how to better integrate with the maven plugin. @adrpar I presume you are using an extension? Do you mind opening an issue in github.com/asciidoctor/asciidoctor-maven-plugin/ with the specific use case and an example?

adrpar · 2023-04-08T18:55:30Z

I was not aware of the maven plugin and I need to check if it suites my needs.

My use case is in general to implement a documentation workflow that is similar to what the docToolchain (https://doctoolchain.org) provides but more extensible and tailored to uploading documentation to Confluence. My idea was to create a set of command line tools that can be put together into a processing pipeline that will run within a CI/CD system.

Specifically the use case for getting all the images was with copying the sources of a documentation into a "build" directory, so that all images are in one place. Thus collecting all images in one images directory. The second step would then be to (if needed) update all the image references within the asciidoc source to point to that one directory... Which I know is going quite beyond what AsciidoctorJ provides.

I hope that somewhat explained my use case.

mojavelinux · 2023-04-09T06:06:40Z

or maybe improve the behaviour of the bindings to be more intuitive: If catalogAssets is set to true, already convert the document once on load, so that all the information is there as one would at least naively expect.

This would deviate from the behavior of Asciidoctor core and cause side effects, not something AsciidoctorJ should be doing. Right now, the contract in Asciidoctor is that the inline images become available after conversion. Thus, if you need to access them, then the code that calls Asciidoctor needs to use convert or convertFile.

(This may change in the future major release once we solve the inline parsing issue, which we are working on. But it's not the way it is now).

abelsromero added bug enhancement labels Apr 7, 2023

abelsromero mentioned this issue Apr 8, 2023

Retrieve all images in catalog during conversion #1171

Merged

8 tasks

robertpanzer closed this as completed in #1171 Apr 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asset catalog does not include inlined images #1166

Asset catalog does not include inlined images #1166

adrpar commented Apr 7, 2023

abelsromero commented Apr 7, 2023 •

edited

Loading

mojavelinux commented Apr 7, 2023

adrpar commented Apr 8, 2023

abelsromero commented Apr 8, 2023

adrpar commented Apr 8, 2023

mojavelinux commented Apr 9, 2023

Asset catalog does not include inlined images #1166

Asset catalog does not include inlined images #1166

Comments

adrpar commented Apr 7, 2023

abelsromero commented Apr 7, 2023 • edited Loading

mojavelinux commented Apr 7, 2023

adrpar commented Apr 8, 2023

abelsromero commented Apr 8, 2023

adrpar commented Apr 8, 2023

mojavelinux commented Apr 9, 2023

abelsromero commented Apr 7, 2023 •

edited

Loading