-
Notifications
You must be signed in to change notification settings - Fork 493
added SearchRecord custom functions for at-blocks #2672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I can add the changelog and will fix the linting issues once we decide what is the most optimal solution :) |
Just thinking out loud here. The methods you're adding, they are getting called here: Documenter.jl/src/html/HTMLWriter.jl Lines 1683 to 1684 in 4328aec
It feels like what we want is sometimes not to write any {
"location": "",
"page": "Example Site",
"title": "Example Site",
"text": "Main.DocString",
"category": "page"
},
{
"location": "#DocString",
"page": "Example Site",
"title": "DocString",
"text": "Docstring\n\n\n\n\n\n",
"category": "constant"
} The second one (actual docstring content) gets pushed here: Documenter.jl/src/html/HTMLWriter.jl Lines 1775 to 1783 in 4328aec
So what I think we might be able to do is change these methods to functions ( |
src/html/HTMLWriter.jl
Outdated
output_text = "" | ||
for child in children_array[2:end] | ||
if isa(child.element, Documenter.MultiOutputElement) && isa(child.element.element, Dict) | ||
for mime in [MIME"text/plain"(), MIME"text/markdown"(), MIME"text/html"()] | ||
if haskey(child.element.element, mime) | ||
output_text *= string(child.element.element[mime]) * " " | ||
break | ||
end | ||
end | ||
else | ||
output_text *= mdflatten(child) * " " | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, for at-example blocks, we render both the input and the output. We probably want to put the source code into the index as well?
In general, though, I wonder if it's better to just add the search index entries in the actual domify
methods, like here:
Documenter.jl/src/html/HTMLWriter.jl
Lines 2394 to 2527 in 4328aec
# Select the "best" representation for HTML output. | |
domify(dctx::DCtx, node::Node, ::Documenter.MultiOutput) = domify(dctx, node.children) | |
domify(dctx::DCtx, node::Node, moe::Documenter.MultiOutputElement) = Base.invokelatest(domify, dctx, node, moe.element) | |
function domify(dctx::DCtx, node::Node, d::Dict{MIME, Any}) | |
rawhtml(code) = DOM.Tag(Symbol("#RAW#"))(code) | |
# Our first preference for the MIME type is 'text/html', which we can natively include | |
# in the HTML. But it might happen that it's too large (above example_size_threshold), | |
# in which case we move on to trying to write it as an image. | |
has_text_html = false | |
if haskey(d, MIME"text/html"()) | |
if length(d[MIME"text/html"()]) < dctx.ctx.settings.example_size_threshold | |
# If the size threshold is met, we can just return right away | |
return rawhtml(d[MIME"text/html"()]) | |
end | |
# We'll set has_text_html to distinguish it from the case where the 'text/html' MIME | |
# show() method was simply missing. | |
has_text_html = true | |
end | |
# If text/html failed, we try to write the output as an image, possibly to a file. | |
image = if haskey(d, MIME"image/svg+xml"()) | |
@tags img | |
svg = d[MIME"image/svg+xml"()] | |
svg_tag_match = match(r"<svg[^>]*>", svg) | |
dom = if length(svg) >= dctx.ctx.settings.example_size_threshold | |
filename = write_data_file(dctx, svg; suffix = ".svg") | |
@assert !isnothing(filename) | |
img[:src => filename, :alt => "Example block output"] | |
elseif svg_tag_match === nothing | |
# There is no svg tag so we don't do any more advanced | |
# processing and just return the svg as HTML. | |
# The svg string should be invalid but that's not our concern here. | |
rawhtml(svg) | |
else | |
# The xmlns attribute has to be present for data:image/svg+xml | |
# to work (https://stackoverflow.com/questions/18467982). | |
# If it doesn't exist, we splice it into the first svg tag. | |
# This should never invalidate otherwise valid svg. | |
svg_tag = svg_tag_match.match | |
xmlns_present = occursin("xmlns", svg_tag) | |
if !xmlns_present | |
svg = replace(svg, "<svg" => "<svg xmlns=\"http://www.w3.org/2000/svg\"", count = 1) | |
end | |
# We can leave the svg as utf8, but the minimum safety precaution we need | |
# is to ensure the src string separator is not in the svg. | |
# That can be either " or ', and the svg will most likely use only one of them | |
# so we check which one occurs more often and use the other as the separator. | |
# This should leave most svg basically intact. | |
# Replace % with %25 and # with %23 https://github.com/jakubpawlowicz/clean-css/issues/763#issuecomment-215283553 | |
svg = replace(svg, "%" => "%25") | |
svg = replace(svg, "#" => "%23") | |
singles = count(==('\''), svg) | |
doubles = count(==('"'), svg) | |
if singles > doubles | |
# Replace every " with %22 because it terminates the src=" string otherwise | |
svg = replace(svg, "\"" => "%22") | |
sep = "\"" | |
else | |
# Replace every ' with %27 because it terminates the src=' string otherwise | |
svg = replace(svg, "\'" => "%27") | |
sep = "'" | |
end | |
rawhtml(string("<img src=", sep, "data:image/svg+xml;utf-8,", svg, sep, "/>")) | |
end | |
(; dom = dom, mime = "image/svg+xml") | |
elseif haskey(d, MIME"image/png"()) | |
domify_show_image_binary(dctx, "png", d) | |
elseif haskey(d, MIME"image/webp"()) | |
domify_show_image_binary(dctx, "webp", d) | |
elseif haskey(d, MIME"image/gif"()) | |
domify_show_image_binary(dctx, "gif", d) | |
elseif haskey(d, MIME"image/jpeg"()) | |
domify_show_image_binary(dctx, "jpeg", d) | |
end | |
# If image is nothing, then the object did not have any of the supported image | |
# representations. In that case, if the 'text/html' representation exists, we use that, | |
# but with a warning because it goes over the size limit. If 'text/html' was missing too, | |
# we carry on to additional MIME types. | |
if has_text_html && isnothing(image) | |
# The 'text/html' representation of an @example block is above the threshold, but no | |
# supported image representation is present as an alternative. | |
push!( | |
dctx.ctx.atexample_warnings, | |
AtExampleFallbackWarning( | |
page = dctx.navnode.page, | |
size_bytes = length(d[MIME"text/html"()]), | |
fallback = nothing, | |
) | |
) | |
return rawhtml(d[MIME"text/html"()]) | |
elseif has_text_html && !isnothing(image) | |
# The 'text/html' representation of an @example block is above the threshold, | |
# falling back to '$(image.mime)' representation. | |
push!( | |
dctx.ctx.atexample_warnings, | |
AtExampleFallbackWarning( | |
page = dctx.navnode.page, | |
size_bytes = length(d[MIME"text/html"()]), | |
fallback = image.mime, | |
) | |
) | |
return image.dom | |
elseif !has_text_html && !isnothing(image) | |
return image.dom | |
end | |
# Check for some 'fallback' MIMEs, defaulting to 'text/plain' if we can't find any of them. | |
return if haskey(d, MIME"text/latex"()) | |
latex = d[MIME"text/latex"()] | |
# If the show(io, ::MIME"text/latex", x) output is already wrapped in | |
# \[ ... \] or $$ ... $$, we unwrap it first, since when we output | |
# Markdown.LaTeX objects we put the correct delimiters around it anyway. | |
has_math, latex = _strip_latex_math_delimiters(latex) | |
out = if !has_math | |
Documenter.mdparse(latex; mode = :single) | |
else | |
[MarkdownAST.@ast MarkdownAST.DisplayMath(latex)] | |
end | |
domify(dctx, out) | |
elseif haskey(d, MIME"text/markdown"()) | |
out = Markdown.parse(d[MIME"text/markdown"()]) | |
out = convert(MarkdownAST.Node, out) | |
domify(dctx, out) | |
elseif haskey(d, MIME"text/plain"()) | |
@tags pre | |
text = d[MIME"text/plain"()] | |
pre[".documenter-example-output"](domify_ansicoloredtext(text, "nohighlight hljs")) | |
else | |
error("this should never happen.") | |
end | |
end |
The bonus is that the "rendered" output of a block is already present, so it would reduce duplication I think.
X-ref #2672 (comment) about how to remove the other entries.
Just one more note: as I was playing around with this PR, and to actually see what ends up in the search index, I put together a small toy build that also prettyfies the index, in case that might be useful: becf65c In general (but probably not in this PR, to not scope creep this), I think we want some explicit tests for the search index generation, like with reference output etc. |
this is a great idea, I will make a new PR for tests after this PR's completion. |
this time I made a should_index_for_search function checking for at-blocks and only added the one which were not at blocks but still I don't think this issue is solved |
""" | ||
Filter function to prevent duplicate search entries. | ||
""" | ||
function should_index_for_search(navnode, node, element) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be sufficient to just check the type, no need to introspect into the objects? So something as simple as this should be enough probably:
exclude_from_search_index(element) = typeof(element) in (Documenter.MultiOutputElement, Documenter.MultiCodeBlock, Documenter.EvalNode)
this is an attempt to solve and potentially close #1929
my thought process was that right now the SearchRecord objects are created and added to the search index which are very general so I made custom function for adding SearchRecords and add special handling for MultiOutput which is @example block, MultiCodeBlock which is the @repl block and EvalNode which is the @eval block