Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workaround for TableOfContents h1/h2 hierarchy issue #34

Merged
merged 2 commits into from
Nov 30, 2019

Conversation

hakusaro
Copy link
Contributor

Add workaround for TableOfContents h1/h2 hierarchy issue

First, some background is necessary. As noted at the hugo documentation site,
https://gohugo.io/content-management/toc/, the table of contents module does
not currently let you change which heading levels should be rendered. This
creates a slight issue as follows.

Let's say you have a markdown document and you use this template. The <h1>
tag is the title for each page, as set in the frontmatter. In markdown, that's
also the single hash/octothorpe/number sign marker for a top level heading.

By convention, it's typical to have a single <h1> element of the same font size
on each page. Naturally, the ideal behavior is to then make all elements <h2>
or lower. As noted by Mozilla, this is even a loosely defined convention:

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Heading_Elements

Now, this is all well and good, until you read more into the issue above.
Essentially, Hugo generates the table of contents starting with <h1> elements
from the markdown output. If you don't have an <h1> element, Hugo will render
the level associated with <h1> anyway and then indent <h1> all subsequent levels
based on this non-existent hierarchy. You can make the table of contents look
better by adding an <h1> element to the page, but this creates the conflict as
described above. You end up with two <h1> elements, which is antithetical to the
spec, only to create a correct looking table of contents. And one again, those
two <h1> elements will make your page look bad because they have the same font
size.

This is easily shown graphically. Screenshots can be found at the bottom of this PR.

A discussion exists on gohugoio/hugo talking about the heading level issue
in the table of contents system, here:

gohugoio/hugo#1778

This is a slightly maddening issue, because it's closed, but in it many people
have identified this issue as a problem and patched it with one way or another.

My commit introduces the patch from @HenrySkup, which is located here:

gohugoio/hugo#1778 (comment)

This commit specifically will parse and remove empty headings in the generated
output from the TableOfContents module, thus creating the proper indentation
level as expected, and not requiring breaking the html spec, nor making pages
look nasty. It's backwards compatible with existing markdown written under the
broken assumption, by way of rendering correctly with existing <h1> elements,
while simultaneously fixing any table of contents generated with only <h2> and
subsequent heading levels by indenting them the correct amount each time.

This was tested against Hugo v0.58.0.

Screenshots of behavior without and with the patch, respectively

The first two images following this sentence are before
and after the patch with only <h2> elements in the markdown body and a title in the
frontmatter at the top. Note the spacing between the leftmost box border and the elements
generated by the {{ .TableOfContents }} directive in both images, and you'll spot the 1/2
indentation difference.

Before patch

After patch

The next two images depict the same before/after, this time with <h1> elements in
the body. This is used to demonstrate the backwards compatibility in this patch.

Before patch with more detail

After patch with more detail

Finally, what it looks like with the patch using only <h2> elements with
more detail. Again, you can compare how indented the first level headings are between
the <h2> elements used in the markdown body with the above images, showing that
the correct indentation level is present without a parent <h1> tag in the markdown body.

After patch

As you can see, this is something that's very hard to describe the nuanced behavior of.
Naturally, the table of contents as created before this will look different than after, but
that's just because most of them will be made with only <h2> elements in the markdown
body, but they at least look correct and space evenly from then on.

This is trivially tested with markdown documents like so run through the toc generator:

# this is an h1 level heading
## this is an h2 level heading
## this is an h2 level but should be at the same level as the h1 above
## this is again an h2 level heading

hakusaro and others added 2 commits October 14, 2019 19:57
First, some background is necessary. As noted at the hugo documentation site,
https://gohugo.io/content-management/toc/, the table of contents module does
not currently let you change which heading levels should be rendered. This
creates a slight issue as follows.

Let's say you have a markdown document and you use this template. The <h1>
tag is the title for each page, as set in the frontmatter. In markdown, that's
also the single hash/octothorpe/number sign marker for a top level heading.

By convention, it's typical to have a single h1> element of the same font size
on each page. Naturally, the ideal behavior is to then make all elements <h2>
or lower. As noted by Mozilla, this is even a loosely defined convention:

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Heading_Elements

Now, this is all well and good, until you read more into the issue above.
Essentially, Hugo generates the table of contents starting with <h1> elements
from the markdown output. If you don't have an <h1> element, Hugo will render
the level associated with <h1> anyway and then indent all subsequent levels
based on this non-existent hierarchy. You can make the table of contents look
better by adding an <h1> element to the page, but this creates the conflict as
described above. You end up with two <h1> elements, which is antithetical to the
spec, only to create a correct looking table of contents. And one again, those
two <h1> elements will make your page look bad because they have the same font
size.

This is easily shown graphically, as is documented in the pull request.

A discussion exists on gohugoio/hugo talking about the heading level issue
in the table of contents system, here:

gohugoio/hugo#1778

This is a slightly maddening issue, because it's closed, but in it many people
have identified this issue as a problem and patched it with one way or another.

My commit introduces the patch from @HenrySkup, which is located here:

gohugoio/hugo#1778 (comment)

This commit specifically will parse and remove empty headings in the generated
output from the `TableOfContents` module, thus creating the proper indentation
level as expected, and not requiring breaking the html spec, nor making pages
look nasty. It's backwards compatible with existing markdown written under the
broken assumption, by way of rendering correctly with existing <h1> elements,
while simultaneously fixing any table of contents generated with only <h2> and
subsequent heading levels by indenting them the correct amount each time.

This was tested against Hugo v0.58.0.
@qqpann
Copy link
Owner

qqpann commented Nov 30, 2019

Thanks for your PR, hakusaro!

Your detailed report is pretty awesome. It helped me understand the issue.

However, there seems to be issue in this solution.
If you have multiple <h2> and <h3> etc, the second <h2> will come left than the first.
SS 2019-11-30 21 52 32

This happens because replacing <ul><li><ul> with <ul> will leave extra </li> and </ul>.
So I pushed a more detailed regex replacement pattern, with result:
SS 2019-12-01 0 00 57

@qqpann
Copy link
Owner

qqpann commented Nov 30, 2019

I may also consider fixing it with css.
In that case, indent will be all relative.

@qqpann qqpann merged commit 216c469 into qqpann:master Nov 30, 2019
@hakusaro
Copy link
Contributor Author

Your detailed report is pretty awesome. It helped me understand the issue.

Thank you so much! I really appreciate it! I was trying to be super detailed so you would understand it easily & be able to see the problem, and also understand possible solutions.

If you have multiple <h2> and <h3> etc, the second <h2> will come left than the first.

Interesting. Maybe I have bad eyes -- I haven't seen that yet. O.o


Thank you so much for taking the time to read my issue & look over my PR! I really appreciate your work, and I'm super happy you merged it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants