-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ergonomic way to enhance ToC within Markdown: insert TOC slice, exclude headings #6201
Comments
Hi, since this feature is already in place, is it essentially a documentation request? The TOC shape is most likely stable, and it's kind of documented here: https://docusaurus.io/docs/next/markdown-features/inline-toc#custom-table-of-contents However, up till now we haven't figured out an ergonomic way to let you use it. For example, #3915 is very likely to be solved by letting you code a TOC yourself. |
This is first a question about whether modifying the ToC like in the example is intended to be part of the API, or is just exposed internals. If the answer is yes, then this is a documentation request. If it's no, this is a request for an alternative. This is not intended to supersede #3915, as is explained in the comment that I originally linked to, though it can be used as a stopgap in the meantime. |
This is internal implementation detail that is IMHO stable enough to use externally. We could document it and make it officially a public API, but it probably requires some upfront thinking to be sure that the current API is the best way to solve your use-case |
It fits exactly my use case, since this document has the entirety of its content auto-generated. However, if only a section were to be as such, then I'd want to be able to splice the ToC accordingly, more similarly to #3915. I think this could be arranged by having a file— Hopefully I didn't sound too confused? |
If we add this to our doc, we should probably dogfood this on our own doc, explaining the constraints (like elements requiring a unique id for linking to work). We could showcase manual construction of a toc for generated HTML (like your case). We should probably add this to this page (to be renamed as just "TOC"?): https://docusaurus.io/docs/markdown-features/inline-toc Do you want to submit a PR? I'm not sure what you mean by |
@slorber What about we invent a "TOC enhancement" syntax ourselves and parse that in the TOC generating remark plugin? I don't have a clear plan yet, but it would be something like: import Content, {partialToc} from './_partial.md';
# A Markdown page
## Actual Heading 1
[[insert toc]]
- Inserted Heading 1
- Inserted Heading 2
- Subheading 1
## Actual Heading 2
<Content />
[[import toc: partialToc]] Which will generate a TOC: export const toc = [
{title: 'Actual Heading 1'},
{title: 'Inserted Heading 1'},
{title: 'Inserted Heading 2', children: [{title: 'Subheading 1'}]},
{title: 'Actual Heading 2'},
...partialToc,
]; However, importing partials are very difficult. We either have to actually read the imported file and parse that as well, or we just let the user import that and we spread it into the final TOC. In the former case, we need to actually keep track of the imported component => MD file path mapping and read & parse more files, and means sacrificed performance; in the latter case, it means we can't be sure how the partial TOC should be spread. For example, for the example above, if ### Imported Subheading 1
### Imported Subheading 2 Then the final TOC should have the two "Imported Subheading"'s as children of "Actual Heading 2" instead of spreading it to the root of TOC. In hindsight it may be better if we have started off designing our server-side TOC structure as a flat list instead of a recursive tree. It's going to be rendered on client-side as a list anyways and we hardly take advantage of the tree structure. For now we recursively render each TOC level, but we could well use |
@Josh-Cena I'm not sure it's really related to this issue, as here the content is in HTML and not a mdx partial import.
That's something I thought about but do we really want to invent a syntax that will only serve temporarily? I'm already not a fan of inventing a non-std md syntax 😅 The end goal is that the TOC works for imported files automatically, without asking the user to use any new fancy syntax.
I don't think we should do that, it duplicates work to each doc importing a partial and also adds more weight to each page, as the shared toc ends up being duplicated in each doc We'd rather favor composition and have a remark plugin that compose the toc from current doc and partials without inlining the partial tocs into the document, like your spread example, but handled automatically by the remark plugin, not hand-written by the user
Good point, it will probably be needed to flatten that structure So we are maybe not ready to make this "manual toc" feature an official public API 😅 |
I'm proposing this custom TOC syntax in place of handwriting the entire TOC structure which contains a lot of boilerplate. If we have that, we don't need to document the
As I said, it means we have to "understand" that Then there's the question of (a) user wanting to hide some headings from the TOC and (b) user wanting to insert extra headings that would otherwise not be visible to Remark. We would need a way to let users handwrite & insert part of the TOC, hence the proposal for the Glad I made the flattened list point through :P Going to see what we can do about it |
This will be triaged as a feature request and we will figure out an ergonomic way to tweak the TOC structure. Just for reference, VuePress has this |
Fine by me, then. I'll keep relying on |
We should look for the import statement extension, for sure it's a bit more complex but it should be achievable.
It looks more appropriate to me to have a a syntax on the heading itself, similar to anchor links? I'm not sure how your proposal solves this use-case?
For example, user using React components with some headings inside md? That's a quite specific use-case, but still seems like a reasonable thing to solve without requiring users to write the full toc manually In practice, it's the use-case @ISSOtm exposed, but @ISSOtm may be satisfied by just having documentation explaining what we recommend, so it may not be so useful to implement something immediately, and doc might be good enough until it becomes more painful for a few users? This gives us time to think more deeply about this problem. I'd also be happy to have a way to enhance/customize the automatically generated toc object, and not sure adding proprietary markdown syntax tags is the most flexible option. I'd rather use a real function: # Title
blabla
export function toc(originalToc) {
return [...originalToc,myExtraDocEntry];
} Does it make sense?
Yes @ISSOtm, as you see this is subject to potential breaking changes as we may flatten the toc structure 🤪 so it's not yet a good time to document but it's a good enough workaround for now. |
A bit of a tricky UX problem is splicing some headings in the middle of the ToC. Figuring out the correct index is less trivial than how often it'd be desirable, imo. |
Yes, the problem with |
Looked a bit into this. Several random thoughts: On the point of MDX heading transclusion. The solution would be like this: This solution is because we don't know what's actually in
On the point of inserting extra anchors. This has three use-cases:
The actual syntax is open to discussion, but it would still basically be some artificial headings that will be recognized and removed by our remark plugin. <!-- this admonition-like syntax encapsulates some artificial headings
that will be present in the TOC but removed from the content -->
:::toc
## Explanation {#explanation}
::: Something like that... This part doesn't require any refactors, because the remark plugin will see artificial headings the same as normal ones. On the point of hiding headings away from the TOC. This is the tricky part. Note that ## Hidden heading {!}
## Hidden heading 2 {!#heading} This will still allow us to set the anchor ID, but the In conclusion, we will:
|
The artificially inserted TOC will likely be useful for our API doc: https://docusaurus.io/docs/next/api/plugins/@docusaurus/plugin-content-docs I envision something similar to Yarn's doc: https://yarnpkg.com/cli/workspaces/foreach#options where every line in the table can have its toc link |
Yarn table is using h3 Couldn't we also use this? |
would like to use a component for my headings, while including them in the TOC. currently approach is to change <ChangelogHeading date="2022-12-20">
## 0.1.8
</ChangelogHeading> so that Docusaurus picks up the MDX heading to put in the TOC while still rendering as <hgroup style={{ display: 'flex', flexWrap: 'wrap', alignItems: 'baseline', gap: '2em' }}>
<h2>0.1.8</h2>
<time dateTime="2022-12-20">2022-12-20</time>
</hgroup> |
@jasikpark if you only want to customize the rendering of h2 headings of a specific doc, that looks better to me to simply use mdx components to provide a custom rendering logic. We allow you to configure such docs globally here: https://docusaurus.io/docs/markdown-features/react#mdx-component-scope You could create your own h2 component that renders the way you want, and is able to parse the h2 string on changelog docs. IE you could just write this ## 0.1.8 - 2022-12-20 We don't allow (yet) to assign components on a per-doc basis. Until then you can add if/else in a global h2 component to detect when it makes sense to apply such custom rendering logic. Or you can try using the MDX Provider in your doc or theme directly? import {MDXProvider} from '@mdx-js/react';
import H2Custom from '@site/src/components/H2Custom';
<MDXProvider components={{h2: H2Custom}}>
## 0.1.8 - 2022-12-20
text
## 0.1.9 - 2022-12-21
text
<MDXProvider/> (if you don't want the date to appear in the TOC you can just put the version and create a mapping from version to date outside of the markdown) |
interesting! thanks for the suggestions, i think i'll just stick with my current solution in that case 👍 |
I was trying to add headings that show up in the TOC, but I've encountered an issue. <div class="nobullet">
# Inputs
* ## A
* details
* ## B
* details
* ## C
* details
* ## D
* details
</div> I'd like to have the ability to set any word to be of not be in the TOC no matter where it is. word{is-in-TOC}(custom id)
* word{is-in-TOC}(custom id 2)
or
## word{is-not-in-TOC} |
Trying to automate the // Use cheerio as docusaurus uses it
import * as cheerio from "cheerio";
// Read in raw html string and generate docusaurus toc format
export const genToc = function (apiRaw) {
const $ = cheerio.load(apiRaw);
return $("h2, h3, h4, h5, h6")
.toArray()
.map((header) => {
const $header = $(header);
const level = parseInt($header[0].name[1], 10);
const value = $header.text();
const id = $header.attr("id");
return {
level,
value,
id,
};
});
}; So one can then import GeneratedAPI from './_api.mdx';
This is the generated api
<GeneratedAPI />
import { genToc } from '../_components/api-toc.js';
import apiRaw from "!!raw-loader!./_api.mdx";
export const toc = genToc(apiRaw); However this approach seems to add 300kb to the page, which I would guess is mostly cheerio coming down the line. (_api.mdx is only 20kb) I'm unfamiliar with SSR but could this be forced to be done at "SSG" time during a |
@foot yes the TOC has to be computed server-side otherwise it would be first invisible and then "pop" once React hydrates. Here you include cherrio but also the markdown file as a source string (so each doc is now both a React component + a string). This really should be done at build time, ideally as a remark plugin. |
Sorry if this is repeated information; it's a little hard to tell the progress here. I have a use-case where I need my page to use I am using Docusaurus to generate multiple documentation bundles that are product-specific. 90-95% of the documentation for these products is the same, so I use a single Docusaurus project to keep things as centralized and maintainable as possible. I set an environment variable depending on which version of the docs I want to build, and I access the variable through the There are some cases where I want to change the text of a heading based on the environment variable. For example "Getting Started with <Product A>" vs. "Getting Started with <Product B>". For this, I use a JSX component that grabs the environment variable and renders the correct text or content according to the variable. The only way to render a heading like this (as far as I know) is to put it inside of an My problem is that the TOC never picks up any
Unfortunately for me, the consequence of this "fix" is exactly what I don't want. Is there any solution to this problem or workaround I can use? If not, do we know the status of any fixes for this problem? The only option I have now is to copy-paste all my markdown files and do away with the environment variable, which is a major maintainability headache. |
Adding some more detail to my previous comment now that I learned a little more about Remark/Rehype plugins. By dumping out the {
"type": "element",
"tagName": "h2",
"properties": { "id": "asdf" },
"children": [
{
"type": "text",
"value": "ASDF",
"position": {
"start": { "line": 111, "column": 4, "offset": 2968 },
"end": { "line": 111, "column": 12, "offset": 2976 }
}
}
],
"position": {
"start": { "line": 111, "column": 1, "offset": 2965 },
"end": { "line": 111, "column": 12, "offset": 2976 }
}
}, but the component ones look like some kind of embedded JSX instead, so the ToC plugin is unable to detect the header elements within them {
"type": "jsx",
"value": "<Paragraph\n aContent={\n <>\n <hr />\n <h2>ASDF 2</h2>\n <EndpointTemplate\n description= ........", // truncated for brevity
"position": {
"start": { "line": 124, "column": 1, "offset": 3280 },
"end": { "line": 197, "column": 3, "offset": 5500 },
"indent": [
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
]
}
}, As you can see, once we get to a JSX component, it stops being a tree and just flattens everything into 1 node. Somehow somewhere this JSX is expanded into an HTML tree, but I don't know how or where. Is there any way to run some kind of plugin or transformation after the JSX is expanded? |
@nullromo we are not using remark-toc: we have our own custom Docusaurus toc remark plugin. If the remark-toc package does not behave the way you want we can't do anything for you, and you should report it to that plugin author directly. In any case, you can provide an explicit toc structure yourself on any doc that overrides the one we compute: export const toc = [
{
value: "Label1",
id: "anchor1",
level: 2,
children: [],
},
{
value: "Label2",
id: "anchor2",
level: 2,
children: [
{
value: "Label3",
id: "anchor3",
level: 3,
children: [],
},
],
},
]; You can also take over how this structure is rendered with swizzle. I agree that's probably not ideal and requires maintenance, but that works If your goal is to only have placeholders in headings (and not dynamic headings that are present/absent conditionally), that can probably be implemented with a remark plugin: ## Getting Started with %%PRODUCT_NAME%% We don't provide such plugins but you'll find useful information here and code examples here: This code example is probably the best to get started: #395 (comment) You could even imagine a plugin that add/remove sections conditionally: <ProductAOnly>
## Getting Started with Product A
blabla
</ProductAOnly> If your remark plugin runs before our custom toc plugin (tip: use So if your "before" remark plugin removes everything that's inside a JSX Our TOC plugin is pretty simply, it mostly collects all headings found in the MD AST and add them to the function tocPlugin(): Transformer {
return async (ast) => {
visit(ast, 'heading', (child: Heading) => {
addHeadingToTocExport(child);
});
};
}; |
@slorber Thanks for the advice. I was able to come up with a plugin just like how you described. Unfortunately, I still didn't find a way to actually parse through the JSX nodes within the AST, so I just have to use a regex to determine which nodes I want to remove. I'll post the code here (with names changed) for reference. Pluginconst visit = require('unist-util-visit');
// plugin that removes certain components
const myPlugin = () => {
// determine which nodes to filter out based on the product type
const nodeRegex = (() => {
switch (process.env.PRODUCT_TYPE) {
case 'PRODUCT_A':
return /(ProductBParagraph|ProductCParagraph)/;
case 'PRODUCT_B':
return /(ProductAParagraph|ProductCParagraph)/;
case 'PRODUCT_C':
return /(ProductAParagraph|ProductBParagraph)/;
default:
return /^$/;
}
})();
return async (
/** @type {import("unist").Node<import("unist").Data>} */ ast,
) => {
// this variable will become true when we hit the opening tag for the
// node to be removed, and it will become false when we hit the next
// tag. This means that it will be true for all the nodes between the
// opening and closing tags
let removing = false;
// traverse the tree
visit(ast, {}, (child, index, parent) => {
// remember if we removed the current node or not
let removed = false;
// removes the current node if we are in removing mode
const removeNodeIfNeeded = () => {
// check if in removing mode
if (removing) {
// remove the node
//console.log('removing', child);
parent.children.splice(index, 1);
// remember that we removed the node
removed = true;
}
};
// remove the current node if necessary
removeNodeIfNeeded();
// if the node is a JSX node, see if we hit the opening/closing tag
if (child.type === 'jsx') {
// if the value matches, then it's the right tag
if (
// @ts-ignore
nodeRegex.test(child.value)
) {
// toggle removing mode
removing = !removing;
// if we just toggled on, remove this node
removeNodeIfNeeded();
}
}
// if we removed the node, return SKIP, otherwise just return
if (removed) {
return [visit.SKIP, index];
}
return;
});
};
}; Componentsexport enum ProductType {
PRODUCT_A = 'PRODUCT_A',
PRODUCT_B = 'PRODUCT_B',
PRODUCT_C = 'PRODUCT_C',
}
const useProductType = () => {
return useDocusaurusContext().siteConfig.customFields
.productType as ProductType;
};
const makeProductTypeParagraph = (
productFilter: ProductType,
) => {
return (props: React.PropsWithChildren) => {
const productType = useProductType();
if (productType === productTypeFilter) {
return <>{props.children}</>;
}
return null;
};
};
export const ProductAParagraph = makeProductTypeParagraph(ProductType.PRODUCT_A);
export const ProductBParagraph = makeProductTypeParagraph(ProductType.PRODUCT_B);
export const ProductCParagraph = makeProductTypeParagraph(ProductType.PRODUCT_C); Markdown<ProductAParagraph>
---
## Product A Heading 1
<Thing
cool='yeah'
/>
---
## Product A Heading 2
<>
<MyComponent
nice='wow'
/>
</>
</ProductAParagraph>
<ProductBParagraph>
---
## Product B Heading 1
awesome
</ProductBParagraph>
I don't necessarily like the regex matching technique here because it kind of blindly removes stuff without really knowing what's going on. For example, if my approach is to remove everything between the opening and closing tags ( So if there's any way that you know of to actually process JSX from inside a remark plugin, I'd love to hear about it. In any case, thanks a lot for the help! Glad I now have a suitable workaround 🎉 |
First: I'd recommend implementing this in Docusaurus v3 (MDX is now at v2), currently in alpha. That was just to give you a direction, I have not implemented this myself. If you want to build this properly you have to learn MDX / Unified and how all things work together: you can't skip reading the doc and investing some time. You should inspect the produced AST tree and see which nodes you want to remove. Most likely the nodes will look like this: {
"type": "mdxJsxFlowElement",
"name": "div",
"attributes": [],
"children": [/* content */]
} What I would do is use a visitor to visit all Again this is just an idea and direction: you'll have to figure out the details yourself and learn how these things work. |
Exporting the global |
@jeluard while improving the TOC to support imported docs, I noticed a strange behavior. Related discussion: Apparently, exporting But as soon as you have headings (>= level 2), they get used in priority over your exported TOC. This didn't look good to me, so I fixed this behavior for Docusaurus v3.2/canary to always let you the ability to override the generated toc: https://stackblitz.com/edit/github-zjz2fr?file=docs%2Fintro.mdx,package.json |
@slorber I can confirm that it works as expected with canary docusaurus. Thanks! |
Just in case someone else reads this: Some of the custom ToCs in this issue set their children like this: export const toc = [{
value: "Label1",
id: "anchor1",
level: 2,
children: [],
},{
value: "Label2",
id: "anchor2",
level: 2,
children: [{
value: "Label3",
id: "anchor3",
level: 3,
children: [],
}],
}
] to create the same ToC as ## Label1 {#anchor1}
## Label2 {#anchor2}
### Label3 {#anchor3} However (probably since #6729 ) setting children is not necessary anymore and the child entries don't show up in the ToC. This works: export const toc = [{
value: "Label1",
id: "anchor1",
level: 2,
},{
value: "Label2",
id: "anchor2",
level: 2,
},{
value: "Label3",
id: "anchor3",
level: 3,
}
] |
Yes I confirm in 2021 we had a nested structure, and now there's no |
Thank you for the notice! I updated upstream, and edited the OP accordingly. This works much better :) |
I am wondering if docusaurus has supported customisation of ToC generation, as it appears on the docs website that it did not, but I have discovered a potential use case for us here: lf-lang/lf-lang.github.io#238 |
@axmmisaka if you want to put headings inside tabs and expect the TOC to update according to the selected tab, there's another issue for that: #5343 |
Thanks for the reply. |
Have you read the Contributing Guidelines on issues?
Description
Docusaurus currently allows manually altering (and even outright replacing) the ToC in Docs, as per #3915 (comment), but this is not documented.
This issue is as much a question on whether this is something we can expect to rely on, as a request to document it if the answer is "yes".
Has this been requested on Canny?
No response
Motivation
This is useful for documentation that is generated from other sources (in this case, a
man
page): while the HTML can be injected, the ToC does not follow suit (and I wouldn't expect it to. Or is that preferable?).API design
Here is a doc which largely consists of externally-generated HTML, for which we additionally generate the ToC via a script.
Have you tried building it?
If the existing behavior is to be accepted as official, then nothing needs to be built; otherwise, what replacement API is deemed better will need to be discussed first.
Self-service
The text was updated successfully, but these errors were encountered: