In HTML and Markdown, we have a concept of nested and hierarchical header text, and I think it could be helpful to track for TitleText.
Does Unstructured currently have any tracking for?:
- Identifying the hierarchical level of a title during parsing.
- Storing this hierarchy level (H1, H2, etc.) in the metadata.
- Exposing the hierarchy of relevant headers on sub-elements, such as on metadata of a
NarrativeText element?
Totally understandable if this is out of scope or perhaps not relevant to mainstream use cases. For my part, I just wanted to better understand if this is something that Unstructured can do, or if not if there are existing plans to add something like this in the future.
Thanks!