Skip to content

tech-debt: Markdown Parser - Additional Features from Spec #28

@raifdmueller

Description

@raifdmueller

Summary

The Markdown Parser (PR #27) implements all 6 Acceptance Criteria from 04_markdown_parser.adoc. However, the specification mentions additional features that are not yet implemented. This issue tracks these as technical debt for future enhancement.

Open Items

1. Reserved Frontmatter Fields (Spec lines 193-211)

The spec defines reserved frontmatter fields that are not yet processed:

Field Description Current Status
order Explicit sort order (overrides filename) Not implemented
draft Mark document as draft (should be ignored) Not implemented
exclude Exclude document from index Not implemented

Impact: Low - relevant for Structure Index integration

2. End Line Tracking (Spec lines 307-308, 234-236)

The spec shows end_line for sections and code blocks:

# Spec shows:
class MarkdownSection:
    start_line: int
    end_line: int  # <-- Missing

class MarkdownElement:
    start_line: int
    end_line: int  # <-- Missing (only have SourceLocation.line)

Current: Using shared Section model with SourceLocation (only line = start line)

Impact: Medium - needed for precise content extraction

2. End Line Tracking (Spec lines 307-308, 234-236)

The spec shows end_line for sections and code blocks:

# Spec shows:
class MarkdownSection:
    start_line: int
    end_line: int  # <-- Missing

class MarkdownElement:
    start_line: int
    end_line: int  # <-- Missing (only have SourceLocation.line)

Current: Using shared Section model with SourceLocation (only line = start line)

Impact: Medium - needed for precise content extraction

3. Code Block Content Extraction (Spec line 236)

The spec says code blocks should have a content attribute with raw content (without fence markers).

Current: Only metadata (language, location) is stored, not the actual code content.

Impact: Medium - needed for Content Access API

4. Combined Folder Structure (Spec lines 137-158)

When parsing a folder, the spec describes:

  • Heading levels should be adjusted in context (H1 in subfile → H2 in combined doc)
  • FolderDocument.structure should contain the combined hierarchy

Current: structure: [] is a placeholder, not populated.

Impact: Low - integration feature for document composition

5. Task Lists & Blockquotes Recognition (Spec lines 72-77)

Mentioned as "recognized but not detailed parsed":

  • Task Lists: - [ ] and - [x]
  • Blockquotes: > quoted text

Current: Not recognized at all.

Impact: Low - spec says "not detailed parsed" anyway

Recommendation

These items can be addressed when:

  1. Structure Index (Issue Structure Index: In-Memory Document Index #5) needs the reserved frontmatter fields
  2. Content Access API needs end_line and code block content
  3. Document composition use cases require combined folder structure

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    tech-debtTechnical debt to be addressed later

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions