Skip to content

Nested A tags parsed improperly #144

@nonara

Description

@nonara

Synopsis

Given:

<a href="#">link <a href="#">nested link</a> end</a>

This is invalid HTML, and it should be parsed as:

<A>
 |-- <TextNode text="link ">
<A>
 |-- <TextNode text="nested link">
<TextNode text=" end">

However, it's parsed as:

<A>
 |-- <TextNode text="link ">
 |-- <A>
      |-- <TextNode text="nested link">
 |-- <TextNode text=" end">

This is causing issues for the markdown converter.

Spec

Spec dictates that an A tag cannot be a child of another A.

Upon encountering a nested A tag, the parser should consider the present tag terminated and begin a new one. Any further text that occurs after (ie. end, should be considered a TextNode)

This behaviour can be demonstrated via: https://astexplorer.net

Solution

I believe this should be easily solvable, without a performance impact. I will investigate this ASAP and submit a fix. I hope to get to it this weekend.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions