You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that the HTML SAX parser fails when reading the same html from a file vs a string. It chokes on any badly formed tags such as <br > from an IO source
Yeeeeaaaaaah, this is a weird one. The TL;DR is that HTML::SAX::Parser is using XML::ParserContext instead of HTML::ParserContext for parsing IO. This bug has been present since ... the dawn of time.
SOOOOoooo ... how does it feel to be the first person in the universe to SAX-parse an HTML file through an IO object? Must feel good, right? Right?
Sigh.
I'll have a fix as soon as I work through the inconsistencies in API between the XML and HTML SAX parsers.
I noticed that the HTML SAX parser fails when reading the same html from a file vs a string. It chokes on any badly formed tags such as
<br >
from an IO sourceI've added a test to highlight the issue in #1576
If anyone could point me in a general direction I would be more than happy to try fix this.
The text was updated successfully, but these errors were encountered: