Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -165,8 +165,9 @@ void read(Tokeniser t, CharacterReader r) {
case nullChar: // replacement
t.tagPending.appendTagName(replacementStr);
break;
case eof: // should emit pending tag?
case eof:
t.eofError(this);
t.emitTagPending();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The fix for the missing tag emission on EOF is incomplete. While this PR correctly adds t.emitTagPending() to the TagName state, several other states in the same file that are involved in building a tag also handle EOF without emitting the pending tag. This inconsistency can lead to tags being dropped if the input ends prematurely, which can be exploited to bypass HTML sanitizers that rely on this parser for security filtering.

The following states are also missing the t.emitTagPending() call in their eof case:

  • BeforeAttributeName (line 574)
  • AttributeName (line 623)
  • AfterAttributeName (line 664)
  • AttributeValue_doubleQuoted (line 756)
  • AttributeValue_singleQuoted (line 789)
  • AttributeValue_unquoted (line 828)
  • AfterAttributeValue_quoted (line 865)
  • SelfClosingStartTag (line 886)

Interestingly, BeforeAttributeValue (line 709) already correctly handles this. To ensure full protection against this vulnerability, the same fix should be applied to all the states listed above.

t.transition(Data);
break;
default: // buffer underrun
Expand Down