Skip to content

Fixes for recently reported issues#28

Open
frenzymadness wants to merge 6 commits intomainfrom
fixes
Open

Fixes for recently reported issues#28
frenzymadness wants to merge 6 commits intomainfrom
fixes

Conversation

@frenzymadness
Copy link
Member

This is kinda urgent, so I'm requesting review from multiple teammates at the same time.

As members of fedora-python org, you should have access to the privately reported vulnerabilities that this is going to solve. We can also discuss the issues and the solution proposed here in person, if you want to.

Cc: @uug4na

Unicode escapes in CSS were not properly decoded before security
checks. This prevents attackers from bypassing filters using
escape sequences.
<base> tags are now automatically removed whenever <head> is removed to
prevent URL hijacking attacks. According to HTML spec, <base> must be in
<head>, but browsers may interpret misplaced <base> tags, allowing
attackers to redirect all relative URLs to malicious servers.
Copy link

@uug4na uug4na left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick fix - decoding hex escapes covers the @\69mport / \000069 case nicely.

One small concern: CSS escapes aren’t only hex-based. A backslash can also escape a single character (e.g. \i -> i), so variants like @\impor\t or @\i mport may still be interpreted by browsers as @import. Since backslashes are no longer stripped, those might slip past _has_sneaky_javascript().

Maybe worth either fully normalizing CSS escapes or handling remaining backslashes after hex decoding. I can add test cases like <style>@\impor\t url(evil.css)</style> if helpful.

@frenzymadness
Copy link
Member Author

I thought about it and tested cases like:

<div style="background:url(ja\vascript:alert(1));">INLINE TEST 1</div>

or

<div style="@\import url(https://evil.invalid/x.css);">INLINE TEST 2</div>

but they were neither rendered without the backslash nor executed by my browser. But sure, I can put the removal of all backslash characters back and run it after the escape sequences are decoded.

@uug4na
Copy link

uug4na commented Feb 26, 2026

Appreciate you testing it. +1 to re-adding backslash removal after decoding, to cover non-hex escapes too.

Copy link

@befeleme befeleme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Maint commits: ✔️
  • sanitation - code-wise, functionality-wise: ✔️
  • CSS @import sanitizer: code-wise, functionality-wise: ✔️

I'm not familiar with the topic well enough to figure out if there are missing use cases, but for the ones listed in the report and covered in tests, I verified that the vulnerability existed and is remedied with the code from this PR.
If you're going to add more robust backslash handling, I'll be happy to re-review.

@frenzymadness
Copy link
Member Author

I've restored the old behavior and added more tests in two new commits for easier review.

test_cases = [
# Tab after escape
('<div style="@\\69\tmport url(evil.css)">test</div>', '<div>test</div>'),
# Newline after escape (note: actual newline, not \n)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(note: actual newline, not \n) - I see \n just a line below, I don't understand the comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand it like it's neither r"\n" nor "\\n".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants