Skip to content

Fixed escaping of alt text in ContentFormat.img() #2036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bmispelon
Copy link
Member

@bmispelon bmispelon commented Apr 17, 2025

This should fix #2035 and also address escaping issues in the other formats (markdown and restructured text).

@bmispelon bmispelon changed the title Fixed escaping of alt text in ContentFormat.HTML Fixed escaping of alt text in ContentFormat.img() Apr 17, 2025
@bmispelon bmispelon marked this pull request as ready for review April 17, 2025 07:43
@bmispelon bmispelon force-pushed the fix/2035-img-alt-text-escape branch from 6cdcbef to 161b8fd Compare April 17, 2025 07:44
@bmispelon bmispelon requested a review from thibaudcolas April 17, 2025 07:44
@thibaudcolas thibaudcolas removed their request for review May 14, 2025 04:41
Copy link
Member

@thibaudcolas thibaudcolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable but I don’t think I’ll have time to actually test this / review in more depth, sorry

Copy link
Member

@adamzap adamzap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the final ReST test that includes a newline (see my other comment), I'm able to get the tests to pass with this small change:

diff --git a/blog/models.py b/blog/models.py
index 0083dd22..9ca310b0 100644
--- a/blog/models.py
+++ b/blog/models.py
@@ -1,3 +1,4 @@
+import html
 from urllib.parse import urlparse
 
 from django.conf import settings
@@ -73,7 +74,7 @@ class ContentFormat(models.TextChoices):
         CF = type(self)
         return {
             CF.REST: f".. image:: {url}\n   :alt: {alt_text}",
-            CF.HTML: f'<img src="{url}" alt="{alt_text}">',
+            CF.HTML: f'<img src="{url}" alt="{html.escape(alt_text)}">',
             CF.MARKDOWN: f"![{alt_text}]({url})",
         }[self]

I'm wondering if _md_escape is not needed since markdown escapes alt text automatically:

>>> import markdown
>>> markdown.markdown('![Al"t t"ext](/path/to/img.jpg)')
'<p><img alt="Al&quot;t t&quot;ext" src="/path/to/img.jpg" /></p>'

If the intent behind _md_escape is something beyond fixing alt text quoting, could we put it in a different PR?

@bmispelon bmispelon force-pushed the fix/2035-img-alt-text-escape branch from 161b8fd to cef9de9 Compare May 20, 2025 20:53
@bmispelon
Copy link
Member Author

I got carried away not realizing that escaping was not really needed in markdown, I agree that keeping it simple is better here.

I went for the format_html route instead of manually using html.escape() as you suggested, but removed the _md_escape() and _rst_escape() which simplifies the PR (I agree that supporting newlines in alt texts for rst is not worth the complications).

Let me know what you think (I've kept the changes in a separate commit for ease of review but the two should get squashed when merging).

Copy link
Member

@adamzap adamzap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks great! No worries on the initial complexity. Thanks for considering my suggestions.

Copy link
Member

@pauloxnet pauloxnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ContentFormat image HTML source needs to escape attributes
4 participants