Skip to content

Commit 25f8bc3

Browse files
Leg0shiiceberam
andcommitted
fix(HTML): skip loading SVG files in image data handling
Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Signed-off-by: Leg0shii <dragonsaremyfavourite@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>
1 parent 8960f4a commit 25f8bc3

File tree

10 files changed

+12888
-2141
lines changed

10 files changed

+12888
-2141
lines changed

docling/backend/html_backend.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454
"details",
5555
"figure",
5656
"footer",
57+
"img",
5758
"h1",
5859
"h2",
5960
"h3",
@@ -1242,6 +1243,10 @@ def _create_image_ref(
12421243
return None
12431244

12441245
def _load_image_data(self, src_url: str) -> Optional[bytes]:
1246+
if src_url.lower().endswith(".svg"):
1247+
_log.debug(f"Skipping SVG file: {src_url}")
1248+
return None
1249+
12451250
try:
12461251
if src_url.startswith(("http://", "https://")):
12471252
response = requests.get(src_url, stream=True)

tests/data/groundtruth/docling_v2/embedded_wiki_duck.html.itxt

Lines changed: 260 additions & 256 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)