Skip to content

Commit b95645b

Browse files
authored
Prefer MIME type when determining extensions for MediaBag items (#10557)
Currently, remote images added to the MediaBag are stored at paths with extensions determined based on the external URI. For instance, an image from https://example.com/image.png is stored as <hash>.png. If the URI does not contain an extension (e.g., https://example.com/image), then the content-type of the downloaded image is used to determine the extension. This change switches the precedence such that content-type is preferred over extensions contained in the URI. This is necessary because some images are located at URIs with misleading extensions -- shields.io, for instance, serves SVGs from URIs with .yml extensions. With this change, the image/svg+xml content-type is now preferred over the .yml URI extension. This fixes a bug in the PDF writer in which such an image would be mishandled due to not being identified as an SVG.
1 parent ba04a99 commit b95645b

File tree

2 files changed

+6
-4
lines changed

2 files changed

+6
-4
lines changed

src/Text/Pandoc/MediaBag.hs

+5-3
Original file line numberDiff line numberDiff line change
@@ -107,9 +107,11 @@ insertMedia fp mbMime contents (MediaBag mediamap)
107107
_ -> getMimeTypeDef fp''
108108
mt = fromMaybe fallback mbMime
109109
path = maybe fp'' (unEscapeString . uriPath) uri
110-
ext = case takeExtension path of
111-
'.':e | '%' `notElem` e -> '.':e
112-
_ -> maybe "" (\x -> '.':T.unpack x) $ extensionFromMimeType mt
110+
ext = case extensionFromMimeType mt of
111+
Just e -> '.':T.unpack e
112+
Nothing -> case takeExtension path of
113+
'.':e | '%' `notElem` e -> '.':e
114+
_ -> ""
113115

114116
-- | Lookup a media item in a 'MediaBag', returning mime type and contents.
115117
lookupMedia :: FilePath

test/Tests/MediaBag.hs

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ tests = [
2929
assertBool "file in directory is not extracted with original name" exists1
3030
exists2 <- doesFileExist ("foo" </> "f9d88c3dbe18f6a7f5670e994a947d51216cdf0e.jpg")
3131
assertBool "file above directory is not extracted with hashed name" exists2
32-
exists3 <- doesFileExist ("foo" </> "2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua")
32+
exists3 <- doesFileExist ("foo" </> "2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.png")
3333
exists4 <- doesFileExist "a.lua"
3434
assertBool "data uri with malicious payload gets written outside of destination dir"
3535
(exists3 && not exists4)

0 commit comments

Comments
 (0)