Skip to content

UFAL/Download instructions for command line don't work #1210

@milanmajchrak

Description

@milanmajchrak

Original issue: ufal/clarin-dspace#1130 (comment)

Why curl {} brace expansion cannot be used for download commands

Background

The original requirement was to use curl's URL globbing ({}) for compact download commands:

curl -o "#1" "https://server/api/.../handle/123/42{/file1.txt,/file2.txt}"

After thorough investigation, this approach is not viable. Here's why:

Problem 1: Raw (non-encoded) filenames in {} produce invalid HTTP requests

Filenames with spaces, diacritics, or special characters (Médiá (3).jfif) placed directly in {} result in curl sending raw bytes in the URL path. This violates HTTP spec (RFC 7230) — the server returns 400 Bad Request.
If you are trying to use this CURL on windows you must have the right encoding in you terminal.
You find it out using the command chcp

Problem 2: Percent-encoded filenames in {} produce wrong output filenames

curl -o "#1" "url/{M%C3%A9di%C3%A1.jfif,simple.txt}"

The #1 variable is substituted with the literal text from {} — including percent sequences. The file is saved as M%C3%A9di%C3%A1.jfif instead of Médiá.jfif. curl does not URL-decode #1. From the official documentation: "curl does not attempt to decode %-sequences (yet) in the provided file name". There is no --url-decode or similar option.

Problem 3: -J (Content-Disposition) fails on Windows

curl -OJ "url/{M%C3%A9di%C3%A1.jfif,simple.txt}"

The -J flag reads the filename from the server's Content-Disposition header. However, curl on Windows fails with "Invalid argument" when the header contains non-ASCII characters (diacritics, CJK, etc.).

Solution

Use separate -o "filename" "url" pairs for each file:

curl -o "Médiá (3).jfif" "https://server/.../M%C3%A9di%C3%A1%20%283%29.jfif" -o "data.csv" "https://server/.../data.csv"
  • -o receives the real filename (with diacritics) directly from the shell — works on all OS
  • The URL is properly percent-encoded — valid HTTP
  • Works on Windows, Linux, and macOS for any filename

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions