Open
Description
IPFS path components (that is, IPFS merkledag node link names) can contain undesirable things:
- C string terminators (
0x00
, ASCIINUL
) - UNIX newlines (
0x0A
,"\n"
, ASCIILF
) - Tabs (
0x09
,"\t"
) - Escape sequences (
0x1B
, ASCIIESC
) - Every other control character (
0x00
…0x1F
,0x7F
) - Slashes (
0x2F
,"/"
) - Bytes which are illegal in UTF-8 (
0xFE
,0xFF
) - Byte sequences which are invalid UTF-8 (malformed sequences, like
0x80
in isolation) - UTF-8 sequences that encode invalid code points (i.e.
U+D800
…U+D8FF
) - UTF-8 sequences that are overlong (i.e.
0xC0 80
,0xE0 80 80
,F0 80 80 80
all decode asU+0000
, which isNUL
)
Path components can also be strings that are commonly understood as path components with a special meaning, namely "."
and ".."
.
I propose that path components (link names) be restricted both by specification and implementation, preferably by defining "valid" path components in a way that excludes the above. go-ipfs
should refuse to create invalid links. go-ipfs
should also refuse to process any invalid links it sees, either by discarding the link or discarding the node, and that behavior should be specified as well.