Skip to content

Handling of nulls (and other escape sequences) in docstrings #50992

Open

Description

It is relatively easy to mess up and include non-printable characters in docstrings, e.g.:

"""
This function works on the NULL character `\0`.
"""
function foo end

The \0 ends up as an actually 0x00 character in the string, and will not be printed e.g. in the REPL. There are about ~10 cases of this in the ecosystem, with two random examples:

This can cause problems also downstream in tooling, with e.g. Documenter currently just emitting \0 characters into the HTML (JuliaDocs/Documenter.jl#2226), which in turn can cause other problems (e.g. Gumbo does not seem to like \0 in HTML).

Opening this as a discussion / tracking issue, to figure out if there is anything we can do. Point being here that it's unlikely that having the 0x00 in the string is ever the intent of the docstring author, and more likely they wanted 0x5c 0x30.

  1. We could warn or disallow certain characters in docstrings? Disallowing would be breaking, since as seen above, it already exists in the wild.
  2. We could somehow handle it in the Markdown standard library? Or maybe when you pull it up in the REPL? In the HTML spec, 0+0000 sometimes gets replaced with 0+FFFD. This is also how the CommonMark spec handles it (and therefore CommonMark.jl). So an option here would be to fix it in the Markdown parser.
  3. We could try to rely on external tooling, like semgrep or docstring linting (@tecosaur) to catch these issue.
  4. Or we just don't do anything and just accept that the docstring strings can also contain those characters, and just handle it in the tooling?

cc @pfitzseb @pankgeorg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    docsystemThe documentation building system

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions