Skip to content

untokenize() does not round-trip for tab characters #128031

Closed as not planned
Closed as not planned
@tomasr8

Description

@tomasr8

Bug report

Bug description:

Tab characters used as white space do not roundtrip when using untokenize. Here's an example:

import tokenize, io

source = "a +\tb"

tokens = list(tokenize.generate_tokens(io.StringIO(source).readline))
x = tokenize.untokenize(tokens)
print(x)
# a + b

Here, the tab is replaced with a regular space. Given that untokenize tries to match the source string exactly, I think we should fix this. We can do that by getting the characters from the source line rather than always using a normal space:

self.tokens.append(" " * col_offset)

I'll send a fix in a moment :)

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

Labels

stdlibPython modules in the Lib dirtopic-parsertype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions