Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: formatting deeply indented if is not idempotent and breaks code #288

Open
3 tasks done
black-puppydog opened this issue Oct 16, 2022 · 1 comment
Open
3 tasks done

Comments

@black-puppydog
Copy link

Checklist prior to opening an issue

  • I have followed fully the installation steps laid out in the documentation site.
  • I have restarted jupyterlab.
  • I have read the FAQ section in the documentation site.

First off, despite what I am writing below, I want to thank Ryan and all the other contributors for making this! The extension has saved me a ton of trouble and nerves and has just been an all-round quality of life improvement! 😊

Describe the bug

I have a cell in one of my notebooks that breaks on formatting. Not sure what is going wrong yet, but I managed to get a minimal breaking example:

lm_tokenizer = object()
def f():
    while True:
        while True:
            while True:
                if (
                    lm_tokenizer.convert_tokens_to_ids(name) != lm_tokenizer.unk_token_id
                ):
                    pass

The key here is the if statement in combination with the deep nesting. In real life, this is of course a combination of class, function, and if/else nesting.

The bug happens upon formatting twice. Here's what the first saving produces, which matches what the black online formatter produces:

lm_tokenizer = object()


def f():
    while True:
        while True:
            while True:
                if (
                    lm_tokenizer.convert_tokens_to_ids(name)
                    != lm_tokenizer.unk_token_id
                ):
                    pass

Note that the second part of the if statement is now the start of the line, beginning with a ! character. Now, if I paste this version into the black online formatter, it correctly produces the exact same output. However, both on my local JupyterLab and on saturncloud.io, I get this output on the second pass:

lm_tokenizer = object()


def f():
    while True:
        while True:
            while True:
                if (
                    lm_tokenizer.convert_tokens_to_ids(name)
# �                     != lm_tokenizer.unk_token_id
                ):
                    pass

The second part of the if has been commented out, and I don't know where that extra unicode symbol comes from either. In Jupyter it shows up as a red dot.
In any case, this obviously changes the code significantly, so something is going wrong... Since black itself seems to be doing just fine, I'd expect this to be a downstream issue, hence I'm filing this here.

Workaround

For the time being, I just disabled formatting for the code inside the if with #fmt:off. So I'm not blocked by this, but it took me a while to spot this since my code didn't break, it "just" produced incorrect results.

Diagnostic commands

Most relevant here, both black and jupyterlab_code_formatter are up to date:

black==22.10.0
jupyterlab-code-formatter==1.5.3
Full diagnostics
$ pip freeze
absl-py==1.2.0
aiohttp==3.8.3
aioitertools==0.11.0
aiosignal==1.2.0
anyio==3.6.1
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
asttokens==2.0.8
astunparse==1.6.3
async-timeout==4.0.2
attrs==22.1.0
Babel==2.10.3
backcall==0.2.0
beautifulsoup4==4.11.1
bencoder.pyx==3.0.0
black==22.10.0
bleach==5.0.1
boto3==1.24.87
botocore==1.27.59
cachetools==5.2.0
certifi @ file:///croot/certifi_1665076670883/work/certifi
cffi==1.15.1
charset-normalizer==2.1.1
click==8.1.3
commonmark==0.9.1
ConfigArgParse==1.5.3
contourpy==1.0.5
croniter==1.3.7
cycler==0.11.0
datasets==2.6.0
debugpy==1.6.3
decorator==5.1.1
deepdiff==5.8.1
defusedxml==0.7.1
dill==0.3.5.1
dnspython==2.2.1
dottorrent==1.10.1
dottorrent-gui==1.3.11
email-validator==1.3.0
entrypoints==0.4
execnb==0.1.4
executing==1.1.0
fastcore==1.5.27
fastjsonschema==2.16.2
filelock==3.8.0
fire==0.4.0
fonttools==4.37.4
frozenlist==1.3.1
fsspec==2022.8.2
ghapi==1.0.3
google-auth==2.12.0
google-auth-oauthlib==0.4.6
grpcio==1.49.1
h11==0.14.0
httptools==0.5.0
huggingface-hub==0.10.0
humanfriendly==10.0
idna==3.4
ipykernel==6.16.0
ipython==8.5.0
ipython-genutils==0.2.0
ipywidgets==8.0.2
isort==5.10.1
itsdangerous==2.1.2
jedi==0.18.1
Jinja2==3.1.2
jmespath==1.0.1
joblib==1.2.0
json5==0.9.10
jsonschema==4.16.0
jupyter==1.0.0
jupyter-console==6.4.4
jupyter-core==4.11.1
jupyter-server==1.19.1
jupyter_client==7.3.5
jupyterlab==3.4.8
jupyterlab-code-formatter==1.5.3
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.3
jupyterlab_server==2.15.2
kiwisolver==1.4.4
lxml==4.9.1
Markdown==3.4.1
MarkupSafe==2.1.1
matplotlib==3.6.0
matplotlib-inline==0.1.6
mistune==2.0.4
multidict==6.0.2
multiprocess==0.70.13
mypy-extensions==0.4.3
nbclassic==0.4.4
nbclient==0.7.0
nbconvert==7.1.0
nbdev==2.3.7
nbformat==5.6.1
nest-asyncio==1.5.6
nltk==3.7
notebook==6.4.12
notebook-shim==0.1.0
numpy==1.23.3
oauthlib==3.2.1
ordered-set==4.1.0
orjson==3.8.0
packaging==21.3
pandas==1.5.0
pandocfilters==1.5.0
parso==0.8.3
pathspec==0.10.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.2.0
platformdirs==2.5.2
prometheus-client==0.14.1
prompt-toolkit==3.0.31
protobuf==3.19.6
psutil==5.9.2
ptyprocess==0.7.0
pudb==2022.1.2
pure-eval==0.2.2
pyarrow==9.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.21
pydantic==1.10.2
pyDeprecate==0.3.2
Pygments==2.13.0
PyJWT==2.5.0
pyparsing==3.0.9
pyrsistent==0.18.1
python-dateutil==2.8.2
python-dotenv==0.21.0
python-multipart==0.0.5
pytorch-lightning==1.7.7
pytorch-pretrained-biggan==0.1.1
pytz==2022.4
PyYAML==6.0
pyzmq==24.0.1
qtconsole==5.3.2
QtPy==2.2.1
regex==2022.9.13
requests==2.28.1
requests-oauthlib==1.3.1
responses==0.18.0
rich==12.6.0
rofimoji==5.6.0
rsa==4.9
s3transfer==0.6.0
scikit-learn==1.1.2
scipy==1.9.1
Send2Trash==1.8.0
six==1.16.0
sniffio==1.3.0
soupsieve==2.3.2.post1
speedtest-cli==2.1.3
stack-data==0.5.1
starlette==0.20.4
tensorboard==2.10.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
termcolor==2.0.1
terminado==0.16.0
threadpoolctl==3.1.0
tinycss2==1.1.1
tokenizers==0.12.1
tomli==2.0.1
torch==1.12.1
torchmetrics==0.10.0
torchvision==0.13.1+cpu
tornado==6.2
tqdm==4.64.1
traitlets==5.4.0
transformers==4.22.2
typing_extensions==4.3.0
ujson==5.5.0
urllib3==1.26.12
urwid==2.1.2
urwid-readline==0.13
uvicorn==0.18.3
uvloop==0.17.0
watchdog==2.1.9
watchfiles==0.17.0
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.4.1
websockets==10.3
Werkzeug==2.2.2
widgetsnbextension==4.0.3
wrapt==1.14.1
xxhash==3.0.0
yarl==1.8.1
youtube-dl==2021.12.17
yt-dlp==2022.8.19
$ jupyter labextension list
JupyterLab v3.4.8
/home/daan/miniconda3/envs/hfc/share/jupyter/labextensions
        jupyterlab_pygments v0.2.2 enabled OK (python, jupyterlab_pygments)
        @jupyter-widgets/jupyterlab-manager v5.0.3 enabled OK (python, jupyterlab_widgets)
        @ryantam626/jupyterlab_code_formatter v1.5.3 enabled OK (python, jupyterlab-code-formatter)
$ jupyter serverextension list
config dir: /home/daan/miniconda3/envs/hfc/etc/jupyter
    jupyterlab  enabled
    - Validating...
      jupyterlab 3.4.8 OK
    jupyterlab_code_formatter  enabled
    - Validating...
      jupyterlab_code_formatter 1.5.3 OK
@ryantam626
Copy link
Collaborator

Sorry for the late reply, I haven't given this project much love, I just kinda lost steam over the years as I got increasingly burnt out, but lately I have gotten a second wind (perhaps only for a brief period...)

Oh boy this is not great.

FWIW I am in middle to a big refactor for the project (mostly due to the evolution of jupyterlab's plugin tooling), I can look into this bug after that.

Without the the refactor, the development envrionment for this plugin is just nightmare-ish to use, so that is currently trumping all tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants