Description
Goal
Simplify our pylance/pyright config in our monorepo setup, and improve pylance startup performance.
Summary
With go-to-definition (editor.action.revealDefinition
) and friends, if the definition file is found via a symlink, I would like pylance to resolve the symlink when opening the file.
(cc @rchiodo and @debonte for pylance and @cwebster-99 @luabud for monorepo support)
Motivation
In our monorepo, we have a mix of languages, and teams with varying preferences for file layout.
Python packages are found at various levels within the repo, and these files may live alongside other scripts and other things ought not be import targets.
To support static analysis tools in this kind of setup, we have a directory (e.g. pkgs
) with symlinks to the individual python packages.
When linting and at runtime, we put this path on PYTHONPATH, and then it's as if we had a well laid-out python environment.
Currently, Pylance opens the definition file with its symlinked path. This results in duplicate tabs opened to the same file, and means certain things such as git gutters aren't rendered in the symlinked file.
Therefore, to avoid this, we maintain a pyrightconfig.json
with extraPaths
that includes the parent of every python package. This is roughly 200 directories right now…
The sole benefit of this config is that Go to Definition opens the canonical path of the file, within the team's code.
This comes at a heavy price though:
- the config is hard to maintain; if teams add a package but forget to add it to the list then pylance reports import errors (red squiggle)
- if there scripts adjacent to packages, Pylance sometimes accidentally picks them up as valid import targets
- having a large number of
extraPaths
is extremely slow. I did some rough benchmarking and found that pylance took about 60s after a 'reload window' to start showing syntax highlighting. If I instead just added thepkgs
symlink directory, then it started in 20s.
Only if Pylance resolved symlinks we would be very happy with our monorepo setup.
Partial Workaround
Install https://github.com/zaucy/vscode-symlink-follow/ and configure
"symlink-follow.autoFollow": true,
"symlink-follow.onlyFollowWithinWorkspace": true,
This is subpar though because when resolving symlinks, the symlink quickly opens then closes, causing a visual flash, and breaking repon closed tab (workbench.action.reopenClosedEditor
(cmd/ctrl + shift + t)): zaucy/vscode-symlink-follow#7
Reproduction
Shell script to reproduce
#!/bin/sh
# make a pretend repo
mkdir test; cd test
mkdir myrepo
# make two packages where one imports the other
mkdir -p myrepo/team1/pkg1/ myrepo/team2/pkg2/
echo "import pkg2" > myrepo/team1/pkg1/__init__.py
touch myrepo/team2/pkg2/__init__.py
# symlink the packages into a consistent import location
mkdir myrepo/pkgs
ln -sr myrepo/team1/pkg1/ myrepo/pkgs/pkg1
ln -sr myrepo/team2/pkg2/ myrepo/pkgs/pkg2
# create a pyrightconfig.json file that includes the pkgs directory
echo '{ "extraPaths": [ "pkgs" ] }' > myrepo/pyrightconfig.json
# symlink it
ln -s myrepo my-symlinked-repo
# open the myrepo in vscode
code myrepo
# expected:
# myrepo/team1/pkg1/__init__.py 'import pkg2' go to definition should go to `myrepo/team2/pkg2/__init__.py`
# actual:
# myrepo/team1/pkg1/__init__.py 'import pkg2' go to definition goes to `myrepo/pkgs/pkg2/__init__.py`
# then open the symlinked repo
code my-symlinked-repo
# expected & actual:
# my-symlinked-repo/team1/pkg1/__init__.py 'import pkg2' go to definition goes to my-symlinked-repo/pkg2/__init__.py
├── myrepo
│ ├── pkgs
│ │ ├── pkg1 -> ../team1/pkg1
│ │ └── pkg2 -> ../team2/pkg2
│ ├── team1
│ │ └── pkg1
│ │ └── __init__.py # this imports pkg2
│ └── team2
│ └── pkg2
│ └── __init__.py
└── my-symlinked-repo -> myrepo
expected:
In myrepo/team1/pkg1/__init__.py
'import pkg2' go to definition should go to myrepo/team2/pkg2/init.py
actual:
In myrepo/team1/pkg1/__init__.py
'import pkg2' go to definition instead goes to myrepo/pkgs/pkg2/init.py
Background
-
In July 2023: A user files a ticket to support a workflow where they wish pylance to resolve symlinked directories within a repo, that are a component of a definition file path: Paths to files are not resolved with realpath before opening #4588
-
In Nov 2023: A different user files a ticket to avoid having pylance 'go to definition' resolve symlink when the entire repo is a symlink: [Linux] GoTo definition performs symlink resolution #5136
-
In Dec 2023: Pylance 2023.11.102, the above was closed.
-
In Jan 2024: Pyright 1.1.346, presumably, this is the diff pulled into Pyright which altered the symlink resolution: https://github.com/microsoft/pyright/pull/6967/files#r1450820504
Proposed Acceptance criteria
- Given a symlinked repo, when the user runs go to definition, it should open up a filepath within the symlinked repo. (This is the current behavior as of [Linux] GoTo definition performs symlink resolution #5136)
- Given a repo with symlinks that resolve to files within the repo, when the user runs go to definition, pylance should open up the resolved path within the repo. (this would address this ticket's goal, and also the previous ask in Paths to files are not resolved with realpath before opening #4588)
Sample fix
I verified that this small diff properly resolves symlinks.
Hnasar/pyright@45c609e
If needed, I can continue trying to implement the acceptance criteria in this, but I wanted to get buy-in and see what kind of semantics were amenable to the Pylance & Pyright devs.
(best case scenario would be to get something like this fixed upstream rather than in something downstream fork).
Elaboration
In terms of the monorepo setup guide our monorepo fits into the "Scenario 1":
Using one shared virtual environment
In microsoft/vscode-python#21204, @luabud writes
If a mono repo is set up in a way that one can use a shared virtual environment (i.e. there's no dependency conflicts between the projects in the mono repo), my understanding is that the current experience is good enough as one can simply open the root/base folder in VS Code, create a virtual environment on the project root and install all the dependencies on that same venv. The extension can automatically activate that venv and all actions can be performed inside it.
Similar as to #4588, the difference with our monorepo is that we don't actually install the packages into a venv. We have a shared venv globally installed to reduce space on all machines, and then we dynamically add the pkgs
symlink dir to sys.path
. If we were to follow the open source way, and introduce a lot of overhead to our build process (similar to this user), we could theoretically:
- define package metadata for all our python packages
- reorganize the entire monorepo to move all of our python packages into
src
folders to avoid accidentally importing scripts - require that users create a venv with editable pkg installs
- figure out how to combine our shared venv with the local in-repo venv
I support hundreds of developers, so such a large rework of our entire repo and build setup will be quite arduous.
A little bit more cleverness with resolving the symlinks would go a long way and is the single blocker to us being happy with our monorepo set up.