Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate (sandbox) language servers #12358

Open
1 task done
jansol opened this issue May 27, 2024 · 13 comments
Open
1 task done

Isolate (sandbox) language servers #12358

jansol opened this issue May 27, 2024 · 13 comments
Labels
enhancement [core label] extension infrastructure Feedback for extensions APIs, creation, management, etc language An umbrella label for all programming languages syntax behaviors network Network connectivity issues, protocols and services support security & privacy Data privacy issue, security vulnerabilities, etc

Comments

@jansol
Copy link
Contributor

jansol commented May 27, 2024

Check for existing issues

  • Completed

Describe the feature

Related to #12354. Language servers downloaded by Zed have full access to everything on the machine. This is problematic from a security perspective (e.g. as seen recently the github release could be tampered with, causing people to download and execute malicious code). To mitigate the possible impact from this, it would be useful to sandbox language servers somehow, e.g. restrict them to only access files that belong to the work tree (and the language's respective system/standard library modules).

If applicable, add mockups / screenshots to help present your vision of the feature

No response

@jansol jansol added admin read Pending admin review enhancement [core label] triage Maintainer needs to classify the issue labels May 27, 2024
@Moshyfawn Moshyfawn added language An umbrella label for all programming languages syntax behaviors network Network connectivity issues, protocols and services support extension infrastructure Feedback for extensions APIs, creation, management, etc security & privacy Data privacy issue, security vulnerabilities, etc and removed triage Maintainer needs to classify the issue labels May 27, 2024
@maxdeviant
Copy link
Member

Is there any prior art in this space?

@jansol
Copy link
Contributor Author

jansol commented May 28, 2024

The extreme solution would basically be containers (docker, lxc, etc). I don't know how they work on other platforms but on Linux it AFAIK boils down to some cgroup2 trickery and chroot'ing into a virtual directory hierarchy set up with bind-mounts that looks like a linux system with only the specified things present.

@maxdeviant
Copy link
Member

I was thinking more of other editors that have sandboxed execution of language servers.

@bjorn3
Copy link

bjorn3 commented May 28, 2024

Implementing good sandboxes from scratch can be surprisingly tricky. For example if you forget to create a new network namespace (for example because the language server needs internet access to download dependencies from the internet and you don't want to use a bridge between the host and container network namespace like netavark), you just got yourself a way to escape the sandbox by sending arbitrary keypresses to the X server over a unix socket in the "abstract namespace" which is attached to the network namespace rather than the mount namespace like regular unix sockets.

Also the exact extent to which you can sandbox a language server depends on the language server in question. For example rust-analyzer needs to be able to run builds with cargo, which in turn requires read-only access to the system toolchain and all source code as well as full write access to the target dir, Cargo.lock and the caches in ~/.cargo/{git,registry}. While other language servers may be perfectly fine with not being able to access any files at all other than through the editor presenting it the source code for open files through the lsp protocol.

@JosephTLyons JosephTLyons removed the admin read Pending admin review label May 28, 2024
@skewballfox
Copy link
Contributor

skewballfox commented May 28, 2024

For example if you forget to create a new network namespace (for example because the language server needs internet access to download dependencies from the internet and you don't want to use a bridge between the host and container network namespace like netavark)

I'd argue that that isn't a feature that should be built directly into a language server, and instead handled by the whatever is (planned for) managing the extension.

For example rust-analyzer needs to be able to run builds with cargo, which in turn requires read-only access to the system toolchain and all source code as well as full write access to the target dir, Cargo.lock and the caches in ~/.cargo/{git,registry}.

some language servers also check ~/.config for global configurations, along with project scoped configurations in the workspace directory.

I checked the docs for extensions but didn't see anything relevant. If extension sandboxing is something zed is interest in supporting, one way to support this would be to have some kind of allowlist in the extension toml:

[permission.files]
workspace=["pyproject.toml", "mypy.init"]

@SyntheticBird45
Copy link

SyntheticBird45 commented May 29, 2024

Regarding linux:

  • Namespaces are extreme and might cause issues on hardened kernel/distributions as user namespaces are disabled. Moreover, it is tricky to provide good security with them by respecting the least privilege principle, as it'll be easy for things to break. For reference, VSCode (being based on electron) make use of a SUID helper binary to setup a sandbox when user namespaces are unavailable.
  • Stricly speaking of FS restrictions, the Landlock API can be useful
  • SELinux and Apparmor are out of scopes as you can't dynamically manage them/load your own without risky permissions and/or performance cost.
  • Flatpak make use of Bubblewrap software to sandbox its applications. Bubblewrap is available as a user namespaces and SUID variants.

From my POV, Landlock and Bubblewrap seems to be the best available options at sandboxing launched extensions processes

@bjorn3
Copy link

bjorn3 commented May 29, 2024

Stricly speaking of FS restrictions, the Landlock API can be useful

Landlock currently doesn't support restricting unix sockets in the abstract namespace, so without separately creating a new network namespace using eg Bubblewrap or restricting all network access sycalls using seccomp, it isn't really useful for sandboxing on desktop systems.

@jansol
Copy link
Contributor Author

jansol commented May 29, 2024

Looks like there's a "rust-native" implementation based on seccomp+namespaces+cgroup resource limits with a fairly nice API: https://crates.io/crates/hakoniwa

Bubblewrap would be the more "mainstream" option. AFAIK it is also used by steam, among others, but I did not find existing rust bindings for it with a quick search.

@SyntheticBird45
Copy link

SyntheticBird45 commented May 29, 2024

Landlock currently doesn't support restricting unix sockets in the abstract namespace, so without separately creating a new network namespace using eg Bubblewrap or restricting all network access sycalls using seccomp, it isn't really useful for sandboxing on desktop systems.

Looks like there's a "rust-native" implementation based on seccomp+namespaces+cgroup resource limits with a fairly nice API: https://crates.io/crates/hakoniwa

As long as the final solution do not make Zed rely on user namespaces, it's fine.

@jansol
Copy link
Contributor Author

jansol commented May 29, 2024

is this what you mean by user namespaces? Looks like they are optional.

@bjorn3
Copy link

bjorn3 commented May 29, 2024

Without user namespaces or a setuid helper it is not possible to create new mount or network namespaces without root permission. This means that the only sandboxing you can do is with seccomp. This is rather fragile for two reasons however: You can't selectively allow specific paths to be read or written. Only allow or deny all filesystem access. This is fine when the program to be sandboxed applies the seccomp rules after dynamic linking and other kinds of initialization that need filesystem access and the program genuinely doesn't need any filesystem access at runtime or all files it needs access to are opened by the program before applying the seccomp rules. This doesn't work when you want to sandbox an untrusted application rather than a trusted program that may happen to have a bug reachable only after the initialization phase. The second reason is that seccomp works at the level of individual syscalls. You either have to allow all syscalls except for specific ones (which will allow a sandbox escape when updating the kernel) or deny all syscalls except for specific ones. When denying all except specified ones any change to the to be sandboxed program may cause the sandboxed process to crash. Any libc update can start using new syscalls and the same applies for other libraries and the program itself.

@SyntheticBird45
Copy link

Since VSCode is based on electron it can take advantage of the SUID helper chrome-sandbox. I hardly see Zed needing to ship a root/4755 binary. Unless the sandboxing is a user namespace only feature, there is no practical way for Zed to sandbox language servers on Linux

@bjorn3
Copy link

bjorn3 commented May 29, 2024

Relying on bwrap would still be an option. It can take advantage of user namespaces, but if those are not available it can run as SUID helper. A lot of people have it installed already as it is required by flatpak. And for the remaining users you can ship a copy and try to use user namespaces. For the small subset of users who have neither bwrap not user namespaces opting for no sandbox would be better than opting for no sandbox for anyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement [core label] extension infrastructure Feedback for extensions APIs, creation, management, etc language An umbrella label for all programming languages syntax behaviors network Network connectivity issues, protocols and services support security & privacy Data privacy issue, security vulnerabilities, etc
Projects
None yet
Development

No branches or pull requests

7 participants