Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand environment variables in user-provided index URLs #5734

Open
charliermarsh opened this issue Aug 2, 2024 · 23 comments
Open

Expand environment variables in user-provided index URLs #5734

charliermarsh opened this issue Aug 2, 2024 · 23 comments
Labels
compatibility Compatibility with a specification or another tool needs-decision Undecided if this should be done

Comments

@charliermarsh
Copy link
Member

Right now, we only support expansion (1) by the shell (of course) and (2) in requirements files. But the following don't expand:

  • cargo run pip install flask --index-url 'https://${PYPI_USERNAME}.com/repository/pypi-proxy/simple/flask/'
  • Anything in a uv.toml or pyproject.toml file
@charliermarsh charliermarsh added the compatibility Compatibility with a specification or another tool label Aug 2, 2024
@charliermarsh
Copy link
Member Author

It looks like pip does not support either of these.

@charliermarsh
Copy link
Member Author

charliermarsh commented Aug 2, 2024

Poetry does not support the latter: python-poetry/poetry#208

@charliermarsh
Copy link
Member Author

Rust does not support this but the issue is not closed: rust-lang/cargo#10789

@charliermarsh
Copy link
Member Author

I am not strictly opposed to it but we should get some more opinions (\cc @zanieb @BurntSushi @konstin).

@chrisrodrigue
Copy link

PDM supports this: Store credentials with the index, Local dependencies

@chrisrodrigue
Copy link

Hatch also supports this: Context formatting

@eth3lbert
Copy link
Contributor

I looked around Poetry's document and found this https://python-poetry.org/docs/configuration/#http-basicnameusernamepassword .

@CoolCat467
Copy link

I know I'm not a maintainer of uv, but I strongly suggest that if environment lookups were added that they be opt-in. I've heard of things going very bad security wise from environment variables.

@zanieb
Copy link
Member

zanieb commented Aug 3, 2024

Can you share more details or examples please @CoolCat467?

@zanieb
Copy link
Member

zanieb commented Aug 3, 2024

Something like --allow-variable <NAME> might make sense for opt-in per-variable. We probably want to consider how we want the UX to work for lockfiles and credentials in general. I think expanding environment variables is probably not the best solution, it seems like we'd be better off finding a matching URL in our configuration file and pulling authentication from there (we need and want to support declaring indexes anyway).

@eth3lbert
Copy link
Contributor

eth3lbert commented Aug 3, 2024

@chrisrodrigue
Copy link

Related: #5119

@chrisrodrigue
Copy link

chrisrodrigue commented Aug 3, 2024

@CoolCat467

I know I'm not a maintainer of uv, but I strongly suggest that if environment lookups were added that they be opt-in. I've heard of things going very bad security wise from environment variables.

Currently uv.lock exposes API keys in source URLs which means a lot of users can’t check it into version control.

One of the strengths of environment variables is avoiding version control. Most VCSs support masking environment variables from job/log output when expanded. See GitLab and GitHub docs.

They aren’t 100% foolproof. I think there’s a tradeoff between flexibility and security.

@BurntSushi
Copy link
Member

cargo run pip install flask --index-url 'https://${PYPI_USERNAME}.com/repository/pypi-proxy/simple/flask/'

I think I would personally find it very surprising if an environment variable got expanded here. Basically, if I'm running a Unix shell command and I use single quotes, I have a very strong prior that the string is interpreted literally with no interpolation.

Something like --allow-variable <NAME> might make sense for opt-in per-variable. We probably want to consider how we want the UX to work for lockfiles and credentials in general. I think expanding environment variables is probably not the best solution, it seems like we'd be better off finding a matching URL in our configuration file and pulling authentication from there (we need and want to support declaring indexes anyway).

👍 on this (including that env vars are maybe not the best solution to this specific problem), but having an explicit --allow-variable I think would help mitigate some concerns here.

@charliermarsh charliermarsh added the needs-decision Undecided if this should be done label Aug 6, 2024
@jfgordon2
Copy link

I'm currently looking at using uv instead of pipenv for our projects, and env var expansion is something we use with pipenv and our AWS CodeArtifact repo. Not having this makes things a bit more complicated to use uv (i.e. we can't easily specify in the pyproject.toml file where the packages are located without constant updates since the index url contains the auth and it's only good for a limited time).

I'm not sure I understand the concern around env var expansion within a url for this use-case, though.

@woutervh
Copy link

woutervh commented Oct 1, 2024

"opt-in variables"? because you don't trust your own environment-variables? That makes zero sense to me.

Currently uv.lock exposes API keys in source URLs which means a lot of users can’t check it into version control.

The source-urls in the lockfile could just use the verbatim non-resolved strings "https://${PYPI_USERNAME}:${PYPI_USERNAME}..."

@paveldikov
Copy link
Contributor

paveldikov commented Oct 15, 2024

Coming from a very strongly-regulated background, hard agree with @CoolCat467. IMHO security-by-default is the way. I am not strictly opposed to the feature, but if it is to be implemented I believe it should not be on by default.

(also, TBH, I struggle to understand the use case... if you want to interpolate secrets from env vars, why not just pass UV_EXTRA_INDEX_URL? Are we trying to over-compensate for UX gaps in CI/CD platforms?)


People have been bitten countless times by tools doing insecure-by-default behaviours (NGINX and Docker are my two favourite examples). I'd really rather not be in the position of having to bolt-on extra security on top of uv.

because you don't trust your own environment-variables?

As ridiculous/paranoid as it may sound to those not in heavily-regulated fields -- nope, I do not trust my own env vars not to get leaked. This happens all the time, and if you are especially unfortunate, can lead to multi-billion $$$ costs.

And although solutions such as in-line log filters may do a decent job at scrubbing secrets out of stdout/stderr... those are still best-effort. And there are countless other places I could leak my secrets -- e.g. I could inadvertently publish them as part of my wheel/container image/rpm/deb. Obviously no one will really want to do the latter, but as uv's functionality and complexity grows, it is not impossible to think of an evolutionary path where this sort of vulnerability takes shape.

@woutervh
Copy link

@paveldikov You argue against putting secrets in environment variables, because they might be leaked (see uv.lock). OK, Then don't. I worked for years in the pharma-industry, and currently work in another highly regulated industry.

I fail to see a structural difference between putting your secrets in UV_INDEX_URL or in a extrapolated variable in pyproject.toml.
I fail to see why the former would be more safe than the latter.

@paveldikov
Copy link
Contributor

@woutervh

You argue against putting secrets in environment variables, because they might be leaked (see uv.lock). OK, Then don't.

This is actually not what I am saying. I do happen to dislike secrets-in-environment-variables, but I am not actually arguing against them. I am well aware that they are a fact of life in a number of commonly-used CI/CD platforms.

What I am rather saying is that this feature should not be on by default. I'd be perfectly happy with an allow-env-substitution setting exposed in {uv,pyproject}.toml. So you'd have:

index-url = "https://${PYPI_USERNAME}.com/repository/pypi-proxy/simple/flask/"
allow-env-substitution = true

Simple as that. I think it's a no-brainer: if you want the expressive-but-less-secure behaviour, add an extra line in your config, and be done with it.

@DanCardin
Copy link

it feels like that setting would have to be itself an env var (or uv global setting) to make any sense to me...

Like if you're explicitly describing your index-url as having this special interpolation syntax, then obviously you'll always be setting that config value as true in conjunction (having it as false would never be correct).

To me it feels like the attack you're trying guard against is some package which sets index-url = "https://attacking-index.com/${PYPI_USERNAME}/${PYPI_PASSWORD}/${OTHER_UNRELATED_ENV_VARS_TO_EXFILTRATE}", and naively uv installing in that package might expose your env vars in an uncontrollable way?

The pyproject.toml author needn't fear their own project settings/values, afaict (again, because if you can control the url, you can control that setting. But perhaps you as the consumer of some unknown pyproject.toml maybe ought to fear using it if this feature exists. Whereas today where you're explicitly mapping a given domain to the credentials (with something like netrc or keychain), or supplying them directly through UV_* env vars.

@paveldikov
Copy link
Contributor

But perhaps you as the consumer of some unknown pyproject.toml maybe ought to fear using it if this feature exists

Exactly. Two scenarios that come to mind immediately:

  • installing 3rd party package from sdist
  • CI/CD platform ran by an individual/team that is not directly responsible for the code that is being deployed.
    • 'trust' is funny one in this instance. The platform team generally trusts in the developer/code owner's good intentions, but not necessarily their competence. As such CI/CD owners go to quite some effort to implement guard rails to mitigate the risk of self-sabotage, in this instance, credential leakage/exfiltration.

(FWIW the idea of consuming an untrusted pyproject.toml which specifies an arbitrary/untrusted index -- even without interpolation -- terrifies me too. But that's a separate problem, and a very thorny one too)

@DanCardin
Copy link

I'm not sure exactly what mitigation makes the most sense, but basically my point was that the operator of the environment ought to be the one controlling access (default or not) to this feature. e.g. environment variable or uv setting. Whereas it being enableable by the person controlling the pyproject.toml would be where I could see there being issues.

The other problem, if you take this issue seriously, would be that just enabling it or disabling it seems likely to not be particularly useful. If you're operating in a project that uses the feature, likely you'd just plop the enabling env var into a bashrc and go about your day, worse if it defaults on.

Akin to index credentials themselves (by keychain) seems like it'd need to be something that maps index domain (not name, seems like that would also be too easy to attack) to allowed env vars. It just starts to get very un-ergonomic very quick; unless perhaps it was a JIT approval the first time for a given index.

@Kristina-Pianykh
Copy link

Kristina-Pianykh commented Oct 22, 2024

The problem with the first issue is the use of single quotes which tell bash to interpret the string literally. Use double quotes for string interpolation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Compatibility with a specification or another tool needs-decision Undecided if this should be done
Projects
None yet
Development

No branches or pull requests