Skip to content

Make uid/gid configurable & change group of files #849

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Oct 4, 2024

Conversation

lfrancke
Copy link
Member

@lfrancke lfrancke commented Sep 11, 2024

Description

This PR contains three related changes for UID/GID handling. I decided to lump these together because they are related but if whoever reviews this feels more comfortable I can also split this in multiple PRs.

  • Make user id & group configurable
  • This also switches to use numeric UIDs everywhere instead of hardcoding a name (e.g. stackable)
  • Change ownership of all things belonging to stackable so that they are owned by the root group (gid = 0)

Configurable user name, uid and gid

Using the new functionality to support global arguments this extracts the user id, user name and gid into arguments that can be changed easily.
They still stay at the 1000 we use so far even though that is not optimal and needs to be changed as well.

But because I don't know if any operators make any assumptions about the uid/gid (and fsgroup which is not handled here) I decided to split this into two steps.

This PR is step 1: Make things more configurable, step 2 will follow later.

Detailed reasoning

Using a hardcoded uid for our stackable user is a good idea in theory, in practice the id 1000 should be avoided.

This is because the users from Docker containers are mapped to users on the underlying host OS. Some OSes start "real" user ids at 1000 (or 500) and reserve everything before that to "system" users. User 1000 therefore has a good chance of being mapped to a real user that exists on the underlying system, which should be avoided.

The easiest way of doing so is by picking an arbitrarily large (more or less) number to statically use in our Dockerfiles.
This is exactly what OpenShift does by default. It picks a "random" UID from a range of UIDs (in reality it picks the first one from a range). The UID is larger than 1.000.000.000 by default.

Note

There is a bug/problem and the number cannot be too large. The linked issue does include a workaround which did apply to our base images.

Note

Kubernetes 1.30 contains user namespaces as a beta feature. Expected to be moved to GA at a later point but at the moment not for 1.32 so the earliest would be 1.33 (around April/May 2025) and it would be another 1,5-2 years before we could use it.

Numeric UIDs

Note

This is to support securityContext.runAsNonRoot for users wanting to use it or in preparation for the future where we might want to enable it ourselves

The USER statement in a Dockerfile ends up in an image's metadata:

cfefc85f-cad4-4042-b3cb-21a31629e304

This user is used as the default user when an image is started using plain Docker:

docker run -it --entrypoint bash docker.stackable.tech/stackable/druid:30.0.0-stackable0.0.0-dev

It is also the default when used as a plain Pod in Kubernetes:

kubectl run test --image=docker.stackable.tech/stackable/druid:30.0.0-stackable0.0.0-dev --rm=true --restart=Never --tty=true --stdin=true -- bash

In OpenShift this is what it looks like as an admin user (they are exempt from SCCs):

kubectl run test --image=docker.stackable.tech/stackable/hbase:2.6.0-stackable0.0.0-dev --rm=true --restart=Never --tty=true --stdin=true --namespace test -- id  

Warning: would violate PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "test" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "test" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "test" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "test" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
uid=1000(stackable) gid=1000(stackable) groups=1000(stackable)

Here is the same command run as a non-admin user (note the use of a non-1000 ID means that we bypass the SCC warning):

oc run test --as developer --image=docker.stackable.tech/stackable/hbase:2.6.0-stackable0.0.0-dev --rm=true --restart=Never --tty=true --stdin=true --namespace test -- id
uid=1000740000(1000740000) gid=0(root) groups=0(root),1000740000
pod "test" deleted

If we - or someone else - want to enforce that a user is non-root using the securityContext.runAsNonRoot field it will not work as Kubernetes has no way of mapping the string stackable to a UID (it is not aware of the implementation details inside the container, it could call out to LDAP for all it knows). Therefore this combination (non-numeric UID) and runAsNonRoot is forbidden and results in an error:

a017e819-3866-4f49-902e-17688147e100

This PR, therefore, switches all Dockerfiles to use the numeric UID instead of the username.

Group of all files

To support our images to run as an arbitrary user we need to make sure that arbitrary users can read, write and execute all files and commands that user stackable can as well.

The container user is always a member of the root group and we're applying the suggested steps.

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes
- [ ] Changes are OpenShift compatible
- [ ] All added packages (via microdnf or otherwise) have a comment on why they are added
- [ ] Things not downloaded from Red Hat repositories should be mirrored in the Stackable repository and downloaded from there
- [ ] All packages should have (if available) signatures/hashes verified
- [ ] Add an entry to the CHANGELOG.md file
- [ ] Integration tests ran successfully
TIP: Running integration tests with a new product image

The image can be built and uploaded to the kind cluster with the following commands:

bake --product <product> --image-version <stackable-image-version>
kind load docker-image <image-tagged-with-the-major-version> --name=<name-of-your-test-cluster>

See the output of bake to retrieve the image tag for <image-tagged-with-the-major-version>.

Copy link
Member

@razvan razvan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so far, so good :)

Copy link
Member

@Techassi Techassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is currently still a draft, but I wanted to make sure that the correct image-tools version is used here: https://github.com/stackabletech/docker-images/blob/main/.github/actions/build-product-image/action.yml#L11-L13

# Conflicts:
#	stackable-base/Dockerfile
@lfrancke
Copy link
Member Author

Thanks @Techassi. Good catch. I'll update it now.

@lfrancke lfrancke self-assigned this Sep 13, 2024
# Conflicts:
#	.github/workflows/release.yml
#	airflow/Dockerfile
#	hello-world/Dockerfile
#	hive/Dockerfile
@lfrancke lfrancke marked this pull request as ready for review October 1, 2024 13:39
@lfrancke
Copy link
Member Author

lfrancke commented Oct 1, 2024

This is ready for review.
I did not run any tests yet but I'll try to to that this week.

# Conflicts:
#	hbase/Dockerfile
#	hive/Dockerfile
NickLarsenNZ
NickLarsenNZ previously approved these changes Oct 1, 2024
Copy link
Member

@NickLarsenNZ NickLarsenNZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lfrancke
Copy link
Member Author

lfrancke commented Oct 1, 2024

Thank you! I'll let the tests for all operators run before I merge.

@lfrancke
Copy link
Member Author

lfrancke commented Oct 1, 2024

Tests:

@lfrancke
Copy link
Member Author

lfrancke commented Oct 2, 2024

This is now not ready to merge anymore because it first needs stackabletech/actions#2 to be merged and then an update to the actions.

@lfrancke
Copy link
Member Author

lfrancke commented Oct 3, 2024

All tests pass. I now just need to update the action.

@lfrancke
Copy link
Member Author

lfrancke commented Oct 4, 2024

Action has been updated. This is ready for review again.

@lfrancke lfrancke requested a review from NickLarsenNZ October 4, 2024 09:55
Copy link
Member

@NickLarsenNZ NickLarsenNZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lfrancke lfrancke added this pull request to the merge queue Oct 4, 2024
Merged via the queue into main with commit cb993ee Oct 4, 2024
1 of 2 checks passed
@lfrancke lfrancke deleted the feature/image-tools-revamp branch October 4, 2024 10:42
lfrancke added a commit that referenced this pull request Oct 8, 2024
This is a follow-up for #849 and includes:

- The missing bits for Hive
- Kafka
@lfrancke
Copy link
Member Author

Followup PR with more products: #890

@lfrancke
Copy link
Member Author

Release Notes

  • Our Docker images now exclusively make use of numeric user IDs in USER statements allowing the use of securityContext.runAsNonRoot
  • The group id of all files relevant to our products is now set to 0. This allows the images to be used with any arbitrary user as every container user will always belong to the root group (0). This is especially useful on OpenShift when trying to move to the restricted-v2 SecurityContextConstraint (SCC), Stackable currently defaults to the nonroot-v2 SCC but we plan on migrating to restricted-v2 in the future

@lfrancke lfrancke added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Oct 15, 2024
github-merge-queue bot pushed a commit that referenced this pull request Oct 16, 2024
* Make uid/gid configurable & change group of files

This is a follow-up for #849 and includes:

- The missing bits for Hive
- Kafka

* More tools now migrated but not tested yet:

- Kafka Testing Tools
- KCat
- NiFi
- Omid

* - OPA
- Spark (WIP)

* Adds Spark and a changelog entry

* Update CHANGELOG.md

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>

* Update comment

---------

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>
Techassi pushed a commit that referenced this pull request Oct 21, 2024
* Make uid/gid configurable & change group of files

This is a follow-up for #849 and includes:

- The missing bits for Hive
- Kafka

* More tools now migrated but not tested yet:

- Kafka Testing Tools
- KCat
- NiFi
- Omid

* - OPA
- Spark (WIP)

* Adds Spark and a changelog entry

* Update CHANGELOG.md

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>

* Update comment

---------

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>
github-merge-queue bot pushed a commit that referenced this pull request Oct 23, 2024
* Make uid/gid configurable & change group of files

This is a follow-up for #849 and includes:

- The missing bits for Hive
- Kafka

* More tools now migrated but not tested yet:

- Kafka Testing Tools
- KCat
- NiFi
- Omid

* - OPA
- Spark (WIP)

* Adds Spark and a changelog entry

* - statsd_exporter
- superset

* - superset
- tools

* Adds Trino

* Update CHANGELOG

* Add Trino CLI

* Add Vector

* Add note

* Update tools/Dockerfile

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>

* Update superset/Dockerfile

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>

* Update tools/Dockerfile

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>

* Update trino-cli/Dockerfile

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>

* Update trino-cli/Dockerfile

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>

* Update superset/Dockerfile

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>

* Fix CHANGELOG

---------

Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release/24.11.0 release-note Denotes a PR that will be considered when it comes time to generate release notes.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants