Skip to content

Fix quickstart doc with docker compose #1610

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ services:
volumes:
# Bind local conf file to a convenient location in the container
- type: bind
source: ../assets/postgres/postgresql.conf
source: ${ASSETS_PATH}/postgres/postgresql.conf
target: /etc/postgresql/postgresql.conf
command:
- "postgres"
Expand Down
7 changes: 6 additions & 1 deletion getting-started/eclipselink/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,12 @@ This example requires `jq` to be installed on your machine.
2. Start the docker compose group by running the following command from the root of the repository:

```shell
docker compose -f getting-started/eclipselink/docker-compose-bootstrap-db.yml -f getting-started/assets/postgres/docker-compose-postgres.yml -f getting-started/eclipselink/docker-compose.yml up
export ASSETS_PATH=$(pwd)/getting-started/assets/
export CLIENT_ID=root
export CLIENT_SECRET=s3cr3t
docker compose -p polaris -f getting-started/assets/postgres/docker-compose-postgres.yml \
-f getting-started/eclipselink/docker-compose-bootstrap-db.yml \
-f getting-started/eclipselink/docker-compose.yml up
```

3. Using spark-sql: attach to the running spark-sql container:
Expand Down
4 changes: 2 additions & 2 deletions getting-started/eclipselink/docker-compose-bootstrap-db.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ services:
polaris.persistence.type: eclipse-link
polaris.persistence.eclipselink.configuration-file: /deployments/config/eclipselink/persistence.xml
volumes:
- ../assets/eclipselink/:/deployments/config/eclipselink
- ${ASSETS_PATH}/eclipselink/:/deployments/config/eclipselink
command:
- "bootstrap"
- "--realm=POLARIS"
- "--credential=POLARIS,root,s3cr3t"
- "--credential=POLARIS,${CLIENT_ID},${CLIENT_SECRET}"
polaris:
depends_on:
polaris-bootstrap:
Expand Down
10 changes: 5 additions & 5 deletions getting-started/eclipselink/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ services:
polaris.features."SUPPORTED_CATALOG_STORAGE_TYPES": "[\"FILE\",\"S3\",\"GCS\",\"AZURE\"]"
polaris.readiness.ignore-severe-issues: "true"
volumes:
- ../assets/eclipselink/:/deployments/config/eclipselink
- ${ASSETS_PATH}/eclipselink/:/deployments/config/eclipselink
healthcheck:
test: ["CMD", "curl", "http://localhost:8182/q/health"]
interval: 2s
Expand All @@ -61,7 +61,7 @@ services:
- CLIENT_ID=${CLIENT_ID}
- CLIENT_SECRET=${CLIENT_SECRET}
volumes:
- ../assets/polaris/:/polaris
- ${ASSETS_PATH}/polaris/:/polaris
entrypoint: '/bin/sh -c "chmod +x /polaris/create-catalog.sh && /polaris/create-catalog.sh"'

spark-sql:
Expand Down Expand Up @@ -98,11 +98,11 @@ services:
polaris-setup:
condition: service_completed_successfully
environment:
- CLIENT_ID=${CLIENT_ID}
- CLIENT_SECRET=${CLIENT_SECRET}
- CLIENT_ID=${USER_CLIENT_ID}
- CLIENT_SECRET=${USER_CLIENT_SECRET}
stdin_open: true
tty: true
ports:
- "8080:8080"
volumes:
- ../assets/trino-config/catalog:/etc/trino/catalog
- ${ASSETS_PATH}/trino-config/catalog:/etc/trino/catalog
1 change: 1 addition & 0 deletions getting-started/jdbc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ This example requires `jq` to be installed on your machine.
export QUARKUS_DATASOURCE_JDBC_URL=jdbc:postgresql://postgres:5432/POLARIS
export QUARKUS_DATASOURCE_USERNAME=postgres
export QUARKUS_DATASOURCE_PASSWORD=postgres
export ASSETS_PATH=$(pwd)/getting-started/assets/
export CLIENT_ID=root
export CLIENT_SECRET=s3cr3t
docker compose -f getting-started/jdbc/docker-compose-bootstrap-db.yml -f getting-started/assets/postgres/docker-compose-postgres.yml -f getting-started/jdbc/docker-compose.yml up
Expand Down
4 changes: 2 additions & 2 deletions getting-started/jdbc/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ services:
- CLIENT_ID=${CLIENT_ID}
- CLIENT_SECRET=${CLIENT_SECRET}
volumes:
- ../assets/polaris/:/polaris
- ${ASSETS_PATH}/polaris/:/polaris
entrypoint: '/bin/sh -c "chmod +x /polaris/create-catalog.sh && /polaris/create-catalog.sh"'

spark-sql:
Expand Down Expand Up @@ -108,4 +108,4 @@ services:
- CLIENT_ID=${CLIENT_ID}
- CLIENT_SECRET=${CLIENT_SECRET}
volumes:
- ../assets/trino-config/catalog:/etc/trino/catalog
- ${ASSETS_PATH}/trino-config/catalog:/etc/trino/catalog
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ The requirements to run the script below are:

```shell
chmod +x getting-started/assets/cloud_providers/deploy-aws.sh
export ASSETS_PATH=$(pwd)/getting-started/assets/
./getting-started/assets/cloud_providers/deploy-aws.sh
```

Expand All @@ -50,6 +51,7 @@ export CLIENT_SECRET=s3cr3t
To shut down the Polaris server, run the following commands:

```shell
export ASSETS_PATH=$(pwd)/getting-started/assets/
docker compose -f getting-started/eclipselink/docker-compose.yml down
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ The requirements to run the script below are:

```shell
chmod +x getting-started/assets/cloud_providers/deploy-azure.sh
export ASSETS_PATH=$(pwd)/getting-started/assets/
./getting-started/assets/cloud_providers/deploy-azure.sh
```

Expand All @@ -45,6 +46,7 @@ export CLIENT_SECRET=s3cr3t
To shut down the Polaris server, run the following commands:

```shell
export ASSETS_PATH=$(pwd)/getting-started/assets/
docker compose -f getting-started/eclipselink/docker-compose.yml down
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ The requirements to run the script below are:

```shell
chmod +x getting-started/assets/cloud_providers/deploy-gcp.sh
export ASSETS_PATH=$(pwd)/getting-started/assets/
./getting-started/assets/cloud_providers/deploy-gcp.sh
```

Expand All @@ -45,6 +46,7 @@ export CLIENT_SECRET=s3cr3t
To shut down the Polaris server, run the following commands:

```shell
export ASSETS_PATH=$(pwd)/getting-started/assets/
docker compose -f getting-started/eclipselink/docker-compose.yml down
```

Expand Down
41 changes: 21 additions & 20 deletions site/content/in-dev/unreleased/getting-started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@ weight: 200

Polaris can be deployed via a docker image or as a standalone process. Before starting, be sure that you've satisfied the relevant prerequisites detailed in the previous page.

## Docker Image

To start using Polaris in Docker, build and launch Polaris, which is packaged with a Postgres instance, Apache Spark, and Trino.
## Common Setup
Before running Polaris, ensure you have completed the following setup steps:

1. **Build Polaris**
```shell
cd ~/polaris
./gradlew \
Expand All @@ -36,7 +36,20 @@ cd ~/polaris
:polaris-quarkus-admin:assemble --rerun \
-Dquarkus.container-image.tag=postgres-latest \
-Dquarkus.container-image.build=true
docker compose -f getting-started/eclipselink/docker-compose-postgres.yml -f getting-started/eclipselink/docker-compose-bootstrap-db.yml -f getting-started/eclipselink/docker-compose.yml up
```
- **For standalone**: Omit the `-Dquarkus.container-image.tag` and `-Dquarkus.container-image.build` options if you do not need to build a Docker image.

## Running Polaris with Docker

To start using Polaris in Docker and launch Polaris, which is packaged with a Postgres instance, Apache Spark, and Trino.

```shell
export ASSETS_PATH=$(pwd)/getting-started/assets/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this line to the top of the file with the exporting of the CLIENT_ID/SECRET, so that it only needs to be run once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I followed, this is the beginning of the quickstart page (https://polaris.apache.org/in-dev/unreleased/getting-started/quickstart/#docker-image). This is only needed for the context of docker compose.

The export of CLIENT_ID/CLIENT_SECRET is invalid I think as the current state (without the env) file, this won't even be able to start.

If I understand correctly, we should consider move export of CLIENT_ID/CLIENT_SECRET to this section (as the current docker compose file has no credential, so it will try to set empty string for root credential (as well as username, which is in-valid).

The export is only needed if user doesn't want to use env file (as env file will load the credential in the updated command). Let me know what you think.

Copy link
Collaborator

@adnanhemani adnanhemani May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this comment is in conjunction with the suggestion from the overall review's comment. We should do the following:

  1. Move the export of CLIENT_ID/CLIENT_SECRET to the top of the file - the Docker Compose files will be able to intake the environment variables set in bash. (i.e. keep one reference at the top of the Quickstart page and one at the top of each of the cloud deployment pages)
  2. Remove all references to setting the CLIENT_ID/CLIENT_SECRET elsewhere.
  3. Add the export ASSETS_PATH to these references to setting CLIENT_ID/CLIENT_SECRET

Does this make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got what you mean. I had made some changes to this doc for refactor. Please review.

export CLIENT_ID=root
export CLIENT_SECRET=s3cr3t
docker compose -p polaris -f getting-started/assets/postgres/docker-compose-postgres.yml \
-f getting-started/eclipselink/docker-compose-bootstrap-db.yml \
-f getting-started/eclipselink/docker-compose.yml up
```

You should see output for some time as Polaris, Spark, and Trino build and start up. Eventually, you won’t see any more logs and see some logs relating to Spark, resembling the following:
Expand All @@ -48,24 +61,17 @@ spark-sql-1 | 25/04/04 05:39:38 WARN SparkSQLCLIDriver: WARNING: Direct
spark-sql-1 | 25/04/04 05:39:39 WARN RESTSessionCatalog: Iceberg REST client is missing the OAuth2 server URI configuration and defaults to http://polaris:8181/api/catalogv1/oauth/tokens. This automatic fallback will be removed in a future Iceberg release.It is recommended to configure the OAuth2 endpoint using the 'oauth2-server-uri' property to be prepared. This warning will disappear if the OAuth2 endpoint is explicitly configured. See https://github.com/apache/iceberg/issues/10537
```

Finally, set the following static credentials for interacting with the Polaris server in the following exercises:

```shell
export CLIENT_ID=root
export CLIENT_SECRET=s3cr3t
```

The Docker image pre-configures a sample catalog called `quickstart_catalog` that uses a local file system.

## Running Polaris as a Standalone Process

You can also start Polaris through Gradle (packaged within the Polaris repository):

1. **Start the Server**

Run the following command to start Polaris:

```shell
cd ~/polaris
# Build the server
./gradlew clean :polaris-quarkus-server:assemble :polaris-quarkus-server:quarkusAppPartsBuild --rerun
# Start the server
./gradlew run
```

Expand All @@ -83,11 +89,6 @@ When using a Gradle-launched Polaris instance in this tutorial, we'll launch an
For more information on how to configure Polaris for production usage, see the [docs]({{% relref "../configuring-polaris-for-production" %}}).

When Polaris is run using the `./gradlew run` command, the root principal credentials are `root` and `secret` for the `CLIENT_ID` and `CLIENT_SECRET`, respectively.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker: We could actually remove this line.

You can also set these credentials as environment variables for use with the Polaris CLI:
```shell
export CLIENT_ID=root
export CLIENT_SECRET=secret
```
Comment on lines -86 to -90
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need this to enable polaris cli.


### Installing Apache Spark and Trino Locally for Testing

Expand Down
16 changes: 11 additions & 5 deletions site/content/in-dev/unreleased/getting-started/using-polaris.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,16 @@ Title: Using Polaris
type: docs
weight: 400
---

## Setup

Define your `CLIENT_ID` & `CLIENT_SECRET` and export them for future use.

```shell
export CLIENT_ID=YOUR_CLIENT_ID
export CLIENT_SECRET=YOUR_CLIENT_SECRET
```

## Defining a Catalog

In Polaris, the [catalog]({{% relref "../entities#catalog" %}}) is the top-level entity that objects like [tables]({{% relref "../entities#table" %}}) and [views]({{% relref "../entities#view" %}}) are organized under. With a Polaris service running, you can create a catalog like so:
Expand Down Expand Up @@ -167,7 +171,6 @@ bin/spark-sql \
--conf spark.sql.catalog.quickstart_catalog.client.region=us-west-2
```


Similar to the CLI commands above, this configures Spark to use the Polaris running at `localhost:8181`. If your Polaris server is running elsewhere, but sure to update the configuration appropriately.

Finally, note that we include the `iceberg-aws-bundle` package here. If your table is using a different filesystem, be sure to include the appropriate dependency.
Expand All @@ -176,7 +179,9 @@ Finally, note that we include the `iceberg-aws-bundle` package here. If your tab

Refresh the Docker container with the user's credentials:
```shell
docker compose -f getting-started/eclipselink/docker-compose.yml up -d
docker compose -p polaris -f getting-started/eclipselink/docker-compose.yml stop spark-sql
docker compose -p polaris -f getting-started/eclipselink/docker-compose.yml rm -f spark-sql
docker compose -p polaris -f getting-started/eclipselink/docker-compose.yml up -d --no-deps spark-sql
Comment on lines +182 to +184
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, prefer to have a JDBC example.

```

Attach to the running spark-sql container:
Expand Down Expand Up @@ -237,14 +242,15 @@ org.apache.iceberg.exceptions.ForbiddenException: Forbidden: Principal 'quicksta
Refresh the Docker container with the user's credentials:

```shell
docker compose -f getting-started/eclipselink/docker-compose.yml down trino
docker compose -f getting-started/eclipselink/docker-compose.yml up -d
docker compose -p polaris -f getting-started/eclipselink/docker-compose.yml stop trino
docker compose -p polaris -f getting-started/eclipselink/docker-compose.yml rm -f trino
docker compose -p polaris -f getting-started/eclipselink/docker-compose.yml up -d --no-deps trino
Comment on lines +245 to +247
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to provide jdbc examples instead of eclipseLink ones here, we could do that in a follow-up PR though.

```

Attach to the running Trino container:

```shell
docker exec -it eclipselink-trino-1 trino
docker exec -it $(docker ps -q --filter name=trino) trino
```

You may not see Trino's prompt immediately, type ENTER to see it. A few commands that you can try:
Expand Down
Loading