-
Notifications
You must be signed in to change notification settings - Fork 346
Add REST catalog support in docs (#1) #4031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@somratdutta is attempting to deploy a commit to the ClickHouse Team on Vercel. A member of the Team first needs to authorize it. |
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@somratdutta would you mind to merge main into your branch? The error the vercel deployment is failing on is unrelated to your changes and was fixed yesterday. |
Sure @Blargian. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@somratdutta firstly, thank you so much for this contribution, I am really excited to see it! I've left a few comments. I was unable to get this working following these steps. Let's put together an example which can be easily reproduced.
|
||
:::note | ||
As this feature is experimental, you will need to enable it using: | ||
`SET allow_experimental_database_rest_catalog = 1;` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`SET allow_experimental_database_rest_catalog = 1;` | |
`SET allow_experimental_database_iceberg = 1;` |
clickhouse: | ||
image: clickhouse/clickhouse-server:main | ||
container_name: clickhouse | ||
user: '0:0' # Ensures root permissions | ||
networks: | ||
iceberg_net: | ||
ports: | ||
- "8123:8123" | ||
- "9002:9000" | ||
volumes: | ||
- ./clickhouse:/var/lib/clickhouse | ||
- ./clickhouse/data_import:/var/lib/clickhouse/data_import # Mount dataset folder | ||
networks: | ||
- iceberg_net | ||
environment: | ||
- CLICKHOUSE_DB=default | ||
- CLICKHOUSE_USER=default | ||
- CLICKHOUSE_DO_NOT_CHOWN=1 | ||
- CLICKHOUSE_PASSWORD= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clickhouse: | |
image: clickhouse/clickhouse-server:main | |
container_name: clickhouse | |
user: '0:0' # Ensures root permissions | |
networks: | |
iceberg_net: | |
ports: | |
- "8123:8123" | |
- "9002:9000" | |
volumes: | |
- ./clickhouse:/var/lib/clickhouse | |
- ./clickhouse/data_import:/var/lib/clickhouse/data_import # Mount dataset folder | |
networks: | |
- iceberg_net | |
environment: | |
- CLICKHOUSE_DB=default | |
- CLICKHOUSE_USER=default | |
- CLICKHOUSE_DO_NOT_CHOWN=1 | |
- CLICKHOUSE_PASSWORD= | |
clickhouse: | |
image: clickhouse/clickhouse-server:25.5.6 | |
container_name: clickhouse | |
user: '0:0' # Ensures root permissions | |
networks: | |
iceberg_net: | |
ports: | |
- "8123:8123" | |
- "9002:9000" | |
volumes: | |
- ./clickhouse:/var/lib/clickhouse | |
- ./clickhouse/data_import:/var/lib/clickhouse/data_import # Mount dataset folder | |
networks: | |
- iceberg_net | |
environment: | |
- CLICKHOUSE_DB=default | |
- CLICKHOUSE_USER=default | |
- CLICKHOUSE_DO_NOT_CHOWN=1 | |
- CLICKHOUSE_PASSWORD= |
CREATE DATABASE demo | ||
ENGINE = DataLakeCatalog('http://rest:8181/v1', 'admin', 'password') | ||
SETTINGS | ||
catalog_type = 'rest', | ||
storage_endpoint = 'http://minio:9000/lakehouse', | ||
warehouse = 'demo' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CREATE DATABASE demo | |
ENGINE = DataLakeCatalog('http://rest:8181/v1', 'admin', 'password') | |
SETTINGS | |
catalog_type = 'rest', | |
storage_endpoint = 'http://minio:9000/lakehouse', | |
warehouse = 'demo' | |
SET allow_experimental_database_iceberg = 1; | |
CREATE DATABASE demo | |
ENGINE = DataLakeCatalog('http://rest:8181/v1', 'admin', 'password') | |
SETTINGS | |
catalog_type = 'rest', | |
storage_endpoint = 'http://minio:9000/lakehouse', | |
warehouse = 'demo' |
┌─name──────────┐ | ||
│ default.taxis │ | ||
└───────────────┘ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I unfortunately don't get this when I try to run the steps. I'm getting back:
SHOW TABLES IN demo
Query id: 4411372a-a71c-44e9-b27b-146af2048670
Ok.
0 rows in set. Elapsed: 0.047 sec.
demo
is however created:
SHOW DATABASES
Query id: 70f26176-08cd-4e5e-b788-44ce1adf10eb
┌─name───────────────┐
1. │ INFORMATION_SCHEMA │
2. │ default │
3. │ demo │
4. │ information_schema │
5. │ system │
└────────────────────┘
Can you confirm you were able to get this working locally?
|
||
You can use various containerized REST catalog implementations such as **[Databricks docker-spark-iceberg](https://github.com/databricks/docker-spark-iceberg/blob/main/docker-compose.yml?ref=blog.min.io)** which provides a complete Spark + Iceberg + REST catalog environment with docker-compose, making it ideal for testing Iceberg integrations. | ||
|
||
You'll need to add ClickHouse as a dependency in your docker-compose setup: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's write here some short set up steps, something like:
Create a new folder in which to run the example, then create a file docker-compose.yml
with the configuration from Databricks docker-spark-iceberg.
Next, create a file docker-compose.override.yml
and place the following ClickHouse container configuration into it
(After the code block we can say to run docker compose up
)
networks: | ||
iceberg_net: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
networks: | |
iceberg_net: |
With this line I get an error. Works without it.
description: 'In this guide, we will walk you through the steps to query | ||
your data in S3 buckets using ClickHouse and the REST Catalog.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update this description, no S3 buckets involved in this guide.
Summary
This PR adds comprehensive documentation for ClickHouse's REST Catalog integration.
Checklist