Skip to content

Add REST catalog support in docs (#1) #4031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

somratdutta
Copy link

@somratdutta somratdutta commented Jul 7, 2025

Summary

This PR adds comprehensive documentation for ClickHouse's REST Catalog integration.

Checklist

@somratdutta somratdutta requested review from a team as code owners July 7, 2025 03:01
@somratdutta somratdutta requested a review from BentsiLeviav July 7, 2025 03:01
Copy link

vercel bot commented Jul 7, 2025

@somratdutta is attempting to deploy a commit to the ClickHouse Team on Vercel.

A member of the Team first needs to authorize it.

Copy link

vercel bot commented Jul 8, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
clickhouse-docs ✅ Ready (Inspect) Visit Preview Jul 9, 2025 8:43am

@Blargian
Copy link
Member

Blargian commented Jul 8, 2025

@somratdutta would you mind to merge main into your branch? The error the vercel deployment is failing on is unrelated to your changes and was fixed yesterday.

@somratdutta
Copy link
Author

Sure @Blargian.
I would do it, thanks for reviewing my PR.

Copy link
Member

@Blargian Blargian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@somratdutta firstly, thank you so much for this contribution, I am really excited to see it! I've left a few comments. I was unable to get this working following these steps. Let's put together an example which can be easily reproduced.


:::note
As this feature is experimental, you will need to enable it using:
`SET allow_experimental_database_rest_catalog = 1;`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`SET allow_experimental_database_rest_catalog = 1;`
`SET allow_experimental_database_iceberg = 1;`

Comment on lines +50 to +68
clickhouse:
image: clickhouse/clickhouse-server:main
container_name: clickhouse
user: '0:0' # Ensures root permissions
networks:
iceberg_net:
ports:
- "8123:8123"
- "9002:9000"
volumes:
- ./clickhouse:/var/lib/clickhouse
- ./clickhouse/data_import:/var/lib/clickhouse/data_import # Mount dataset folder
networks:
- iceberg_net
environment:
- CLICKHOUSE_DB=default
- CLICKHOUSE_USER=default
- CLICKHOUSE_DO_NOT_CHOWN=1
- CLICKHOUSE_PASSWORD=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
clickhouse:
image: clickhouse/clickhouse-server:main
container_name: clickhouse
user: '0:0' # Ensures root permissions
networks:
iceberg_net:
ports:
- "8123:8123"
- "9002:9000"
volumes:
- ./clickhouse:/var/lib/clickhouse
- ./clickhouse/data_import:/var/lib/clickhouse/data_import # Mount dataset folder
networks:
- iceberg_net
environment:
- CLICKHOUSE_DB=default
- CLICKHOUSE_USER=default
- CLICKHOUSE_DO_NOT_CHOWN=1
- CLICKHOUSE_PASSWORD=
clickhouse:
image: clickhouse/clickhouse-server:25.5.6
container_name: clickhouse
user: '0:0' # Ensures root permissions
networks:
iceberg_net:
ports:
- "8123:8123"
- "9002:9000"
volumes:
- ./clickhouse:/var/lib/clickhouse
- ./clickhouse/data_import:/var/lib/clickhouse/data_import # Mount dataset folder
networks:
- iceberg_net
environment:
- CLICKHOUSE_DB=default
- CLICKHOUSE_USER=default
- CLICKHOUSE_DO_NOT_CHOWN=1
- CLICKHOUSE_PASSWORD=

Comment on lines +82 to +87
CREATE DATABASE demo
ENGINE = DataLakeCatalog('http://rest:8181/v1', 'admin', 'password')
SETTINGS
catalog_type = 'rest',
storage_endpoint = 'http://minio:9000/lakehouse',
warehouse = 'demo'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CREATE DATABASE demo
ENGINE = DataLakeCatalog('http://rest:8181/v1', 'admin', 'password')
SETTINGS
catalog_type = 'rest',
storage_endpoint = 'http://minio:9000/lakehouse',
warehouse = 'demo'
SET allow_experimental_database_iceberg = 1;
CREATE DATABASE demo
ENGINE = DataLakeCatalog('http://rest:8181/v1', 'admin', 'password')
SETTINGS
catalog_type = 'rest',
storage_endpoint = 'http://minio:9000/lakehouse',
warehouse = 'demo'

Comment on lines +101 to +103
┌─name──────────┐
│ default.taxis │
└───────────────┘
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I unfortunately don't get this when I try to run the steps. I'm getting back:

SHOW TABLES IN demo

Query id: 4411372a-a71c-44e9-b27b-146af2048670

Ok.

0 rows in set. Elapsed: 0.047 sec.

demo is however created:

SHOW DATABASES

Query id: 70f26176-08cd-4e5e-b788-44ce1adf10eb

   ┌─name───────────────┐
1. │ INFORMATION_SCHEMA │
2. │ default            │
3. │ demo               │
4. │ information_schema │
5. │ system             │
   └────────────────────┘

Can you confirm you were able to get this working locally?


You can use various containerized REST catalog implementations such as **[Databricks docker-spark-iceberg](https://github.com/databricks/docker-spark-iceberg/blob/main/docker-compose.yml?ref=blog.min.io)** which provides a complete Spark + Iceberg + REST catalog environment with docker-compose, making it ideal for testing Iceberg integrations.

You'll need to add ClickHouse as a dependency in your docker-compose setup:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's write here some short set up steps, something like:

Create a new folder in which to run the example, then create a file docker-compose.yml with the configuration from Databricks docker-spark-iceberg.

Next, create a file docker-compose.override.yml and place the following ClickHouse container configuration into it

(After the code block we can say to run docker compose up)

Comment on lines +54 to +55
networks:
iceberg_net:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
networks:
iceberg_net:

With this line I get an error. Works without it.

Comment on lines +7 to +8
description: 'In this guide, we will walk you through the steps to query
your data in S3 buckets using ClickHouse and the REST Catalog.'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update this description, no S3 buckets involved in this guide.

@Blargian Blargian self-assigned this Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants