-
Notifications
You must be signed in to change notification settings - Fork 418
dev: add make notebook
#2528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
dev: add make notebook
#2528
Changes from all commits
ab8e6d5
a142274
375085b
d69f359
451a58c
c4ad0a3
b55d7bd
5bc770b
f9f5f02
057af7a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,3 +5,4 @@ build | |
| .gitignore | ||
| uv.lock | ||
| mkdocs/* | ||
| notebooks/* | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can do this, but then we have to make sure that they are not bundled in the release. The notebooks do contain code.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yea agreed. i double check the artifacts, the new Feels like this can be a potential footgun where a folder is included in the artifact but RAT check is ignored in
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -211,6 +211,41 @@ export PYICEBERG_CATALOG__TEST_CATALOG__ACCESS_KEY_ID=username | |
| export PYICEBERG_CATALOG__TEST_CATALOG__SECRET_ACCESS_KEY=password | ||
| ``` | ||
|
|
||
| ## Notebooks for Experimentation | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a new section to https://py.iceberg.apache.org/contributing/ |
||
|
|
||
| PyIceberg provides Jupyter notebooks for quick experimentation and learning. Two Make commands are available depending on your needs: | ||
|
|
||
| ### PyIceberg Examples (`make notebook`) | ||
|
|
||
| For basic PyIceberg experimentation without additional infrastructure: | ||
|
|
||
| ```bash | ||
| make notebook | ||
| ``` | ||
|
|
||
| This will install notebook dependencies and launch Jupyter Lab in the `notebooks/` directory. | ||
|
|
||
| **PyIceberg Example Notebook** (`notebooks/pyiceberg_example.ipynb`) is based on the [Getting Started with PyIceberg](https://py.iceberg.apache.org/#getting-started-with-pyiceberg) page. It demonstrates basic PyIceberg operations like creating catalogs, schemas, and querying tables without requiring any external services. | ||
|
|
||
| ### Spark Integration Examples (`make notebook-infra`) | ||
|
|
||
| For working with PyIceberg alongside Spark, use the infrastructure-enabled notebook environment: | ||
|
|
||
| ```bash | ||
| make notebook-infra | ||
| ``` | ||
|
|
||
| This command spins up the full integration test infrastructure via Docker Compose, including: | ||
|
|
||
| - **Spark** (with Spark Connect) | ||
| - **Iceberg REST Catalog** (using the [`apache/iceberg-rest-fixture`](https://hub.docker.com/r/apache/iceberg-rest-fixture) image) | ||
| - **Hive Metastore** | ||
| - **S3-compatible object storage** (Minio) | ||
|
|
||
| **Spark Example Notebook** (`notebooks/spark_integration_example.ipynb`) is based on the [Spark Getting Started](https://iceberg.apache.org/docs/nightly/spark-getting-started/) guide. This notebook demonstrates how to work with PyIceberg alongside Spark, leveraging the Docker-based testing setup for a complete local development environment. | ||
|
|
||
| After running `make notebook-infra`, open `spark_integration_example.ipynb` in the Jupyter Lab interface to explore Spark integration capabilities. | ||
|
|
||
| ## Code standards | ||
|
|
||
| Below are the formalized conventions that we adhere to in the PyIceberg project. The goal of this is to have a common agreement on how to evolve the codebase, but also using it as guidelines for newcomers to the project. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -198,6 +198,10 @@ Since the catalog was configured to use the local filesystem, we can explore how | |
| find /tmp/warehouse/ | ||
| ``` | ||
|
|
||
| ## Try it yourself with Jupyter Notebooks | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added to the frontpage, https://py.iceberg.apache.org/ |
||
|
|
||
| PyIceberg provides Jupyter notebooks for hands-on experimentation with the examples above and more. Check out the [Notebooks for Experimentation](contributing.md#notebooks-for-experimentation) guide. | ||
|
|
||
| ## More details | ||
|
|
||
| For the details, please check the [CLI](cli.md) or [Python API](api.md) page. | ||


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding the
make installpre-req here because otherwisedev/provision.pywill fail