-
Notifications
You must be signed in to change notification settings - Fork 2
docs: update cover image in readme #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,299 +1,180 @@ | ||
| # Deepnote Toolkit | ||
|
|
||
| [](https://github.com/deepnote/deepnote-toolkit/actions/workflows/ci.yml) | ||
| [](https://codecov.io/gh/deepnote/deepnote-toolkit) | ||
|
|
||
| > [!WARNING] | ||
| > This code is distributed to user context, so we are treating it as a public repository - ensure no secrets are included in the codebase. | ||
| Welcome to the Deepnote Toolkit, our homegrown Python package managed by Poetry. It is an essential Python package that needs to be installed in the user's environment, which is operated by Deepnote. This package encapsulates all the code that needs to run in the user's space environment. | ||
|
|
||
| ### Key features | ||
|
|
||
| - Deepnote component library | ||
| - Python kernel with scientific computing libraries | ||
| - SQL support with query caching | ||
| - Data visualization (Altair, Plotly) | ||
| - Streamlit apps support with auto-reload | ||
| - Language Server Protocol integration | ||
| - Git integration with SSH/HTTPS authentication | ||
| - Prometheus metrics collection | ||
| - Integration environment variables management | ||
|
|
||
| ### Bundle types | ||
| <div align="center"> | ||
|
|
||
| The toolkit consists of two main bundle types: | ||
|  | ||
|
|
||
| 1. **Kernel Bundle**: Libraries available to user code execution (pandas, numpy, etc.) | ||
| 2. **Server Bundle**: Dependencies for running infrastructure services (Jupyter, Streamlit, LSP) | ||
|
|
||
| ### How to setup? | ||
| [](https://github.com/deepnote/deepnote-toolkit/actions/workflows/ci.yml) | ||
| [](https://codecov.io/gh/deepnote/deepnote-toolkit) | ||
|
|
||
| #### Option 1: Using mise (Recommended) | ||
|
|
||
| [mise](https://mise.jdx.dev/) automatically manages Python, Java, and other tool versions: | ||
| [Website](https://deepnote.com/?utm_source=github&utm_medium=github&utm_campaign=github&utm_content=readme_main) • [Docs](https://deepnote.com/docs?utm_source=github&utm_medium=github&utm_campaign=github&utm_content=readme_main) • [Blog](https://deepnote.com/blog?utm_source=github&utm_medium=github&utm_campaign=github&utm_content=readme_main) • [X](https://x.com/DeepnoteHQ) • [Examples](https://deepnote.com/explore?utm_source=github&utm_medium=github&utm_campaign=github&utm_content=readme_main) • [Community](https://github.com/deepnote/deepnote/discussions) | ||
|
|
||
| 1. Install mise: [Getting started](https://mise.jdx.dev/getting-started.html) | ||
| 2. Run setup: | ||
| </div> | ||
|
|
||
| ```bash | ||
| mise install # Installs Python 3.12 and Java 11 | ||
| mise run setup # Installs dependencies and pre-commit hooks | ||
| ``` | ||
| # Deepnote Toolkit | ||
|
|
||
| #### Option 2: Manual setup | ||
| The Deepnote Toolkit is a Python package that powers the [Deepnote notebook environment](https://github.com/deepnote/deepnote/). It provides the essential functionality that runs in user workspaces, enabling interactive data science workflows with SQL, visualizations, and integrations. | ||
|
|
||
| 1. Install poetry: [Installation](https://python-poetry.org/docs/#installation) | ||
| 2. Install Java 11 (required for PySpark tests): | ||
| - macOS: `brew install openjdk@11` | ||
| - Ubuntu/Debian: `sudo apt-get install openjdk-11-jdk` | ||
| - RHEL/Fedora: `sudo dnf install java-11-openjdk-devel` | ||
| 3. Set up venv for development package: | ||
|
|
||
| ```bash | ||
| # if python 3.10 is installed this should use | ||
| $ poetry env use 3.10 | ||
| ``` | ||
| ## Installation | ||
|
|
||
| 4. Verify the virtual environment location: | ||
| The toolkit is automatically installed in Deepnote workspaces. For local development or testing: | ||
|
|
||
| ```bash | ||
| $ poetry env info | ||
| ``` | ||
| ```bash | ||
| pip install deepnote-toolkit | ||
| ``` | ||
|
|
||
| 5. Install dependencies: | ||
| For server components (Jupyter, Streamlit, LSP): | ||
|
|
||
| ```bash | ||
| $ poetry install | ||
| ``` | ||
| ```bash | ||
| pip install deepnote-toolkit[server] | ||
| ``` | ||
|
|
||
| 6. Install Poe Poetry addon: | ||
| ## Features | ||
|
|
||
| ```bash | ||
| $ poetry self add 'poethepoet[poetry_plugin]' | ||
| ``` | ||
| ### Core capabilities | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add blank lines before subheadings. Subheadings require blank lines above them per markdown style (MD022). ## Features
+
### Core capabilities
- **SQL execution engine**: Multi-database SQL support with connection management, query templating via Jinja2, intelligent caching, and query chaining with CTE generation
- **Interactive visualizations**: Vega-Lite charts with VegaFusion optimization, multi-layer support, and interactive selections
- **Data processing**: Enhanced DataFrame utilities, data sanitization, and DuckDB in-memory analytics
- **Jupyter integration**: Custom IPython kernel with scientific computing libraries (pandas, numpy, etc.)
+
### Developer tools
- **CLI interface**: Command-line tools for server management and configuration
- **Streamlit support**: Auto-reload development workflow for Streamlit applications
- **Language server protocol**: Code intelligence and autocompletion support
- **Runtime initialization**: Session persistence, environment variable management, and post-start hooks
+
### Infrastructure
- **Git integration**: SSH/HTTPS authentication for repository access
- **SSH tunneling**: Secure database connections through SSH tunnels
- **Metrics collection**: Prometheus metrics for monitoring and observability
- **Feature flags**: Dynamic feature toggling supportAlso applies to: 41-41, 47-47 🧰 Tools🪛 markdownlint-cli2 (0.18.1)35-35: Headings should be surrounded by blank lines (MD022, blanks-around-headings) 🤖 Prompt for AI Agents |
||
| - **SQL execution engine**: Multi-database SQL support with connection management, query templating via Jinja2, intelligent caching, and query chaining with CTE generation | ||
| - **Interactive visualizations**: Vega-Lite charts with VegaFusion optimization, multi-layer support, and interactive selections | ||
| - **Data processing**: Enhanced DataFrame utilities, data sanitization, and DuckDB in-memory analytics | ||
| - **Jupyter integration**: Custom IPython kernel with scientific computing libraries (pandas, numpy, etc.) | ||
|
|
||
| 7. Install pre-commit hooks: | ||
| ### Developer tools | ||
| - **CLI interface**: Command-line tools for server management and configuration | ||
| - **Streamlit support**: Auto-reload development workflow for Streamlit applications | ||
| - **Language server protocol**: Code intelligence and autocompletion support | ||
| - **Runtime initialization**: Session persistence, environment variable management, and post-start hooks | ||
|
|
||
| ```bash | ||
| $ poetry poe setup-hooks | ||
| ``` | ||
| ### Infrastructure | ||
| - **Git integration**: SSH/HTTPS authentication for repository access | ||
| - **SSH tunneling**: Secure database connections through SSH tunnels | ||
| - **Metrics collection**: Prometheus metrics for monitoring and observability | ||
| - **Feature flags**: Dynamic feature toggling support | ||
|
|
||
| 8. Verify installation: | ||
| ## Architecture | ||
|
|
||
| ```bash | ||
| $ poetry poe lint | ||
| $ poetry poe format | ||
| ``` | ||
| The toolkit is organized into two deployment bundles: | ||
|
|
||
| ### Setup troubleshooting | ||
| 1. **Kernel bundle**: Core libraries available to user code (pandas, numpy, SQL drivers, visualization libraries) | ||
| 2. **Server bundle**: Infrastructure services (Jupyter Server, Streamlit, Python LSP Server) | ||
|
|
||
| 1. If `poetry install` fails with error `library 'ssl' not found`: | ||
| ### Main modules | ||
|
|
||
| ```bash | ||
| env LDFLAGS="-I/opt/homebrew/opt/openssl/include -L/opt/homebrew/opt/openssl/lib" poetry install | ||
| ``` | ||
| - **`deepnote_toolkit.sql`**: SQL execution, templating, caching, and query chaining | ||
| - **`deepnote_toolkit.chart`**: Vega-Lite chart rendering with VegaFusion optimization | ||
| - **`deepnote_toolkit.cli`**: Command-line interface for toolkit management | ||
| - **`deepnote_toolkit.ocelots`**: Deepnote component library for interactive UI elements | ||
| - **`deepnote_toolkit.runtime`**: Runtime initialization and session management | ||
| - **`deepnote_core`**: Core utilities shared across the toolkit | ||
|
|
||
| 2. If `poetry install` fails installing `pymssql`, install `freetds` via homebrew. | ||
| ## Usage | ||
|
|
||
| ## CLI Quick Start | ||
| ### CLI commands | ||
|
|
||
| The toolkit includes a pip-native CLI: | ||
| The toolkit provides a command-line interface for managing servers and configuration: | ||
|
|
||
| ```bash | ||
| # Install the package with server components | ||
| poetry install --with server | ||
| # Run the CLI to see available commands | ||
| poetry run deepnote-toolkit --help | ||
| # Start Jupyter server on default port (8888) | ||
| poetry run deepnote-toolkit server | ||
| deepnote-toolkit server | ||
|
|
||
| # Start servers with custom configuration | ||
| poetry run deepnote-toolkit server --jupyter-port 9000 | ||
| deepnote-toolkit server --jupyter-port 9000 | ||
|
|
||
| # View/modify configuration | ||
| poetry run deepnote-toolkit config show | ||
| poetry run deepnote-toolkit config set server.jupyter_port 9000 | ||
| deepnote-toolkit config show | ||
| deepnote-toolkit config set server.jupyter_port 9000 | ||
| ``` | ||
|
|
||
| **Security Note**: The CLI will warn if Jupyter runs without authentication. For local development only. Set `DEEPNOTE_JUPYTER_TOKEN` for shared environments. | ||
|
|
||
| ## Testing | ||
| **Security note**: The CLI will warn if Jupyter runs without authentication. For local development only. Set `DEEPNOTE_JUPYTER_TOKEN` for shared environments. | ||
|
|
||
| Tests run against all supported Python versions using nox in Docker for reproducible environments. | ||
|
|
||
| ### Local Testing | ||
| ## Development | ||
|
|
||
| #### Using mise (Recommended) | ||
| ### Testing | ||
|
|
||
| ```bash | ||
| # Run unit tests (no coverage by default) | ||
| mise run test | ||
| # Run unit tests with coverage | ||
| mise run test:coverage | ||
| The project uses nox for testing across multiple Python versions (3.9-3.12) in Docker containers. | ||
|
|
||
| # Run tests quickly without nox/coverage overhead | ||
| mise run test:quick tests/unit/test_file.py | ||
| mise run test:quick tests/unit/test_file.py::TestClass::test_method -v | ||
| **Quick testing with mise:** | ||
|
|
||
| # Pass custom arguments (including --coverage) | ||
| mise run test -- --coverage tests/unit/test_file.py | ||
| ```bash | ||
| mise run test # Run unit tests | ||
| mise run test:coverage # Run with coverage | ||
| mise run test:quick tests/unit/ # Fast testing without nox overhead | ||
| ``` | ||
|
|
||
| #### Using nox directly | ||
| **Using nox directly:** | ||
|
|
||
| ```bash | ||
| # Run unit tests without coverage | ||
| poetry run nox -s unit | ||
| # Run unit tests with coverage | ||
| poetry run nox -s unit -- --coverage | ||
| # Run specific test file | ||
| poetry run nox -s unit -- tests/unit/test_file.py | ||
| poetry run nox -s unit # Run unit tests | ||
| poetry run nox -s unit -- --coverage # With coverage | ||
| poetry run nox -s unit -- tests/unit/test_file.py # Specific file | ||
| ``` | ||
|
|
||
| #### Using Docker | ||
| ```bash | ||
| # Run unit tests | ||
| TEST_TYPE="unit" TOOLKIT_VERSION="local-build" ./bin/test | ||
| # Run integration tests | ||
| TEST_TYPE="integration" TOOLKIT_VERSION="local-build" TOOLKIT_INDEX_URL="http://localhost:8000" ./bin/test | ||
| # Or use the test-local script for both unit tests and integration tests | ||
| ./bin/test-local | ||
| **Using Docker:** | ||
|
|
||
| # Run a specific file with test-local | ||
| ./bin/test-local tests/unit/test_file.py | ||
| # ... or specific test | ||
| ./bin/test-local tests/unit/test_file.py::TestClass::test_method | ||
| ```bash | ||
| ./bin/test-local # Run all tests | ||
| ./bin/test-local tests/unit/test_file.py # Specific file | ||
| ``` | ||
|
|
||
| ### Test Coverage | ||
|
|
||
| - Unit tests for core functionality | ||
| - Integration tests for bundle installation | ||
| - Server startup tests | ||
| - Environment variable handling | ||
| ### Test coverage | ||
|
|
||
| ## Development Workflow | ||
| - Unit tests for SQL execution, charting, and utilities | ||
| - Integration tests for bundle installation and server startup | ||
| - Python 3.9-3.12 compatibility testing | ||
| - Coverage threshold: 55% | ||
|
|
||
| ### Using in Deepnote Projects | ||
| ### Local development with Docker | ||
|
|
||
| When you push a commit, a new version of `deepnote/jupyter-for-local` is built with your commit hash (shortened!). Use it in projects by updating `common.yml`: | ||
|
|
||
| ```yaml | ||
| jupyter: | ||
| image: "deepnote/jupyter-for-local:SHORTENED_COMMIT_SHA" | ||
| ``` | ||
|
|
||
| Alternatively, to develop against local copy of toolkit, first run this command to build the image: | ||
| For local development with hot-reload: | ||
|
|
||
| ```bash | ||
| # Build the development image | ||
| docker build \ | ||
| --build-arg "FROM_PYTHON_TAG=3.11" \ | ||
| -t deepnote/deepnote-toolkit-local-hotreload \ | ||
| -f ./dockerfiles/jupyter-for-local-hotreload/Dockerfile . | ||
| ``` | ||
|
|
||
| And start container: | ||
|
|
||
| ```bash | ||
| # To include server logs in the output add this argument | ||
| # -e WITH_SERVER_LOGS=1 \ | ||
| # Some toolkit features (e.g. feature flags support) require | ||
| # DEEPNOTE_PROJECT_ID to be set to work correctly. Add this | ||
| # argument with your project id | ||
| # -e DEEPNOTE_PROJECT_ID=981af2c1-fe8b-41b7-94bf-006b74cf0641 \ | ||
|
|
||
| # Start the container | ||
| docker run \ | ||
| -v "$(pwd)":/deepnote-toolkit \ | ||
| -v /tmp/deepnote-mounts:/deepnote-mounts:shared \ | ||
| -p 8888:8888 \ | ||
| -p 2087:2087 \ | ||
| -p 8051:8051 \ | ||
| -p 8888:8888 -p 2087:2087 -p 8051:8051 \ | ||
| -w /deepnote-toolkit \ | ||
| --add-host=localstack.dev.deepnote.org:host-gateway \ | ||
| --rm \ | ||
| --name deepnote-toolkit-local-hotreload-container \ | ||
| deepnote/deepnote-toolkit-local-hotreload | ||
| ``` | ||
|
|
||
| This will start a container with Deepnote toolkit mounted inside and expose all required ports. If you change code that is executed in kernel (e.g. you updated DataFrame formatter), you need only to restart kernel from Deepnote's UI. If you updated code that is starts Jupyter itself, you need to restart container. And if you add/modify dependencies you need to rebuild image. | ||
| Now, you need to modify `common.yml` in the Deepnote app. First, replace `jupyter` service with noop image: | ||
| ```yml | ||
| jupyter: | ||
| image: 'screwdrivercd/noop-container' | ||
| ``` | ||
| And change `JUPYTER_HOST` variable of executor to point to host machine: | ||
| ```yml | ||
| executor: | ||
| environment: | ||
| JUPYTER_HOST: host.docker.internal | ||
| deepnote/deepnote-toolkit-local-hotreload | ||
| ``` | ||
|
|
||
| ### Review Apps | ||
| **Hot-reload behavior:** | ||
| - Kernel code changes: Restart kernel from Jupyter UI | ||
| - Server code changes: Restart container | ||
| - Dependency changes: Rebuild image | ||
|
|
||
| Each PR creates a review app for testing. Access it via GitHub checks. Monitor logs in Grafana: | ||
| ### Docker images | ||
|
|
||
| ``` | ||
| {pod="p-PROJECT_ID", container="notebook"} | ||
| ``` | ||
| ### Adding Dependencies | ||
| - Kernel dependencies: Add to `[tool.poetry.dependencies]` in pyproject.toml | ||
| ```bash | ||
| # Add a package to kernel bundle (available in notebooks) | ||
| $ poetry add pandas | ||
| # Add a package with specific version | ||
| $ poetry add "pandas>=2.0.0" | ||
| ``` | ||
| ```bash | ||
| # Add a development dependency | ||
| $ poetry add --group dev pytest | ||
| ``` | ||
| After adding dependencies, run tests to verify compatibility: | ||
| ```bash | ||
| $ ./bin/test-local | ||
| ``` | ||
| The repository includes three main Dockerfiles: | ||
|
|
||
| ### Docker Usage | ||
| - **`builder.Dockerfile`**: Builds Python packages for versions 3.9-3.12, generates kernel and server bundles | ||
| - **`test.Dockerfile`**: Provides reproducible test environment for nox across Python versions | ||
| - **`jupyter-for-local.Dockerfile`**: Development environment with Jupyter integration for local testing | ||
|
|
||
| We use Docker to ensure reproducible environments due to Jupyter libraries' binary dependencies: | ||
| ## Contributing | ||
|
|
||
| - `builder.Dockerfile`: Creates Python package bundles for different versions (3.9-3.12), generates kernel and server bundles, and packages the toolkit for distribution using Poetry. | ||
| See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, coding standards, and contribution guidelines. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chainVerify referenced documentation files exist. The README links to Also applies to: 167-167, 173-173 🏁 Script executed: #!/bin/bash
# Verify supporting documentation files
for file in CONTRIBUTING.md LICENSE SECURITY.md; do
if [ -f "$file" ]; then
echo "✓ $file exists"
else
echo "✗ $file missing"
fi
doneLength of output: 63 Create missing CONTRIBUTING.md file. README.md references 🤖 Prompt for AI Agents |
||
|
|
||
| - `test.Dockerfile`: Provides consistent test environment for running unit and integration tests across Python versions using nox. Used both locally and in CI/CD pipeline. | ||
| ## License | ||
|
|
||
| - `jupyter-for-local.Dockerfile`: Creates development environment with Jupyter integration, used for local development from docker-compose used in main monorepo. | ||
| Apache License 2.0 - See [LICENSE](LICENSE) for details. | ||
|
|
||
| ### Production Releases | ||
| ## Support | ||
|
|
||
| To release a new version to production: | ||
| - **Documentation**: [docs.deepnote.com](https://docs.deepnote.com) | ||
| - **Issues**: [GitHub Issues](https://github.com/deepnote/deepnote-toolkit/issues) | ||
| - **Security**: See [SECURITY.md](SECURITY.md) for reporting vulnerabilities | ||
|
|
||
| 1. Merge your changes to main. This will automatically trigger a GitHub Actions workflow that runs the test suite and a staging deployment. | ||
| 2. Trigger a new [GitHub Release](https://github.com/deepnote/deepnote-toolkit/releases) in the GitHub UI. | ||
| 3. Monitor [the GitHub Actions workflows](https://github.com/deepnote/deepnote-toolkit/actions) and ensure a successful production deployment. | ||
|
|
||
| Note: The production release pipeline automatically creates two PRs in the ops and app-config repositories: | ||
| <div align="center"> | ||
|
|
||
| - A staging PR that updates staging values and is auto-merged | ||
| - A production PR that updates production values and requires manual approval and merge | ||
| Built with 💙 by the data-driven team | ||
|
|
||
| Important: Always test the changes in the staging environment before approving and merging the production PR to ensure everything works as expected. | ||
| </div> | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove trailing whitespace.
Line 16 has trailing whitespace.
📝 Committable suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
16-16: Trailing spaces
Expected: 0 or 2; Actual: 1
(MD009, no-trailing-spaces)
🤖 Prompt for AI Agents