dlt-hub · rudolfix · Oct 27, 2025 · Oct 22, 2025 · Oct 26, 2025 · AstrakhantsevaAA
diff --git a/Makefile b/Makefile
@@ -72,7 +72,7 @@ lint-snippets:
 	uv pip install docstring_parser_fork --reinstall
 	uv run mypy --config-file mypy.ini docs/website docs/tools --exclude docs/tools/lint_setup --exclude docs/website/docs_processed --exclude docs/website/versioned_docs/
 	uv run ruff check
-	uv run flake8 --max-line-length=200 docs/website docs/tools --exclude docs/website/.dlt-repo
+	uv run flake8 --max-line-length=200 docs/website docs/tools --exclude docs/website/.dlt-repo,docs/website/node_modules
 
 lint-and-test-snippets: lint-snippets
 	cd docs/website/docs && uv run pytest --ignore=node_modules --ignore hub/features/transformations/transformation-snippets.py

diff --git a/docs/tools/check_embedded_snippets.py b/docs/tools/check_embedded_snippets.py
@@ -21,7 +21,20 @@
 
 
 SNIPPET_MARKER = "```"
-ALLOWED_LANGUAGES = ["py", "toml", "json", "yaml", "text", "sh", "bat", "sql", "hcl", "dbml", "dot"]
+ALLOWED_LANGUAGES = [
+    "py",
+    "toml",
+    "json",
+    "yaml",
+    "text",
+    "sh",
+    "bat",
+    "sql",
+    "hcl",
+    "dbml",
+    "dot",
+    "mermaid",
+]
 
 LINT_TEMPLATE = "./lint_setup/template.py"
 LINT_FILE = "./lint_setup/lint_me.py"

diff --git a/docs/website/docs/hub/core-concepts/profiles-dlthub.md b/docs/website/docs/hub/core-concepts/profiles-dlthub.md
@@ -31,7 +31,7 @@ They are hidden behind a feature flag, which means you need to manually enable t
 To activate these features, create the `.dlt/.workspace` file in your project directory; this tells `dlt` to switch from the classic project mode to the new Workspace mode.
 :::
 
-Profiles are part of the [dltHub Workspace](../workspace/intro) feature.
+Profiles are part of the [dltHub Workspace](../workspace/overview.md) feature.
 To use them, first install `dlt` with Workspace support:
 
 ```sh
@@ -234,6 +234,6 @@ You’ll see your pipeline connected to the remote MotherDuck dataset and ready
 
 ## Next steps
 
-* [Configure the workspace.](../workspace/intro)
+* [Configure the workspace.](../workspace/overview.md)
 * [Deploy your pipeline.](../../walkthroughs/deploy-a-pipeline)
 * [Monitor and debug pipelines.](../../general-usage/pipeline#monitor-the-loading-progress)
diff --git a/docs/website/docs/hub/features/quality/data-quality.md b/docs/website/docs/hub/features/quality/data-quality.md
@@ -8,33 +8,13 @@ keywords: ["dlthub", "data quality", "contracts"]
 🚧 This feature is under development. Interested in becoming an early tester? [Join dltHub early access](https://info.dlthub.com/waiting-list).
 :::
 
-dltHub will allow you to define data validation rules at the YAML level or using Pydantic models. This ensures your data meets expected quality standards at the ingestion step.
-
-## Example: Defining a quality contract in YAML
-
-You can specify quality contracts to enforce constraints on your data, such as expected value ranges and nullability.
-
-```yaml
-engine_version: 10
-name: scd_type_3
-tables:
-  customers:
-    columns:
-      category:
-        data_type: bigint
-        nullable: false
-        quality_contracts:
-          expect_column_max_to_be_between:
-            min_value: 1
-            max_value: 100
-```
+dltHub will allow you to define data validation rules in Python. This ensures your data meets expected quality standards at the ingestion step.
 
 ## Key features
 With dltHub, you will be able to:
 
-* Define data tests and quality contracts using YAML configuration or Pydantic models.
+* Define data tests and quality contracts in Python.
 * Apply both row-level and batch-level validation.
 * Enforce constraints on distributions, boundaries, and expected values.
 
 Stay tuned for updates as we expand these capabilities! 🚀
-
diff --git a/docs/website/docs/hub/intro.md b/docs/website/docs/hub/intro.md
@@ -3,41 +3,121 @@ title: Introduction
 description: Introduction to dltHub
 ---
 
-# What is dltHub?
+## What is dltHub?
 
-![dltHub](/img/slot-machine-gif.gif)
+dltHub is an LLM-native data engineering platform that lets any Python developer build, run, and operate production-grade data pipelines, and deliver end-user-ready insights without managing infrastructure.
 
-dltHub is a commercial extension to the open-source data load tool (dlt). It augments it with a set of features like transformations, data validations,
-iceberg with full catalog support and provides a yaml interface to define data platforms. dltHub features include:
+dltHub is built around the open-source library [dlt](../intro.md). It uses the same core concepts (sources, destinations, pipelines) and extends the extract-and-load focus of `dlt` with:
 
-- [@dlt.hub.transformation](features/transformations/index.md) - powerful Python decorator to build transformation pipelines and notebooks
-- [dbt transformations](features/transformations/dbt-transformations.md): a staging layer for data transformations, combining a local cache with schema enforcement, debugging tools, and integration with existing data workflows.
-- [Iceberg support](ecosystem/iceberg.md)
-- [Secure data access and sharing](features/data-access.md)
-- [AI workflows](features/ai.md): agents to augment your data engineering team.
+* Enhanced developer experience
+* Transformations
+* Data quality
+* AI-assisted (“agentic”) workflows
+* Managed runtime
 
-To get started with dltHub, install the library using pip (Python 3.9-3.12):
+dltHub supports both local and managed cloud development. A single developer can deploy and operate pipelines, transformations, and notebooks directly from a dltHub Workspace, using a single command.
+The dltHub Runtime, customizable pipeline dashboard, and validation tools make it straightforward to monitor, troubleshoot, and keep data reliable throughout the whole end-to-end data workflow:
 
-```sh
-pip install dlthub
+```mermaid
+flowchart LR
+    A[Create a pipeline] --> B[Ensure data quality]
+    B --> C[Create reports & transformations]
+    C --> D[Deploy Workspace]
+    D --> E[Maintain data quality]
+    E --> F[Share]
 ```
 
-You can try out any features by self-issuing a trial license. You can use such license for evaluation, development and testing.
-Trial license are issued off-line using `dlt license` command:
+In practice, this means any Python developer can:
 
-1. Display a list of available features
-```sh
-dlt license scopes
-```
+* Build and customize data pipelines quickly (with LLM help when desired).
+* Derisk data insights by keeping data quality high with checks, tests, and alerts.
+* Ship fresh dashboards, reports, and data apps.
+* Scale the data workflows easily without babysitting infra, schema drift, and silent failures.
 
-2. Issue license for the feature you want to test.
 
-```sh
-dlt license issue dlthub.transformation
-```
 
-The command above will enable access to new `@dlt.hub.transformation` decorator. Note that you may
-self issue licenses several times and the command above will carry-over features from previously issued license.
+:::tip
+Want to see it end-to-end? Watch the dltHub [Workspace demo](https://youtu.be/rmpiFSCV8aA).
+:::
+
+To get started quickly, follow the [installation instructions](getting-started/installation.md).
+
+## Overview
+
+### Key capabilities
+
+1. **[LLM-native workflow](../dlt-ecosystem/llm-tooling/llm-native-workflow)**: accelerate pipeline authoring and maintenance with guided prompts and copilot experiences.
+
+2. **[Transformations](features/transformations/index.md)**: write Python or SQL transformations with `@dlt.hub.transformation`, orchestrated within your pipeline.
+
+3. **[Data quality](features/quality/data-quality.md)**: define correctness rules, run checks, and fail fast with actionable messages.
+
+4. **[Data apps & sharing](../general-usage/dataset-access/marimo)**: build lightweight, shareable data apps and notebooks for consumers.
+
+5. **[AI agentic support](features/mcp-server.md)**: use MCP servers to analyze pipelines and datasets.
+6. **Managed runtime**: deploy and run with a single command—no infra to provision or patch.
+7. **[Storage choice](ecosystem/iceberg.md)**: pick managed Iceberg-based lakehouse, DuckLake, or bring your own storage.
+
+### How dltHub fits with dlt (OSS)
+
+dltHub embraces the dlt library, not replaces it:
+* dlt (OSS): Python library focused on extract & load with strong typing and schema handling.
+* dltHub: Adds transformations, quality, agentic tooling, managed runtime, and storage choices, so you can move from local dev to production seamlessly.
+
+If you like the dlt developer experience, dltHub gives you everything around it to run in production with less toil.
+
+## dltHub products
+dltHub consists of three main products. You can use them together or compose them based on your needs.
+
+### Workspace
+
+**[Workspace](workspace/overview.md) [Public preview]** - the unified environment for building, running, and maintaining data workflows end-to-end.
+
+* Scaffolding and LLM helpers for faster pipeline creation.
+* Integrated transformations (@dlt.hub.transformation decorator).
+* Data quality rules, test runs, and result surfacing.
+* Notebook and data apps (e.g., Marimo) for sharing insights.
+* Visual dashboards for pipeline health and run history.
+
+### Runtime [Private preview]
+
+**Runtime** - a managed cloud runtime operated by dltHub:
+
+* Scalable execution for pipelines and transformations.
+* APIs, web interfaces, and auxiliary services.
+* Secure, multi-tenant infrastructure with upgrades and patching handled for you.
+
+:::tip
+Prefer full control? See [Enterprise](#tiers--licensing) below for self-managed options.
+:::
+
+### Storage
+
+**[Storage](ecosystem/iceberg.md) [In development]**. Choose where your data lives:
+
+* Managed lakehouse: Iceberg open table format (or DuckLake) managed by dltHub.
+* Bring your own storage: connect to your own lake/warehouse when needed.
+
+## Tiers & licensing
+
+Some of the features described in this documentation are free to use. Others require a paid plan. Latest pricing & full feature matrix can be found live on our website.
+Most features support a self-guided trial right after install, check the [installation instructions](getting-started/installation.md) for more information.
+
+| Tier                  | Best for                                                                                   | Runtime                        | Typical use case                                                             | Notes                                          | Availability    |
+| --------------------- | ------------------------------------------------------------------------------------------ | ------------------------------ | ---------------------------------------------------------------------------- | ---------------------------------------------- |-----------------|
+| **dltHub Basic**      | Solo developers or small teams owning a **single pipeline + dataset + reports** end-to-end | Managed dltHub Runtime         | Set up a pipeline quickly, add tests and transformations, share a simple app | Optimized for velocity with minimal setup      | Private preview |
+| **dltHub Scale**      | Data teams building **composable data platforms** with governance and collaboration        | Managed dltHub Runtime         | Multiple pipelines, shared assets, team workflows, observability             | Team features and extended governance          | Alpha           |
+| **dltHub Enterprise** | Organizations needing **enterprise controls** or **self-managed runtime**                  | Managed or self-hosted Runtime | On-prem/VPC deployments, custom licensing, advanced security                 | Enterprise features and deployment flexibility | In developement |
+
+
+### Who is dltHub for?
+
+* Python developers who want production outcomes without becoming infra experts.
+* Lean data teams standardizing on dlt and wanting integrated quality, transforms, and sharing.
+* Organizations that prefer managed operations but need open formats and portability.
 
-3. Do not forget to read our [EULA](EULA.md) and [Special Terms](EULA.md#specific-terms-for-the-self-issued-trial-license-self-issued-trial-terms)
-for self issued licenses.
+:::note
+* You can start on Basic and upgrade to Scale or Enterprise later, no code rewrites.
+* We favor open formats and portable storage (e.g., Iceberg), whether you choose our managed lakehouse or bring your own.
+* For exact features and pricing, check the site; this section is meant to help you choose a sensible starting point.
+:::
diff --git a/docs/website/docs/hub/workspace/init.md b/docs/website/docs/hub/workspace/init.md
@@ -32,7 +32,7 @@ This adds support for AI-assisted workflows and the `dlt ai` command.
 
 **dlt Workspace** is a unified environment for developing, running, and maintaining data pipelines — from local development to production.
 
-[More about dlt Workspace ->](../workspace/intro)
+[More about dlt Workspace ->](../workspace/overview.md)
 
 
 ## Step 1: Initialize a custom pipeline