|  | 
| 1 | 1 | # Notion | 
| 2 | 2 | 
 | 
| 3 |  | -:::info | 
| 4 |  | -Need help deploying these sources, or figuring out how to run them in your data stack? | 
|  | 3 | +:::info Need help deploying these sources, or figuring out how to run them in your data stack? | 
| 5 | 4 | 
 | 
| 6 |  | -[Join our slack community](https://dlthub-community.slack.com/join/shared_invite/zt-1slox199h-HAE7EQoXmstkP_bTqal65g) or [book a call](https://calendar.app.google/kiLhuMsWKpZUpfho6) with our support engineer Adrian. | 
|  | 5 | +[Join our Slack community](https://dlthub-community.slack.com/join/shared_invite/zt-1slox199h-HAE7EQoXmstkP_bTqal65g) | 
|  | 6 | +or [book a call](https://calendar.app.google/kiLhuMsWKpZUpfho6) with our support engineer Adrian. | 
| 7 | 7 | ::: | 
| 8 | 8 | 
 | 
|  | 9 | +[Notion](https://www.notion.so/) is a flexible workspace tool for organizing personal and | 
|  | 10 | +professional tasks, offering customizable notes, documents, databases, and more. | 
| 9 | 11 | 
 | 
| 10 |  | -Notion is a tool that allows users to organize and manage their personal and professional lives. | 
| 11 |  | -It provides a flexible workspace where you can create and customize various types of digital content, | 
| 12 |  | -such as notes, documents, databases, task lists, and more. | 
|  | 12 | +This Notion `dlt` verified source and | 
|  | 13 | +[pipeline example](https://github.com/dlt-hub/verified-sources/blob/master/sources/notion_pipeline.py) | 
|  | 14 | +loads data using “Notion API” to the destination of your choice. | 
| 13 | 15 | 
 | 
| 14 |  | -Using this Notion `dlt` verified source and pipeline example, you can load the ***databases*** from Notion to a [destination](../destinations/duckdb) of your choice. | 
| 15 |  | -In Notion, [databases](https://www.notion.so/help/intro-to-databases) are a powerful feature that allows you to create structured collections of information. | 
| 16 |  | -They are similar to spreadsheets or tables but with added flexibility and functionality. | 
|  | 16 | +Sources that can be loaded using this verified source are: | 
| 17 | 17 | 
 | 
| 18 |  | -## Grab API credentials | 
|  | 18 | +| Name             | Description                           | | 
|  | 19 | +|------------------|---------------------------------------| | 
|  | 20 | +| notion_databases | Retrieves data from Notion databases. | | 
| 19 | 21 | 
 | 
| 20 |  | -1. If you don't already have a Notion account, please create one. | 
| 21 |  | -2. Access your Notion account and navigate to [My Integrations](https://www.notion.so/my-integrations). | 
| 22 |  | -3. On the left-hand side, click on "New Integration" and provide a suitable name for the integration. | 
| 23 |  | -4. Finally, click on "Submit" located at the bottom of the page. | 
| 24 |  | - | 
| 25 |  | -## Add a connection to the database | 
| 26 |  | - | 
| 27 |  | -1. Open the database that you want to load to the destination. | 
| 28 |  | -2. Click on the three dots located at the top right corner and choose "Add connections". | 
|  | 22 | +## Setup Guide | 
| 29 | 23 | 
 | 
| 30 |  | -     | 
|  | 24 | +### Grab credentials | 
| 31 | 25 | 
 | 
|  | 26 | +1. If you don't already have a Notion account, please create one. | 
|  | 27 | +1. Access your Notion account and navigate to | 
|  | 28 | +   [My Integrations](https://www.notion.so/my-integrations). | 
|  | 29 | +1. Click "New Integration" on the left and name it appropriately. | 
|  | 30 | +1. Finally, click on "Submit" located at the bottom of the page. | 
| 32 | 31 | 
 | 
| 33 |  | -3. From the list of options, select the integration you previously created and click on "Confirm". | 
| 34 |  | - | 
| 35 |  | -## Initialize the verified source and pipeline example | 
| 36 |  | - | 
| 37 |  | -To get started with your verified source and pipeline example follow these steps: | 
|  | 32 | +### Add a connection to the database | 
| 38 | 33 | 
 | 
| 39 |  | -1. Open up your terminal or command prompt and navigate to the directory where you'd like to create your project. | 
| 40 |  | -2. Enter the following command: | 
|  | 34 | +1. Open the database that you want to load to the destination. | 
| 41 | 35 | 
 | 
| 42 |  | -    ```bash | 
| 43 |  | -    dlt init notion duckdb | 
| 44 |  | -    ``` | 
|  | 36 | +1. Click on the three dots located in the top right corner and choose "Add connections". | 
| 45 | 37 | 
 | 
| 46 |  | -    This command will initialize your verified source with Notion and creates a pipeline with duckdb as the destination. | 
| 47 |  | -    If you'd like to use a different destination, simply replace `duckdb` with the name of your preferred destination. | 
| 48 |  | -    You can find supported destinations and their configuration options in our [documentation](../destinations/duckdb) | 
|  | 38 | +    | 
| 49 | 39 | 
 | 
| 50 |  | -3. After running this command, a new directory will be created with the necessary files and configuration settings to get started. | 
|  | 40 | +1. From the list of options, select the integration you previously created and click on "Confirm". | 
| 51 | 41 | 
 | 
| 52 |  | -    ``` | 
| 53 |  | -    notion_source | 
| 54 |  | -    ├── .dlt | 
| 55 |  | -    │   ├── config.toml | 
| 56 |  | -    │   └── secrets.toml | 
| 57 |  | -    ├── notion | 
| 58 |  | -    │   ├── helpers | 
| 59 |  | -    │   │  ├── __init__.py | 
| 60 |  | -    │   │  ├── client.py | 
| 61 |  | -    │   │  └── database.py | 
| 62 |  | -    │   ├── __init__.py | 
| 63 |  | -    │   ├── README.md | 
| 64 |  | -    │   └── settings.py | 
| 65 |  | -    ├── .gitignore | 
| 66 |  | -    ├── requirements.txt | 
| 67 |  | -    └── notion_pipeline.py | 
| 68 |  | -    ``` | 
|  | 42 | +### Initialize the verified source | 
| 69 | 43 | 
 | 
|  | 44 | +To get started with your data pipeline, follow these steps: | 
| 70 | 45 | 
 | 
| 71 |  | -## Add credentials | 
|  | 46 | +1. Enter the following command: | 
| 72 | 47 | 
 | 
| 73 |  | -1. Inside the `.dlt` folder, you'll find a file called “*secrets.toml*”, which is where you can securely store your access tokens and other sensitive information. It's important to handle this file with care and keep it safe. | 
|  | 48 | +   ```bash | 
|  | 49 | +   dlt init notion duckdb | 
|  | 50 | +   ``` | 
| 74 | 51 | 
 | 
| 75 |  | -Here's what the file looks like: | 
|  | 52 | +   [This command](../../reference/command-line-interface) will initialize | 
|  | 53 | +   [the pipeline example](https://github.com/dlt-hub/verified-sources/blob/master/sources/notion_pipeline.py) | 
|  | 54 | +   with Notion as the [source](../../general-usage/source) and [duckdb](../destinations/duckdb.md) | 
|  | 55 | +   as the [destination](../destinations). | 
| 76 | 56 | 
 | 
| 77 |  | -```toml | 
| 78 |  | -# Put your secret values and credentials here | 
| 79 |  | -# Note: Do not share this file and do not push it to GitHub! | 
| 80 |  | -[source.notion] | 
| 81 |  | -api_key = "set me up!" # Notion API token (e.g. secret_XXX...) | 
| 82 |  | -``` | 
|  | 57 | +1. If you'd like to use a different destination, simply replace `duckdb` with the name of your | 
|  | 58 | +   preferred [destination](../destinations). | 
| 83 | 59 | 
 | 
| 84 |  | -2. Replace the value of `api_key` with the one that [you copied above](notion.md#grab-api-credentials). This will ensure that your data-verified source can access your notion resources securely. | 
| 85 |  | -3. Next, follow the instructions in [Destinations](../destinations/duckdb) to add credentials for your chosen destination. This will ensure that your data is properly routed to its final destination. | 
|  | 60 | +1. After running this command, a new directory will be created with the necessary files and | 
|  | 61 | +   configuration settings to get started. | 
| 86 | 62 | 
 | 
| 87 |  | -## Run the pipeline example | 
|  | 63 | +For more information, read the | 
|  | 64 | +[Walkthrough: Add a verified source.](../../walkthroughs/add-a-verified-source) | 
| 88 | 65 | 
 | 
| 89 |  | -1. Install the necessary dependencies by running the following command: | 
|  | 66 | +### Add credentials | 
| 90 | 67 | 
 | 
| 91 |  | -    ```bash | 
| 92 |  | -    pip install -r requirements.txt | 
| 93 |  | -    ``` | 
|  | 68 | +1. In the `.dlt` folder, there's a file called `secrets.toml`. It's where you store sensitive | 
|  | 69 | +   information securely, like access tokens. Keep this file safe. Here's its format for service | 
|  | 70 | +   account authentication: | 
| 94 | 71 | 
 | 
| 95 |  | -2. Now the pipeline can be run by using the command: | 
|  | 72 | +   ```toml | 
|  | 73 | +   # Put your secret values and credentials here | 
|  | 74 | +   # Note: Do not share this file and do not push it to GitHub! | 
|  | 75 | +   [source.notion] | 
|  | 76 | +   api_key = "set me up!" # Notion API token (e.g. secret_XXX...) | 
|  | 77 | +   ``` | 
| 96 | 78 | 
 | 
| 97 |  | -    ```bash | 
| 98 |  | -    python3 notion_pipeline.py | 
| 99 |  | -    ``` | 
|  | 79 | +1. Replace the value of `api_key` with the one that [you copied above](notion.md#grab-credentials). | 
|  | 80 | +   This will ensure that your data-verified source can access your notion resources securely. | 
| 100 | 81 | 
 | 
| 101 |  | -3. To make sure that everything is loaded as expected, use the command: | 
|  | 82 | +1. Next, follow the instructions in [Destinations](../destinations/duckdb) to add credentials for | 
|  | 83 | +   your chosen destination. This will ensure that your data is properly routed to its final | 
|  | 84 | +   destination. | 
| 102 | 85 | 
 | 
| 103 |  | -    ```bash | 
| 104 |  | -    dlt pipeline <pipeline_name> show | 
| 105 |  | -    ``` | 
|  | 86 | +## Run the pipeline | 
| 106 | 87 | 
 | 
| 107 |  | -    For example, the pipeline_name for the above pipeline example is `notion`, you may also use any custom name instead. | 
|  | 88 | +1. Before running the pipeline, ensure that you have installed all the necessary dependencies by | 
|  | 89 | +   running the command: | 
|  | 90 | +   ```bash | 
|  | 91 | +   pip install -r requirements.txt | 
|  | 92 | +   ``` | 
|  | 93 | +1. You're now ready to run the pipeline! To get started, run the following command: | 
|  | 94 | +   ```bash | 
|  | 95 | +   python3 notion_pipeline.py | 
|  | 96 | +   ``` | 
|  | 97 | +1. Once the pipeline has finished running, you can verify that everything loaded correctly by using | 
|  | 98 | +   the following command: | 
|  | 99 | +   ```bash | 
|  | 100 | +   dlt pipeline <pipeline_name> show | 
|  | 101 | +   ``` | 
|  | 102 | +   For example, the `pipeline_name` for the above pipeline example is `notion`, you may also use any | 
|  | 103 | +   custom name instead. | 
| 108 | 104 | 
 | 
|  | 105 | +For more information, read the [Walkthrough: Run a pipeline.](../../walkthroughs/run-a-pipeline) | 
| 109 | 106 | 
 | 
| 110 |  | -## Customizations | 
|  | 107 | +## Sources and resources | 
| 111 | 108 | 
 | 
| 112 |  | -To load data to the destination using `dlt`, you have the option to write your own methods. | 
|  | 109 | +`dlt` works on the principle of [sources](../../general-usage/source) and | 
|  | 110 | +[resources](../../general-usage/resource). | 
| 113 | 111 | 
 | 
| 114 |  | -### Source and resource methods | 
|  | 112 | +### Source `notion_databases` | 
| 115 | 113 | 
 | 
| 116 |  | -`dlt` works on the principle of [sources](https://dlthub.com/docs/general-usage/source) | 
| 117 |  | -and [resources](https://dlthub.com/docs/general-usage/resource) that for this verified | 
| 118 |  | -source are found in the `__init__.py` file within the *notion* directory. | 
| 119 |  | -This verified source has one default method: | 
|  | 114 | +This function loads notion databases from notion into the destination. | 
| 120 | 115 | 
 | 
| 121 | 116 | ```python | 
| 122 | 117 | @dlt.source | 
| 123 | 118 | def notion_databases( | 
| 124 | 119 |     database_ids: Optional[List[Dict[str, str]]] = None, | 
| 125 | 120 |     api_key: str = dlt.secrets.value, | 
| 126 | 121 | ) -> Iterator[DltResource]: | 
| 127 |  | -
 | 
| 128 | 122 | ``` | 
| 129 | 123 | 
 | 
| 130 |  | -- **`database_ids`**: A list of dictionaries each containing a database id and a name. | 
| 131 |  | -                      If `database_ids` is None, then the source retrieves data from all existed databases in your Notion account. | 
| 132 |  | -- **`api_key`**: The Notion API secret key. | 
|  | 124 | +`database_ids`: A list of dictionaries each containing a database id and a name. | 
|  | 125 | + | 
|  | 126 | +`api_key`: The Notion API secret key. | 
|  | 127 | + | 
|  | 128 | +> If "database_ids" is None, the source fetches data from all integrated databases in your Notion | 
|  | 129 | +> account. | 
|  | 130 | +
 | 
|  | 131 | +It is important to note that the data is loaded in “replace” mode where the existing data is | 
|  | 132 | +completely replaced. | 
|  | 133 | + | 
|  | 134 | +### Create your own pipeline | 
|  | 135 | + | 
|  | 136 | +If you wish to create your own pipelines, you can leverage source and resource methods from this | 
|  | 137 | +verified source. | 
|  | 138 | + | 
|  | 139 | +1. Configure the pipeline by specifying the pipeline name, destination, and dataset as follows: | 
|  | 140 | + | 
|  | 141 | +   ```python | 
|  | 142 | +   pipeline = dlt.pipeline( | 
|  | 143 | +      pipeline_name="notion",  # Use a custom name if desired | 
|  | 144 | +      destination="duckdb",  # Choose the appropriate destination (e.g., duckdb, redshift, post) | 
|  | 145 | +      dataset_name="notion_database"  # Use a custom name if desired | 
|  | 146 | +   ) | 
|  | 147 | +   ``` | 
|  | 148 | + | 
|  | 149 | +   To read more about pipeline configuration, please refer to our | 
|  | 150 | +   [documentation](../../general-usage/pipeline). | 
|  | 151 | + | 
|  | 152 | +1. To load all the integrated databases: | 
|  | 153 | + | 
|  | 154 | +   ```python | 
|  | 155 | +   load_data = notion_databases() | 
|  | 156 | +   load_info = pipeline.run(load_data) | 
|  | 157 | +   print(load_info) | 
|  | 158 | +   ``` | 
|  | 159 | + | 
|  | 160 | +1. To load the custom databases: | 
|  | 161 | + | 
|  | 162 | +   ```python | 
|  | 163 | +   selected_database_ids = [{"id": "0517dae9409845cba7d","use_name":"db_one"}, {"id": "d8ee2d159ac34cfc"}] | 
|  | 164 | +   load_data = notion_databases(database_ids=selected_database_ids) | 
|  | 165 | +   load_info = pipeline.run(load_data) | 
|  | 166 | +   print(load_info) | 
|  | 167 | +   ``` | 
|  | 168 | + | 
|  | 169 | +   The Database ID can be retrieved from the URL. For example if the URL is: | 
|  | 170 | + | 
|  | 171 | +   ```shell | 
|  | 172 | +   https://www.notion.so/d8ee2d159ac34cfc85827ba5a0a8ae71?v=c714dec3742440cc91a8c38914f83b6b | 
|  | 173 | +   ``` | 
|  | 174 | + | 
|  | 175 | +   > The database ID in the given Notion URL is: "d8ee2d159ac34cfc85827ba5a0a8ae71". | 
| 133 | 176 | 
 | 
| 134 |  | -The above function yields data resources from the Notion databases. | 
| 135 |  | -It is important to note that the data is loaded in “replace” mode where the existing data is completely replaced. | 
|  | 177 | +The database ID in a Notion URL is the string right after notion.so/, before any question marks. It | 
|  | 178 | +uniquely identifies a specific page or database. | 
| 136 | 179 | 
 | 
| 137 |  | -That’s it! Enjoy running your Notion DLT verified source! | 
|  | 180 | +The database name ("use_name") is optional; if skipped, the pipeline will fetch it from Notion | 
|  | 181 | +automatically. | 
0 commit comments