Skip to content

Commit f58e72a

Browse files
authored
Docs/notion docs updated (dlt-hub#557)
* Updated Notion * Updated Notion documentation
1 parent 753dc11 commit f58e72a

File tree

1 file changed

+134
-90
lines changed
  • docs/website/docs/dlt-ecosystem/verified-sources

1 file changed

+134
-90
lines changed
Lines changed: 134 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1,137 +1,181 @@
11
# Notion
22

3-
:::info
4-
Need help deploying these sources, or figuring out how to run them in your data stack?
3+
:::info Need help deploying these sources, or figuring out how to run them in your data stack?
54

6-
[Join our slack community](https://dlthub-community.slack.com/join/shared_invite/zt-1slox199h-HAE7EQoXmstkP_bTqal65g) or [book a call](https://calendar.app.google/kiLhuMsWKpZUpfho6) with our support engineer Adrian.
5+
[Join our Slack community](https://dlthub-community.slack.com/join/shared_invite/zt-1slox199h-HAE7EQoXmstkP_bTqal65g)
6+
or [book a call](https://calendar.app.google/kiLhuMsWKpZUpfho6) with our support engineer Adrian.
77
:::
88

9+
[Notion](https://www.notion.so/) is a flexible workspace tool for organizing personal and
10+
professional tasks, offering customizable notes, documents, databases, and more.
911

10-
Notion is a tool that allows users to organize and manage their personal and professional lives.
11-
It provides a flexible workspace where you can create and customize various types of digital content,
12-
such as notes, documents, databases, task lists, and more.
12+
This Notion `dlt` verified source and
13+
[pipeline example](https://github.com/dlt-hub/verified-sources/blob/master/sources/notion_pipeline.py)
14+
loads data using “Notion API” to the destination of your choice.
1315

14-
Using this Notion `dlt` verified source and pipeline example, you can load the ***databases*** from Notion to a [destination](../destinations/duckdb) of your choice.
15-
In Notion, [databases](https://www.notion.so/help/intro-to-databases) are a powerful feature that allows you to create structured collections of information.
16-
They are similar to spreadsheets or tables but with added flexibility and functionality.
16+
Sources that can be loaded using this verified source are:
1717

18-
## Grab API credentials
18+
| Name | Description |
19+
|------------------|---------------------------------------|
20+
| notion_databases | Retrieves data from Notion databases. |
1921

20-
1. If you don't already have a Notion account, please create one.
21-
2. Access your Notion account and navigate to [My Integrations](https://www.notion.so/my-integrations).
22-
3. On the left-hand side, click on "New Integration" and provide a suitable name for the integration.
23-
4. Finally, click on "Submit" located at the bottom of the page.
24-
25-
## Add a connection to the database
26-
27-
1. Open the database that you want to load to the destination.
28-
2. Click on the three dots located at the top right corner and choose "Add connections".
22+
## Setup Guide
2923

30-
![Notion Database](./docs_images/Notion_Database_2.jpeg)
24+
### Grab credentials
3125

26+
1. If you don't already have a Notion account, please create one.
27+
1. Access your Notion account and navigate to
28+
[My Integrations](https://www.notion.so/my-integrations).
29+
1. Click "New Integration" on the left and name it appropriately.
30+
1. Finally, click on "Submit" located at the bottom of the page.
3231

33-
3. From the list of options, select the integration you previously created and click on "Confirm".
34-
35-
## Initialize the verified source and pipeline example
36-
37-
To get started with your verified source and pipeline example follow these steps:
32+
### Add a connection to the database
3833

39-
1. Open up your terminal or command prompt and navigate to the directory where you'd like to create your project.
40-
2. Enter the following command:
34+
1. Open the database that you want to load to the destination.
4135

42-
```bash
43-
dlt init notion duckdb
44-
```
36+
1. Click on the three dots located in the top right corner and choose "Add connections".
4537

46-
This command will initialize your verified source with Notion and creates a pipeline with duckdb as the destination.
47-
If you'd like to use a different destination, simply replace `duckdb` with the name of your preferred destination.
48-
You can find supported destinations and their configuration options in our [documentation](../destinations/duckdb)
38+
![Notion Database](./docs_images/Notion_Database_2.jpeg)
4939

50-
3. After running this command, a new directory will be created with the necessary files and configuration settings to get started.
40+
1. From the list of options, select the integration you previously created and click on "Confirm".
5141

52-
```
53-
notion_source
54-
├── .dlt
55-
│ ├── config.toml
56-
│ └── secrets.toml
57-
├── notion
58-
│ ├── helpers
59-
│ │ ├── __init__.py
60-
│ │ ├── client.py
61-
│ │ └── database.py
62-
│ ├── __init__.py
63-
│ ├── README.md
64-
│ └── settings.py
65-
├── .gitignore
66-
├── requirements.txt
67-
└── notion_pipeline.py
68-
```
42+
### Initialize the verified source
6943

44+
To get started with your data pipeline, follow these steps:
7045

71-
## Add credentials
46+
1. Enter the following command:
7247

73-
1. Inside the `.dlt` folder, you'll find a file called “*secrets.toml*”, which is where you can securely store your access tokens and other sensitive information. It's important to handle this file with care and keep it safe.
48+
```bash
49+
dlt init notion duckdb
50+
```
7451

75-
Here's what the file looks like:
52+
[This command](../../reference/command-line-interface) will initialize
53+
[the pipeline example](https://github.com/dlt-hub/verified-sources/blob/master/sources/notion_pipeline.py)
54+
with Notion as the [source](../../general-usage/source) and [duckdb](../destinations/duckdb.md)
55+
as the [destination](../destinations).
7656

77-
```toml
78-
# Put your secret values and credentials here
79-
# Note: Do not share this file and do not push it to GitHub!
80-
[source.notion]
81-
api_key = "set me up!" # Notion API token (e.g. secret_XXX...)
82-
```
57+
1. If you'd like to use a different destination, simply replace `duckdb` with the name of your
58+
preferred [destination](../destinations).
8359

84-
2. Replace the value of `api_key` with the one that [you copied above](notion.md#grab-api-credentials). This will ensure that your data-verified source can access your notion resources securely.
85-
3. Next, follow the instructions in [Destinations](../destinations/duckdb) to add credentials for your chosen destination. This will ensure that your data is properly routed to its final destination.
60+
1. After running this command, a new directory will be created with the necessary files and
61+
configuration settings to get started.
8662

87-
## Run the pipeline example
63+
For more information, read the
64+
[Walkthrough: Add a verified source.](../../walkthroughs/add-a-verified-source)
8865

89-
1. Install the necessary dependencies by running the following command:
66+
### Add credentials
9067

91-
```bash
92-
pip install -r requirements.txt
93-
```
68+
1. In the `.dlt` folder, there's a file called `secrets.toml`. It's where you store sensitive
69+
information securely, like access tokens. Keep this file safe. Here's its format for service
70+
account authentication:
9471

95-
2. Now the pipeline can be run by using the command:
72+
```toml
73+
# Put your secret values and credentials here
74+
# Note: Do not share this file and do not push it to GitHub!
75+
[source.notion]
76+
api_key = "set me up!" # Notion API token (e.g. secret_XXX...)
77+
```
9678

97-
```bash
98-
python3 notion_pipeline.py
99-
```
79+
1. Replace the value of `api_key` with the one that [you copied above](notion.md#grab-credentials).
80+
This will ensure that your data-verified source can access your notion resources securely.
10081

101-
3. To make sure that everything is loaded as expected, use the command:
82+
1. Next, follow the instructions in [Destinations](../destinations/duckdb) to add credentials for
83+
your chosen destination. This will ensure that your data is properly routed to its final
84+
destination.
10285

103-
```bash
104-
dlt pipeline <pipeline_name> show
105-
```
86+
## Run the pipeline
10687

107-
For example, the pipeline_name for the above pipeline example is `notion`, you may also use any custom name instead.
88+
1. Before running the pipeline, ensure that you have installed all the necessary dependencies by
89+
running the command:
90+
```bash
91+
pip install -r requirements.txt
92+
```
93+
1. You're now ready to run the pipeline! To get started, run the following command:
94+
```bash
95+
python3 notion_pipeline.py
96+
```
97+
1. Once the pipeline has finished running, you can verify that everything loaded correctly by using
98+
the following command:
99+
```bash
100+
dlt pipeline <pipeline_name> show
101+
```
102+
For example, the `pipeline_name` for the above pipeline example is `notion`, you may also use any
103+
custom name instead.
108104

105+
For more information, read the [Walkthrough: Run a pipeline.](../../walkthroughs/run-a-pipeline)
109106

110-
## Customizations
107+
## Sources and resources
111108

112-
To load data to the destination using `dlt`, you have the option to write your own methods.
109+
`dlt` works on the principle of [sources](../../general-usage/source) and
110+
[resources](../../general-usage/resource).
113111

114-
### Source and resource methods
112+
### Source `notion_databases`
115113

116-
`dlt` works on the principle of [sources](https://dlthub.com/docs/general-usage/source)
117-
and [resources](https://dlthub.com/docs/general-usage/resource) that for this verified
118-
source are found in the `__init__.py` file within the *notion* directory.
119-
This verified source has one default method:
114+
This function loads notion databases from notion into the destination.
120115

121116
```python
122117
@dlt.source
123118
def notion_databases(
124119
database_ids: Optional[List[Dict[str, str]]] = None,
125120
api_key: str = dlt.secrets.value,
126121
) -> Iterator[DltResource]:
127-
128122
```
129123

130-
- **`database_ids`**: A list of dictionaries each containing a database id and a name.
131-
If `database_ids` is None, then the source retrieves data from all existed databases in your Notion account.
132-
- **`api_key`**: The Notion API secret key.
124+
`database_ids`: A list of dictionaries each containing a database id and a name.
125+
126+
`api_key`: The Notion API secret key.
127+
128+
> If "database_ids" is None, the source fetches data from all integrated databases in your Notion
129+
> account.
130+
131+
It is important to note that the data is loaded in “replace” mode where the existing data is
132+
completely replaced.
133+
134+
### Create your own pipeline
135+
136+
If you wish to create your own pipelines, you can leverage source and resource methods from this
137+
verified source.
138+
139+
1. Configure the pipeline by specifying the pipeline name, destination, and dataset as follows:
140+
141+
```python
142+
pipeline = dlt.pipeline(
143+
pipeline_name="notion", # Use a custom name if desired
144+
destination="duckdb", # Choose the appropriate destination (e.g., duckdb, redshift, post)
145+
dataset_name="notion_database" # Use a custom name if desired
146+
)
147+
```
148+
149+
To read more about pipeline configuration, please refer to our
150+
[documentation](../../general-usage/pipeline).
151+
152+
1. To load all the integrated databases:
153+
154+
```python
155+
load_data = notion_databases()
156+
load_info = pipeline.run(load_data)
157+
print(load_info)
158+
```
159+
160+
1. To load the custom databases:
161+
162+
```python
163+
selected_database_ids = [{"id": "0517dae9409845cba7d","use_name":"db_one"}, {"id": "d8ee2d159ac34cfc"}]
164+
load_data = notion_databases(database_ids=selected_database_ids)
165+
load_info = pipeline.run(load_data)
166+
print(load_info)
167+
```
168+
169+
The Database ID can be retrieved from the URL. For example if the URL is:
170+
171+
```shell
172+
https://www.notion.so/d8ee2d159ac34cfc85827ba5a0a8ae71?v=c714dec3742440cc91a8c38914f83b6b
173+
```
174+
175+
> The database ID in the given Notion URL is: "d8ee2d159ac34cfc85827ba5a0a8ae71".
133176
134-
The above function yields data resources from the Notion databases.
135-
It is important to note that the data is loaded in “replace” mode where the existing data is completely replaced.
177+
The database ID in a Notion URL is the string right after notion.so/, before any question marks. It
178+
uniquely identifies a specific page or database.
136179

137-
That’s it! Enjoy running your Notion DLT verified source!
180+
The database name ("use_name") is optional; if skipped, the pipeline will fetch it from Notion
181+
automatically.

0 commit comments

Comments
 (0)