Skip to content

Conversation

@molkazhani2001
Copy link
Contributor

@molkazhani2001 molkazhani2001 commented Aug 21, 2025

This PR is a twin PR to the one in verified sources.

  • Mentions the necessity to own the tables to be replicated if the role is not a superuser.
  • Removes some unnecessary stuff about sql_database (i think it was from the time sql_database was in verified sources not in dlt oss)
  • Adds an explicit note why init_replication with persist_snapshot is important to avoid data loss - This addressed by a separate subsection about init_replication

Relates to 640 in verified sources.

@molkazhani2001 molkazhani2001 requested a review from VioletM August 21, 2025 11:26
@netlify
Copy link

netlify bot commented Aug 21, 2025

Deploy Preview for dlt-hub-docs ready!

Name Link
🔨 Latest commit 9372083
🔍 Latest deploy log https://app.netlify.com/projects/dlt-hub-docs/deploys/68adb919696eae00086d5bf5
😎 Deploy Preview https://deploy-preview-3020--dlt-hub-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@sh-rp sh-rp added the documentation Improvements or additions to documentation label Aug 26, 2025
- Ownership of the publication and tables to add them
- Superuser privileges if replicating an entire schema
- `SELECT` on source tables and `CREATE` on the target schema for snapshots
- Superuser or the `REPLICATION` attribute for replication slot operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an item 3: users also need ownership of the tables

@molkazhani2001 molkazhani2001 force-pushed the docs/pg_replication_improvement branch from b81688d to 9372083 Compare August 26, 2025 13:39
@sh-rp sh-rp requested a review from VioletM September 29, 2025 07:55
@anuunchin anuunchin self-assigned this Sep 29, 2025
@anuunchin anuunchin force-pushed the docs/pg_replication_improvement branch from 9372083 to 37f1380 Compare September 30, 2025 14:46
@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Sep 30, 2025

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
docs 6097a65 Commit Preview URL

Branch Preview URL
Oct 02 2025, 07:59 AM

@anuunchin anuunchin force-pushed the docs/pg_replication_improvement branch 2 times, most recently from f50f87b to 027b3c3 Compare October 1, 2025 10:06
| Name | Description |
| -------------------- | ----------------------------------------------- |
| replication_resource | Load published messages from a replication slot |
| init_replication | Initialize replication and optionally return snapshot resources for initial data load |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VioletM thought it makes sense to add it here, because it does serve as a resource 👀

@anuunchin anuunchin force-pushed the docs/pg_replication_improvement branch from 027b3c3 to 6097a65 Compare October 2, 2025 07:55
@anuunchin anuunchin moved this to In Progress in dlt core library Oct 6, 2025
Copy link
Contributor

@VioletM VioletM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments

...
```

`slot_name`: Name of the replication slot to create if it does not exist yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is just a copy-past from the verified-source repo. Do you think it makes sense to leave it here? I think it will only create an additional place to edit together with the code :)


### Snapshot resources from `init_replication`

The `init_replication` function serves two main purposes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of init_replication is kind of complex also because of all the permissions. I would also add:

   The example below make use of the `init_replication` helper from the `pg_replication` source.  
   When you run `init_replication`, Postgres is prepared for logical replication: a publication is created and tables (or the whole schema) are added, a replication slot is created, and—if `persist_snapshots=True` snapshot tables are generated to capture the initial state.

   To perform these steps your Postgres user needs the following permissions: 
   - `CREATE` on the database (or superuser) to create publications  
   - Ownership of the publication and tables to add them  
   - Superuser privileges if replicating an entire schema  
   - `SELECT` on source tables and `CREATE` on the target schema for snapshots  
   - Superuser or the `REPLICATION` attribute for replication slot operations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for taking care of it @anuunchin !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

4 participants