-
Notifications
You must be signed in to change notification settings - Fork 0
stream: docs and tutorial for xata clone stream #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
76fa2f7
stream: docs for xata clone stream
divyenduz 1f0f2ff
stream: docs for xata clone stream
divyenduz 3eedce6
add streaming replication tutorial
divyenduz c2cc027
docs feedback
divyenduz 533e3b7
add streaming replication tutorial
divyenduz 8f5e73e
feedback
divyenduz 8c5d338
feedback
divyenduz 33eb733
feedback
divyenduz 2ccc958
feedback
divyenduz cb77a57
feedback
divyenduz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| --- | ||
| title: Stream Command | ||
| description: Commands for managing logical streaming replication operations | ||
| --- | ||
divyenduz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| The stream command helps you manage database streaming operations with `pgstream`, using logical replication. | ||
|
|
||
divyenduz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ## Subcommands | ||
|
|
||
| ### destroy | ||
|
|
||
| Destroy any pgstream setup, removing the replication slot and all the relevant tables/functions/triggers, along with the internal pgstream schema. | ||
gulcin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ```bash | ||
| xata stream destroy --source-url <url> [--config <file>] [--log-level <level>] [--postgres-url <url>] [--replication-slot <name>] [-h|--help] | ||
| ``` | ||
|
|
||
| - `--source-url`: The source URL of the database to clone (required) | ||
| - `--config`: .env or .yaml config file to use with pgstream if any | ||
| - `--log-level`: Log level for pgstream (trace|debug|info|warn|error|fatal|panic, default: info) | ||
| - `--postgres-url`: Source postgres URL where pgstream destroy will be run | ||
| - `--replication-slot`: Name of the postgres replication slot to be deleted by pgstream from the source url | ||
| - `-h, --help`: Print help information and exit | ||
|
|
||
| ## Global Flags | ||
|
|
||
| - `-h, --help` - Print help information and exit | ||
| - `--json` - Output in JSON format | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,162 @@ | ||
| --- | ||
| title: Set up a logical streaming replica | ||
| description: Use Xata's streaming replication to keep your database continuously synchronized with real-time changes. | ||
divyenduz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| --- | ||
|
|
||
| This guide shows you how to set up continuous logical streaming replication from your production PostgreSQL database to Xata, enabling real-time data synchronization with optional anonymization. | ||
|
|
||
divyenduz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|  | ||
|
|
||
| ## 1. Prerequisites | ||
|
|
||
| - A Xata account ([sign up here](https://console.xata.io)) | ||
| - The [Xata CLI](/cli) installed: | ||
| ```bash | ||
| curl -fsSL https://xata.io/install.sh | bash | ||
| ``` | ||
| - A PostgreSQL database with: | ||
| - Logical replication enabled | ||
| - Role with permissions to create a replication slow (`xata clone stream` command does that automatically) | ||
| - Network connectivity from Xata to your database | ||
|
|
||
| ## 2. Enable logical replication on source database | ||
|
|
||
| First, ensure your source PostgreSQL database has logical replication enabled. You'll need to set these parameters: | ||
|
|
||
| ```sql | ||
| -- Check current settings | ||
| SHOW wal_level; | ||
| SHOW max_replication_slots; | ||
| SHOW max_wal_senders; | ||
| ``` | ||
|
|
||
| If not already configured, update your PostgreSQL configuration: | ||
|
|
||
| ```sql | ||
| ALTER SYSTEM SET wal_level = logical; | ||
| ALTER SYSTEM SET max_replication_slots = 10; | ||
| ALTER SYSTEM SET max_wal_senders = 10; | ||
| ``` | ||
|
|
||
| Restart your PostgreSQL instance for the changes to take effect. | ||
|
|
||
| ## 3. Create a Xata project and branch | ||
|
|
||
| In the Console, create a new project and then click the **Create main branch** button to create the PostgreSQL instance. | ||
|
|
||
| For streaming replication, consider using at least 1 replica to ensure high availability during continuous synchronization. Select an instance size that can handle your expected write throughput. | ||
|
|
||
| > **Note:** Streaming replication maintains a persistent connection to your source database. Ensure your network allows stable, long-lived connections between Xata and your PostgreSQL instance. | ||
|
|
||
| ## 4. Configure the Xata CLI | ||
|
|
||
| Authenticate the CLI by running: | ||
|
|
||
| ```sh | ||
| xata auth login | ||
| ``` | ||
|
|
||
| Initialize the project by running: | ||
|
|
||
| ```sh | ||
| xata init | ||
| ``` | ||
|
|
||
| ## 5. Configure streaming replication | ||
|
|
||
| Generate a configuration for the streaming process: | ||
|
|
||
| ```bash | ||
| xata clone config --source-url $CONN_STRING | ||
| ``` | ||
|
|
||
| Where `CONN_STRING` is your PostgreSQL connection string with replication permissions. | ||
|
|
||
| The configuration prompt will ask you to: | ||
|
|
||
| - Select tables to replicate | ||
| - Set up transformation pipelines i.e. anonymization rules | ||
|
|
||
| This creates a configuration file at `.xata/clone.yaml` that you can further customize. | ||
|
|
||
| ## 6. Initialize and start streaming | ||
|
|
||
| ```bash | ||
| xata clone stream --source-url $CONN_STRING | ||
| ``` | ||
|
|
||
| This command will: | ||
|
|
||
| - Create an initial snapshot of your specified tables | ||
| - Set up the streaming pipeline | ||
| - Begin continuous replication | ||
|
|
||
| ## 7. Advanced configuration | ||
|
|
||
| ### Filtering specific tables | ||
|
|
||
| To stream only specific tables, use the `--filter-tables` flag: | ||
gulcin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ```bash | ||
| xata clone stream --source-url $CONN_STRING \ | ||
| --filter-tables "users.*,orders.*,products.*" | ||
| ``` | ||
|
|
||
| If this option is not specified it defaults to `*.*` | ||
|
|
||
| ### Custom transformations | ||
|
|
||
| Edit your `.xata/clone.yaml` file to add custom transformations: | ||
|
|
||
| ```yaml | ||
| transforms: | ||
| - table: users | ||
| columns: | ||
| - name: email | ||
| transformer: mask_email | ||
| - name: phone | ||
| transformer: redact | ||
| - table: orders | ||
| columns: | ||
| - name: credit_card | ||
| transformer: mask_credit_card | ||
| ``` | ||
|
|
||
| ### Running with Docker | ||
|
|
||
| For production deployments, consider running the streaming process in a containerized environment: | ||
|
|
||
| ```bash | ||
| docker run -d \ | ||
| --name xata-stream \ | ||
| --restart unless-stopped \ | ||
| -v $(pwd)/.xata:/config \ | ||
| xata/cli clone stream \ | ||
| --source-url $CONN_STRING | ||
| ``` | ||
|
|
||
| ## 10. Handling failures and recovery | ||
|
|
||
| If the streaming connection is interrupted, the replication slot ensures no data is lost. Simply restart the streaming command: | ||
|
|
||
divyenduz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ```bash | ||
| xata clone stream --source-url $CONN_STRING | ||
| ``` | ||
|
|
||
| The process will resume from where it left off, catching up with any changes that occurred during the downtime. | ||
divyenduz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| However, if the too much lag accumulates then the Postgres server might slow down as it has to do both catching up on the lag and its normal operations. | ||
|
|
||
| If you terminate the `xata clone stream` process and do not wish to run streaming replication again, clean up the replication slot and | ||
| other `pgstream` objects using `xata stream destroy` command. | ||
|
|
||
| Not cleaning up the replication slot will cause the WAL to be aggregated continuously and that would lead to full disk space. Use options like `max_slot_wal_keep_size` | ||
| to keep the max WAL size in check. | ||
|
|
||
| ## Summary | ||
|
|
||
| - You now have real-time streaming replication (Postgres's logical replication) from your PostgreSQL database to Xata | ||
| - Changes in your source database are automatically synchronized | ||
| - Your data can be anonymized in transit using configurable transformers | ||
| - The replication slot ensures no data loss during network interruptions | ||
|
|
||
| For more details on advanced streaming configurations and monitoring, see the [clone command documentation](/cli/clone). | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.