Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add how to dump to amazon s3 for dumpling #4193

Merged
merged 3 commits into from
Nov 10, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 32 additions & 1 deletion dumpling-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ For backups of SST files (key-value pairs) or backups of incremental data that a

1. Support exporting data in multiple formats, including SQL and CSV
2. Support the [table-filter](https://github.com/pingcap/tidb-tools/blob/master/pkg/table-filter/README.md) feature, which makes it easier to filter data
3. More optimizations are made for TiDB:
3. Support exporting data to Amazon S3 cloud storage.
4. More optimizations are made for TiDB:
- Support configuring the memory limit of a single TiDB SQL statement
- Support automatic adjustment of TiDB GC time for TiDB v4.0.0 and above
- Use TiDB's hidden column `_tidb_rowid` to optimize the performance of concurrent data export from a single table
Expand Down Expand Up @@ -81,6 +82,36 @@ For example, you can export all records that match `id < 100` in `test.sbtest1`
>
> - Here you need to execute the `select * from <table-name> where id <100` statement on all tables to be exported. If some tables do not have specified fields, the export fails.

### Export data to Amazon S3 cloud storage

Since v4.0.8, Dumpling supports exporting data to cloud storages. If you need to back up data to Amazon's S3 backend storage, you need to specify the S3 storage path in the `-o` parameter.

You need to create an S3 bucket in the specified region (see the [Amazon documentation - How do I create an S3 Bucket](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-bucket.html)). If you also need to create a folder in the bucket, see the [Amazon documentation - Creating a folder](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-folder.html).

Pass `SecretKey` and `AccessKey` of the account with the permission to access the S3 backend storage to the Dumpling node as environment variables.

{{< copyable "shell-regular" >}}

```shell
export AWS_ACCESS_KEY_ID=${AccessKey}
export AWS_SECRET_ACCESS_KEY=${SecretKey}
```

Dumpling also supports reading credential files from `~/.aws/credentials`. For more Dumpling configuration, see the configuration of [BR storages](/br/backup-and-restore-storages.md), which is consistent with the Dumpling configuration.

When you back up data using Dumpling, explicitly specify the `--s3.region` parameter, which means the region of the S3 storage:

{{< copyable "shell-regular" >}}

```shell
./dumpling \
-u root \
-P 4000 \
-h 127.0.0.1 \
-o "s3://${Bucket}/${Folder}" \
--s3.region "${region}"
```

### Filter the exported data

#### Use the `--where` option to filter data
Expand Down