Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
925 changes: 925 additions & 0 deletions GAUSSDB_CDC_E2E_TEST_MANUAL.md

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,10 +112,16 @@ SELECT * FROM test_table;
```
5. View job execution status through Flink WebUI or downstream database.

- **GaussDB**: Supports full snapshot and incremental CDC for Huawei GaussDB using `mppdb_decoding`.
- **MySQL**: Supports full snapshot and incremental CDC for MySQL.
- ... (others)

Try it out yourself with our more detailed [tutorial](docs/content/docs/get-started/quickstart/mysql-to-doris.md).
You can also see [connector overview](docs/content/docs/connectors/pipeline-connectors/overview.md) to view a comprehensive catalog of the
connectors currently provided and understand more detailed configurations.

For GaussDB specific integration, see the [GaussDB Connector Guide](flink-cdc-connect/flink-cdc-source-connectors/flink-connector-gaussdb-cdc/README.md).

### Join the Community

There are many ways to participate in the Apache Flink CDC community. The
Expand Down
189 changes: 189 additions & 0 deletions README_TEST.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
# GaussDB CDC 一键测试

## 🚀 快速开始

```bash
./test_gaussdb_cdc.sh
```

就这么简单!脚本会自动完成:
1. ✅ 部署最新代码
2. ✅ 运行完整测试
3. ✅ 验证数据一致性

---

## 📋 测试流程

该脚本按顺序执行以下步骤:

### 步骤 1: 部署最新代码
- 强制清理 Maven 缓存
- 编译最新代码
- 部署到 Flink 集群
- 重启集群并提交作业

### 步骤 2: 运行增量同步测试
- 初始化测试环境
- 测试 INSERT 操作
- 测试 UPDATE 操作
- 测试 DELETE 操作
- 验证每个操作的同步结果

### 步骤 3: 验证数据一致性
- 比较 GaussDB 和 MySQL 记录数量
- 比较数据内容完整性
- 验证测试数据同步状态

---

## 📊 输出示例

### 成功时:
```
╔══════════════════════════════════════════════════════════════════╗
║ 测试完成报告 ║
╚══════════════════════════════════════════════════════════════════╝

测试结果:
✅ 步骤 1: 部署成功
✅ 步骤 2: 测试通过 (INSERT/UPDATE/DELETE)
✅ 步骤 3: 数据一致性验证通过

测试统计:
⏱️ 总耗时: 2 分 30 秒
📅 完成时间: 2025-12-23 18:15:00

╔══════════════════════════════════════════════════════════════════╗
║ 🎉 恭喜!所有测试通过! ║
║ GaussDB CDC 增量同步功能正常工作 ║
╚══════════════════════════════════════════════════════════════════╝
```

### 失败时:
```
╔══════════════════════════════════════════════════════════════════╗
║ ❌ 测试失败 ║
╚══════════════════════════════════════════════════════════════════╝

💡 故障排查建议:
1. 查看 Flink 日志: docker logs flink-taskmanager
2. 检查作业状态: docker exec flink-jobmanager ./bin/flink list
3. 验证数据库连接: ./run_gaussdb_test.sh init
4. 查看详细文档: cat TEST_SCRIPTS_GUIDE.md
```

---

## 🔧 单独运行各个步骤

如果需要单独运行某个步骤:

```bash
# 只部署
./deploy_gaussdb.sh

# 只测试
./run_gaussdb_test.sh test

# 只验证
./check_sync_result.sh
```

---

## 📖 详细文档

查看完整的使用指南:
```bash
cat TEST_SCRIPTS_GUIDE.md
```

---

## 🎯 返回值

- `0` - 所有测试通过
- `1` - 测试失败

可用于 CI/CD 流程:
```bash
if ./test_gaussdb_cdc.sh; then
echo "部署到生产环境"
else
echo "测试失败,停止部署"
exit 1
fi
```

---

## 📞 故障排查

### 常见问题

**Q: 部署失败怎么办?**
```bash
# 检查 Maven 构建日志
mvn clean install -DskipTests

# 检查 Docker 容器状态
docker ps
```

**Q: 测试失败怎么办?**
```bash
# 查看 Flink 日志
docker logs flink-taskmanager | tail -100

# 检查数据库连接
./run_gaussdb_test.sh init
```

**Q: 验证失败怎么办?**
```bash
# 手动检查数据
./check_sync_result.sh

# 清理测试数据重试
./run_gaussdb_test.sh cleanup
./test_gaussdb_cdc.sh
```

---

## 🔄 持续集成示例

```yaml
# .github/workflows/test.yml
name: GaussDB CDC Test

on: [push, pull_request]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run CDC Test
run: ./test_gaussdb_cdc.sh
```

---

## 📝 版本信息

- **版本**: 2.0
- **更新日期**: 2025-12-23
- **维护者**: GaussDB CDC Team

---

## 🎉 快速验证

修改代码后,只需运行:
```bash
./test_gaussdb_cdc.sh
```

就能完整验证 CDC 功能是否正常!
128 changes: 128 additions & 0 deletions check_distributed_sync_result.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
#!/bin/bash

# ==============================================================================
# GaussDB Distributed CDC Sync Result Verification Script
# 验证多个 GaussDB DN 和 MySQL 之间的数据一致性
# ==============================================================================

set -e

# 颜色定义
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m' # No Color

# DN Connection Details
DN_HOSTS=("10.250.0.30" "10.250.0.181" "10.250.0.157")
DN_PORTS=("40000" "40020" "40040")
DB_USER="tom"
DB_PASS="Gauss_235"
DB_NAME="db1"

# CN Connection Details
CN_HOST="10.250.0.51"
CN_PORT="8000"

# 测试数据ID
TEST_ID_BASE=900

echo -e "${BLUE}╔════════════════════════════════════════════════════════════╗${NC}"
echo -e "${BLUE}║ GaussDB Distributed CDC 同步结果验证 ║${NC}"
echo -e "${BLUE}╚════════════════════════════════════════════════════════════╝${NC}"
echo ""

# ========== 1. 获取所有 GaussDB DN 的汇总数据 ==========
echo -e "${CYAN}📊 Fetching data from all GaussDB DNs...${NC}"
GAUSSDB_TOTAL_DATA=""
GAUSSDB_TOTAL_COUNT=0

for i in "${!DN_HOSTS[@]}"; do
host=${DN_HOSTS[$i]}
port=${DN_PORTS[$i]}
echo -ne " -> DN$((i+1)) ($host:$port)... "
DN_DATA=$(PGPASSWORD=$DB_PASS psql -h $host -p $port -U $DB_USER -d $DB_NAME -t -A -F'|' -c "SELECT product_id, product_name, category, price, stock FROM products;" 2>/dev/null || true)
DN_COUNT=$(echo "$DN_DATA" | grep -v '^$' | wc -l | tr -d ' ')
echo -e "${GREEN}$DN_COUNT records${NC}"

if [ -n "$DN_DATA" ]; then
if [ -n "$GAUSSDB_TOTAL_DATA" ]; then
GAUSSDB_TOTAL_DATA="${GAUSSDB_TOTAL_DATA}\n${DN_DATA}"
else
GAUSSDB_TOTAL_DATA="$DN_DATA"
fi
fi
GAUSSDB_TOTAL_COUNT=$((GAUSSDB_TOTAL_COUNT + DN_COUNT))
done

echo -e " Total records in GaussDB (all DNs): ${GREEN}$GAUSSDB_TOTAL_COUNT${NC}"

# ========== 2. 获取 MySQL 数据 ==========
echo -e "${CYAN}📊 Fetching data from MySQL...${NC}"
MYSQL_DATA=$(docker exec mysql-sink mysql -uflinkuser -pflinkpw inventory --default-character-set=utf8mb4 -se "SELECT product_id, product_name, category, price, stock FROM products_sink ORDER BY product_id;" 2>/dev/null | tr '\t' '|')
MYSQL_COUNT=$(echo "$MYSQL_DATA" | grep -v '^$' | wc -l | tr -d ' ')

echo -e " Total records in MySQL: ${GREEN}$MYSQL_COUNT${NC}"
echo ""

# ========== 3. 比较记录数量 ==========
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo -e "${BLUE}📈 Record Count Comparison${NC}"
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"

if [ "$GAUSSDB_TOTAL_COUNT" -eq "$MYSQL_COUNT" ]; then
echo -e "${GREEN}✅ Record count matches: $GAUSSDB_TOTAL_COUNT records${NC}"
COUNT_MATCH=true
else
echo -e "${RED}❌ Record count mismatch!${NC}"
echo -e " GaussDB (Total): $GAUSSDB_TOTAL_COUNT records"
echo -e " MySQL: $MYSQL_COUNT records"
echo -e " Diff: $((GAUSSDB_TOTAL_COUNT - MYSQL_COUNT)) records"
COUNT_MATCH=false
fi
echo ""

# ========== 4. 比较数据内容 ==========
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo -e "${BLUE}🔍 Data Content Comparison${NC}"
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"

# 排序并比较数据
GAUSSDB_SORTED=$(echo -e "$GAUSSDB_TOTAL_DATA" | grep -v '^$' | sort)
MYSQL_SORTED=$(echo "$MYSQL_DATA" | grep -v '^$' | sort)

if [ "$GAUSSDB_SORTED" == "$MYSQL_SORTED" ]; then
echo -e "${GREEN}✅ Data content matches perfectly!${NC}"
CONTENT_MATCH=true
else
echo -e "${RED}❌ Data content mismatch detected!${NC}"
echo ""
echo -e "${YELLOW}Differences:${NC}"

# 显示差异
diff <(echo "$GAUSSDB_SORTED") <(echo "$MYSQL_SORTED") | head -20 || true

CONTENT_MATCH=false
fi
echo ""

# ========== 5. 最终结果 ==========
echo -e "${BLUE}╔════════════════════════════════════════════════════════════╗${NC}"
echo -e "${BLUE}║ 验证结果总结 ║${NC}"
echo -e "${BLUE}╚════════════════════════════════════════════════════════════╝${NC}"

if [ "$COUNT_MATCH" = true ] && [ "$CONTENT_MATCH" = true ]; then
echo -e "${GREEN}✅ Record Count: PASSED${NC}"
echo -e "${GREEN}✅ Data Content: PASSED${NC}"
echo ""
echo -e "${GREEN}🎉 All checks PASSED! Distributed CDC sync is working correctly.${NC}"
exit 0
else
[ "$COUNT_MATCH" = true ] && echo -e "${GREEN}✅ Record Count: PASSED${NC}" || echo -e "${RED}❌ Record Count: FAILED${NC}"
[ "$CONTENT_MATCH" = true ] && echo -e "${GREEN}✅ Data Content: PASSED${NC}" || echo -e "${RED}❌ Data Content: FAILED${NC}"
echo ""
echo -e "${RED}❌ Some checks FAILED. Distributed CDC sync may have issues.${NC}"
exit 1
fi
Loading