Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

populate DB instance with dummy data for benchmarks #8048

Closed
9 tasks done
Tracked by #7035
alexandr-shegeda opened this issue Nov 17, 2021 · 1 comment
Closed
9 tasks done
Tracked by #7035

populate DB instance with dummy data for benchmarks #8048

alexandr-shegeda opened this issue Nov 17, 2021 · 1 comment
Assignees
Labels
area/connectors Connector related issues area/databases

Comments

@alexandr-shegeda
Copy link
Contributor

alexandr-shegeda commented Nov 17, 2021

Tell us about the problem you're trying to solve

We need to implement an SQL script that will generate a dummy date for benchmark purposes.
Desired row/table size described in this doc

Row Size (B) Size (KB)
Regular 10,000 10
Small 500 0.50
Table Row Count Regular Row (%) Small Row (%) Large Row (%) Table Size (B) Table Size (GB)
Regular 50,000 99% 0% 1% 995,000,000 1
Small 10,000 0% 100% 0% 5,000,000 0.01
Database Table Count Regular Table Small Table Large Table Database Size (B) Database Size (GB)
Regular 25 25 0 0 24,875,000,000 25
Half-regular 10 10 0 0 10,737,000,000 10
Many small tables 1,000 0 1,000 0 5,000,000,000 5

Describe the solution you’d like

  • prepare the SQL procedure for populating the table with dammy data
  • create/populate the database with a small data set (1000 streams with total size 5GB)
  • create/populate the database with a half-regular data set (10 streams with total size 10GB)
  • create/populate the database with a regular data set (25 streams with total size 25GB)
  • modify the script and create databases for Postgres
  • modify the script and create databases for MsSQL
  • add all scripts and documentation for their running to repository
  • test with different options for CPU and Memory values
  • make it possible for performance tests to transfer CPU and Memory values as parameters

Additional context- [ ] modify the script and create databases for MsSQL

In order to reduce computing time, we can generate data for 1 table and then clone this table so many times as we need.

@alexandr-shegeda alexandr-shegeda changed the title populate DB instance with dummy data populate DB instance with dummy data for benchmarks Nov 17, 2021
@andriikorotkov
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/databases
Projects
No open projects
Archived in project
Development

No branches or pull requests

2 participants