Skip to content

[META] Generate large data corpora (1 to 10 TB) for the big5 workload #490

Open
@gkamat

Description

@gkamat

Description

The big5 workload is available with two sizes of data corpora, 60 GB and a 100 GB. The latter features a more representative timestamp sequence. Larger data corpora would be appropriate for performance testing at scale. This issue is to track generation of such larger corpora.

Initially, a 1 TB corpus will be generated and tested out. OSB scaling and stability will also be relevant in this context. Once this size of corpus can be used effectively, larger corpora, up to 10 TB in size, perhaps with multiple indices will be tackled.

Task Breakdown

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

  • Status

    In Progress
  • Status

    In Progress
  • Status

    New
  • Status

    This Quarter

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions