Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add more details about the sst in sst-metadata tool #1019

Merged
merged 5 commits into from
Jun 25, 2023

Conversation

zouxiang1993
Copy link
Contributor

@zouxiang1993 zouxiang1993 commented Jun 23, 2023

Rationale

More details about the sst are neeeded for troubleshooting problems.

Detailed Changes

  • Output some statistics about the file;
  • Output compression information;

Test Plan

Check the output of sst-meta tool.

@zouxiang1993
Copy link
Contributor Author

zouxiang1993 commented Jun 23, 2023

Output example:

FileStatistics {
	file_count: 36,
	size: 214.65,
	metadata_size: 25.34, 
	kv_size: 24.14,
	filter_size: 17.71,
	row_num: 2175886,
}
FieldStatistics: 
serviceVersion,	 compressed_size: 3.32mb,	 uncompressed_size: 7.75mb,	 compress_ratio: 2.33
extra6,	 compressed_size: 0.02mb,	 uncompressed_size: 0.02mb,	 compress_ratio: 0.75
subtag,	 compressed_size: 8.89mb,	 uncompressed_size: 21.37mb,	 compress_ratio: 2.41
tsid,	 compressed_size: 19.65mb,	 uncompressed_size: 19.98mb,	 compress_ratio: 1.02
extra4,	 compressed_size: 0.17mb,	 uncompressed_size: 0.37mb,	 compress_ratio: 2.19

@ShiKaiWi
Copy link
Member

@zouxiang1993 I'm OK with this PR but @jiacai2050 is the owner of this module, I guess we should wait for his review.

@jiacai2050
Copy link
Contributor

jiacai2050 commented Jun 25, 2023

It seems those info are duplicated with before?

FileStatistics {
	file_count: 36,
	size: 214.65,
	metadata_size: 25.34, 
	kv_size: 24.14,
	filter_size: 17.71,
	row_num: 2175886,
}

Original output

Location:2097.sst, time_range:[2023-05-09 06:00:00, 2023-05-09 08:00:00), max_seq:132734153, size:348.675M, metadata:40.142M, kv:38.175M, filter:28.525M, row_num:7038949

@zouxiang1993
Copy link
Contributor Author

It seems those info are duplicated with before?

FileStatistics {
	file_count: 36,
	size: 214.65,
	metadata_size: 25.34, 
	kv_size: 24.14,
	filter_size: 17.71,
	row_num: 2175886,
}

Original output

Location:2097.sst, time_range:[2023-05-09 06:00:00, 2023-05-09 08:00:00), max_seq:132734153, size:348.675M, metadata:40.142M, kv:38.175M, filter:28.525M, row_num:7038949

Yes. But these info are statistics among all the sst file.

@zouxiang1993 zouxiang1993 changed the title feat: add --stats in sst-metadat tool to query sst file & field stati… feat: add more details about the sst in sst-metadata tool Jun 25, 2023
@jiacai2050 jiacai2050 merged commit 03d9aa4 into apache:main Jun 25, 2023
@zouxiang1993 zouxiang1993 deleted the sst-metadata-stats branch June 26, 2023 02:18
dust1 pushed a commit to dust1/ceresdb that referenced this pull request Aug 9, 2023
## Rationale
More details about the sst are neeeded for troubleshooting problems.

## Detailed Changes
- Output some statistics about the file;
- Output compression information;

## Test Plan
Check the output of sst-meta tool.

---------

Co-authored-by: Ruixiang Tan <tanruixiang0104@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants