Skip to content
/ server Public

MDEV-31669: use XXH3 as a digest instead of md5#4573

Open
MohamedM216 wants to merge 1 commit intoMariaDB:mainfrom
MohamedM216:feat-use-sha2-256/MDEV-31669
Open

MDEV-31669: use XXH3 as a digest instead of md5#4573
MohamedM216 wants to merge 1 commit intoMariaDB:mainfrom
MohamedM216:feat-use-sha2-256/MDEV-31669

Conversation

@MohamedM216
Copy link
Contributor

@MohamedM216 MohamedM216 commented Jan 21, 2026

Jira Issue number for this PR: MDEV-31669

I followed mysql implementation as suggested in jira.

PERFORMANCE SCHEMA MD5 DIGEST NEEDS TO CHANGE DIGEST FOR FIPS
COMPLIANCE
Before this fix, DIGEST hashes are computed using the MD5 hash.
MD5 is not FIPS compliant.
This fix replaces the MD5 hash with SHA256 (which is compliant).
As a result, DIGEST columns are changed from VARCHAR(32) to VARCHAR(64)

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Jan 22, 2026
Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a preliminary review. Right now there's no strong opposition against removing MD5 from this. But the choice of a replacement hash function is yet to be determined. I can see the logic in Sergei's argument that using a crypto hash for a non-crypto purpose is kind of a mismatch. But I will leave that discussion for the final review now.

First of all: please make sure the change compiles and runs all buildbot test in a satisfactory way.

@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from 899f3b3 to 3e7b460 Compare January 22, 2026 22:33
@MohamedM216 MohamedM216 requested a review from gkodinov January 23, 2026 04:39
Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please switch this to XXH3, as suggested by Sergei. I will then do a full review of the result once this is done.

@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from 3e7b460 to 241cb0c Compare February 18, 2026 15:32
@MohamedM216 MohamedM216 changed the title MDEV-31669: use sha2-256 as a digest instead of md5 MDEV-31669: use XXH3 as a digest instead of md5 Feb 18, 2026
@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from 241cb0c to ea765b3 Compare February 19, 2026 12:43
@MohamedM216
Copy link
Contributor Author

Apologies, I accidentally closed this PR.
I will reopen it shortly.

@MohamedM216 MohamedM216 reopened this Feb 19, 2026
Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for switching to XXhash.

OK to push from me, given that you add a commit message to the commit that follows CODING_STANDARDS.md.

Please wait for the final review.

@gkodinov gkodinov requested a review from vuvova February 19, 2026 14:47
@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from ea765b3 to 233436e Compare February 19, 2026 23:28
@MohamedM216
Copy link
Contributor Author

Hi @gkodinov , I've updated the commit message. Thanks!

Copy link
Member

@vuvova vuvova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main comment here: I've looked at xxhash.h that you're using and noticed that it has 128-bit hash. This is the same width as MD5 so less changes and also same chance of collison, 64-bit hash has it much higher.

could you try with XXH3_128bit please?


--disable_query_log
create table test._digests(d varchar(32) primary key);
create table test._digests(d varchar(16) primary key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep it at 32 here. It'd be good to know that existing applications (that query and, perhaps, temporarily store these digests) will continue to work

compute_md5_hash(md5,
(const char *) digest_storage->m_token_array,
digest_storage->m_byte_count);
XXH64_hash_t res = XXH3_64bits(digest_storage->m_token_array,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, perhaps we should use XXH3_128bits? it exists in the same file, so as easy to use. but has the same width as md5, so less changes and more importantly same chance of collisions as before.

@MohamedM216
Copy link
Contributor Author

OK, I'll work on that. Thanks for the feedback!

@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch 2 times, most recently from 8b3c053 to bc14604 Compare February 26, 2026 23:23
- Following MySQL implementation as mentioned in the PR description
- XXH is significantly faster than both MD5 and SHA-256
- Here we use the 128-bit XXH3
- Keep DIGEST_HASH_SIZE at 16 bytes (same as MD5)
- Reduce COL_DIGEST_SIZE to 32 chars
- Keep DIGEST columns as VARCHAR(32)
@MohamedM216 MohamedM216 force-pushed the feat-use-sha2-256/MDEV-31669 branch from bc14604 to 8ad7bd6 Compare February 26, 2026 23:42
@MohamedM216 MohamedM216 requested a review from vuvova February 26, 2026 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

3 participants