Skip to content

Conversation

@dAdAbird
Copy link
Member

A new GUC option is applied to all newly created keys - both principal and internal.
We had to change the _keys file format to accommodate 32-byte internal keys. So the keys file migration routine was added at server start.

@codecov-commenter
Copy link

codecov-commenter commented Nov 12, 2025

Codecov Report

❌ Patch coverage is 38.62661% with 143 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.07%. Comparing base (44684d3) to head (2c38527).
⚠️ Report is 15 commits behind head on main.

❌ Your project status has failed because the head coverage (59.07%) is below the target coverage (90.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #465      +/-   ##
==========================================
- Coverage   59.69%   59.07%   -0.63%     
==========================================
  Files          67       68       +1     
  Lines       10471    10653     +182     
  Branches     1813     1836      +23     
==========================================
+ Hits         6251     6293      +42     
- Misses       3524     3659     +135     
- Partials      696      701       +5     
Components Coverage Δ
access 72.96% <19.49%> (-11.93%) ⬇️
catalog 87.93% <100.00%> (ø)
common 77.77% <ø> (ø)
encryption 73.63% <85.71%> (+2.06%) ⬆️
keyring 73.98% <100.00%> (+0.44%) ⬆️
src 92.11% <65.38%> (-2.10%) ⬇️
smgr 94.06% <100.00%> (ø)
transam ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@AndersAstrand AndersAstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit messages needs work since they current reveal very little of what the commits actually do.

I also think we need to discuss the technical debt we take on here by pushing generalization of the file format versions to the future. Adding "old" and "new" versions of all the functions works for now ofc, but means that it's something we need to clean up later.

Maybe it's worth taking on that debt now, I'm not sure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably log a WARNING or NOTICE if the key loaded from the KMS doesn't match the current setting for TdeKeyLength, since they obviously want some other cipher than what they're getting.

Maybe also add a error_hint() about how to rotate the principal key?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also make sense to check that the Principal Key's length matches that of the Internal Key (and issue a WARNING if it doesn't) whenever we decrypt the latter. What do you think? Or is it an overkill?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... yes. that makes sense. I wonder if we want WARNING on them or just LOG or INFO? I think we could spam the warning log a lot otherwise.

errmsg("TDE map file \"%s\" is corrupted: %m", tde_filename));
}

return fheader->file_version;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what we gain by returning this value instead of just using the field from the header structure after the call?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed like a cleaner interface to me. But looking at it now, you're right, it doesn't worth changes

typedef struct WalKeyFileEntry
{
uint32 key_len;
CipherType cipher; /* Cipher type. We support only AES for now. */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use uint32 here as we do with the other enum that's written directly to file?

This probably doesn't matter in practice for this enum as it's unlikely to hold more than a handful of values, but I would prefer to get an error if we add so many entries the size of the enum grows rather than the file format randomly changing because of it.

}
pg_tde_save_server_key(principalKey, false);
TDEXLogSmgrInitWrite(true);
TDEXLogSmgrInitWrite(true, 16);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like seeing a random 16 in the source, but will need to think what the right solution is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine as an intermediate step before the longer keys are supported in the later commits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix it in the upcoming PR(s) when taking care of fetools

@dAdAbird
Copy link
Member Author

I also think we need to discuss the technical debt we take on here by pushing generalization of the file format versions to the future. Adding "old" and "new" versions of all the functions works for now ofc, but means that it's something we need to clean up later.

Maybe it's worth taking on that debt now, I'm not sure.

@AndersAstrand what other option do you see?
We could clean up the migration and old/new stuff in the next next release if there is no posibillity to skip releases while upgrading

@AndersAstrand
Copy link
Collaborator

AndersAstrand commented Nov 25, 2025

I also think we need to discuss the technical debt we take on here by pushing generalization of the file format versions to the future. Adding "old" and "new" versions of all the functions works for now ofc, but means that it's something we need to clean up later.
Maybe it's worth taking on that debt now, I'm not sure.

@AndersAstrand what other option do you see? We could clean up the migration and old/new stuff in the next next release if there is no posibillity to skip releases while upgrading

The other option would be to not be just "old/new" but instead more explicit about versions. But I think it's ok to leave this in for now and fix it when we need to.

I think I'm also just a little allergic to using the words "old/new" in code as it tends to not say much about what it actually is.

@dAdAbird
Copy link
Member Author

I also think we need to discuss the technical debt we take on here by pushing generalization of the file format versions to the future. Adding "old" and "new" versions of all the functions works for now ofc, but means that it's something we need to clean up later.
Maybe it's worth taking on that debt now, I'm not sure.

@AndersAstrand what other option do you see? We could clean up the migration and old/new stuff in the next next release if there is no posibillity to skip releases while upgrading

The other option would be to not be just "old/new" but instead more explicit about versions. But I think it's ok to leave this in for now and fix it when we need to.

I think I'm also just a little allergic to using the words "old/new" in code as it tends to not say much about what it actually is.

Tbh, I hacked that for the POC and focused on other things after that. But I agree with your disdain for old/new. I have some Idea and will try to do an additional commit (or a refactoring PR if we merge this one by then) later this week.

*ctxPtr = EVP_CIPHER_CTX_new();

if (EVP_CipherInit_ex(*ctxPtr, cipher_ctr_ecb, NULL, key, NULL, 1) == 0)
if (EVP_CipherInit_ex(*ctxPtr, cipher, NULL, key, NULL, 1) == 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not completely related to the PR, but I realized that now we only cache the context for WAL and nothing else.

For everything else, ctx is always created internally every time.

This is just something we should investigate, as the mixed implementation most likely isn't practical.

Maybe we should also have an assert about the cipher size in this function, when we are reusing an older context?

@AndersAstrand AndersAstrand changed the title Add support for AES 256 encryption PG-1968 Add support for AES 256 encryption Nov 28, 2025
Adds a respective GUC and enables TDE AES lib to handle 128 and 256-bit keys.
Also adds the support for 256 bit encryption of primary keys
It adds `pg_tde/keys_version`, which contains the latest magic number
(version) of the wal and smgr keys. If those versions don't match the
current ones at the server start, it will rewrite _keys to the new
format if needed and update keys_version.
If the start crashed along the way updating _keys, it will retry again
on the next start since keys_version wasn't updated.
Instead of _old/_new we can extract the version from FILE_MAGIC. And
define a function that extracts an entry from this particular file
version and transforms it to the current entry format. That way, there
is no need to define new "FILE_MAGIC" constants; instead, simply bump
the version of the current ones, keeping the naming format. That way,
we are restricted to 255 file versions, but let's solve it when we get
there.

Also, all named size constants were replaced with the numeric values in
the previous format instances to avoid breaking old formats if we
change any of those sizes for the current one.

If it's ok, I'll squash it into the prev commit
WAL and replication tested with the modified pg_rewind test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants