Skip to content

fix: Also migrate messages in PGP-contacts migration #6921

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: link2xt/pgp-contacts
Choose a base branch
from

Conversation

Hocuri
Copy link
Collaborator

@Hocuri Hocuri commented Jun 16, 2025

Previously, messages were not rewritten. This meant that all messages stayed with the old email-identified contact.
#6916 made it very obvious that all messages sent into a group before the PGP-contacts migration got the email avatar.

With this PR, all encrypted messages are rewritten to the PGP-contact identified by the current autocrypt key. It is not possible to find out which key was actually used to sign the message.

For me, this prints:

06-16 16:08:32.444 19307 19345 I DeltaChat: [accId=1] src/sql/migrations.rs:1820: Rewriting msgs for PGP contacts took 3.768890467s
06-16 16:08:41.825 19307 19345 I DeltaChat: [accId=1] src/sql/migrations.rs:1242: PGP contacts migration took 13.516607026s in total

which is slow, but the messages aren't actually the slowest part of it. We may want to try and show a spinner in the UI while the migration is being done, but let's first hear what others report as to how long this takes.

@Hocuri Hocuri requested review from iequidoo, r10s and link2xt June 16, 2025 14:21
@Hocuri
Copy link
Collaborator Author

Hocuri commented Jun 16, 2025

Actually, the messages migration probably is the largest part of it - what took long is committing the SQL transaction. I need to do this in the background.

@Hocuri Hocuri removed request for r10s, link2xt and iequidoo June 16, 2025 14:27
@Hocuri Hocuri changed the title fix: Also migrate messages in PGP-contacts migration [WIP] fix: Also migrate messages in PGP-contacts migration Jun 16, 2025
@Hocuri
Copy link
Collaborator Author

Hocuri commented Jun 16, 2025

I'm not exactly sure how to do this migration in the background. One way would be:

  • In the pgp-contacts migration, write a Json of autocrypt_pgp_contacts and autocrypt_pgp_contacts_with_reset_peerstate somewhere into the database (alternatively, create a SQL database with two columns old_contact and pgp_contact out of it). And set a config Config::MigrateMessagesToPgpContacts.
  • In run_migrations(), if Config::MigrateMessagesToPgpContacts is set, spawn a tokio task:
    • start a SQLite transaction
    • check again that Config::MigrateMessagesToPgpContacts is set
    • load the pgp-contacts-map from the database
    • execute the migration which I already implemented in the PR here.
    • clear Config::MigrateMessagesToPgpContacts
    • end of SQLite transaction

@Hocuri
Copy link
Collaborator Author

Hocuri commented Jun 16, 2025

With only rewriting the last 10.000 messages, it still took almost 5s, I will try 5.000:

06-16 22:05:31.616  6767  6988 I DeltaChat: [accId=1] src/sql/migrations.rs:1831: Rewriting msgs for PGP contacts took 45.399062ms
06-16 22:05:35.808  6767  6988 I DeltaChat: [accId=1] src/sql/migrations.rs:1243: PGP contacts migration took 4.772880362s in total

Edit: This measurement is not reliable, I lowered to 5000 msgs, and the migration took longer this time:

06-16 22:33:48.914 10194 10225 I DeltaChat: [accId=1] src/sql/migrations.rs:1831: Rewriting msgs for PGP contacts took 21.90323ms
06-16 22:33:54.657 10194 10225 I DeltaChat: [accId=1] src/sql/migrations.rs:1243: PGP contacts migration took 6.132597029s in total

Edit 2: Another run with 5000 msgs:

06-16 23:22:04.710 14394 14561 I DeltaChat: [accId=1] src/sql/migrations.rs:1831: Rewriting msgs for PGP contacts took 21.105573ms
06-16 23:22:09.819 14394 14561 I DeltaChat: [accId=1] src/sql/migrations.rs:1243: PGP contacts migration took 5.619887341s in total

Edit3:

Actually, even without rewriting message-ids, the migration takes multiple seconds:

06-17 13:08:26.312 17893 18094 I DeltaChat: [accId=1] src/sql/migrations.rs:1242: PGP contacts migration took 5.363832446s

So, probably the PR here is fine, the problem is that the PGP-contacts migration is slow in general.

Edit 4: Note that all of these were debug builds, and while we do optimize Sqlite in debg builds, maybe the LTO of release builds improves things a bit

@iequidoo
Copy link
Collaborator

iequidoo commented Jun 16, 2025

With this PR, all encrypted messages are rewritten to the PGP-contact identified by the current autocrypt key. It is not possible to find out which key was actually used to sign the message.

For 1:1 protected chats this can be solved by assigning to the pgp contact all recent messages until a ChatProtectionEnabled system message is encountered. Earlier messages it's better to assing to the email contact to be on the safe side.

For unprotected (even encrypted) chats this doesn't matter that much, preserving their encryption status (padlock) would be fine i think.

@Hocuri Hocuri changed the title [WIP] fix: Also migrate messages in PGP-contacts migration fix: Also migrate messages in PGP-contacts migration Jun 17, 2025
@Hocuri Hocuri requested review from iequidoo, link2xt and r10s June 17, 2025 11:44
@Hocuri
Copy link
Collaborator Author

Hocuri commented Jun 18, 2025

Nothing bad can happen if a few old messages are assigned to a slightly-wrong PGP contact, so I think it's fine. And, keeping the complexity low and moving forward.

@Hocuri Hocuri force-pushed the hoc/pgp-contacts-message-migration branch from c091d55 to acc754a Compare June 19, 2025 12:45
@iequidoo
Copy link
Collaborator

iequidoo commented Jun 20, 2025

Nothing bad can happen if a few old messages are assigned to a slightly-wrong PGP contact, so I think it's fine. And, keeping the complexity low and moving forward.

It's fine if ChatProtectionEnabled and ChatProtectionDisabled system messages are also moved to the PGP chat and preserve their position so that the user can be sure which messages were signed with the actual contact's key and which weren't.

@@ -1239,7 +1240,11 @@ CREATE INDEX gossip_timestamp_index ON gossip_timestamp (chat_id, fingerprint);
let start = Instant::now();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, this is the only place where we use Instant, maybe it's better to measure time with tools::Time as everywhere for uniformity

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instant is the easiest way to measure how long some operation takes, so, I think it's the best tool for the job here. We do use it in test_utils.rs, too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, Instant should be the right way to measure time in the ideal world. And we don't run tests on Android. But e.g. Context::background_fetch() uses tools::Time to measure and log elapsed time. It should be at least documented somewhere, which clock to use and when.

"SELECT id, from_id, to_id
FROM msgs
WHERE id>?
AND (param LIKE '%\nc=1%' OR param LIKE 'c=1%')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not GLOB \"*c=*\"? GLOB is case-sensitive (though LIKE is also fine in this particular case)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about GLOB, good to know! But since case-sensitivity doesn't matter here, I think it's fine to leave it as-is, since I tested it the way I wrote it, and would have to re-do this testing if I change it now.

FTR, SQL generally recommends to use ' rather than "

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine, but leaving this unresolved to maybe switch to GLOB if the PR will need retesting for some other reason

@Hocuri
Copy link
Collaborator Author

Hocuri commented Jun 20, 2025

It's fine if ChatProtectionEnabled and ChatProtectionDisabled system messages are also moved to the PGP chat and preserve their position so that the user can be sure which messages were signed with the actual contact's key and which weren't.

Yes, all messages that were in a chat together will still be in a chat together, in the order they had before. Only contact ids will be changed.

@Hocuri Hocuri force-pushed the hoc/pgp-contacts-message-migration branch from 02d55ac to c5ce09e Compare June 20, 2025 18:42
Hocuri and others added 4 commits June 20, 2025 20:48
Previously, messages were not rewritten. This meant that all messages
stayed with the old email-identified contact.
github.com//pull/6916 made it very obvious that all
messages sent into a group before the PGP-contacts migration got the
email avatar.

With this PR, all encrypted messages are rewritten to the PGP-contact
identified by the current autocrypt key. It is not possible to find out
which key was actually used to sign the message.
Co-authored-by: iequidoo <117991069+iequidoo@users.noreply.github.com>
Co-authored-by: iequidoo <117991069+iequidoo@users.noreply.github.com>
@Hocuri Hocuri force-pushed the hoc/pgp-contacts-message-migration branch from c5ce09e to 4ad3333 Compare June 20, 2025 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants