Create data migration for moving to new tables #315

erskingardner · 2025-08-11T18:11:19Z

Summary by CodeRabbit

New Features
- Link users to relays with explicit relay types and created/updated timestamps for relationship history.
Bug Fixes
- Enforce unique relay URLs to prevent duplicate relays.
- Preserve follower lists and user profiles during migration to avoid inconsistencies.
Chores
- Consolidated and migrated users, follows, accounts, and relays into a unified schema and backfilled data from contacts/accounts.
- Updated tests and local relay endpoints to reflect the migrated schema.

coderabbitai · 2025-08-11T18:11:27Z

Walkthrough

Renamed schema tables (users_new→users, account_followers→account_follows), added user_relays, made relays.url UNIQUE, updated/added updated_at triggers, and added a six-step data migration (0006) to populate users, relays, user_relays, and accounts_new. Rust SQL and tests updated to use the new table names.

Changes

Cohort / File(s)	Summary
Schema & triggers `db_migrations/0005_new_tables.sql`	Renamed `users_new`→`users`, `account_followers`→`account_follows`; added `account_follows.created_at`; created `user_relays` table; added UNIQUE constraint on `relays.url`; renamed/added `updated_at` triggers for `users`, `account_follows`, and `user_relays`.
Data migration `db_migrations/0006_data_migration.sql`	New six-step migration: add unique index on `user_relays(user_id,relay_id,relay_type)`; migrate `contacts`→`users`; extract relay URLs from `contacts` and `accounts` JSON arrays into `relays`; populate `user_relays` for nip65/inbox/key_package types from `contacts` and `accounts`; migrate `accounts`→`accounts_new` with timestamp normalization and deduplication.
Rust DB — accounts `src/whitenoise/database/accounts_new.rs`	Updated SQL queries and test scaffolding to reference `account_follows` and `users` instead of `account_followers` and `users_new`; no public API changes.
Rust DB — users `src/whitenoise/database/users.rs`	Switched queries and tests from `users_new`→`users`; preserved behavior, decoding, and API surfaces.
Tests — relays `src/whitenoise/relays.rs`	Test relay URLs updated to local non-TLS WebSocket endpoints (`ws://localhost:7777`, `ws://localhost:8080`); no production logic changed.

Sequence Diagram(s)

sequenceDiagram
  actor MigrationRunner
  participant SQLite
  MigrationRunner->>SQLite: INSERT INTO users (pubkey,metadata) SELECT DISTINCT pubkey, json(...) FROM contacts WHERE NOT EXISTS(users.pubkey)
  MigrationRunner->>SQLite: INSERT OR IGNORE INTO relays(url) SELECT DISTINCT url FROM json_each(contacts.*)
  MigrationRunner->>SQLite: INSERT OR IGNORE INTO user_relays(user_id,relay_id,relay_type) SELECT users.id, relays.id, 'nip65' FROM contacts↔users↔relays
  MigrationRunner->>SQLite: INSERT OR IGNORE INTO relays(url) SELECT DISTINCT url FROM json_each(accounts.*)
  MigrationRunner->>SQLite: INSERT OR IGNORE INTO user_relays(...) SELECT users.id, relays.id, relay_type FROM accounts↔users↔relays
  MigrationRunner->>SQLite: INSERT INTO accounts_new(pubkey,user_id,settings,last_synced_at) SELECT ..., datetime(last_synced,'unixepoch') FROM accounts WHERE NOT EXISTS(...)

sequenceDiagram
  actor App
  participant DB as SQLite
  App->>DB: SELECT u.*, f.* FROM users u JOIN account_follows f ON f.user_id = u.id WHERE f.account_id = ?
  DB-->>App: rows -> mapped User objects

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~35 minutes

Possibly related PRs

Add common save and load methods for all tables #314: Overlaps on schema renames and Rust DB mappings for users, account_follows, user_relays, and relays.url UNIQUE.
Create new tables and basic database impls #311: Introduced earlier users_new/account_followers and related triggers that this PR renames/consolidates.

Suggested reviewers

jgmontoya
delcin-raj

Poem

I nibble roots and hop through rows,
New tables sprout where quiet code grows.
Follows link and relays hum,
Timestamps tick — migrations done.
I thump a paw, the schema glows. 🐇✨

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch data-migration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

jgmontoya

Looks good, just the account relays missing

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (2)

db_migrations/0006_data_migration.sql (2)
14-22: Normalize relay URLs to avoid false duplicates and empty/whitespace entries.

Optional but recommended: trim whitespace during extraction and filter on TRIM(url) != ''. Consider lowercasing scheme/host if appropriate for your domain.

Example diff for one block (apply similarly to others):
-SELECT DISTINCT
-    json_extract(relay_value.value, '$') as url
+SELECT DISTINCT
+    TRIM(json_extract(relay_value.value, '$')) AS url
 ...
-  AND json_extract(relay_value.value, '$') IS NOT NULL
-  AND json_extract(relay_value.value, '$') != '';
+  AND json_extract(relay_value.value, '$') IS NOT NULL
+  AND TRIM(json_extract(relay_value.value, '$')) != '';
If you adopt trimming here, make sure the JOINs to relays in Steps 3 and 5 also use the same TRIM() so lookups match the normalized relays.

Also applies to: 24-31, 34-41, 88-96, 98-105, 108-115

1-1: Consider wrapping the migration in a transaction.

To ensure atomicity and speed, wrap with BEGIN IMMEDIATE; ... COMMIT; unless your migration runner already does.
BEGIN IMMEDIATE;
-- migration statements...
COMMIT;

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 421bc08 and 0cc6c40.

📒 Files selected for processing (1)

db_migrations/0006_data_migration.sql (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: check (ubuntu-latest, native)

🔇 Additional comments (1)

db_migrations/0006_data_migration.sql (1)

167-169: Verify last_synced units (seconds vs milliseconds) before using datetime(..., 'unixepoch').

If last_synced is milliseconds, datetime(..., 'unixepoch') will produce dates in 1970. If it’s seconds, you’re good. Confirm the units and adjust (e.g., last_synced/1000) if needed.

Run these quick checks in your DB to infer typical ranges:
-- Inspect a sample of values
SELECT MIN(last_synced), MAX(last_synced)
FROM accounts
WHERE last_synced IS NOT NULL;

-- Compare converted dates
SELECT last_synced,
       datetime(last_synced, 'unixepoch') AS as_seconds,
       datetime(last_synced/1000, 'unixepoch') AS as_millis
FROM accounts
WHERE last_synced IS NOT NULL
LIMIT 20;

coderabbitai · 2025-08-11T19:08:41Z

db_migrations/0006_data_migration.sql

+    CASE
+        WHEN metadata IS NOT NULL AND metadata != '' THEN json(metadata)
+        ELSE NULL
+    END as metadata


⚠️ Potential issue

Avoid nullifying metadata: only call json() on valid JSON.

json(metadata) returns NULL on invalid JSON, which will drop non-JSON-but-previously-populated metadata. Check json_valid first.

- CASE - WHEN metadata IS NOT NULL AND metadata != '' THEN json(metadata) - ELSE NULL - END as metadata + CASE + WHEN json_valid(metadata) THEN json(metadata) + ELSE NULL + END AS metadata

🤖 Prompt for AI Agents

In db_migrations/0006_data_migration.sql around lines 5 to 8, the current CASE calls json(metadata) unconditionally which yields NULL for invalid JSON and drops existing non-JSON metadata; update the condition to only call json(metadata) when json_valid(metadata) is true and otherwise preserve the original metadata (do not return NULL) so invalid/non-JSON strings remain unchanged.

db_migrations/0006_data_migration.sql

coderabbitai · 2025-08-11T19:08:41Z

db_migrations/0006_data_migration.sql

+JOIN relays r ON r.url = json_extract(relay_value.value, '$'),
+     json_each(c.nip65_relays) as relay_value
+WHERE json_valid(c.nip65_relays)


⚠️ Potential issue

Fix JOIN order: ON clause references alias not yet in scope (will error in SQLite).

In each block, the JOIN relays r ... ON ... references relay_value before json_each(...) relay_value is introduced (appears after a comma). In SQLite, the ON clause can only reference tables to its left. Reorder to introduce json_each(...) first (via CROSS JOIN), then join relays.

Apply these diffs:

--- Step 3: nip65_relays -FROM contacts c -JOIN users u ON u.pubkey = c.pubkey -JOIN relays r ON r.url = json_extract(relay_value.value, '$'), - json_each(c.nip65_relays) as relay_value +FROM contacts c +JOIN users u ON u.pubkey = c.pubkey +CROSS JOIN json_each(c.nip65_relays) AS relay_value +JOIN relays r ON r.url = json_extract(relay_value.value, '$') WHERE json_valid(c.nip65_relays)

--- Step 3: inbox_relays -FROM contacts c -JOIN users u ON u.pubkey = c.pubkey -JOIN relays r ON r.url = json_extract(relay_value.value, '$'), - json_each(c.inbox_relays) as relay_value +FROM contacts c +JOIN users u ON u.pubkey = c.pubkey +CROSS JOIN json_each(c.inbox_relays) AS relay_value +JOIN relays r ON r.url = json_extract(relay_value.value, '$') WHERE json_valid(c.inbox_relays)

--- Step 3: key_package_relays -FROM contacts c -JOIN users u ON u.pubkey = c.pubkey -JOIN relays r ON r.url = json_extract(relay_value.value, '$'), - json_each(c.key_package_relays) as relay_value +FROM contacts c +JOIN users u ON u.pubkey = c.pubkey +CROSS JOIN json_each(c.key_package_relays) AS relay_value +JOIN relays r ON r.url = json_extract(relay_value.value, '$') WHERE json_valid(c.key_package_relays)

Also applies to: 66-68, 80-82

🤖 Prompt for AI Agents

In db_migrations/0006_data_migration.sql around lines 52-54 (also apply same change at 66-68 and 80-82), the JOIN ordering is incorrect because the ON clause references the alias relay_value before json_each(... ) AS relay_value is introduced; reorder the joins so json_each(c.nip65_relays) AS relay_value (as a CROSS JOIN or comma-separated table placed to the left) is introduced first, then JOIN relays r ON r.url = json_extract(relay_value.value, '$'); ensure json_valid(c.nip65_relays) remains in the WHERE clause and update each affected block accordingly.

coderabbitai · 2025-08-11T19:08:41Z

db_migrations/0006_data_migration.sql

+JOIN relays r ON r.url = json_extract(relay_value.value, '$'),
+     json_each(a.nip65_relays) as relay_value
+WHERE json_valid(a.nip65_relays)


⚠️ Potential issue

Same JOIN scoping bug in Step 5; reorder to bring json_each into scope before joining relays.

These three blocks have the same issue; reorder as in Step 3.

Apply these diffs:

--- Step 5: accounts.nip65_relays -FROM accounts a -JOIN users u ON u.pubkey = a.pubkey -JOIN relays r ON r.url = json_extract(relay_value.value, '$'), - json_each(a.nip65_relays) as relay_value +FROM accounts a +JOIN users u ON u.pubkey = a.pubkey +CROSS JOIN json_each(a.nip65_relays) AS relay_value +JOIN relays r ON r.url = json_extract(relay_value.value, '$') WHERE json_valid(a.nip65_relays)

--- Step 5: accounts.inbox_relays -FROM accounts a -JOIN users u ON u.pubkey = a.pubkey -JOIN relays r ON r.url = json_extract(relay_value.value, '$'), - json_each(a.inbox_relays) as relay_value +FROM accounts a +JOIN users u ON u.pubkey = a.pubkey +CROSS JOIN json_each(a.inbox_relays) AS relay_value +JOIN relays r ON r.url = json_extract(relay_value.value, '$') WHERE json_valid(a.inbox_relays)

--- Step 5: accounts.key_package_relays -FROM accounts a -JOIN users u ON u.pubkey = a.pubkey -JOIN relays r ON r.url = json_extract(relay_value.value, '$'), - json_each(a.key_package_relays) as relay_value +FROM accounts a +JOIN users u ON u.pubkey = a.pubkey +CROSS JOIN json_each(a.key_package_relays) AS relay_value +JOIN relays r ON r.url = json_extract(relay_value.value, '$') WHERE json_valid(a.key_package_relays)

Also applies to: 140-142, 154-156

coderabbitai · 2025-08-11T19:08:41Z

db_migrations/0006_data_migration.sql

+-- Step 6: Migrate accounts to accounts_new table
+INSERT INTO accounts_new (pubkey, user_id, settings, last_synced_at)
+SELECT
+    a.pubkey,
+    u.id as user_id,
+    a.settings,
+    CASE
+        WHEN a.last_synced IS NOT NULL THEN datetime(a.last_synced, 'unixepoch')
+        ELSE NULL
+    END as last_synced_at
+FROM accounts a
+JOIN users u ON u.pubkey = a.pubkey
+WHERE NOT EXISTS (SELECT 1 FROM accounts_new WHERE accounts_new.pubkey = a.pubkey);


🛠️ Refactor suggestion

⚠️ Potential issue

Accounts with no matching user are dropped; insert missing users from accounts first.

Currently, accounts_new is populated by joining to users. If a pubkey exists in accounts but not in contacts (Step 1), its account will be skipped. Insert any missing users from accounts before Step 6.

Add this block right after Step 1 (before Step 2):

-- Step 1b: Ensure all account pubkeys exist in users INSERT INTO users (pubkey, metadata) SELECT DISTINCT a.pubkey, NULL FROM accounts a WHERE a.pubkey IS NOT NULL AND a.pubkey != '' AND NOT EXISTS (SELECT 1 FROM users u WHERE u.pubkey = a.pubkey);

🤖 Prompt for AI Agents

In db_migrations/0006_data_migration.sql around lines 160 to 172, accounts rows with pubkeys that have no matching users are being skipped by the JOIN, so add a pre-migration step right after Step 1 (before Step 2) that inserts any distinct, non-null, non-empty pubkeys from the accounts table into users with metadata set to NULL if they don't already exist; this ensures the subsequent INSERT INTO accounts_new ... JOIN users will find a user row for every account pubkey and no accounts are dropped.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/whitenoise/relays.rs (1)

964-967: Nit: reuse relay1 instead of re-parsing the same URL

Avoid duplicating parsing for the same string; clone relay1 for target_relay.

-        let relay1 = RelayUrl::parse("ws://localhost:7777").unwrap();
-        let relay2 = RelayUrl::parse("ws://localhost:8080").unwrap();
-        let target_relay = RelayUrl::parse("ws://localhost:7777").unwrap();
+        let relay1 = RelayUrl::parse("ws://localhost:7777").unwrap();
+        let relay2 = RelayUrl::parse("ws://localhost:8080").unwrap();
+        let target_relay = relay1.clone();

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0cc6c40 and d6c85b9.

📒 Files selected for processing (1)

src/whitenoise/relays.rs (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: check (ubuntu-latest, native)
GitHub Check: check (macos-14, native)

coderabbitai · 2025-08-11T19:21:04Z

src/whitenoise/relays.rs

+        // Test relay URLs - use localhost relays that are available in test environment
+        let relay1 = RelayUrl::parse("ws://localhost:7777").unwrap();
+        let relay2 = RelayUrl::parse("ws://localhost:8080").unwrap();
+        let target_relay = RelayUrl::parse("ws://localhost:7777").unwrap();



💡 Verification agent

🧩 Analysis chain

Nit: reuse relay1 instead of re-parsing the same URL + quick verification script

Same dedupe applies here. Also, consider verifying there are no remaining TLS endpoints in tests.

- let relay1 = RelayUrl::parse("ws://localhost:7777").unwrap(); - let relay2 = RelayUrl::parse("ws://localhost:8080").unwrap(); - let target_relay = RelayUrl::parse("ws://localhost:7777").unwrap(); + let relay1 = RelayUrl::parse("ws://localhost:7777").unwrap(); + let relay2 = RelayUrl::parse("ws://localhost:8080").unwrap(); + let target_relay = relay1.clone();

🏁 Script executed:

#!/bin/bash # Find remaining TLS endpoints in code/tests rg -n -S 'wss://|https://' || true

Length of output: 60277

Refactor duplicate parsing in relays.rs

Instead of parsing the same URL twice, clone relay1 for target_relay:

• File: src/whitenoise/relays.rs (around lines 1043–1047)

- let relay1 = RelayUrl::parse("ws://localhost:7777").unwrap(); - let relay2 = RelayUrl::parse("ws://localhost:8080").unwrap(); - let target_relay = RelayUrl::parse("ws://localhost:7777").unwrap(); + let relay1 = RelayUrl::parse("ws://localhost:7777").unwrap(); + let relay2 = RelayUrl::parse("ws://localhost:8080").unwrap(); + let target_relay = relay1.clone();

I ran a global search for wss:// and https:// and found many TLS URIs in parser, integration tests, examples, etc.—those are intentionally testing URL parsing and can remain as-is.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Test relay URLs - use localhost relays that are available in test environment

let relay1 = RelayUrl::parse("ws://localhost:7777").unwrap();

let relay2 = RelayUrl::parse("ws://localhost:8080").unwrap();

let target_relay = RelayUrl::parse("ws://localhost:7777").unwrap();

// Test relay URLs - use localhost relays that are available in test environment

let relay1 = RelayUrl::parse("ws://localhost:7777").unwrap();

let relay2 = RelayUrl::parse("ws://localhost:8080").unwrap();

let target_relay = relay1.clone();

🤖 Prompt for AI Agents

In src/whitenoise/relays.rs around lines 1043 to 1047, the test parses the same URL twice — create relay1 then set target_relay by cloning relay1 instead of calling RelayUrl::parse again; update target_relay = relay1.clone() (or .to_owned() if preferred) to remove the duplicate parse and keep behavior identical.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (8)

db_migrations/0006_data_migration.sql (8)

6-14: Do not NULL-out non-JSON metadata; only json() when valid JSON.

json(metadata) returns NULL for invalid JSON; the current CASE will drop previously populated non-JSON metadata. Guard with json_valid and preserve original text when not JSON.

Apply:

 INSERT INTO users (pubkey, metadata)
 SELECT
     pubkey,
-    CASE
-        WHEN metadata IS NOT NULL AND metadata != '' THEN json(metadata)
-        ELSE NULL
-    END as metadata
+    CASE
+        WHEN json_valid(metadata) THEN json(metadata)
+        WHEN metadata IS NOT NULL AND metadata != '' THEN metadata
+        ELSE NULL
+    END AS metadata
 FROM contacts
 WHERE NOT EXISTS (SELECT 1 FROM users WHERE users.pubkey = contacts.pubkey);

165-176: Accounts without a contacts row will be dropped; seed missing users from accounts first.

If a.pubkey doesn’t exist in users (because it wasn’t in contacts Step 1), this JOIN drops the account. Insert any missing users from accounts before Step 2.

Add immediately after Step 1:

-- Step 1b: Ensure all account pubkeys exist in users
INSERT INTO users (pubkey, metadata)
SELECT DISTINCT a.pubkey, NULL
FROM accounts a
WHERE a.pubkey IS NOT NULL
  AND a.pubkey != ''
  AND NOT EXISTS (SELECT 1 FROM users u WHERE u.pubkey = a.pubkey);

63-74: Same join scoping fix needed for inbox_relays (contacts).

Apply:

 FROM contacts c
 JOIN users u ON u.pubkey = c.pubkey
-JOIN relays r ON r.url = json_extract(relay_value.value, '$'),
-     json_each(c.inbox_relays) as relay_value
+CROSS JOIN json_each(c.inbox_relays) AS relay_value
+JOIN relays r ON r.url = json_extract(relay_value.value, '$')
 WHERE json_valid(c.inbox_relays)

77-88: Same join scoping fix needed for key_package_relays (contacts).

Apply:

 FROM contacts c
 JOIN users u ON u.pubkey = c.pubkey
-JOIN relays r ON r.url = json_extract(relay_value.value, '$'),
-     json_each(c.key_package_relays) as relay_value
+CROSS JOIN json_each(c.key_package_relays) AS relay_value
+JOIN relays r ON r.url = json_extract(relay_value.value, '$')
 WHERE json_valid(c.key_package_relays)

137-148: Same join scoping fix needed for accounts.inbox_relays.

Apply:

 FROM accounts a
 JOIN users u ON u.pubkey = a.pubkey
-JOIN relays r ON r.url = json_extract(relay_value.value, '$'),
-     json_each(a.inbox_relays) as relay_value
+CROSS JOIN json_each(a.inbox_relays) AS relay_value
+JOIN relays r ON r.url = json_extract(relay_value.value, '$')
 WHERE json_valid(a.inbox_relays)

151-162: Same join scoping fix needed for accounts.key_package_relays.

Apply:

 FROM accounts a
 JOIN users u ON u.pubkey = a.pubkey
-JOIN relays r ON r.url = json_extract(relay_value.value, '$'),
-     json_each(a.key_package_relays) as relay_value
+CROSS JOIN json_each(a.key_package_relays) AS relay_value
+JOIN relays r ON r.url = json_extract(relay_value.value, '$')
 WHERE json_valid(a.key_package_relays)

49-60: SQLite join scoping bug: bring json_each into scope before joining relays.

ON clause references relay_value before it exists; reorder to CROSS JOIN json_each first, then JOIN relays.

Apply:

 FROM contacts c
 JOIN users u ON u.pubkey = c.pubkey
-JOIN relays r ON r.url = json_extract(relay_value.value, '$'),
-     json_each(c.nip65_relays) as relay_value
+CROSS JOIN json_each(c.nip65_relays) AS relay_value
+JOIN relays r ON r.url = json_extract(relay_value.value, '$')
 WHERE json_valid(c.nip65_relays)
   AND json_extract(relay_value.value, '$') IS NOT NULL
   AND json_extract(relay_value.value, '$') != '';

123-134: Same join scoping fix needed for accounts.nip65_relays.

Apply:

 FROM accounts a
 JOIN users u ON u.pubkey = a.pubkey
-JOIN relays r ON r.url = json_extract(relay_value.value, '$'),
-     json_each(a.nip65_relays) as relay_value
+CROSS JOIN json_each(a.nip65_relays) AS relay_value
+JOIN relays r ON r.url = json_extract(relay_value.value, '$')
 WHERE json_valid(a.nip65_relays)

🧹 Nitpick comments (2)

db_migrations/0006_data_migration.sql (2)
18-26: OK: deduped relay URL extraction from contacts.nip65_relays. Optional: filter to strings only.

Current checks avoid NULL/empty and rely on relays.url UNIQUE. Consider ensuring only string values are inserted to avoid numeric/boolean/object artifacts.

Apply:
 WHERE json_valid(contacts.nip65_relays)
   AND json_extract(relay_value.value, '$') IS NOT NULL
-  AND json_extract(relay_value.value, '$') != '';
+  AND json_extract(relay_value.value, '$') != ''
+  AND json_type(relay_value.value) = 'text';
Same optional filter applies to lines 33-35 and 43-45.

92-100: OK: accounts.nip65_relays URL extraction with OR IGNORE. Optional: ensure text values.

Same optional string-only filter as in contacts:
 WHERE json_valid(accounts.nip65_relays)
   AND json_extract(relay_value.value, '$') IS NOT NULL
-  AND json_extract(relay_value.value, '$') != '';
+  AND json_extract(relay_value.value, '$') != ''
+  AND json_type(relay_value.value) = 'text';
Apply similarly at 107-109 and 117-119.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6c85b9 and 8edbf68.

📒 Files selected for processing (1)

db_migrations/0006_data_migration.sql (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: check (macos-14, native)
GitHub Check: check (ubuntu-latest, native)

🔇 Additional comments (1)

db_migrations/0006_data_migration.sql (1)

1-3: Good: unique constraint up front to enforce idempotency and prevent dupes.

Creating idx_user_relays_unique before backfilling relationships is the right move, combined with OR IGNORE on inserts.

coderabbitai · 2025-08-11T19:34:51Z

db_migrations/0006_data_migration.sql

+FROM contacts
+WHERE NOT EXISTS (SELECT 1 FROM users WHERE users.pubkey = contacts.pubkey);


⚠️ Potential issue

Avoid inserting users with NULL/empty pubkeys.

Rows with NULL/empty pubkey will pass the NOT EXISTS check and attempt to insert, likely violating NOT NULL/PK constraints.

Apply:

-FROM contacts -WHERE NOT EXISTS (SELECT 1 FROM users WHERE users.pubkey = contacts.pubkey); +FROM contacts +WHERE pubkey IS NOT NULL + AND pubkey != '' + AND NOT EXISTS (SELECT 1 FROM users WHERE users.pubkey = contacts.pubkey);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

FROM contacts

WHERE NOT EXISTS (SELECT 1 FROM users WHERE users.pubkey = contacts.pubkey);

FROM contacts

WHERE pubkey IS NOT NULL

AND pubkey != ''

AND NOT EXISTS (SELECT 1 FROM users WHERE users.pubkey = contacts.pubkey);

🤖 Prompt for AI Agents

In db_migrations/0006_data_migration.sql around lines 13 to 14, the INSERT selects from contacts and will attempt to insert rows where contacts.pubkey is NULL or empty because the NOT EXISTS check alone doesn't exclude them; update the WHERE clause to exclude NULL and empty (and optionally whitespace-only) pubkeys before the NOT EXISTS check (e.g., add conditions like contacts.pubkey IS NOT NULL AND contacts.pubkey <> '' or use TRIM(contacts.pubkey) <> '') so only valid non-empty pubkeys are considered for insertion.

Create data migration for moving to new tables

421bc08

jgmontoya reviewed Aug 11, 2025

View reviewed changes

Add relay migration for accounts

0cc6c40

coderabbitai bot reviewed Aug 11, 2025

View reviewed changes

Fix flakey tests

d6c85b9

coderabbitai bot reviewed Aug 11, 2025

View reviewed changes

Add uniqueness constraint and update queries

8edbf68

coderabbitai bot reviewed Aug 11, 2025

View reviewed changes

erskingardner merged commit a06e572 into master Aug 11, 2025
4 checks passed

erskingardner deleted the data-migration branch August 11, 2025 19:53

coderabbitai bot mentioned this pull request Aug 26, 2025

Fix other issues with data migration code #331

Merged

This was referenced Sep 12, 2025

Normalize relay urls and remove limit(1) from filter #360

Merged

Track event created time on replaceable events #364

Merged

Edge case fix for old data migration #366

Merged

		FROM contacts
		WHERE NOT EXISTS (SELECT 1 FROM users WHERE users.pubkey = contacts.pubkey);

Create data migration for moving to new tables #315

Create data migration for moving to new tables #315

Uh oh!

Conversation

erskingardner commented Aug 11, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

jgmontoya left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

erskingardner commented Aug 11, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 11, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)