-
Notifications
You must be signed in to change notification settings - Fork 36
Fetch user metadata in background with TTL #401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughImplements a background-fetch model for user metadata and relay list management. Introduces TTL-based staleness checks for metadata, triggering background synchronization for new or stale users instead of immediate blocking updates. Integration tests are updated to use retry loops for polling updated user data before assertions. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Coverage: 83.95% → 84.02% (+0.07%) |
✅ Coverage: 83.95% → 84.04% (+0.09%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/whitenoise/users.rs (2)
47-53: Avoid hammering relays for empty metadata profiles.Because
needs_metadata_refresh()returnstruewhenever metadata is still the default, every call tofind_or_create_user_by_pubkeyfor users who have never published metadata will kick off a fullbackground_fetch_user_data(relay sync + metadata fetch). That reintroduces the old “fetch on every lookup” behaviour and can quickly pile up network load for common empty profiles. Please consider respecting the TTL even for default metadata (e.g., only retry once the 24 h window has elapsed, or track the last attempted refresh) so we don’t spin up background jobs on every invocation. New users are already covered by thecreatedbranch.
600-603: Docstring no longer matches the async flow.The comment still promises that this method updates relays and metadata inline, but the work now happens in a detached background task. Could you update the docs to explain that the sync is deferred, so callers aren’t misled into assuming fresh metadata is immediately available?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/integration_tests/test_cases/user_discovery/find_or_create_user.rs(2 hunks)src/whitenoise/users.rs(5 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.cursor/rules/rust-code-style.mdc)
**/*.rs: Write all trait bounds in a where clause for functions and impl blocks
Prefer using Self when referring to the implementing type (including return types and enum variant matches)
For public types in libraries, derive Debug, Clone, Copy, PartialEq, Eq, and Hash (in this order) when possible
Derive Default only when a reasonable default value exists
Use full paths for tracing macros (tracing::!(...)) instead of importing macro names
Prefer .to_string() or .to_owned() for &str -> String conversion over .into() or String::from
Order imports: (1) core/alloc/std, (2) external crates, (3) current sub-modules (mod x;), (4) internal crate modules (use crate::, super::, self::)
When implementing fmt traits, import core::fmt and use fmt::Display, fmt::Formatter, etc.
After declaring mod x;, import sub-items via self::x::Y instead of x::Y
Avoid if let ... { } else { } in favor of match when both branches do work
Use if let ... { } when the alternative match arm is intentionally empty
Avoid inline sub-modules (mod x { ... }); prefer a separate file x.rs with mod x; (except tests and benches)
For tests, prefer inline #[cfg(test)] mod tests { ... } and avoid mod tests;
For benches, prefer inline #[cfg(bench)] mod benches { ... } and avoid mod benches;
Files:
src/whitenoise/users.rssrc/integration_tests/test_cases/user_discovery/find_or_create_user.rs
src/integration_tests/**
📄 CodeRabbit inference engine (.cursor/rules/mls-protocol-docs.mdc)
Add and maintain MLS integration tests under src/integration_tests
Files:
src/integration_tests/test_cases/user_discovery/find_or_create_user.rs
🧠 Learnings (5)
📓 Common learnings
Learnt from: jgmontoya
Repo: parres-hq/whitenoise PR: 340
File: src/whitenoise/event_processor/mod.rs:121-158
Timestamp: 2025-09-04T16:58:00.736Z
Learning: Repository parres-hq/whitenoise: For retry scheduling hardening (schedule_retry in src/whitenoise/event_processor/mod.rs and backoff logic in RetryInfo in src/types.rs), the team prefers to land the improvements in a separate PR, tracked via a GitHub issue, as requested by jgmontoya in PR #340.
Learnt from: jgmontoya
Repo: parres-hq/whitenoise PR: 365
File: src/whitenoise/accounts.rs:296-298
Timestamp: 2025-09-14T19:36:21.619Z
Learning: Repository parres-hq/whitenoise: For the process_user_event_streams method in src/whitenoise/accounts.rs, the team prefers to address the concurrent stream processing improvement (processing streams concurrently with per-event error logging instead of sequential processing with early returns) in a separate PR, as requested by jgmontoya in PR #365.
📚 Learning: 2025-08-23T16:27:41.511Z
Learnt from: jgmontoya
Repo: parres-hq/whitenoise PR: 324
File: src/integration_tests/test_cases/subscription_processing/publish_metadata_update.rs:42-45
Timestamp: 2025-08-23T16:27:41.511Z
Learning: In Whitenoise integration tests, using `unwrap()` on `test_client.add_relay()` for dev relay connections is acceptable since these are controlled test environments where failing fast is preferred over complex error handling.
Applied to files:
src/integration_tests/test_cases/user_discovery/find_or_create_user.rs
📚 Learning: 2025-08-23T16:23:47.066Z
Learnt from: jgmontoya
Repo: parres-hq/whitenoise PR: 324
File: src/integration_tests/test_cases/subscription_processing/publish_metadata_update.rs:35-39
Timestamp: 2025-08-23T16:23:47.066Z
Learning: In the Whitenoise integration test framework, using `unwrap()` on `RelayUrl::parse()` for the hardcoded dev_relays URLs (ws://localhost:8080 and ws://localhost:7777) is acceptable since these are known, controlled addresses in the test environment where failing fast is preferred.
Applied to files:
src/integration_tests/test_cases/user_discovery/find_or_create_user.rs
📚 Learning: 2025-08-11T10:23:59.406Z
Learnt from: jgmontoya
Repo: parres-hq/whitenoise PR: 310
File: src/whitenoise/accounts/relays.rs:260-278
Timestamp: 2025-08-11T10:23:59.406Z
Learning: In `src/whitenoise/accounts/relays.rs`, the `publish_relay_list_for_account` method is only called after the account setup is complete, which guarantees that `account.nip65_relays` is already populated and will never be empty. Therefore, no fallback logic is needed when `target_relays` is None.
Applied to files:
src/integration_tests/test_cases/user_discovery/find_or_create_user.rs
📚 Learning: 2025-08-18T07:22:47.108Z
Learnt from: jgmontoya
Repo: parres-hq/whitenoise PR: 318
File: src/whitenoise/accounts/messages.rs:83-91
Timestamp: 2025-08-18T07:22:47.108Z
Learning: In the Whitenoise codebase, `nostr_mls.get_relays()` returns a `std::collections::BTreeSet<RelayUrl>`, which means relay URLs are already deduplicated and don't need additional deduplication logic.
Applied to files:
src/integration_tests/test_cases/user_discovery/find_or_create_user.rs
🧬 Code graph analysis (2)
src/whitenoise/users.rs (1)
src/whitenoise/database/users.rs (1)
find_or_create_by_pubkey(101-120)
src/integration_tests/test_cases/user_discovery/find_or_create_user.rs (3)
src/integration_tests/core/retry.rs (2)
retry_default(81-87)default(12-17)src/whitenoise/accounts.rs (1)
relays(98-106)src/whitenoise/database/users.rs (1)
relays(166-198)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test (ubuntu-latest)
- GitHub Check: Security Audit
- GitHub Check: Test (macos-14)
dannym-arx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me
Note
Fetch user metadata and relays in the background, refreshing only when metadata is stale per a 24h TTL, and update tests to await background completion.
METADATA_TTL_HOURS(24h) andUser::needs_metadata_refresh()to determine staleness ofmetadata.Whitenoise::find_or_create_user_by_pubkey()to startbackground_fetch_user_data()for new users and for existing users with stale metadata; skip sync when fresh.background_fetch_user_data()spawns a task to update relay lists first, then sync metadata, and refresh global subscription.user_discovery/find_or_create_user.rs: wait/retry for background metadata and NIP-65 relay fetch completion before assertions.needs_metadata_refresh()(default, fresh, stale, boundary).Written by Cursor Bugbot for commit 7896259. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
New Features
Improvements