New integration tests framework #324

jgmontoya · 2025-08-23T15:42:54Z

Refactored integration tests from a single monolithic file into a modular framework using scenario-based testing
Created a ScenarioRegistry system that organizes tests into separate modules for better maintainability.

Sample output:

Summary by CodeRabbit

Tests
- Added a comprehensive, scenario-based integration test framework with reusable test cases and summary reporting (account management, app settings, messaging, follows, groups, subscriptions). Updated the integration test runner to execute all scenarios. Feature-gated to avoid production impact.
Documentation
- New integration tests README. Corrected a typo in documentation.
Chores
- Enabled “all” features for editor tooling. Added a new Cargo feature flag for integration tests and updated tasks to use it.
Refactor
- Simplified the integration test entrypoint to a centralized runner.

…nitial tests

… with initial tests

…ging scenario

…cenario Also add cleanup step after scenario execution that calls logout to remove the stored secret and reset the db

…ription based metadata and relay updates Also refactors account_management login to be cleaner Adjust sleeps throughout the integrationt tests

coderabbitai · 2025-08-23T15:43:03Z

Warning

Rate limit exceeded

@jgmontoya has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 8 minutes and 47 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 2f4844a and b61e792.

📒 Files selected for processing (14)

.cursor/rules/integration-test.mdc (1 hunks)
.github/workflows/ci.yml (1 hunks)
Cargo.toml (1 hunks)
src/integration_tests/README.md (1 hunks)
src/integration_tests/scenarios/basic_messaging.rs (1 hunks)
src/integration_tests/scenarios/group_membership.rs (1 hunks)
src/integration_tests/scenarios/mod.rs (1 hunks)
src/integration_tests/test_cases/account_management/login.rs (1 hunks)
src/integration_tests/test_cases/metadata_management/fetch_metadata.rs (1 hunks)
src/integration_tests/test_cases/metadata_management/update_metadata.rs (1 hunks)
src/integration_tests/test_cases/shared/accept_group_invite.rs (1 hunks)
src/integration_tests/test_cases/shared/send_message.rs (1 hunks)
src/integration_tests/test_cases/subscription_processing/publish_metadata_update.rs (1 hunks)
src/integration_tests/test_cases/subscription_processing/publish_relay_list_update.rs (1 hunks)

Walkthrough

Introduces a feature-gated integration testing framework and refactors the integration test binary to delegate to a ScenarioRegistry. Adds core testing traits/utilities, multiple scenarios and test cases, and documentation. Adjusts configs (Cargo feature, VS Code), justfile recipe, and minor Whitenoise core changes (conditional DB visibility, unfollow behavior). Includes a small doc typo fix.

Changes

Cohort / File(s)	Summary
Tooling & Config `.cursor/rules/integration-test.mdc`, `.vscode/settings.json`, `Cargo.toml`, `justfile`	Fixes a doc typo; enables rust-analyzer features=all; adds Cargo feature `integration-tests` with empty default; updates `int-test` to run with `--features integration-tests`.
Integration Test Entrypoint & Wiring `src/bin/integration_test.rs`, `src/lib.rs`, `src/integration_tests/mod.rs`, `src/integration_tests/README.md`, `src/integration_tests/registry.rs`	Replaces inline flow with `ScenarioRegistry::run_all_scenarios`; gates `integration_tests` module behind `integration-tests` feature; adds integration tests module exports, documentation, and a registry to run and summarize scenarios.
Integration Test Core Harness `src/integration_tests/core/*`	Adds ScenarioContext (state, counters), ScenarioResult, Nostr test client helpers, and core traits (TestCase, Scenario) with standardized execution, timing, logging, and cleanup.
Scenarios `src/integration_tests/scenarios/*`	Adds scenarios for account management, app settings, basic and advanced messaging, follow management, group membership, metadata management, and subscription processing; exposes via module re-exports.
Test Cases — Shared `src/integration_tests/test_cases/shared/*`	Adds shared test cases: create accounts, create group, accept group invite, send message (with reactions/replies support).
Test Cases — Account Management `src/integration_tests/test_cases/account_management/*`	Adds login and logout account test cases; module re-exports.
Test Cases — App Settings `src/integration_tests/test_cases/app_settings/*`	Adds fetch app settings and update theme mode test cases; module re-exports.
Test Cases — Follow Management `src/integration_tests/test_cases/follow_management/*`	Adds follow, unfollow, and bulk follow test cases; module re-exports.
Test Cases — Group Membership `src/integration_tests/test_cases/group_membership/*`	Adds add/remove group members test cases with success/failure expectation handling; module re-exports.
Test Cases — Metadata Management `src/integration_tests/test_cases/metadata_management/*`	Adds fetch/update metadata test cases; module re-exports.
Test Cases — Subscription Processing `src/integration_tests/test_cases/subscription_processing/*`	Adds publish metadata update and relay list update test cases; module re-exports.
Whitenoise Core Adjustments `src/whitenoise/mod.rs`, `src/whitenoise/follows.rs`	Makes `Whitenoise.database` public only under `integration-tests` feature; changes unfollow to use `find_or_create_by_pubkey` instead of `find_user_by_pubkey`.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Dev as Developer
  participant Bin as integration_test (bin)
  participant Reg as ScenarioRegistry
  participant Sc as Scenario (N)
  participant TC as TestCase (M)
  participant WN as Whitenoise
  participant NR as Nostr Relays (dev)

  Dev->>Bin: cargo run --features integration-tests
  Bin->>Reg: run_all_scenarios(&Whitenoise)
  loop For each Scenario
    Reg->>Sc: new(&Whitenoise)
    Sc->>Sc: execute() [timed]
    activate Sc
    loop For each TestCase
      Sc->>TC: execute(context)
      activate TC
      TC->>WN: API calls (e.g., create/login/group/send)
      alt External publish needed
        TC->>NR: connect/publish (metadata/relays/messages)
        NR-->>TC: ack
      end
      WN-->>TC: results
      TC-->>Sc: Ok/Err (record_test)
      deactivate TC
    end
    Sc->>WN: cleanup (logout, clear DB)
    Sc-->>Reg: ScenarioResult (+optional error)
    deactivate Sc
    Reg->>Reg: brief delay
  end
  Reg-->>Bin: First error or Ok
  Bin-->>Dev: Summary logs (per-scenario + totals)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Clean up for flutter #321 — Both touch relay/user handling and Whitenoise internals; aligns with unfollow path and relay interactions used in tests.
Add common save and load methods for all tables #314 — Also adjusts Whitenoise/database modules and exports; related to feature-gated DB visibility and test access.
Remove whitenoise methods #322 — Refactors integration test binary interactions; closely related to delegating to ScenarioRegistry.

Suggested reviewers

erskingardner
delcin-raj

Poem

In burrows deep, I queue each test,
With whiskered logs and fluffy zest.
Scenarios hop, results accrue,
Carrots for green, few for the blue.
Feature flags twinkle, relays chime—
All paws report: it’s hop-time! 🥕🐇

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch refactor/integration-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 21

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/whitenoise/follows.rs (1)
39-43: Unfollow must not create users; revert to “find-only” semantics and short-circuit if missing.

Creating a user in the unfollow path contradicts the method docs (Lines 28–33) and can bloat the DB with phantom users. If the user doesn’t exist, the operation should be a no-op.

Apply this fix to avoid creating users during unfollow:
-        let (user, _) = User::find_or_create_by_pubkey(pubkey, &self.database).await?;
-        account.unfollow_user(&user, &self.database).await?;
-        self.background_publish_account_follow_list(account).await?;
-        Ok(())
+        // Do not create a user on unfollow; if the user doesn't exist, there's nothing to remove.
+        let user = match self.find_user_by_pubkey(pubkey).await {
+            Ok(u) => u,
+            Err(crate::whitenoise::error::WhitenoiseError::UserNotFound) => return Ok(()),
+            Err(e) => return Err(e),
+        };
+        account.unfollow_user(&user, &self.database).await?;
+        self.background_publish_account_follow_list(account).await?;
+        Ok(())
src/bin/integration_test.rs (1)
1-39: Add feature gating for the integration test binary and define the Justfile int-test target

The repository currently does not define an integration-tests feature in Cargo.toml, so the integration_test binary is always built. Likewise, no Justfile rule was found that exports WHITENOISE_JUST_INT_TEST or wraps the integration_test invocation. Please make the following changes:

• In Cargo.toml, add an integration-tests feature and gate the bin entry:
[features]
integration-tests = []

[[bin]]
name = "integration_test"
path = "src/bin/integration_test.rs"
required-features = ["integration-tests"]
• Ensure that any workspace or root Cargo.toml enables this feature in CI/instructions only when running integration tests (e.g. via --features integration-tests).

• In your Justfile, add (or update) an int-test target that:
Exports a guard variable (e.g. export WHITENOISE_JUST_INT_TEST=1)
Invokes the binary with the required --data-dir and --logs-dir arguments, e.g.:
int-test DATA_DIR=tests/data LOGS_DIR=tests/logs:
    export WHITENOISE_JUST_INT_TEST=1
    cargo run --features integration-tests --bin integration_test -- \
      --data-dir ${DATA_DIR} --logs-dir ${LOGS_DIR}
These changes will ensure the integration test suite only builds and runs when explicitly requested.

🧹 Nitpick comments (80)

src/integration_tests/README.md (3)
20-25: Align design rule wording with the leakage caveat.

The “No Cross-Contamination” rule is absolute, but earlier we acknowledge potential leakage. Adjust wording to be consistent.
-- **No Cross-Contamination**: No data sharing between scenarios
+- **No Cross-Contamination**: Avoid data sharing between scenarios; design for isolation and idempotency.
175-181: Explain why .execute should be used instead of .run inside scenarios.

Readers won’t know what .execute adds. Add a short note (e.g., metrics, error handling, context mutations).
-           // Note: Use .execute instead of .run inside a scenario.
+           // Note: Use .execute instead of .run inside a scenario:
+           // it wraps run() with standardized timing/metrics, error mapping,
+           // and consistent context/test counters updates.
61-73: Add a short “How to run” section and prerequisites (dev relays).

The framework README should tell newcomers how to execute scenarios and what local services are required. The default dev relay endpoints appear to be ws://localhost:8080 and ws://localhost:7777 (see core/context.rs).

Proposed addition (place after “Directory Structure”):
+## Running the Integration Tests
+
+- Use the Just recipe (recommended): `just int-test`  
+  This runs the `integration_test` binary with the `integration-tests` feature
+  and standardized data/log directories.
+- Prerequisites:
+  - Local dev relays available at `ws://localhost:8080` and `ws://localhost:7777`.
+    These defaults are encoded in `core/context.rs` and are required for scenarios to pass.
+- For CI or alternate setups, override `--data-dir` and `--logs-dir` as needed,
+  but keep paths isolated per run to avoid residue between scenarios.
If you’d like, I can open a small follow-up PR to add this section.
src/integration_tests/core/test_clients.rs (2)
5-16: Avoid flaky sleeps; prefer a readiness check (or delegate connectivity to NostrManager).

A fixed 300ms sleep after connect is brittle and can cause intermittent failures on slower networks or CI. Either:

Wait until at least one relay is connected (with a bounded timeout), or

Delegate publishing to NostrManager APIs which already ensure relay connectivity per prior learnings.

As a small, low-risk improvement, set the signer before connecting and slightly increase the buffer to reduce early publish races:
-    client.connect().await;
-    client.set_signer(keys).await;
-    tokio::time::sleep(Duration::from_millis(300)).await;
+    client.set_signer(keys).await;
+    client.connect().await;
+    tokio::time::sleep(Duration::from_millis(500)).await;
If you want, I can wire a helper like wait_for_any_connected(client, timeout) that polls relay statuses and returns early once ready.

36-68: Validate tag kinds against the SDK; consider dedicated constructors to reduce breakage.

The manual construction of tags for NIP-65 and the MLS relay kinds looks fine, but it couples to internal TagKind variants. If the SDK exposes helpers (e.g., Tag::relay(url) or specific constructors for NIP-65), prefer them to future-proof against TagKind changes.

Also, confirm that:

Single-letter tag uses lowercase 'r' (you did via SingleLetterTag::lowercase(Alphabet::R)).

Kind::InboxRelays and Kind::MlsKeyPackageRelays accept relay tags per the crate you depend on.

If helpers exist, I can propose a concrete refactor patch.
src/integration_tests/test_cases/account_management/mod.rs (1)
1-5: Avoid wildcard re-exports to keep the test API stable and explicit

Explicit re-exports reduce accidental symbol leakage and name collisions across test modules while maintaining ergonomics.

Apply this diff:
 pub mod login;
 pub mod logout_account;

-pub use login::*;
-pub use logout_account::*;
+pub use login::LoginTestCase;
+pub use logout_account::LogoutAccountTestCase;
src/integration_tests/test_cases/subscription_processing/mod.rs (1)
1-5: Use explicit re-exports instead of * for clarity and to avoid symbol drift

Keeps your public test surface intentional and easier to grep.
 pub mod publish_metadata_update;
 pub mod publish_relay_list_update;

-pub use publish_metadata_update::*;
-pub use publish_relay_list_update::*;
+pub use publish_metadata_update::PublishMetadataUpdateTestCase;
+pub use publish_relay_list_update::PublishRelayListUpdateTestCase;
src/integration_tests/test_cases/metadata_management/mod.rs (1)
1-5: Prefer explicit over wildcard re-exports for maintainability

Prevents accidental exposure if new items are added to submodules later.
 pub mod fetch_metadata;
 pub mod update_metadata;

-pub use fetch_metadata::*;
-pub use update_metadata::*;
+pub use fetch_metadata::FetchMetadataTestCase;
+pub use update_metadata::UpdateMetadataTestCase;
src/integration_tests/test_cases/advanced_messaging/mod.rs (2)
4-5: Prefer explicit re-exports to avoid namespace bleed.

Re-exporting only the intended test case types keeps the surface area stable and avoids accidental symbol leakage if submodules add helpers later.

Apply this diff:
-pub use aggregate_messages::*;
-pub use delete_message::*;
+pub use self::aggregate_messages::AggregateMessagesTestCase;
+pub use self::delete_message::DeleteMessageTestCase;
1-5: Add brief module docs for discoverability.

A short summary helps scan the registry and scenario wiring.

Apply this diff:
+/// Advanced messaging integration test cases (aggregate, delete, …).
 pub mod aggregate_messages;
 pub mod delete_message;

-pub use aggregate_messages::*;
-pub use delete_message::*;
+pub use self::aggregate_messages::AggregateMessagesTestCase;
+pub use self::delete_message::DeleteMessageTestCase;
src/integration_tests/test_cases/app_settings/mod.rs (2)
4-5: Export concrete types instead of wildcard.

Keeps the module API intentional and resilient to future additions.

Apply this diff:
-pub use fetch_app_settings::*;
-pub use update_theme_mode::*;
+pub use self::fetch_app_settings::FetchAppSettingsTestCase;
+pub use self::update_theme_mode::UpdateThemeModeTestCase;
1-5: Add a concise module-level doc.

Clarifies the purpose when navigating tests.

Apply this diff:
+/// App settings integration test cases (fetch and theme updates).
 pub mod fetch_app_settings;
 pub mod update_theme_mode;

-pub use fetch_app_settings::*;
-pub use update_theme_mode::*;
+pub use self::fetch_app_settings::FetchAppSettingsTestCase;
+pub use self::update_theme_mode::UpdateThemeModeTestCase;
src/integration_tests/test_cases/metadata_management/fetch_metadata.rs (2)
1-1: Avoid glob import from core.

Import only what’s used to reduce chances of name collisions and speed up IDE tooling.

Apply this diff:
-use crate::integration_tests::core::*;
+use crate::integration_tests::core::{ScenarioContext, TestCase};
25-31: Guard against empty names, not just missing Option.

If name is present but empty/whitespace, the current assertion passes. Slightly tighten it to catch empty values too.

Apply this diff:
-        assert!(
-            metadata.name.is_some(),
-            "Metadata name is missing for account {}",
-            self.account_name
-        );
-        tracing::info!("✓ Metadata name is present: {:?}", metadata.name);
+        let non_empty_name = metadata
+            .name
+            .as_deref()
+            .map(str::trim)
+            .filter(|s| !s.is_empty());
+        assert!(
+            non_empty_name.is_some(),
+            "Metadata name is missing or empty for account {}",
+            self.account_name
+        );
+        tracing::info!("✓ Metadata name is present: {:?}", non_empty_name);
src/integration_tests/test_cases/follow_management/mod.rs (2)
5-7: Use explicit re-exports for clarity.

Prevents accidental exposure of helpers added later in submodules.

Apply this diff:
-pub use bulk_follow_users::*;
-pub use follow_user::*;
-pub use unfollow_user::*;
+pub use self::bulk_follow_users::BulkFollowUsersTestCase;
+pub use self::follow_user::FollowUserTestCase;
+pub use self::unfollow_user::UnfollowUserTestCase;
1-7: Add a small doc comment.

Improves scan-ability in the registry.

Apply this diff:
+/// Follow-management integration test cases (follow, bulk follow, unfollow).
 pub mod bulk_follow_users;
 pub mod follow_user;
 pub mod unfollow_user;

-pub use bulk_follow_users::*;
-pub use follow_user::*;
-pub use unfollow_user::*;
+pub use self::bulk_follow_users::BulkFollowUsersTestCase;
+pub use self::follow_user::FollowUserTestCase;
+pub use self::unfollow_user::UnfollowUserTestCase;
src/integration_tests/core/mod.rs (1)
6-9: Consider a dedicated prelude to scope imports explicitly.

If downstream modules routinely use crate::integration_tests::core::*, offering core::prelude::* can keep star-imports restricted while leaving core itself explicit. Optional; the current approach is fine for a test-only surface.

Example addition outside the selected lines:
// in this file, after existing re-exports
pub mod prelude {
    pub use super::{context::*, traits::*, test_clients::*, scenario_result::*};
}
src/integration_tests/test_cases/shared/mod.rs (1)
6-9: Prefer explicit re-exports to prevent future name collisions

Wildcard re-exports from multiple child modules can introduce accidental symbol collisions as the suite grows. Export only the intended test-case types.

Apply this diff to re-export only the known test case types (adjust if the type names differ):
-pub use accept_group_invite::*;
-pub use create_accounts::*;
-pub use create_group::*;
-pub use send_message::*;
+pub use accept_group_invite::AcceptGroupInviteTestCase;
+pub use create_accounts::CreateAccountsTestCase;
+pub use create_group::CreateGroupTestCase;
+pub use send_message::SendMessageTestCase;
src/integration_tests/scenarios/app_settings.rs (2)
23-29: Default expectation may be brittle without a clean data dir

You assert ThemeMode::System on first fetch. That’s fine if the DB is empty/new, but will fail if a prior run persisted a different theme. Ensure the scenario always runs against a clean data directory, or make the test resilient by explicitly setting System first.

If you want to make this scenario self-sufficient, consider setting the theme to System before asserting:
     // Test fetching default settings
-    FetchAppSettingsTestCase::basic()
-        .expect_theme(ThemeMode::System)
-        .execute(&mut self.context)
-        .await?;
+    UpdateThemeModeTestCase::to_system()
+        .execute(&mut self.context)
+        .await?;
+    FetchAppSettingsTestCase::basic()
+        .expect_theme(ThemeMode::System)
+        .execute(&mut self.context)
+        .await?;
31-43: Assert post-conditions after each update for tighter feedback

If UpdateThemeModeTestCase internally asserts persistence, this is fine. If not, add a follow-up fetch to verify state, which makes failures easier to locate in logs.

Example follow-up checks after each update:
         // Test updating to dark mode
         UpdateThemeModeTestCase::to_dark()
             .execute(&mut self.context)
             .await?;
+        FetchAppSettingsTestCase::basic()
+            .expect_theme(ThemeMode::Dark)
+            .execute(&mut self.context)
+            .await?;

         // Test updating to light mode
         UpdateThemeModeTestCase::to_light()
             .execute(&mut self.context)
             .await?;
+        FetchAppSettingsTestCase::basic()
+            .expect_theme(ThemeMode::Light)
+            .execute(&mut self.context)
+            .await?;

         // Test updating back to system mode
         UpdateThemeModeTestCase::to_system()
             .execute(&mut self.context)
             .await?;
+        FetchAppSettingsTestCase::basic()
+            .expect_theme(ThemeMode::System)
+            .execute(&mut self.context)
+            .await?;
src/bin/integration_test.rs (2)
7-8: Conditionally import ScenarioRegistry with the same feature gate

Prevents unused-import/compile issues when the feature is off.
-use integration_tests::registry::ScenarioRegistry;
+#[cfg(feature = "integration-tests")]
+use integration_tests::registry::ScenarioRegistry;
34-36: Log the failure reason before propagating to process exit

You currently bubble the error with ?. Logging it explicitly yields clearer CI output, while preserving non-zero exit.
-    ScenarioRegistry::run_all_scenarios(whitenoise).await?;
+    if let Err(err) = ScenarioRegistry::run_all_scenarios(whitenoise).await {
+        tracing::error!("Integration scenarios failed: {err}");
+        return Err(err);
+    }
src/integration_tests/test_cases/follow_management/bulk_follow_users.rs (2)
36-50: Reduce flakiness: replace fixed sleep with a bounded wait loop

A fixed 100ms sleep may be insufficient under load and can cause occasional false negatives. Poll with a timeout to await eventual consistency.
-        // Add small delay for async operations
-        tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
-
-        for pubkey in &self.target_pubkeys {
-            let is_following = context
-                .whitenoise
-                .is_following_user(account, pubkey)
-                .await?;
-            assert!(
-                is_following,
-                "Account {} should be following user {} after bulk follow",
-                self.follower_account_name,
-                &pubkey.to_hex()[..8]
-            );
-        }
+        for pubkey in &self.target_pubkeys {
+            let res = tokio::time::timeout(
+                tokio::time::Duration::from_secs(5),
+                async {
+                    loop {
+                        let is_following = context
+                            .whitenoise
+                            .is_following_user(account, pubkey)
+                            .await?;
+                        if is_following {
+                            return Ok::<(), WhitenoiseError>(());
+                        }
+                        tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
+                    }
+                },
+            )
+            .await;
+
+            match res {
+                Ok(Ok(())) => {}
+                Ok(Err(e)) => return Err(e),
+                Err(_) => panic!(
+                    "Account {} should be following user {} after bulk follow (timed out)",
+                    self.follower_account_name,
+                    &pubkey.to_hex()[..8]
+                ),
+            }
+        }
44-49: Safer short-hex formatting (avoid potential slice panics)

Hex strings should be long enough, but string slicing by byte index is brittle. Prefer a helper that takes the first 8 chars.
-                "Account {} should be following user {} after bulk follow",
-                self.follower_account_name,
-                &pubkey.to_hex()[..8]
+                "Account {} should be following user {} after bulk follow",
+                self.follower_account_name,
+                short_hex(pubkey)
Add this helper anywhere in the module:
fn short_hex(pk: &PublicKey) -> String {
    pk.to_hex().chars().take(8).collect()
}
src/integration_tests/scenarios/basic_messaging.rs (1)
24-33: Reduce stringly-typed drift by centralizing names

Optional: define constants for the creator, member, and group names to reduce typo risk and make reuse consistent across steps.
@@
 impl BasicMessagingScenario {
     pub fn new(whitenoise: &'static Whitenoise) -> Self {
         Self {
             context: ScenarioContext::new(whitenoise),
         }
     }
 }
 
+const CREATOR: &str = "basic_msg_creator";
+const MEMBER: &str = "basic_msg_member";
+const GROUP: &str = "basic_messaging_test_group";
@@
-        CreateAccountsTestCase::with_names(vec!["basic_msg_creator", "basic_msg_member"])
+        CreateAccountsTestCase::with_names(vec![CREATOR, MEMBER])
             .execute(&mut self.context)
             .await?;
@@
-        CreateGroupTestCase::basic()
-            .with_name("basic_messaging_test_group")
-            .with_members("basic_msg_creator", vec!["basic_msg_member"])
+        CreateGroupTestCase::basic()
+            .with_name(GROUP)
+            .with_members(CREATOR, vec![MEMBER])
             .execute(&mut self.context)
             .await?;
@@
-        AcceptGroupInviteTestCase::new("basic_msg_member")
+        AcceptGroupInviteTestCase::new(MEMBER)
             .execute(&mut self.context)
             .await?;
@@
-        SendMessageTestCase::basic()
-            .with_sender("basic_msg_creator")
-            .with_group("basic_messaging_test_group")
+        SendMessageTestCase::basic()
+            .with_sender(CREATOR)
+            .with_group(GROUP)
             .with_message_id_key("basic_msg_initial")
             .execute(&mut self.context)
             .await?;
@@
-        SendMessageTestCase::basic()
-            .with_sender("basic_msg_creator")
+        SendMessageTestCase::basic()
+            .with_sender(CREATOR)
             .into_reaction("👍", &basic_message_id)
-            .with_group("basic_messaging_test_group")
+            .with_group(GROUP)
             .execute(&mut self.context)
             .await?;
@@
-        SendMessageTestCase::basic()
-            .with_sender("basic_msg_member")
+        SendMessageTestCase::basic()
+            .with_sender(MEMBER)
             .into_reply("Great message!", &basic_message_id)
-            .with_group("basic_messaging_test_group")
+            .with_group(GROUP)
             .execute(&mut self.context)
             .await?;
Also applies to: 39-63
src/integration_tests/scenarios/subscription_processing.rs (1)
7-8: Prefer explicit imports over prelude::*

Only Metadata appears required here. Explicit imports reduce namespace surprises in tests and speed up IDE xref.
-use nostr_sdk::prelude::*;
+use nostr_sdk::Metadata;
src/integration_tests/test_cases/app_settings/fetch_app_settings.rs (1)
24-35: Optionally persist fetched settings in context for later steps

If other test cases need the same settings, consider recording them in ScenarioContext to avoid refetching and to aid debugging.

Example (if ScenarioContext has a key-value store):
context.record_test("app_settings.theme_mode", format!("{:?}", settings.theme_mode));
src/integration_tests/test_cases/subscription_processing/publish_metadata_update.rs (1)

63-65: Make verification resilient: replace fixed sleep with poll-until or timeout

A fixed 500ms may flake under load. Poll for the expected change with a sensible timeout.

For example, loop with a short sleep (200–300ms) until account.metadata(...).await?.name == self.updated_metadata.name or a 5–10s timeout, then return a descriptive error.
src/integration_tests/scenarios/metadata_management.rs (1)
28-45: Optional: avoid magic strings for the account name

Define a constant for "metadata_user" to reduce duplication and typo risk.
@@
 impl MetadataManagementScenario {
     pub fn new(whitenoise: &'static Whitenoise) -> Self {
         Self {
             context: ScenarioContext::new(whitenoise),
         }
     }
 }
 
+const ACCOUNT: &str = "metadata_user";
@@
-        CreateAccountsTestCase::with_names(vec!["metadata_user"])
+        CreateAccountsTestCase::with_names(vec![ACCOUNT])
             .execute(&mut self.context)
             .await?;
@@
-        FetchMetadataTestCase::for_account("metadata_user")
+        FetchMetadataTestCase::for_account(ACCOUNT)
             .execute(&mut self.context)
             .await?;
@@
-        UpdateMetadataTestCase::for_account("metadata_user")
+        UpdateMetadataTestCase::for_account(ACCOUNT)
             .with_name("Test User")
             .with_about("A test user for metadata testing")
             .with_website("https://example.com")
             .with_nip05("test@example.com")
             .with_picture("https://example.com/avatar.jpg")
             .execute(&mut self.context)
             .await?;
src/integration_tests/scenarios/follow_management.rs (4)
14-18: Avoid over-constraining the lifetime of Whitenoise in Scenario constructors

Requiring a 'static reference makes composition and test harnessing harder (e.g., sharing a runner-scoped instance or using DI/mocks). If not mandated by ScenarioContext, prefer a non-'static borrow.

Apply if feasible:
-    pub fn new(whitenoise: &'static Whitenoise) -> Self {
+    pub fn new(whitenoise: &Whitenoise) -> Self {
         Self {
             context: ScenarioContext::new(whitenoise),
         }
     }
33-37: Seed/derive test keys for determinism

Keys::generate() is non-deterministic. For reproducible CI and easier debugging, consider deriving keys from fixed seeds or a scenario-level RNG (if supported), so reruns produce identical logs and states.

Would you like me to propose a small helper (e.g., test_key(seed: u64)) that constructs Keys from a fixed secret?

59-69: Comment says “error handling”, but test asserts graceful no-op; align intent

The current steps assume both repeat-follow and unfollow-nonexistent should succeed without error. The comments imply error handling checks but do not assert specific error types or messages.

If the API contract is “idempotent and no-op”, consider clarifying the comments:
-        // Test error handling: try to follow the same user again (should succeed without error)
+        // Idempotence: following the same user again should be a no-op and not error

-        // Test error handling: try to unfollow a non-existent follow relationship
+        // Idempotence: unfollowing a non-existent relationship should be a no-op and not error
If the API is expected to error instead, add explicit negative assertions.

54-56: Optional: Assert idempotence of bulk follow

You include a key already followed (test_contact2) in bulk_contacts. To verify true idempotence, consider checking no duplicates are introduced (e.g., via a “following list” or count API), not only boolean following.

If Whitenoise exposes following_count or a list getter, I can draft the assertion here.
src/integration_tests/test_cases/follow_management/follow_user.rs (2)
36-48: Replace fixed sleeps with an “eventually” check to reduce flakiness

A hardcoded 100ms delay risks CI flakes on slower machines. Poll with a short interval and a reasonable timeout.

Apply:
-        // Add small delay for async operations
-        tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
-
-        let is_following = context
-            .whitenoise
-            .is_following_user(account, &self.target_pubkey)
-            .await?;
-        assert!(
-            is_following,
-            "Account {} should be following user {}",
-            self.follower_account_name,
-            &self.target_pubkey.to_hex()[..8]
-        );
+        // Eventually-consistent check with timeout
+        let deadline = tokio::time::Instant::now() + tokio::time::Duration::from_secs(3);
+        let mut is_following = false;
+        while tokio::time::Instant::now() < deadline {
+            is_following = context
+                .whitenoise
+                .is_following_user(account, &self.target_pubkey)
+                .await?;
+            if is_following {
+                break;
+            }
+            tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
+        }
+        assert!(
+            is_following,
+            "Account {} should be following user {} (timed out waiting for propagation)",
+            self.follower_account_name,
+            &self.target_pubkey.to_hex()[..8]
+        );
I can also factor this into a small “eventually” helper in core for reuse.

22-26: Minor: avoid indexing panics on short hex strings

&self.target_pubkey.to_hex()[..8] assumes length ≥ 8. It likely holds, but a defensive pattern avoids potential panics and repeated conversions.

Option: compute once and clamp length.
-        tracing::info!(
-            "Following user {} from account: {}",
-            &self.target_pubkey.to_hex()[..8],
-            self.follower_account_name
-        );
+        let hex = self.target_pubkey.to_hex();
+        let hex8 = &hex[..hex.len().min(8)];
+        tracing::info!("Following user {} from account: {}", hex8, self.follower_account_name);
...
-        tracing::info!(
-            "✓ Account {} is now following user {}",
-            self.follower_account_name,
-            &self.target_pubkey.to_hex()[..8]
-        );
+        tracing::info!("✓ Account {} is now following user {}", self.follower_account_name, hex8);
Alternatively, extract a shared short_key(&PublicKey) -> String utility.

Also applies to: 50-54
src/integration_tests/test_cases/app_settings/update_theme_mode.rs (2)
9-27: Deduplicate constructors; provide a single new(ThemeMode) and delegate

Three near-identical ctors add maintenance overhead. Introduce new(mode) and have helpers call it. No behavior change.

Apply:
 impl UpdateThemeModeTestCase {
-    pub fn to_dark() -> Self {
-        Self {
-            theme_mode: ThemeMode::Dark,
-        }
-    }
+    pub fn new(theme_mode: ThemeMode) -> Self {
+        Self { theme_mode }
+    }
 
-    pub fn to_light() -> Self {
-        Self {
-            theme_mode: ThemeMode::Light,
-        }
-    }
+    pub fn to_dark() -> Self { Self::new(ThemeMode::Dark) }
 
-    pub fn to_system() -> Self {
-        Self {
-            theme_mode: ThemeMode::System,
-        }
-    }
+    pub fn to_light() -> Self { Self::new(ThemeMode::Light) }
+    pub fn to_system() -> Self { Self::new(ThemeMode::System) }
 }
33-36: Potentially unnecessary clone of ThemeMode

If ThemeMode is a small Copy enum, cloning is unnecessary; pass by value. If it’s not Copy, consider changing update_theme_mode to accept &ThemeMode.

Possible change:
-            .update_theme_mode(self.theme_mode.clone())
+            .update_theme_mode(self.theme_mode)
If this fails to compile, we can either derive Copy for ThemeMode or adjust the API to take a reference.
src/integration_tests/test_cases/follow_management/unfollow_user.rs (2)
36-48: Replace fixed sleeps with an “eventually” check to reduce flakiness

Same rationale as follow_user: poll until the unfollow state is observed or time out.

Apply:
-        // Add small delay for async operations
-        tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
-
-        let is_following = context
-            .whitenoise
-            .is_following_user(account, &self.target_pubkey)
-            .await?;
-        assert!(
-            !is_following,
-            "Account {} should not be following user {} after unfollow",
-            self.follower_account_name,
-            &self.target_pubkey.to_hex()[..8]
-        );
+        // Eventually-consistent check with timeout
+        let deadline = tokio::time::Instant::now() + tokio::time::Duration::from_secs(3);
+        let mut is_still_following = true;
+        while tokio::time::Instant::now() < deadline {
+            is_still_following = context
+                .whitenoise
+                .is_following_user(account, &self.target_pubkey)
+                .await?;
+            if !is_still_following {
+                break;
+            }
+            tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
+        }
+        assert!(
+            !is_still_following,
+            "Account {} should not be following user {} after unfollow (timed out waiting for propagation)",
+            self.follower_account_name,
+            &self.target_pubkey.to_hex()[..8]
+        );
I can centralize this pattern in core if you prefer a reusable helper.

22-26: Minor: avoid indexing panics on short hex strings

Same nit as in follow_user; consider clamping or a helper for 8-char key formatting.

Want me to add a short_key util in integration_tests::core and update all call sites?

Also applies to: 50-54
src/integration_tests/test_cases/shared/create_accounts.rs (3)
10-14: Make with_names more ergonomic with IntoIterator

Allow callers to pass any iterable of string-likes, not only Vec<&str>.

Apply:
-    pub fn with_names(account_names: Vec<&str>) -> Self {
-        Self {
-            account_names: account_names.iter().map(|s| s.to_string()).collect(),
-        }
-    }
+    pub fn with_names<I, S>(account_names: I) -> Self
+    where
+        I: IntoIterator<Item = S>,
+        S: Into<String>,
+    {
+        Self {
+            account_names: account_names.into_iter().map(Into::into).collect(),
+        }
+    }
This is source-compatible with existing Vec<&str> call sites.

30-32: Account count assertion may be brittle under parallel execution

If scenarios ever run concurrently or other cases create accounts in between, final_count == initial + N can flake. If the runner guarantees serial execution, this is fine.

Alternative (safer) assertion:
-        assert_eq!(final_count, initial_count + self.account_names.len());
+        assert!(
+            final_count >= initial_count + self.account_names.len(),
+            "expected at least {} accounts after creation, got {}",
+            initial_count + self.account_names.len(),
+            final_count
+        );
Or, capture the created account pubkeys and assert their presence in the post-state, if an accounts listing API exists.

24-28: Optional: set human-readable labels if supported

You store the name only in ScenarioContext. If Whitenoise supports labeling/naming identities, consider setting it here to improve downstream UX/logging.

I can wire a set_account_label(account, name) call if such API exists.
src/integration_tests/test_cases/advanced_messaging/aggregate_messages.rs (3)
33-41: Reduce flakiness: poll with timeout instead of fixed sleep before fetching aggregation.

A fixed 1s sleep can be flaky under load or on CI. Polling with a bounded timeout makes the test more deterministic and faster on green paths.

Apply this diff to replace the sleep + single fetch with a short polling loop:
-        // Wait for message processing
-        tokio::time::sleep(tokio::time::Duration::from_millis(1000)).await;
-
-        // Fetch aggregated messages
-        let aggregated_messages = context
-            .whitenoise
-            .fetch_aggregated_messages_for_group(&account.pubkey, &group.mls_group_id)
-            .await?;
+        // Wait for processing with bounded polling to reduce flakiness
+        let mut aggregated_messages = Vec::new();
+        let deadline = tokio::time::Instant::now() + tokio::time::Duration::from_secs(3);
+        loop {
+            aggregated_messages = context
+                .whitenoise
+                .fetch_aggregated_messages_for_group(&account.pubkey, &group.mls_group_id)
+                .await?;
+            if aggregated_messages.len() >= self.expected_min_messages
+                || tokio::time::Instant::now() >= deadline
+            {
+                break;
+            }
+            tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
+        }
64-71: Avoid potential panic from string slicing; prefer safe substring for hex prefix.

Using [&str][..8] is safe for ASCII hex, but a safe slice avoids surprises if underlying formatting changes. This is a nit.
-            tracing::debug!(
-                "Message [{}]: '{}' from {} (deleted: {}, reply: {}, reactions: {})",
-                message.id,
-                message.content,
-                &message.author.to_hex()[..8],
-                message.is_deleted,
-                message.is_reply,
-                message.reactions.user_reactions.len()
-            );
+            let author_hex = message.author.to_hex();
+            let author_short = author_hex.get(0..8).unwrap_or(&author_hex);
+            tracing::debug!(
+                "Message [{}]: '{}' from {} (deleted: {}, reply: {}, reactions: {})",
+                message.id,
+                message.content,
+                author_short,
+                message.is_deleted,
+                message.is_reply,
+                message.reactions.user_reactions.len()
+            );
48-55: Assertion looks good; consider enriching failure output if this flakes.

The assert is fine. If you see intermittent failures, you could print the IDs or last few messages to aid triage. No action needed now.
src/integration_tests/test_cases/advanced_messaging/delete_message.rs (2)
37-39: Prefer typed tag construction over parse for the e-tag.

Using Tag::custom keeps this consistent with your other test case (relay list update) and avoids parse-time errors.
-        let delete_tags = vec![Tag::parse(vec!["e", target_message_id]).map_err(|e| {
-            WhitenoiseError::Other(anyhow::anyhow!("Failed to create e-tag: {}", e))
-        })?];
+        let delete_tags = vec![Tag::custom(
+            TagKind::SingleLetter(SingleLetterTag::lowercase(Alphabet::E)),
+            [target_message_id.to_string()],
+        )];
47-49: Minor: use String::new() for empty content.

Purely stylistic; both are fine.
-                "".to_string(), // Empty content for deletion event
+                String::new(), // Empty content for deletion event
src/integration_tests/test_cases/shared/create_group.rs (2)
53-60: Avoid unwrap on RelayUrl parsing to keep failures descriptive.

Unwrap is acceptable in tests, but mapping to WhitenoiseError will give cleaner errors if the URL is misconfigured in CI.

Possible adjustment:
-                NostrGroupConfigData::new(
+                NostrGroupConfigData::new(
                     self.group_name.clone(),
                     self.group_description.clone(),
                     None,
                     None,
-                    vec![RelayUrl::parse("ws://localhost:8080").unwrap()],
+                    vec![
+                        RelayUrl::parse("ws://localhost:8080")
+                            .map_err(|e| WhitenoiseError::Other(anyhow::anyhow!("Invalid dev relay URL: {}", e)))?
+                    ],
                 ),
64-66: Reduce flakiness: replace fixed 100ms sleep with a short readiness poll.

Group creation + MLS sync can take longer on loaded CI runners. Polling for group visibility (with a 3–5s cap) is more robust.

I can propose a small poll that checks whitenoise.groups(creator, true) until the created mls_group_id appears. Want me to draft it?
src/integration_tests/test_cases/account_management/logout_account.rs (3)
35-41: Also assert the specific account is gone from storage.

You already check the count. Verifying that the exact pubkey was removed tightens the guarantee and catches edge cases where an unintended account was removed.
         // Verify the account count decreased
         let final_count = context.whitenoise.get_accounts_count().await?;
         assert_eq!(
             final_count,
             initial_count - 1,
             "Account count should decrease by 1 after logout"
         );
+
+        // Ensure the logged-out account is no longer present in the database
+        let final_accounts_db = context.whitenoise.all_accounts().await?;
+        assert!(
+            final_accounts_db.iter().all(|a| a.pubkey != account.pubkey),
+            "Logged-out account should not be present in database"
+        );
47-54: Avoid reaching into context.accounts directly.

Accessing the field couples tests to internal representation. Prefer a helper like context.has_account(name) or reuse get_account + is_ok() if available.

If you want, I can add a tiny helper on ScenarioContext and update call sites.

64-76: Helpful logs; consider including names of remaining accounts always.

Both branches log well. Even when no expectations are provided, logging the remaining account names (not just the count) can aid debugging.
src/integration_tests/test_cases/subscription_processing/publish_relay_list_update.rs (4)
43-45: Propagate add_relay errors instead of panicking.

Unwraps in integration tests can obscure the root cause on CI. Propagate as WhitenoiseError for clearer failures.
-        for relay in &dev_relay_urls {
-            test_client.add_relay(relay.clone()).await.unwrap();
-        }
+        for relay in &dev_relay_urls {
+            test_client
+                .add_relay(relay.clone())
+                .await
+                .map_err(|e| WhitenoiseError::Other(anyhow::anyhow!("Failed to add dev relay: {}", e)))?;
+        }
58-61: Avoid unwrap on event publish; surface SDK error details.
-        test_client
-            .send_event_builder(EventBuilder::new(Kind::RelayList, "").tags(nip65_update_tags))
-            .await
-            .unwrap();
+        test_client
+            .send_event_builder(EventBuilder::new(Kind::RelayList, "").tags(nip65_update_tags))
+            .await
+            .map_err(|e| WhitenoiseError::Other(anyhow::anyhow!("Failed to publish relay list update: {}", e)))?;
81-83: Parse the expected relay URL without unwrap to keep diagnostics crisp.
-        let expected_relay = RelayUrl::parse(&self.new_relay_url).unwrap();
+        let expected_relay = RelayUrl::parse(&self.new_relay_url)
+            .map_err(|e| WhitenoiseError::Other(anyhow::anyhow!("Invalid relay URL '{}': {}", self.new_relay_url, e)))?;
50-51: Optional: wait for connection readiness instead of fixed 200ms sleep.

If the SDK exposes relay connection status, polling readiness reduces flakiness. Otherwise, consider a slightly longer sleep or retry on verification before failing.

Would you like me to draft a small readiness poll (e.g., retry verification for up to ~2s) to replace the fixed 200ms sleep?
src/integration_tests/scenarios/group_membership.rs (2)
41-44: Minor readability nit: avoid an extra block just to clone group_id.

Inline to reduce scope noise.
-        let group_id = {
-            let test_group = self.context.get_group("group_membership_test_group")?;
-            test_group.mls_group_id.clone()
-        };
+        let group_id = self
+            .context
+            .get_group("group_membership_test_group")?
+            .mls_group_id
+            .clone();
50-75: Add explicit membership assertions after add/remove to catch regressions early.

Right now we rely on the test case internals. Adding post-conditions here (e.g., verifying the group composition from the admin’s perspective) will make failures more localized and easier to diagnose.

If there is a helper like whitenoise.group_members(group_id) or an equivalent accessor in the context/test utilities, consider asserting that:

after Line 50 add: member2 is present in the group

after Line 72 add: member2 is not present in the group

If you want, I can draft the concrete calls once you confirm the available API for reading group membership.
src/integration_tests/scenarios/account_management.rs (1)

35-38: Optional: assert metadata propagation to strengthen the login path.

Since metadata is published (Line 36), consider asserting the account’s fetched metadata matches what was published to guard against relay filtering or propagation flakiness.

If there’s an accessor like account.metadata(&whitenoise).await?, add an equality check for name/about after login.
src/integration_tests/test_cases/account_management/login.rs (1)
23-28: Add a builder to customize relays per test.

Useful when running against different dev environments or when you want to test multiple relay sets.
     pub fn with_metadata(mut self, name: &str, about: &str) -> Self {
         self.metadata_name = Some(name.to_string());
         self.metadata_about = Some(about.to_string());
         self
     }
+
+    pub fn with_relays(mut self, relays: Vec<&'static str>) -> Self {
+        self.relays = relays;
+        self
+    }
src/integration_tests/registry.rs (1)
44-82: Nice summary output; consider including an overall pass/fail exit status in logs.

You already return the first error; adding a final “All scenarios passed”/“Failures detected” line helps CI log scanning.
         tracing::info!(
             "Test Cases: {} passed, {} failed",
             total_passed,
             total_failed
         );
 
         // Give async logging time to flush before program exits
         tokio::time::sleep(std::time::Duration::from_millis(100)).await;
+        if scenarios_failed == 0 {
+            tracing::info!("✅ All scenarios passed");
+        } else {
+            tracing::error!("❌ Failures detected in {} scenario(s)", scenarios_failed);
+        }
src/integration_tests/test_cases/shared/accept_group_invite.rs (1)
44-61: Optionally switch to a bounded retry when reading groups.

Same rationale as above; makes the test resilient to timing variance.
-        // Verify that the account now has access to groups
-        let groups = context.whitenoise.groups(account, true).await?;
+        // Verify that the account now has access to groups (with retry)
+        let mut groups = Vec::new();
+        let mut attempts = 0u8;
+        loop {
+            groups = context.whitenoise.groups(account, true).await?;
+            if welcome_count == 0 || !groups.is_empty() || attempts >= 10 {
+                break;
+            }
+            attempts += 1;
+            tokio::time::sleep(tokio::time::Duration::from_millis(200)).await;
+        }
src/integration_tests/scenarios/mod.rs (1)
1-17: Prefer restricting visibility and avoid glob re-exports to reduce surface area

Exposing all scenario internals via pub modules and pub use * can leak test-only APIs broadly across the crate and risks name collisions. Consider scoping to crate-level and avoiding * re-exports.

Suggested change (safe default without affecting internal users):
-pub mod account_management;
-pub mod advanced_messaging;
-pub mod app_settings;
-pub mod follow_management;
-pub mod group_membership;
-pub mod basic_messaging;
-pub mod metadata_management;
-pub mod subscription_processing;
+pub(crate) mod account_management;
+pub(crate) mod advanced_messaging;
+pub(crate) mod app_settings;
+pub(crate) mod follow_management;
+pub(crate) mod group_membership;
+pub(crate) mod basic_messaging;
+pub(crate) mod metadata_management;
+pub(crate) mod subscription_processing;

-pub use account_management::*;
-pub use advanced_messaging::*;
-pub use app_settings::*;
-pub use follow_management::*;
-pub use group_membership::*;
-pub use basic_messaging::*;
-pub use metadata_management::*;
-pub use subscription_processing::*;
+pub(crate) use account_management::*;
+pub(crate) use advanced_messaging::*;
+pub(crate) use app_settings::*;
+pub(crate) use follow_management::*;
+pub(crate) use group_membership::*;
+pub(crate) use basic_messaging::*;
+pub(crate) use metadata_management::*;
+pub(crate) use subscription_processing::*;
src/integration_tests/test_cases/group_membership/add_group_members.rs (1)

60-77: Align “expect_failure” semantics across add/remove cases or make outcome explicit

Here, a successful add in an “expect_failure” case hard-fails. In the removal test (sister file), a successful remove under “expect_failure” is treated as OK. Consider standardizing behavior or making the expected outcome explicit (e.g., ExpectOutcome::{MustFail, MaySucceedOrFail}) to avoid confusion.

If the differing behavior is intentional (e.g., removing non-existent members may be a no-op), add a short comment to document the rationale.

src/integration_tests/scenarios/advanced_messaging.rs (2)

26-141: Extract common names and keys into constants to avoid stringly-typed mistakes

Repeated literals like "advanced_messaging_group", "adv_msg_sender", keys like "reply_target_message" are error-prone. Centralize as constants or a small typed wrapper.

84-86: Consider a short poll before the final aggregation check

Given multiple async operations (reactions, replies, deletes), a bounded poll before AggregateMessagesTestCase can reduce flakiness under slow backends.

Also applies to: 128-132

src/integration_tests/test_cases/group_membership/remove_group_members.rs (1)

60-78: Clarify “expect_failure” contract vs. add-members

Here, both Ok and Err are accepted in “expect_failure” mode (with a note about non-existent members). That’s different from the add-members case. If intentional, explicitly document that remove can legally succeed as a no-op on non-members.
src/integration_tests/core/context.rs (3)
10-10: Rename messages_ids to message_ids for clarity and consistency

Minor naming nit that can cause confusion and typos across the suite.

Indicative change:
-    pub messages_ids: HashMap<String, String>,
+    pub message_ids: HashMap<String, String>,
@@
-    pub fn add_message_id(&mut self, name: &str, message_id: String) {
-        self.messages_ids.insert(name.to_string(), message_id);
+    pub fn add_message_id(&mut self, name: &str, message_id: String) {
+        self.message_ids.insert(name.to_string(), message_id);
     }
@@
-    pub fn get_message_id(&self, message_id: &str) -> Result<&String, WhitenoiseError> {
-        self.messages_ids.get(message_id).ok_or_else(|| {
+    pub fn get_message_id(&self, message_id: &str) -> Result<&String, WhitenoiseError> {
+        self.message_ids.get(message_id).ok_or_else(|| {
             WhitenoiseError::Configuration(format!(
                 "Message ID '{}' not found in context",
                 message_id
             ))
         })
     }
Note: This requires updating all call sites; I can generate a follow-up patch across the repo if useful.

Also applies to: 46-57

32-36: Include the missing key in “not found” errors for accounts and groups

AccountNotFound/GroupNotFound without the missing name hinders diagnosis. get_message_id already includes the key; mirror that behavior.

Example update:
-    pub fn get_account(&self, name: &str) -> Result<&Account, WhitenoiseError> {
-        self.accounts
-            .get(name)
-            .ok_or(WhitenoiseError::AccountNotFound)
-    }
+    pub fn get_account(&self, name: &str) -> Result<&Account, WhitenoiseError> {
+        self.accounts.get(name).ok_or_else(|| {
+            WhitenoiseError::Configuration(format!("Account '{}' not found in context", name))
+        })
+    }

-    pub fn get_group(&self, name: &str) -> Result<&Group, WhitenoiseError> {
-        self.groups.get(name).ok_or(WhitenoiseError::GroupNotFound)
-    }
+    pub fn get_group(&self, name: &str) -> Result<&Group, WhitenoiseError> {
+        self.groups.get(name).ok_or_else(|| {
+            WhitenoiseError::Configuration(format!("Group '{}' not found in context", name))
+        })
+    }
If the WhitenoiseError variants accept payloads, pass the names there instead.

Also applies to: 42-44, 50-57

16-25: Allow overriding dev relays via environment to support CI and remote setups

Hard-coded localhost relays are fine for dev, but reading from env (with these defaults) improves portability.

Sketch:
let relays = std::env::var("WN_DEV_RELAYS")
    .ok()
    .map(|s| s.split(',').map(|s| s.trim().to_string()).collect::<Vec<_>>())
    .unwrap_or_else(|| vec!["ws://localhost:8080".into(), "ws://localhost:7777".into()]);
// store as Vec<String> instead of Vec<&'static str>
This change would alter the field type; do only if acceptable within the feature-gated test harness.
src/integration_tests/test_cases/metadata_management/update_metadata.rs (3)
18-22: Decouple name and display_name setters to avoid accidental coupling

Setting both fields in with_name removes the ability to set a distinct display name later without calling another builder. Recommend adding a dedicated with_display_name and keeping with_name scoped to name only.

Apply this minimal change:
-    pub fn with_name(mut self, name: &str) -> Self {
-        self.metadata.name = Some(name.to_string());
-        self.metadata.display_name = Some(name.to_string());
-        self
-    }
+    pub fn with_name(mut self, name: &str) -> Self {
+        self.metadata.name = Some(name.to_string());
+        self
+    }
+
+    pub fn with_display_name(mut self, display_name: &str) -> Self {
+        self.metadata.display_name = Some(display_name.to_string());
+        self
+    }
55-81: Make verification resilient to eventual consistency with a short retry loop

If metadata propagation isn’t strictly synchronous (e.g., write + read-through caches), a single immediate read may flake. Consider a bounded retry with small backoff.
-        // Verify the update worked
-        let updated_metadata = account.metadata(&context.whitenoise).await?;
+        // Verify the update worked (tolerate brief eventual consistency)
+        let mut attempts = 0u8;
+        let updated_metadata = loop {
+            let md = account.metadata(&context.whitenoise).await?;
+            let matches =
+                md.name == self.metadata.name
+                && md.display_name == self.metadata.display_name
+                && md.about == self.metadata.about
+                && md.picture == self.metadata.picture
+                && md.website == self.metadata.website
+                && md.nip05 == self.metadata.nip05;
+            if matches || attempts >= 5 {
+                break md;
+            }
+            attempts += 1;
+            tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
+        };
77-80: Nit: use standard “NIP-05” casing in assertion message

Minor polish for consistency with the spec/name.
-            "Nip05 was not updated correctly"
+            "NIP-05 was not updated correctly"
src/integration_tests/test_cases/shared/send_message.rs (3)
84-88: Strengthen assertions to catch kind/content mismatches and wrong target group

Optionally verify more of the returned message to catch regressions.
-        assert_eq!(result.message.content, self.message_content);
+        assert_eq!(result.message.content, self.message_content, "Message content mismatch");
+        assert_eq!(result.message.kind, self.message_kind, "Message kind mismatch");
+        assert_eq!(result.message.group_id, group.mls_group_id, "Message delivered to unexpected group");
15-25: Ergonomics: expose with_kind and with_tags for flexibility

Nice-to-have: additional builders avoid ad-hoc variants later.

Add alongside existing builders:
pub fn with_kind(mut self, kind: u16) -> Self {
    self.message_kind = kind;
    self
}

pub fn with_tags(mut self, tags: Vec<Tag>) -> Self {
    self.tags = Some(tags);
    self
}
70-82: Validate and restrict allowed Nostr kinds in send_message_to_group

The send_message_to_group API currently accepts any u16 for the event kind and relies on nostr_mls.create_message to error at runtime. To avoid flaky failures when unsupported kinds are passed, please:
Inspect and update the function signature in src/whitenoise/messages.rs:31–38
pub async fn send_message_to_group(
    &self,
    account: &Account,
    group_id: &GroupId,
    message: String,
    kind: u16,        // ← consider tightening this
    tags: Option<Vec<Tag>>,
) -> Result<MessageWithTokens> { … }
– Replace the raw u16 with the nostr::Kind enum or a custom MessageKind newtype that only exposes supported variants (e.g., TextNote (1), EventDeletion (5), Reaction (7), Custom(9) for MLS chat).
– Add an early validation check to return a clear error if an unsupported kind is passed.
Add tests for invalid kind values in both unit tests (e.g., new send_message_to_group cases) and integration tests (src/integration_tests/test_cases/shared/send_message.rs) to assert failure rather than rely on downstream errors.
Callsite example in shared/send_message.rs:
let result = context
    .whitenoise
    .send_message_to_group(
        sender,
        &group.mls_group_id,
        self.message_content.clone(),
        self.message_kind, // ensure this is one of the allowed kinds
        self.tags.clone(),
    )
    .await?;
src/integration_tests/core/traits.rs (4)
10-14: Record test outcome is good; consider guarding against panics so one test doesn’t abort the scenario

Right now, a panic inside run will bypass record_test and unwind the whole scenario. Optional: wrap run with panic catching to convert panics into WhitenoiseError and mark the test failed.

If acceptable to add a lightweight dependency, one approach is to use futures::FutureExt::catch_unwind():
use futures::FutureExt;
use std::panic::AssertUnwindSafe;

async fn execute(&self, context: &mut ScenarioContext) -> Result<(), WhitenoiseError> {
    let result = AssertUnwindSafe(self.run(context)).catch_unwind().await
        .map_err(|_| WhitenoiseError::Other(anyhow::anyhow!("Test panicked"))))?;
    context.record_test(result.is_ok());
    result
}
If you prefer not to add futures, you can tokio::spawn the future and inspect JoinError::is_panic().

61-68: Intentional: cleanup failures don’t flip scenario success — verify this matches reporting requirements

Current behavior logs cleanup errors but still returns a successful ScenarioResult. If CI should fail on cleanup issues (e.g., DB not wiped), consider surfacing this as a failed scenario or a separate “post-run” status.

Option (keep success, attach note):
-                if cleanup_result.is_err() {
-                    tracing::error!(
-                        "✗ {} Scenario cleanup failed: {}",
-                        scenario_name,
-                        cleanup_result.err().unwrap()
-                    );
-                }
+                if let Err(e) = cleanup_result {
+                    tracing::error!("✗ {} Scenario cleanup failed: {}", scenario_name, e);
+                    // Optionally: return a failed result instead
+                    // return (ScenarioResult::failed(scenario_name, tests_run, tests_passed, duration), Some(e));
+                }
6-8: Confirm async_trait Send bounds are appropriate; relax if you need !Send scenarios

async_trait imposes Send by default on returned futures. If any Scenario/TestCase holds non-Send state, annotate with #[async_trait(?Send)] to avoid compile-time friction in integration builds.
-#[async_trait]
+#[async_trait(?Send)]
 pub trait TestCase {
And similarly for Scenario.

92-111: Cleanup policy: full DB wipe per scenario is heavy; consider a configurable toggle

Deleting all data is robust but can be slow for large datasets or multi-scenario runs. Consider a configuration flag/env to skip wipes locally while keeping it on in CI.

I can wire a simple INTEGRATION_CLEANUP=full|logout|none switch in the framework if you’d like.

.cursor/rules/integration-test.mdc

.vscode/settings.json

Cargo.toml

src/bin/integration_test.rs

src/integration_tests/core/scenario_result.rs

src/integration_tests/test_cases/subscription_processing/publish_metadata_update.rs

src/whitenoise/mod.rs

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* chore(.cursor/rules/integration-test): fix typo * refactor(integration_tests): create integration test framework with initial tests * test(integration): add app_settings and metadata_management scenarios * test(integration/account_management): add logout test * test(integration/follow_management): add scenario * fixup! refactor(integration_tests): create integration test framework with initial tests * test(integration/group_membership): add Group Membership Scenario * refactor(integration_tests): rename messaging scenario to basic messaging scenario * test(integration/advanced_messaging): add AdvancedMessagingScenario scenario Also add cleanup step after scenario execution that calls logout to remove the stored secret and reset the db * test(integration/subscription_processing): add scenario to test subscription based metadata and relay updates Also refactors account_management login to be cleaner Adjust sleeps throughout the integrationt tests * test(integration): fix tests by using updated methods * docs(integration_tests): update readme * refactor(integration_tests): registry: reorder summary info * fixup! test(integration): fix tests by using updated methods * chore(ci): use --features integration-tests to run integration tests * Update .cursor/rules/integration-test.mdc Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Update src/integration_tests/README.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * fix: CodeRabbit comments --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

jgmontoya added 13 commits August 23, 2025 18:07

chore(.cursor/rules/integration-test): fix typo

7283ec4

refactor(integration_tests): create integration test framework with i…

6187f3c

…nitial tests

test(integration): add app_settings and metadata_management scenarios

be1a223

test(integration/account_management): add logout test

d273c23

test(integration/follow_management): add scenario

01b5e79

fixup! refactor(integration_tests): create integration test framework…

6307f9f

… with initial tests

test(integration/group_membership): add Group Membership Scenario

3c8b482

refactor(integration_tests): rename messaging scenario to basic messa…

2f3e55b

…ging scenario

test(integration/advanced_messaging): add AdvancedMessagingScenario s…

61f2964

…cenario Also add cleanup step after scenario execution that calls logout to remove the stored secret and reset the db

test(integration/subscription_processing): add scenario to test subsc…

555ee28

…ription based metadata and relay updates Also refactors account_management login to be cleaner Adjust sleeps throughout the integrationt tests

test(integration): fix tests by using updated methods

d745bbc

docs(integration_tests): update readme

776e95f

refactor(integration_tests): registry: reorder summary info

2f4844a

jgmontoya marked this pull request as ready for review August 23, 2025 15:44

fixup! test(integration): fix tests by using updated methods

cac9b05

coderabbitai bot reviewed Aug 23, 2025

View reviewed changes

jgmontoya and others added 3 commits August 23, 2025 18:56

chore(ci): use --features integration-tests to run integration tests

1951bc9

Update .cursor/rules/integration-test.mdc

d4a7add

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update src/integration_tests/README.md

6b136dc

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot mentioned this pull request Aug 23, 2025

Replace fixed sleeps with polling loops and exponential backoff in integration tests #325

Closed

jgmontoya force-pushed the refactor/integration-tests branch from 404c2a0 to 54688a8 Compare August 23, 2025 16:30

fix: CodeRabbit comments

b61e792

jgmontoya force-pushed the refactor/integration-tests branch from 54688a8 to b61e792 Compare August 23, 2025 16:35

jgmontoya requested review from delcin-raj and erskingardner August 23, 2025 16:42

erskingardner approved these changes Aug 24, 2025

View reviewed changes

erskingardner merged commit 14fe58c into master Aug 24, 2025
4 checks passed

erskingardner deleted the refactor/integration-tests branch August 24, 2025 05:57

This was referenced Aug 25, 2025

Remove nostr dependency and fix re-exports #328

Merged

Throw away invalid JSON in data migration #330

Merged

Create event publish and processor tracking #329

Merged

coderabbitai bot mentioned this pull request Oct 2, 2025

Update rust edition to 2024 #379

Merged

coderabbitai bot mentioned this pull request Oct 23, 2025

Feat/chat media in aggregator #392

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New integration tests framework #324

New integration tests framework #324

Uh oh!

jgmontoya commented Aug 23, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 23, 2025 •

edited

Loading

Rate limit exceeded

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

New integration tests framework #324

New integration tests framework #324

Uh oh!

Conversation

jgmontoya commented Aug 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jgmontoya commented Aug 23, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 23, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)