-
Notifications
You must be signed in to change notification settings - Fork 36
Feat/chat media in aggregator #392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ds from process_message
WalkthroughThis PR threads MediaFile support through tests and the aggregation pipeline: ScenarioContext now stores uploaded media, MediaFile is re-exported and can be queried by group, the message aggregator and processor accept media_files and attach media to ChatMessage, and retry-based unresolved handling plus max_retry_attempts were removed. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Integration Test
participant Ctx as ScenarioContext
participant MsgAPI as messages.rs
participant Agg as MessageAggregator
participant Proc as processor.rs
participant DB as Database
Test->>Ctx: add_media_file(name, file)
Ctx->>Ctx: store in media_files HashMap
Test->>MsgAPI: fetch_aggregated_messages_for_group(group)
MsgAPI->>DB: MediaFile::find_by_group(group)
DB-->>MsgAPI: Vec<MediaFile>
MsgAPI->>Agg: aggregate_messages_for_group(..., media_files)
Agg->>Proc: process_messages(..., media_files)
rect rgb(200,220,255)
Note over Proc: Pass 1 — process base messages, reactions, deletions\nbuild media_files_map, extract media hashes and attachments
Proc->>Proc: extract_media_hashes(imeta_tags)
Proc->>Proc: extract_media_attachments(from hashes via map)
Proc->>Proc: attach media_attachments to ChatMessage
end
rect rgb(220,200,255)
Note over Proc: Pass 2 — reprocess orphaned messages
end
Proc-->>Agg: Vec<ChatMessage with media_attachments>
Agg-->>MsgAPI: Vec<ChatMessage>
MsgAPI-->>Test: aggregated messages
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (6)
🚧 Files skipped from review as they are similar to previous changes (4)
🧰 Additional context used📓 Path-based instructions (2)**/*.rs📄 CodeRabbit inference engine (.cursor/rules/rust-code-style.mdc)
Files:
src/integration_tests/**📄 CodeRabbit inference engine (.cursor/rules/mls-protocol-docs.mdc)
Files:
🧬 Code graph analysis (2)src/integration_tests/test_cases/chat_media_upload/upload_chat_media.rs (1)
src/whitenoise/message_aggregator/processor.rs (4)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🔇 Additional comments (9)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (21)
src/whitenoise/message_aggregator/types.rs (3)
6-6: Import ordering per guidelinesMove std first, then external crates, then internal crate modules. Currently external comes before std.
-use nostr_sdk::prelude::*; -use serde::{Deserialize, Serialize}; -use std::collections::HashMap; - -use crate::nostr_manager::parser::SerializableToken; +use std::collections::HashMap; +use nostr_sdk::prelude::*; +use serde::{Deserialize, Serialize}; +use crate::nostr_manager::parser::SerializableToken; use crate::whitenoise::media_files::MediaFile;As per coding guidelines.
49-56: Derive Eq where possible on reaction typesYou can likely add Eq to EmojiReaction, UserReaction, and ReactionSummary (Hash not advisable due to HashMap). This improves comparisons.
-#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)] pub struct EmojiReaction { ... } -#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)] +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)] pub struct UserReaction { ... } -#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Default)] +#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, Default)] pub struct ReactionSummary { ... }Also applies to: 59-69, 72-82
85-86: Broaden derives for AggregatorConfig and order themBooleans are Copy + Hash; derive them and order as per guideline.
-#[derive(Debug, Clone, Eq, PartialEq)] +#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] pub struct AggregatorConfig {As per coding guidelines.
src/integration_tests/test_cases/chat_media_upload/upload_chat_media.rs (1)
1-6: Reorder imports to follow project conventionPlace std first (none here), then external crates, then crate-local modules.
-use crate::WhitenoiseError; -use crate::integration_tests::core::*; -use async_trait::async_trait; +use async_trait::async_trait; use mdk_core::media_processing::MediaProcessingOptions; use nostr_sdk::Url; +use crate::WhitenoiseError; +use crate::integration_tests::core::*;As per coding guidelines.
src/integration_tests/scenarios/chat_media_upload.rs (1)
1-7: Import orderingExternal crates before crate-internal modules.
-use crate::integration_tests::{ - core::*, - test_cases::{chat_media_upload::*, shared::*}, -}; -use crate::{Whitenoise, WhitenoiseError}; use async_trait::async_trait; +use crate::integration_tests::{ + core::*, + test_cases::{chat_media_upload::*, shared::*}, +}; +use crate::{Whitenoise, WhitenoiseError};As per coding guidelines.
src/whitenoise/media_files.rs (1)
6-6: Public re-export LGTM; consider grouping internal usesRe-exporting MediaFile from database is convenient. Optionally group this pub use after external imports and alongside other crate uses for consistency.
src/integration_tests/test_cases/chat_media_upload/mod.rs (1)
1-1: Use self:: when re-exporting submodule itemsKeeps paths explicit post-mod declaration.
-pub mod send_message_with_media; +pub mod send_message_with_media; pub mod upload_chat_media; -pub use send_message_with_media::*; +pub use self::send_message_with_media::*; pub use upload_chat_media::*;As per coding guidelines.
Also applies to: 4-4
src/whitenoise/messages.rs (4)
1-13: Import orderingPlace external crates before internal crate modules.
-use crate::{ - types::MessageWithTokens, - whitenoise::{ - Whitenoise, - accounts::Account, - error::{Result, WhitenoiseError}, - media_files::MediaFile, - message_aggregator::ChatMessage, - }, -}; -use mdk_core::prelude::*; -use nostr_sdk::prelude::*; +use mdk_core::prelude::*; +use nostr_sdk::prelude::*; +use crate::{ + types::MessageWithTokens, + whitenoise::{ + Whitenoise, + accounts::Account, + error::{Result, WhitenoiseError}, + media_files::MediaFile, + message_aggregator::ChatMessage, + }, +};As per coding guidelines.
109-111: Fetching all media per group: watch scalabilityThis is fine initially. If groups accumulate many files, consider filtering by recent time window or by hashes referenced in messages to reduce payload.
119-125: Error mapping could preserve original typeWrapping aggregator errors in anyhow -> WhitenoiseError loses granularity. Consider adding a WhitenoiseError::Aggregation(ProcessingError) variant and using .map_err(WhitenoiseError::Aggregation).
128-134: Prefer &str over &String in parametersIdiomatic and more flexible.
- fn create_unsigned_nostr_event( + fn create_unsigned_nostr_event( &self, pubkey: &PublicKey, - message: &String, + message: &str, kind: u16, tags: Option<Vec<Tag>>, ) -> Result<(UnsignedEvent, EventId)> {src/whitenoise/database/media_files.rs (2)
143-156: Derive list: add Hash and keep guideline order.Per guidelines for public types, prefer derives in order: Debug, Clone, PartialEq, Eq, Hash, then serde traits. MediaFile can derive Hash (fields appear hashable). Recommend reordering and adding Hash.
Apply:
-#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] +#[derive(Debug, Clone, PartialEq, Eq, Hash, Serialize, Deserialize)] pub struct MediaFile {As per coding guidelines.
296-321: Add deterministic ordering to group query results.find_by_group() returns rows without ORDER BY. Deterministic order simplifies callers and tests and avoids SQLite planner variance.
Apply:
- FROM media_files - WHERE mls_group_id = ?", + FROM media_files + WHERE mls_group_id = ? + ORDER BY created_at ASC, id ASC",Optionally document the order in the fn docs.
src/integration_tests/core/context.rs (1)
59-67: Helpful getters; consider mut accessor if needed later.add_media_file/get_media_file are fine. If future tests need to tweak stored metadata, add get_media_file_mut().
src/integration_tests/test_cases/chat_media_upload/send_message_with_media.rs (1)
80-87: Avoid fixed sleep; rely solely on retry/backoff.The initial 200ms sleep can be flaky across environments. Prefer starting retry immediately with slightly higher attempts or longer delay.
Example:
- tokio::time::sleep(tokio::time::Duration::from_millis(200)).await; - let aggregated_messages = retry( - 15, - std::time::Duration::from_millis(100), + let aggregated_messages = retry( + 20, + std::time::Duration::from_millis(150), || async { /* unchanged */ }, "fetch aggregated messages with media", )src/whitenoise/message_aggregator/reaction_handler.rs (3)
6-12: Reorder imports per project guidelines.Place std first, then external crates, then internal (super/crate). Keeps consistency across modules. As per coding guidelines.
Apply:
-use nostr_sdk::prelude::*; -use std::collections::HashMap; - -use super::emoji_utils; -use super::types::{AggregatorConfig, ChatMessage, EmojiReaction, ProcessingError, UserReaction}; -use mdk_core::prelude::message_types::Message; +use std::collections::HashMap; +use nostr_sdk::prelude::*; +use mdk_core::prelude::message_types::Message; + +use super::emoji_utils; +use super::types::{AggregatorConfig, ChatMessage, EmojiReaction, ProcessingError, UserReaction};
24-53: Use match instead of if-let with else doing work.Both branches perform work; match reads clearer and aligns with our style guide. As per coding guidelines.
- if let Some(target_message) = processed_messages.get_mut(&target_id) { + match processed_messages.get_mut(&target_id) { - add_reaction_to_message( + Some(target_message) => { + add_reaction_to_message( target_message, &message.pubkey, &reaction_emoji, message.created_at, - ); + ); - if config.enable_debug_logging { - tracing::debug!( - "Added reaction '{}' from {} to message {}", - reaction_emoji, - message.pubkey.to_hex(), - target_id - ); - } - Ok(()) - } else { - if config.enable_debug_logging { - tracing::warn!( - "Reaction {} references non-existent message {}", - message.id, - target_id - ); - } - Err(ProcessingError::Internal(format!( - "Reaction target {} not found", - target_id - ))) - } + if config.enable_debug_logging { + tracing::debug!( + "Added reaction '{}' from {} to message {}", + reaction_emoji, + message.pubkey.to_hex(), + target_id + ); + } + Ok(()) + } + None => { + if config.enable_debug_logging { + tracing::warn!( + "Reaction {} references non-existent message {}", + message.id, + target_id + ); + } + Err(ProcessingError::Internal(format!( + "Reaction target {} not found", + target_id + ))) + } + }
49-52: Prefer a dedicated error for missing target (not Internal).Internal suggests a system fault; missing target is a normal, retriable condition in Pass 2. Consider a specific variant (e.g., TargetNotFound/MissingTarget) to improve metrics and control flow.
src/whitenoise/message_aggregator/processor.rs (3)
6-13: Reorder imports per project guidelines.std → external crates → internal (super/crate). Keeps consistency.
-use nostr_sdk::prelude::*; -use std::collections::HashMap; - -use super::reaction_handler; -use super::types::{AggregatorConfig, ChatMessage, ProcessingError}; -use crate::nostr_manager::parser::Parser; -use crate::whitenoise::media_files::MediaFile; +use std::collections::HashMap; +use nostr_sdk::prelude::*; +use mdk_core::prelude::message_types::Message; + +use super::reaction_handler; +use super::types::{AggregatorConfig, ChatMessage, ProcessingError}; +use crate::nostr_manager::parser::Parser; +use crate::whitenoise::media_files::MediaFile; - use mdk_core::prelude::message_types::Message;
236-254: Optional: early-capacity and (secondary) dedup in attachments.If you keep dedup only in extract_media_hashes, this is fine. If you want defense-in-depth, add a seen set here too and pre-allocate capacity.
fn extract_media_attachments( tags: &Tags, media_files_map: &HashMap<String, MediaFile>, ) -> Vec<MediaFile> { - let media_hashes = extract_media_hashes(tags); - let mut media_attachments = Vec::new(); + let media_hashes = extract_media_hashes(tags); + let mut media_attachments = Vec::with_capacity(media_hashes.len()); + let mut seen = std::collections::HashSet::new(); for hash in media_hashes { - if let Some(media_file) = media_files_map.get(&hash) { - media_attachments.push(media_file.clone()); - } + if seen.insert(hash.as_str()) { + if let Some(media_file) = media_files_map.get(&hash) { + media_attachments.push(media_file.clone()); + } + } } media_attachments }
311-313: Tests adjusted to new signature.Good update to include media_files arg.
I can add unit tests for extract_media_hashes (upper-case and duplicates) and for try_process_deletion scrubbing. Want a follow-up patch?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
src/integration_tests/core/context.rs(4 hunks)src/integration_tests/scenarios/chat_media_upload.rs(1 hunks)src/integration_tests/test_cases/chat_media_upload/mod.rs(1 hunks)src/integration_tests/test_cases/chat_media_upload/send_message_with_media.rs(1 hunks)src/integration_tests/test_cases/chat_media_upload/upload_chat_media.rs(2 hunks)src/whitenoise/database/media_files.rs(3 hunks)src/whitenoise/media_files.rs(1 hunks)src/whitenoise/message_aggregator/mod.rs(3 hunks)src/whitenoise/message_aggregator/processor.rs(7 hunks)src/whitenoise/message_aggregator/reaction_handler.rs(3 hunks)src/whitenoise/message_aggregator/tests.rs(5 hunks)src/whitenoise/message_aggregator/types.rs(2 hunks)src/whitenoise/messages.rs(2 hunks)src/whitenoise/mod.rs(0 hunks)
💤 Files with no reviewable changes (1)
- src/whitenoise/mod.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.cursor/rules/rust-code-style.mdc)
**/*.rs: Write all trait bounds in a where clause for functions and impl blocks
Prefer using Self when referring to the implementing type (including return types and enum variant matches)
For public types in libraries, derive Debug, Clone, Copy, PartialEq, Eq, and Hash (in this order) when possible
Derive Default only when a reasonable default value exists
Use full paths for tracing macros (tracing::!(...)) instead of importing macro names
Prefer .to_string() or .to_owned() for &str -> String conversion over .into() or String::from
Order imports: (1) core/alloc/std, (2) external crates, (3) current sub-modules (mod x;), (4) internal crate modules (use crate::, super::, self::)
When implementing fmt traits, import core::fmt and use fmt::Display, fmt::Formatter, etc.
After declaring mod x;, import sub-items via self::x::Y instead of x::Y
Avoid if let ... { } else { } in favor of match when both branches do work
Use if let ... { } when the alternative match arm is intentionally empty
Avoid inline sub-modules (mod x { ... }); prefer a separate file x.rs with mod x; (except tests and benches)
For tests, prefer inline #[cfg(test)] mod tests { ... } and avoid mod tests;
For benches, prefer inline #[cfg(bench)] mod benches { ... } and avoid mod benches;
Files:
src/integration_tests/scenarios/chat_media_upload.rssrc/integration_tests/test_cases/chat_media_upload/mod.rssrc/integration_tests/test_cases/chat_media_upload/send_message_with_media.rssrc/integration_tests/test_cases/chat_media_upload/upload_chat_media.rssrc/whitenoise/message_aggregator/mod.rssrc/whitenoise/messages.rssrc/whitenoise/media_files.rssrc/whitenoise/message_aggregator/tests.rssrc/integration_tests/core/context.rssrc/whitenoise/database/media_files.rssrc/whitenoise/message_aggregator/processor.rssrc/whitenoise/message_aggregator/reaction_handler.rssrc/whitenoise/message_aggregator/types.rs
src/integration_tests/**
📄 CodeRabbit inference engine (.cursor/rules/mls-protocol-docs.mdc)
Add and maintain MLS integration tests under src/integration_tests
Files:
src/integration_tests/scenarios/chat_media_upload.rssrc/integration_tests/test_cases/chat_media_upload/mod.rssrc/integration_tests/test_cases/chat_media_upload/send_message_with_media.rssrc/integration_tests/test_cases/chat_media_upload/upload_chat_media.rssrc/integration_tests/core/context.rs
🧠 Learnings (1)
📚 Learning: 2025-09-29T10:45:56.426Z
Learnt from: CR
PR: parres-hq/whitenoise#0
File: .cursor/rules/mls-protocol-docs.mdc:0-0
Timestamp: 2025-09-29T10:45:56.426Z
Learning: Applies to src/whitenoise/event_processor/event_handlers/handle_mls_message.rs : Follow the Marmot protocol spec when integrating MLS messages with Nostr
Applied to files:
src/whitenoise/message_aggregator/types.rs
🧬 Code graph analysis (8)
src/integration_tests/scenarios/chat_media_upload.rs (2)
src/integration_tests/core/context.rs (1)
new(19-39)src/integration_tests/test_cases/chat_media_upload/send_message_with_media.rs (1)
new(14-20)
src/integration_tests/test_cases/chat_media_upload/send_message_with_media.rs (4)
src/integration_tests/core/context.rs (1)
new(19-39)src/types.rs (1)
mime_type(104-111)src/integration_tests/test_cases/chat_media_upload/upload_chat_media.rs (1)
run(49-145)src/integration_tests/core/retry.rs (1)
retry(90-102)
src/whitenoise/message_aggregator/mod.rs (2)
src/whitenoise/mod.rs (1)
media_files(430-432)src/whitenoise/message_aggregator/processor.rs (1)
process_messages(16-117)
src/whitenoise/messages.rs (4)
src/whitenoise/mod.rs (1)
media_files(430-432)src/whitenoise/database/accounts.rs (1)
find_by_pubkey(143-162)src/whitenoise/accounts.rs (2)
create_mdk(263-270)create_mdk(909-911)src/whitenoise/database/media_files.rs (1)
find_by_group(304-321)
src/integration_tests/core/context.rs (1)
src/whitenoise/mod.rs (2)
media_files(430-432)new(58-72)
src/whitenoise/database/media_files.rs (2)
src/whitenoise/database/group_information.rs (1)
rows(173-175)src/whitenoise/database/mod.rs (1)
new(53-97)
src/whitenoise/message_aggregator/processor.rs (3)
src/whitenoise/mod.rs (2)
media_files(430-432)new(58-72)src/whitenoise/media_files.rs (1)
new(60-62)src/whitenoise/message_aggregator/reaction_handler.rs (1)
process_reaction(14-54)
src/whitenoise/message_aggregator/types.rs (1)
src/whitenoise/mod.rs (1)
media_files(430-432)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test (macos-14)
- GitHub Check: Security Audit
- GitHub Check: Test (ubuntu-latest)
🔇 Additional comments (14)
src/whitenoise/message_aggregator/types.rs (1)
44-46: Field addition LGTM; confirm serde compatibilityAdding media_attachments: Vec makes sense. Ensure MediaFile: Serialize + Deserialize, or ChatMessage serde will fail.
src/integration_tests/test_cases/chat_media_upload/upload_chat_media.rs (1)
141-143: Persisting uploaded media in context is correctStoring MediaFile for subsequent steps is appropriate; no issues.
src/integration_tests/scenarios/chat_media_upload.rs (1)
46-50: Good end-to-end linkage stepSending a message that references the uploaded media and awaiting verification closes the loop. LGTM.
src/whitenoise/messages.rs (1)
102-103: API return type change looks goodReturning Vec (public type) simplifies consumers. No issues.
src/integration_tests/core/context.rs (1)
13-16: Context field for media looks good.Storing MediaFile in ScenarioContext enables cross-step sharing in scenarios. LGTM.
Also applies to: 35-36
src/integration_tests/test_cases/chat_media_upload/send_message_with_media.rs (2)
22-37: imeta tag builder is correct and defensive.Format matches MIP-04 expectations; error mapping is clean. LGTM.
120-142: Attachment assertions are precise and sufficient.Validates count, hash, and MIME type against uploaded media. LGTM.
src/whitenoise/message_aggregator/mod.rs (1)
64-72: API extension to accept media_files: good surface and docs.Signature and docs updated; pass-through to processor is clear. Ensure all call sites provide group-scoped media to avoid cross-group leakage.
Also applies to: 82-84
src/whitenoise/message_aggregator/tests.rs (2)
21-22: Tests updated for new parameter.Passing empty media_files preserves prior behavior. LGTM.
Also applies to: 184-185
229-230: ChatMessage initialization with media_attachments is consistent.Defaulting to empty vec keeps equality/serde tests straightforward. LGTM.
Also applies to: 258-259
src/whitenoise/message_aggregator/reaction_handler.rs (2)
56-75: Verify “a” tag fallback will ever match processed_messages keys.processed_messages keys are message.id strings. “a” tags are addresses (kind:pubkey:d). Without an address→event-id map, this branch likely never hits. Confirm intent or remove to avoid misleading behavior.
Would you like me to add address→id indexing or drop the fallback?
165-166: Test scaffold updated correctly.Adding media_attachments: vec![] keeps the helper aligned with the new ChatMessage shape.
src/whitenoise/message_aggregator/processor.rs (2)
45-71: Orphan reprocessing flow looks good.Single chronological pass + orphan Pass 2 is simpler than multi-retry and fits reactions/deletions well.
Please add a test where a reaction precedes its target (earlier created_at) and is picked up in Pass 2.
Also applies to: 81-107
138-154: Media attachment extraction integration looks good.Passing media_attachments into ChatMessage is cohesive and side-effect free.
0c8584c to
410c2d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary by CodeRabbit
New Features
Improvements
Tests