Skip to content

feat(send_queue): report progress for media uploads #5008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

Johennes
Copy link
Contributor

@Johennes Johennes commented May 6, 2025

This makes it possible to listen to the media upload progress when using the send queue. The progress is communicated via EventSendState::NotSentYet.

I haven't yet fixed / enhanced the tests or added a changelog because I'm not 100% sure about this approach and would like to get some preliminary feedback first.

  • Public API changes documented in changelogs (optional)

Signed-off-by: Johannes Marbach <n0-0ne+github@mailbox.org>
@Johennes Johennes marked this pull request as ready for review May 26, 2025 18:40
@Johennes Johennes requested a review from a team as a code owner May 26, 2025 18:40
@Johennes Johennes requested review from Hywan and removed request for a team May 26, 2025 18:40
Copy link
Member

@Hywan Hywan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A really nice PR! I've a bit of feedback, but otherwise we are on the right direction.

We (the Matrix Rust team) were discussing the fact the timeline item is going to be recreated for each transmission progress update. Maybe we want a bit of throttle when broadcasting the transmission progress so that we reduce the amount of data sent to the apps. People at Element will start playing with it inside Element X, let's see how it goes.

} {
let mut subscriber = progress.subscribe();
let our_updates = updates.clone();
spawn(async move {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who controls this task? If the upload is cancelled, this task should be cancelled too right? Or maybe the subscriber.next().await may return a None at some point? If that's the case, please test that behaviour specifically, and comment that out, otherwise handle this 🙂.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed next() will yield None once the observable (progress) and its clones are dropped. This happens once the final RoomSendQueueUpdate::MediaUpload is emited further down after the request has completed.

I've added a comment about this but could also add an explicit abort if that would be better.

Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, and nice that we could in theory have this! I've got a few questions about the approach below. We'd like to try it out before validating the overall approach, because of the remark related to memory usage across the FFI, but other than that, it would be a great addition!

Comment on lines +100 to +101
/// Is the media a thumbnail?
is_thumbnail: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity: do you think there's lot of value in exposing this? In which cases would a consumer of the API care that an upload progress is related to a thumbnail or not? I'd imagine the interesting information is "what's the overall percentage of progress for the upload", and that this kind of details doesn't bring much value? (Also that would avoid the data migration 😈)

is_thumbnail,
progress,
} => {
self.update_event_send_state(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first sight, I was slightly worried about this approach. This will clone the item every time there's a new progress update, which will cause lots of unnecessary overhead at the FFI layer (especially on Android where the JVM may finalize the items much later after they've been dropped on the Rust side, causing spurious spikes in memory usage). Have you tried it out on a low to medium range Android device? How does it feel, in terms of latency/speed?

Overall, I imagined another API where we could have a side-channel progress stream, exposed independently of the RoomSendQueueUpdates. This way, we could channel those progress stream updates directly to the FFI consumers, in a new delegate — and not have to clone the timeline item each time.

But! Turns out we've discussed this in our team meeting, and we're curious to evaluate the performance of the current approach, so…

TLDR: no changes required here, but would be great if you had some performance feedback for this approach!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about the memory usage. I have actually only used this on iOS so far and don't even have a functional Android setup yet in the project I'm prototyping this. Overall, I'm finding it slightly odd to shove this into send queue updates, too, to be honest. It just seemed like the most straightforward way at the time. But I wouldn't necessarily be opposed to trying out a different approach if you already have concerns.

let mut req =
room.client().media().upload(&mime, data, Some(request_config));
if let Some(watcher) = progress_watcher {
req = req.with_send_progress_observable(watcher);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this will emit the following updates, if I'm sending a file with a thumbnail:

  • first progress updates for the thumbnail, going from 0 to 100%
  • then progress updates for the file itself, going from 0 to 100%

Is that right? I think observers would likely be more observed in the overall progress from 0 to 100%, so it would be nice if we could aggregate those into a single stream, for observers (especially for galleries which may contain many medias and thumbnails).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes together with your other comment. I actually cannot think of a reason why a client would be interested in displaying the thumbnail and file upload progress separately. So aggregating them would probably be best.

I think a difficulty with this is determining the overall upload size ahead of time. The way the code is currently structured, we only receive the size of the file upload once the file has been encrypted and actually started uploading. To aggregate it, we would have to know the size already when starting the thumbnail upload, however.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could update UploadEncryptedFile so that it can either take a reader for the plain data (like today) or a vector of the encrypted data? Then we could encrypt both thumbnail and media ahead of time, to compute the overall upload size, store it on the send queue requests and use it inside RoomSendQueue::sending_task to aggregate the progress per thumbnail / media pair. That would mingle the concerns a little of course.

The only other option I can think of is to do some heuristics and translate the bytes sent for the encrypted attachments into bytes of the unencrypted attachments. That would mean the SDK wouldn't report the actually uploaded bytes though which doesn't feel great either.

🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I thought about these two alternatives as well :)

  • The first one would work for a thumbnail and a media, but could lead to device memory exhaustion (and getting OOM killed) if you want the same behavior for a gallery with many items.
  • The second one could work nicely with a rule of 3: fake_number_bytes_sent = number_encrypted_bytes_sent * total_bytes / total_encrypted_bytes. As you're pointing it, it wouldn't be the "perfect" information when forwarded to end users, and it would be wrong if the end user decided to show the live bandwidth usage (MB per second).

An alternative would be to restrict ourselves to "relative" progress (i.e. get a percentage, not raw values). This would make it impossible to display the live bandwidth usage, but might be sufficient to display an inline spinner that fills up over time. That could be a nice 80/20 solution, what do you think @Johennes @Hywan?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes good point that about the memory limitations when encrypting everything upfront.

I think the percentage should probably suffice in most cases. Maybe we could just map the internal TransmissionProgress to a new struct that contains only the current percentage then.

Johennes and others added 8 commits May 27, 2025 12:54
Co-authored-by: Ivan Enderlin <ivan@mnt.io>
Signed-off-by: Johannes Marbach <n0-0ne+github@mailbox.org>
Co-authored-by: Ivan Enderlin <ivan@mnt.io>
Signed-off-by: Johannes Marbach <n0-0ne+github@mailbox.org>
…ploadedMedia into EventSendProgress::MediaUpload
…iaUploadProgress into RoomSendQueueUpdate::MediaUpload
@jmartinesp
Copy link
Contributor

jmartinesp commented May 28, 2025

I just tested this on EXA and it seems like there's a memory peak when receiving the upload progress:

Grabacion.de.pantalla.2025-05-28.a.las.9.54.21.mov

Keep in mind though that this may also be caused by the upload process itself.


I tried measuring the allocations now and I got this:

image

EventSendProgress$MediaUpload is the binding version of the SDK record, MediaUploadProgress is what we map it to. I think we have 2x the SDK items because we don't map them in the pinned events timeline, so they're discarded.

The range in the image shows an spike in general allocations, specially in the Java Heap. It's a shame we can't filter out items for the graph and just get the values for these allocations, we have no easy way of knowing whether these allocations come mainly from the EventSendProgress objects and their mapping process or there's something else happening at the same time.

@Hywan
Copy link
Member

Hywan commented May 28, 2025

@jmartinesp Thanks! Can you measure if the memory peak maps to the size of the media, or is it something else?

@jmartinesp
Copy link
Contributor

image

I'd say it looks quite similar without these changes, the image above being a bit more stretched.

@Johennes
Copy link
Contributor Author

@jmartinesp thanks a bunch for taking the time to test the impact of this on Android! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants