Skip to content

Conversation

@printminion-co
Copy link
Contributor

@printminion-co printminion-co commented Apr 9, 2025

Summary

  • On NC configured with remote storage like S3
    • User uploads video file of 3.7Gb
    • While file uploads it also stored to S3
    • After the upload is complete, when the user navigates to the folder containing the newly uploaded file, the thumbnail generation logic starts
  • The current logic downloads the entire 3.7 GB file from S3 to create the thumbnail (as observed in the network traffic screen).

Selection_20250408-002

Error produced by ffmpeg

Movie preview generation failed Output: ffmpeg version 6.1.1 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 13.2.1 (Alpine 13.2.1_git20240309) 20240309
  configuration: --prefix=/usr --disable-librtmp --disable-lzma --disable-static --disable-stripping --enable-avfilter --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libmp3lame --enable-libopenmpt --enable-libopus --enable-libplacebo --enable-libpulse --enable-librav1e --enable-librist --enable-libsoxr --enable-libsrt --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-lto=auto --enable-lv2 --enable-openssl --enable-pic --enable-postproc --enable-pthreads --enable-shared --enable-vaapi --enable-vdpau --enable-version3 --enable-vulkan --optflags=-O3 --enable-libjxl --enable-libsvtav1 --enable-libvpl
  libavutil      58. 29.100 / 58. 29.100
  libavcodec     60. 31.102 / 60. 31.102
  libavformat    60. 16.100 / 60. 16.100
  libavdevice    60.  3.100 / 60.  3.100
  libavfilter     9. 12.100 /  9. 12.100
  libswscale      7.  5.100 /  7.  5.100
  libswresample   4. 12.100 /  4. 12.100
  libpostproc    57.  3.100 / 57.  3.100
  [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f53f2e9c600] moov atom not found
[in#0 @ 0x7f53f2f928c0] Error opening input: Invalid data found when processing input
Error opening input file /tmp/oc_tmp_POfcbD.
Error opening input files: Invalid data found when processing input

Proposed Solution

Prevent downloading entire movie files from remote storage (e.g., S3) when the 'moov atom' is located at the end of the file.

Test

Can be tested (locally with Minio) with video files larger than 5Mb with "moov atom" at the end of the file

Download sample video with missing "moov atom" at the beginning of the file
wget https://www.sample-videos.com/video321/mp4/720/big_buck_bunny_720p_10mb.mp4

  • Upload file before proposed change. Observe - thumbnail exist. I happens because after the error in log "Movie preview generation failed Output" the whole file is downloaded from S3.
  • Upload file with proposed change.
    • Observe - no thumbnail is generated

Checklist

Copy link
Contributor

@come-nc come-nc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to get only the end of the file?

Copy link
Contributor

@artonge artonge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@printminion-co I was not able to reproduce the full download locally. Is there anything specific about your setup? Maybe it is the S3 bucket?

Could it be that your S3 bucket does not support partially seeking the file? Not sure exactly what would be missing. Maybe @juliusknorr or @icewind1991 know better?

Also, I assume that in your testing, the full download was triggered by the null option, meaning that the 5242880 was failing. Do you have any error message related to that?

My goal would be to try to fix the partial download instead of dropping the functionality.

@juliusknorr
Copy link
Member

Also, I assume that in your testing, the full download was triggered by the null option, meaning that the 5242880 was failing. Do you have any error message related to that?

I guess that is what @printminion-co mentions that with this specific video file the relevant bits are at the end so it always requires the full read.

I'm not sure if it also may also depend on other factors like the ffmpeg version used if a preview can be generated without the moov atom.

From my POV it would be acceptable to not have previews in those cases, though from a user perspective it may be unexpected to see some previews of videos, but not all.

@skjnldsv
Copy link
Member

Ha, we had a ticket on that topic literally a few days ago.
yeah, that's the issue with videos, not much we can do tbh.
We could stop after X bytes, that would work sure.

Some files are also more error prone, like MOV files are terribly optimized for streaming for that reason, moov atom section is almost never in the early strt of the file 😢

Copy link
Contributor

@artonge artonge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make it easier to read.

@printminion-co printminion-co force-pushed the fix/s3_traffic_on_video_thumbnails branch 2 times, most recently from 9fdd39c to 42b86e1 Compare April 22, 2025 11:31
@printminion-co printminion-co requested a review from artonge April 22, 2025 11:36
@printminion-co printminion-co force-pushed the fix/s3_traffic_on_video_thumbnails branch from 42b86e1 to 68b8d8b Compare April 22, 2025 14:59
@printminion-co printminion-co requested a review from artonge April 22, 2025 15:00
@printminion-co
Copy link
Contributor Author

@printminion-co I was not able to reproduce the full download locally. Is there anything specific about your setup? Maybe it is the S3 bucket?

Could it be that your S3 bucket does not support partially seeking the file? Not sure exactly what would be missing. Maybe @juliusknorr or @icewind1991 know better?

Also, I assume that in your testing, the full download was triggered by the null option, meaning that the 5242880 was failing. Do you have any error message related to that?

My goal would be to try to fix the partial download instead of dropping the functionality.

The functionality will be dropped on files larger than 5Mb on which the ffmpeg thumbnail generator will get the
Following error (added to description).

Movie preview generation failed Output: ffmpeg version 6.1.1 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 13.2.1 (Alpine 13.2.1_git20240309) 20240309
  configuration: --prefix=/usr --disable-librtmp --disable-lzma --disable-static --disable-stripping --enable-avfilter --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libmp3lame --enable-libopenmpt --enable-libopus --enable-libplacebo --enable-libpulse --enable-librav1e --enable-librist --enable-libsoxr --enable-libsrt --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-lto=auto --enable-lv2 --enable-openssl --enable-pic --enable-postproc --enable-pthreads --enable-shared --enable-vaapi --enable-vdpau --enable-version3 --enable-vulkan --optflags=-O3 --enable-libjxl --enable-libsvtav1 --enable-libvpl
  libavutil      58. 29.100 / 58. 29.100
  libavcodec     60. 31.102 / 60. 31.102
  libavformat    60. 16.100 / 60. 16.100
  libavdevice    60.  3.100 / 60.  3.100
  libavfilter     9. 12.100 /  9. 12.100
  libswscale      7.  5.100 /  7.  5.100
  libswresample   4. 12.100 /  4. 12.100
  libpostproc    57.  3.100 / 57.  3.100
  [mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f53f2e9c600] moov atom not found
[in#0 @ 0x7f53f2f928c0] Error opening input: Invalid data found when processing input
Error opening input file /tmp/oc_tmp_POfcbD.
Error opening input files: Invalid data found when processing input"

Probably NC admins with S3 backend could later via config define their own upper re-download limit they are ok with (e.g. 50mb)

Prevent downloading entire movie files from remote storage (e.g., S3)
when the 'moov atom' is located at the end of the file.

Signed-off-by: Misha M.-Kupriyanov <kupriyanov@strato.de>
@printminion-co printminion-co force-pushed the fix/s3_traffic_on_video_thumbnails branch from 68b8d8b to 4a924bf Compare April 23, 2025 08:04
@github-actions
Copy link
Contributor

Hello there,
Thank you so much for taking the time and effort to create a pull request to our Nextcloud project.

We hope that the review process is going smooth and is helpful for you. We want to ensure your pull request is reviewed to your satisfaction. If you have a moment, our community management team would very much appreciate your feedback on your experience with this PR review process.

Your feedback is valuable to us as we continuously strive to improve our community developer experience. Please take a moment to complete our short survey by clicking on the following link: https://cloud.nextcloud.com/apps/forms/s/i9Ago4EQRZ7TWxjfmeEpPkf6

Thank you for contributing to Nextcloud and we hope to hear from you soon!

(If you believe you should not receive this message, you can add yourself to the blocklist.)

@AndyScherzinger AndyScherzinger added this to the Nextcloud 32 milestone Apr 24, 2025
@AndyScherzinger AndyScherzinger merged commit 34949e4 into nextcloud:master Apr 24, 2025
182 of 198 checks passed
@artonge
Copy link
Contributor

artonge commented Apr 24, 2025

/backport to stable31

@invario
Copy link
Contributor

invario commented Apr 26, 2025

Apologies for being late to the discussion, but to my untrained eye, it seems this would affect all non-local shares, including SMB? If that is the case, that is unfortunate, as I have a library of videos larger than 5MB that is mounted as external storage in my NC instance. Previews have already been pre-generated for all of them so I won't notice any difference personally, but although these files are technically stored "remote", they are actually on the same server (for reasons I won't get into) so the connection speed is very high. Even if these were mounted remote but on a LAN, larger files than 5MB could still be accessed fairly quickly.

I've been racking my brain all night to see if there's a better way to deal with the problem of having to download the entire file but thus far my research is indicating it's not possible.

ffprobe/ffmpeg supposedly have methods of accessing the S3 to allow seeking to the end to fetch the moov atom without requesting the entire file, but the problem is NC's temp file mechanism added in the middle that wouldn't allow this to work.

For testing purposes, I've made efforts to brute force trim the front of a video file off down to the last 500KB or so (which should be plenty to cover the moov atom) but this results in ffmpeg/ffprobe failing with a moov atom not found error.

@invario
Copy link
Contributor

invario commented Apr 26, 2025

For what it's worth, I came across the following two threads during my research:
https://stackoverflow.com/questions/12620631/ffmpegphp-get-thumbnail-from-external-url

And:
PHP-FFMpeg/PHP-FFMpeg#167

Is there a reason why NC preview generation is written to call the ffmpeg binary directly as opposed to using PHP-FFMpeg?

Not that this would solve the issue... But supposedly passing a signed URI directly would work

@kesselb
Copy link
Collaborator

kesselb commented May 2, 2025

opposed to using PHP-FFMpeg?

Is also calling the binary directly (through binarydriver and symfony process)

@invario
Copy link
Contributor

invario commented Jun 19, 2025

Still digging around and thinking about how to implement this but buried in common.php is this:

	/**
	 * A custom storage implementation can return an url for direct download of a give file.
	 *
	 * For now the returned array can hold the parameter url - in future more attributes might follow.
	 */
	public function getDirectDownload(string $path): array|false {
		return [];
	}

...which would be exactly what's needed to pass directly to ffmpeg for an S3 bucket so that it could immediately seek to the end of the file to retrieve the moov atom. Unfortunately, it's just a placeholder and not implemented yet.

@invario
Copy link
Contributor

invario commented Jun 21, 2025

@printminion-co I would love it if you could give my PR a test and let me know how it works for you.

@printminion-co
Copy link
Contributor Author

@printminion-co I would love it if you could give my PR a test and let me know how it works for you.

You mean probably the #53634

@invario
Copy link
Contributor

invario commented Jun 24, 2025

Yes, that's the one.

@invario
Copy link
Contributor

invario commented Jul 8, 2025

@printminion-co I would love it if you could give my PR a test and let me know how it works for you.

You mean probably the #53634

Did you have an opportunity to test my PR? Any feedback?

@invario
Copy link
Contributor

invario commented Jul 26, 2025

I have #53634 which is specific to S3 and passes the URL direct to ffmpeg for processing. But I also have #53952 which works for all external shares (tested on S3 and SMB).

Would love to get some testing. Both are tested and work well on my own setup on AWS S3 (a 1.9GB remote file) and also SMB shares.

@nextcloud-bot nextcloud-bot mentioned this pull request Aug 19, 2025
@Vislike
Copy link

Vislike commented Sep 3, 2025

I think this commit breaks preview generation when using encryption also, i do that on my server, and preview for videos used to work fine, but currently on latest version of nextcloud only some videos generates previews. And i think it is this change that broke it since if i understands things correctly Encryption.php always returns false for isLocal() if encryption is enabled so it will not attempt to decrypt the entire file.

@artonge
Copy link
Contributor

artonge commented Sep 4, 2025

I think this commit breaks preview generation when using encryption also, i do that on my server, and preview for videos used to work fine, but currently on latest version of nextcloud only some videos generates previews.

That sounds plausible.

Given that nowadays pictures are more than 5Mb, we could find a better middle ground for the size limit. 20Mb sounds reasonable. We could couple that with the file's mime type. @Vislike any chance you could open a PR and test it on your instance?

@invario
Copy link
Contributor

invario commented Sep 4, 2025

I think this commit breaks preview generation when using encryption also, i do that on my server, and preview for videos used to work fine, but currently on latest version of nextcloud only some videos generates previews.

That sounds plausible.

Given that nowadays pictures are more than 5Mb, we could find a better middle ground for the size limit. 20Mb sounds reasonable. We could couple that with the file's mime type. @Vislike any chance you could open a PR and test it on your instance?

I don't think 20 MB would help all that much when it comes to videos, depending on the video bitrate and length of video, of course. But as an example: On my phone, 5 seconds of 4K video is already 36 MB. And 7 seconds of 1080p is 12 MB. So there's great variation.

The issue still remains that for some videos, the moov atom is at the end of the file and it gets cut off and so preview generation will fail.

PR #53952 resolves this and is pending merge.

I'm not sure about the Encryption.php aspect though and why it's written to return false for isLocal() but #53952 should still function even if something is improperly returning isLocal()=false

@Vislike
Copy link

Vislike commented Sep 5, 2025

I did confirm that it was this PR that broke preview generation, by reverting the changes to Movie.php my server generated preview for all 4k videos, i opened a bug report about it #54860 , and since i don't understand the entire isLocal() thing or why it is false, and i don't know php well enough to setup an environment that indexes symbols so i can try to figure out to resolve it both for remote files and local encrypted.

My hope is someone who knows how the encryption works can chime in, or that the PR form invario solves the issue, i might have time to test that during the weekend on my server.

@come-nc
Copy link
Contributor

come-nc commented Sep 8, 2025

Encryption wrapper returns false for isLocal because the local file is encrypted and cannot be used as-is from outside of Nextcloud.
For example, giving the path to the file to ffmpeg would fail to achieve anything useful.

So the file is as good as not local, as we cannot operate on it apart from through the encryption wrapper.

@invario
Copy link
Contributor

invario commented Sep 8, 2025

Encryption wrapper returns false for isLocal because the local file is encrypted and cannot be used as-is from outside of Nextcloud. For example, giving the path to the file to ffmpeg would fail to achieve anything useful.

So the file is as good as not local, as we cannot operate on it apart from through the encryption wrapper.

@come-nc That makes sense, except this is what's confusing me in ProviderV2.php

protected function useTempFile(File $file): bool {
		return $file->isEncrypted() || !$file->getStorage()->isLocal();
}

Which makes it a bit redundant, or else it would indicate that they're separate from each other.

And now that I'm diving into it a bit more also, useTempFile isn't used anywhere other than in Movie.php and ProviderV2.php.

I worked on Movie.php with the understanding that a file can be encrypted and local and therefore relied on isLocal to determine if it needed to cap the download at 10 MB but if that's not the case, this part of the code is unnecessary. And I think useTempFile can just be dropped from ProviderV2.php

$sizeAttempts = [1024 * 1024 * 10];

if ($this->useTempFile($file)) {
			if ($file->getStorage()->isLocal()) {
				// Temp file required but file is local, so retrieve $sizeAttempt bytes first,
				// and if it doesn't work, retrieve the entire file.
				$sizeAttempts[] = null;
			}
		} else {
			// Temp file is not required and file is local so retrieve entire file.
			$sizeAttempts = [null];
		}

@skjnldsv skjnldsv modified the milestones: Nextcloud 32, Nextcloud 33 Sep 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants