Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] Slow scanning with huge ammounts of videos #2824

Open
suppaberg opened this issue Aug 13, 2022 · 14 comments
Open

[Bug Report] Slow scanning with huge ammounts of videos #2824

suppaberg opened this issue Aug 13, 2022 · 14 comments
Labels
bug report Bug reports that are not yet verified

Comments

@suppaberg
Copy link

suppaberg commented Aug 13, 2022

With huge amounts of videos, scanning local video content gets slower and slower.

General info:
Windows 10
Stash v0.16.1
stash-go.sqlite size: ~110MB
DB contents: ~25k scenes, ~200k images
stash folder location: ssd attached via smb

According to log, slowdown occurs in the "Creating new item..." stage for movies (36s in below example):

time="2022-08-13 09:32:47" level=info msg="Calculating oshash for t:\\dl\\1\\something1.mp4 ..."
time="2022-08-13 09:32:47" level=info msg="t:\\dl\\1\\something1.mp4 doesn't exist. Creating new item..."
time="2022-08-13 09:33:23" level=info msg="[generator] generating phash sprite for t:\\dl\\1\\something1.mp4"
time="2022-08-13 09:33:27" level=info msg="[generator] generating sprite image for t:\\dl\\1\\something1.mp4"
time="2022-08-13 09:33:46" level=info msg="[generator] generating sprite vtt for t:\\dl\\1\\something1.mp4"
time="2022-08-13 09:33:46" level=info msg="Calculating oshash for t:\\dl\\1\\something2.mp4 ..."

Interestingly for images it works instantly.

time="2022-08-13 09:32:07" level=info msg="Calculating checksum for t:\\dl\\1\\something1.png..."
time="2022-08-13 09:32:07" level=info msg="t:\\dl\\1\\something1.png doesn't exist. Creating new item..."
time="2022-08-13 09:32:07" level=info msg="Associating image t:\\dl\\1\\something1.png with folder gallery"

Just to be safe it is not my system, i created another empty Stash instance. No such problems there.

Unrelated to above bug, just a thought:
perhaps it would be better to have nested folders in generated\screenshots and generated\vtt so both folders do not have such huge amounts of files in single folder (~50k+ files right now). Listing such folders for whatever reason gets very slow.

@suppaberg suppaberg added the bug report Bug reports that are not yet verified label Aug 13, 2022
@scruffynerf
Copy link

Curious if the problem continues with the file refactor...

The generated directories getting huge could be solved by using a subdirectories solution usually that's done by using 1-2 of the first or last digits as a reference, like putting Axxxxx into an A subdirectories, Bxxxxx into a B subdirectory, etc.

@suppaberg
Copy link
Author

suppaberg commented Aug 13, 2022

Yes, it seems that is already in use for generated\thumbnails (nesting folders two levels deep of 00 through ff) so could be reused here as well.

Another issue with such big folders is any kind of backup operation. It takes ages to sync those two folders and added downside is that any disk operation on that network share is very slow while backup is in progress. It could be just my way of using it, but i am sure similar problems will happen all over for folders filled with so many files

@bnkai
Copy link
Collaborator

bnkai commented Aug 13, 2022

While browsing a big directory is not ideal it shouldn't be an issue for the generation as we only stat the files while looking for existing ones.
Just to be clear you are aware that the log you pasted shows that you have a few generate options ticked when scanning? If you want a faster scan just don't tick any generate options ( you can queue a separate generate task for example)
30+ secs per scene seems ok to me considering that you calculate the phash and generate the sprite images ( both of them need ffmpeg to seek through the video file and take a number of screenshots )
The image scanning is instant because the only thing calculated is the md5.

@suppaberg
Copy link
Author

suppaberg commented Aug 13, 2022

True about not impacting speed inside Stash with big folders, but for any other operation with said folder, things get slower and slower, the more files there are inside.

Yes, i know there are a few additional operations done on movies. That is deliberate and expected to take some time (looking forward to the feature request mentioned here, if it ever gets added, in combination with sprites. Explained here: #2206). I wanted to show full log.
What i wanted to point out is that on empty DB i had instant (within same second) doesn't exist. Creating new item...", but on bigger db that same operation took 30+s/entry and rising.

@Stephan972
Copy link
Contributor

Same problem here...

General info:
Windows 7
Stash v0.16
stash-go.sqlite size: ~90MB
DB contents: ~15k scenes, ~150k images
stash folder location: 4To usb drive

@cj12312021
Copy link
Collaborator

Figured I’d chime in. I’m currently sitting at 115,727 scenes, 161,253 galleries, and 19,546,134 images. I have noticed a similar problem but it is not due to the quantity of the content. I’ve noticed that the issue was prominent with specific content. In my case, the issue is with Bangbros galleries. Scanning my Bangbros directory which only contains around 2k scenes with galleries can take 6-10 hours to complete. Keep in mind this is the case even when no new content was added to that directory. Again I want to mention that the issue is with the galleries in my case. I haven’t dug much deeper yet to get a better idea as to why the Bangbros galleries are so problematic. For now, I just avoid doing full scans on that directory. When not scanning that directly Stash breezes through everything else.

@gitgiggety
Copy link
Contributor

gitgiggety commented Aug 18, 2022

I've looked into this by adding much more log statements and enabling micro second timestamping of the log. Myself I'm mostly seeing such a slowdown between "Creating new item" and "generating phash sprite" due to screenshot generation. This is always done / isn't conditional (contrary to what @bnkai mentioned above) and for me takes just over half a second.

For example:

INFO[2022-08-18 18:42:15.055659345] Calculating oshash for /data/Videos/Producers/Sapphic Erotica/sapphicerotica 15.08.31 Kari, Lila.Silky Sex (1428) [1080p].mp4 ...
INFO[2022-08-18 18:42:15.063403953] .mp4 doesn't exist. Creating new item...
INFO[2022-08-18 18:42:15.131824068] Created NewVideoFile
INFO[2022-08-18 18:42:15.13186517] Setted title
INFO[2022-08-18 18:42:15.131872543] newScene populated
INFO[2022-08-18 18:42:15.131879074] converted video file to scene
INFO[2022-08-18 18:42:15.131885092] read metadata
INFO[2022-08-18 18:42:15.141304614] scene created & saved
INFO[2022-08-18 18:42:16.85286601] screenshots made
INFO[2022-08-18 18:42:16.92751906] plugin executed
INFO[2022-08-18 18:42:16.927559893] Done scanning scene
INFO[2022-08-18 18:42:16.927568788] Kinda start sprite
INFO[2022-08-18 18:42:16.927591332] Kinda start phash
INFO[2022-08-18 18:42:16.927597304] Kinda start preview
INFO[2022-08-18 18:42:16.92771084] Actual start sprite
INFO[2022-08-18 18:42:16.927717572] Actual start phash
INFO[2022-08-18 18:42:17.002289481] [generator] generating phash sprite for

So it is my believe this slowdown is due to that. Also because if I revert to an older database and rerun the scan this slowdown doesn't happen. This because it's quick in determining that the screenshot files already do exist and it thus doesn't have to run ffmpeg to generate them. (Which might explain why it's fast when you start with a fresh database, if you didn't clear the generated screenshots directory as well).

One thing I'm wondering about is whether it would be better to not generate both the screenshot and the thumbnail of the video, but only generate the screenshot of the video, and just resize that image into the thumbnail. Then the video doesn't have to be read twice.
And for me personally, I don't care about the screenshot anyway. 99% of the videos can be scraped and I overwrite the generated screenshot anyway. So personally I would even disable thumbnail generation (but in that case a dummy image must be shown obviously, to not break the UI).

Edit:
Screenshot (and thumbnail) generation actually is already logged, but not as info but as debug. If you go to the Log tab and select Debug for Log Level you should see them passing by during scanning.

gitgiggety added a commit to gitgiggety/stash that referenced this issue Aug 18, 2022
Currently the thumbnail is generated based on the video, but this is
slow, especially when the video is on network storage. So instead only
generate the screenshot based on the video, and resize that image for
the thumbnail.

Refs: stashapp#2824
gitgiggety added a commit to gitgiggety/stash that referenced this issue Aug 18, 2022
Currently the thumbnail is generated based on the video, but this is
slow, especially when the video is on network storage. So instead only
generate the screenshot based on the video, and resize that image for
the thumbnail.

Refs: stashapp#2824
@suppaberg
Copy link
Author

When trying with new DB i meant i just copied config.yml from old location to new folder (along with stash-win.exe and ffmpeg files, so it uses same ones). "generated" folder is no more, same for db. I reran a quick test on such new setup (with debug logging):

~1MB file:

time="2022-08-19 20:30:36" level=info msg="Calculating oshash for t:\dl\1\something2.mp4 ..."
time="2022-08-19 20:30:36" level=info msg="t:\dl\1\something2.mp4 doesn't exist. Creating new item..."
time="2022-08-19 20:30:36" level=debug msg="Creating thumbnail for t:\dl\1\something2.mp4"
time="2022-08-19 20:30:36" level=debug msg="created thumbnail: generated\screenshots\0c4fd73aae6f73cd.thumb.jpg"
time="2022-08-19 20:30:36" level=debug msg="Creating screenshot for t:\dl\1\something2.mp4"
time="2022-08-19 20:30:36" level=debug msg="created screenshot: generated\screenshots\0c4fd73aae6f73cd.jpg"
time="2022-08-19 20:30:37" level=info msg="[generator] generating phash sprite for t:\dl\1\something2.mp4"
time="2022-08-19 20:30:37" level=info msg="[generator] generating sprite image for t:\dl\1\something2.mp4"
time="2022-08-19 20:30:50" level=info msg="[generator] generating sprite vtt for t:\dl\1\something2.mp4"
time="2022-08-19 20:30:50" level=info msg="Calculating oshash for t:\dl\1\something3.mp4 ..."

~2GB file:

time="2022-08-19 20:26:23" level=info msg="Calculating oshash for t:\dl\1\something11.mp4 ..."
time="2022-08-19 20:26:23" level=info msg="t:\dl\1\something11.mp4 doesn't exist. Creating new item..."
time="2022-08-19 20:26:23" level=debug msg="Creating thumbnail for t:\dl\1\something11.mp4"
time="2022-08-19 20:26:23" level=debug msg="created thumbnail: generated\screenshots\fedc8562ab2e1b1d.thumb.jpg"
time="2022-08-19 20:26:23" level=debug msg="Creating screenshot for t:\dl\1\something11.mp4"
time="2022-08-19 20:26:24" level=debug msg="created screenshot: generated\screenshots\fedc8562ab2e1b1d.jpg"
time="2022-08-19 20:26:24" level=info msg="[generator] generating phash sprite for t:\dl\1\something11.mp4"
time="2022-08-19 20:26:24" level=info msg="[generator] generating sprite image for t:\dl\1\something11.mp4"
time="2022-08-19 20:26:52" level=info msg="[generator] generating sprite vtt for t:\dl\1\something11.mp4"
time="2022-08-19 20:26:52" level=info msg="Calculating oshash for t:\dl\1\something12.mp4 ..."

Again "Creating new item" is just a fraction of a second, as is everything else apart from "generating sprite image".

Big (old) DB with accompanying folders again:
~1GB file

time="2022-08-19 20:50:28" level=info msg="Calculating oshash for t:\dl\1\something110.mp4 ..."
time="2022-08-19 20:50:28" level=info msg="t:\dl\1\something110.mp4 doesn't exist. Creating new item..."
time="2022-08-19 20:50:38" level=debug msg="Creating thumbnail for t:\dl\1\something110.mp4"
time="2022-08-19 20:50:42" level=debug msg="created thumbnail: generated\screenshots\d3eadc22fde92845.thumb.jpg"
time="2022-08-19 20:50:42" level=debug msg="Creating screenshot for t:\dl\1\something110.mp4"
time="2022-08-19 20:50:47" level=debug msg="created screenshot: generated\screenshots\d3eadc22fde92845.jpg"
time="2022-08-19 20:50:55" level=info msg="[generator] generating phash sprite for t:\dl\1\something110.mp4"
time="2022-08-19 20:50:59" level=info msg="[generator] generating sprite image for t:\dl\1\something110.mp4"
time="2022-08-19 20:51:22" level=info msg="[generator] generating sprite vtt for t:\dl\1\something110.mp4"
time="2022-08-19 20:51:22" level=info msg="Calculating oshash for t:\dl\1\something111.mp4 ..."

~18MB file

time="2022-08-19 20:53:55" level=info msg="Calculating oshash for t:\dl\1\something120.mp4 ..."
time="2022-08-19 20:53:55" level=info msg="t:\dl\1\something120.mp4 doesn't exist. Creating new item..."
time="2022-08-19 20:54:05" level=debug msg="Creating thumbnail for t:\dl\1\something120.mp4"
time="2022-08-19 20:54:10" level=debug msg="created thumbnail: generated\screenshots\12beedfdae32g3b8.thumb.jpg"
time="2022-08-19 20:54:10" level=debug msg="Creating screenshot for t:\dl\1\something120.mp4"
time="2022-08-19 20:54:15" level=debug msg="created screenshot: generated\screenshots\12beedfdae32g3b8.jpg"
time="2022-08-19 20:54:22" level=info msg="[generator] generating phash sprite for t:\dl\1\something120.mp4"
time="2022-08-19 20:54:26" level=info msg="[generator] generating sprite image for t:\dl\1\something120.mp4"
time="2022-08-19 20:54:49" level=info msg="[generator] generating sprite vtt for t:\dl\1\something120.mp4"
time="2022-08-19 20:54:49" level=info msg="Calculating oshash for t:\dl\1\something121.mp4 ..."

There seems to be something off here. New install is fast, old one is not on most of the tasks.

@gitgiggety
Copy link
Contributor

Thank you for the clarification and log including debug messages.

So it seems like there already is a 10 second delay between "Creating new item" and "Creating thumbnail", and creating both images takes about 5 seconds each.

The PR I created, #2839, should avoid a second slow pass over the image so should reduce one of those 5 seconds to (well) under a second.

The remaining time is hard to explain for me. But I will try to look into it. For example in the 10 seconds from "Creating new item" to "Creating thumbnail" there is a lot of stuff going on. The information of the video is read (duration, video and audio codecs, resolution, etc) which might or might not be slow, but it could also be inserting the item into the database being slow.

As I guess your empty test installation stays more or less empty? So you're not importing the entire collection? Which obviously makes for a much smaller database in which in shouldn't be that hard / slow to insert new items, for example. And if there is a massive slowdown in the screenshot & thumbnail generation because of the directory containing lots of files you wouldn't notice that either when the directory is empty.

@gitgiggety
Copy link
Contributor

Forgot to ask, could you try to run the following command and report how long it took? In other words: whether it's instant, or takes a second or longer:

ffmpeg -v error -y -ss 10 -i \path\to\something.mp4 -frames:v 1 -q:v 2 -f image2 \path\to\stash\generated/tmp/test.jpg

This will generate a screenshot at 10 seconds of the given video to the given path. And this is also what Stash executes to generate the video. Note you might have to look up the location of the ffmpeg executable (and it might be ffmpeg.exe as well).

This to determine whether it's ffmpeg which is slow or whether it's in Stash. But, which I didn't notice before either, the generated image isn't stored in the destination folder, it's put in a temp folder and moved from there. So even if ffmpeg for some reason would scan the output folder it should be a more or less empty folder anyway.

@gitgiggety
Copy link
Contributor

So I've just done a test and generated a database with 30.000 scenes (1 performer, 1 tag, etc). I've also copied the contents of generated/screenshot folder 10 times (folder contained 1600 genuine screenshots, thumbs and previews, copied 10 times thus creates over 16.000 files). I've then removed the genuine files and started an import of my video collection, without enabling any of the extra "generators", so it just imports the item and generates the screenshot and thumbnail. This ran on the branch with the thumbnail fix, and it took 9 minutes and 15 seconds to import 692 scenes, generating 1384 files. This means it only took 0,80 seconds on average per scene to be imported (and it takes a bit less per video as there are some duplicate files being ignored). Taking into account the thumbnail fix included in this build and it at maximum reducing the screenshot/thumbnail generation in half it still would on average take a maximum of 1,6 seconds on the normal build. And this all on a database with already having 30K scenes in it (although "fake"), and the screenshot folder already containing over 16K of files.

So IMO it's safe to say there aren't any real issues with this. Or at least no issues which result in these massive slowdowns of 10 seconds between "Creating new item" and "Generating thumbnail", nor for 5 seconds to generate the two files (per file).

All of this leaves me wondering how you're using Stash. You mention using an SSD over SMB / network share. But are Stash' files on the network share as well (but running on the local computer)? So the database file, generated folder, etc? Because for me that would be the only explanation why it would be so slow. Because it then constantly has to read and write those files over the share. And I can imagine that being, a lot slower than having Stash read and write it's own files locally. So checking for the existence of the screenshot files might be slow. But for example ffmpeg having to read the video over the SMB share but at the same time also having to write the screenshot to the share might incur some performance penalty. And if the SQLite database is on the SMB share as well it might even be worse. As that definitely has to go back and forward to read and write to the database. Which might, partially, explain the 10 seconds gap between "Creating new item" and starting to generate the thumbnail.

And for what it's worth: in this test Stash was running on my computer, with the video files being on my NAS, stored on HDDs, and in use using SSHFS. So the video files also being made available over a network share. (But Stash' files like the SQLite database and generated folder being stored locally, on an SSD).

@suppaberg
Copy link
Author

suppaberg commented Aug 24, 2022

A bit more info how i used it: i had everything on SMB initially (movies, whole stash folder). I now did a few more tests to try to narrow things down a bit. It looks like huge amounts of files in one folder in combination with SMB share are actually very much connected to the problem after all. For me SMB share is fast for single files (over 100MB/s) but slow for enumerating huge amounts of files.

I moved stash folder to SSD (c: - kept old db of ~130MB, stash-win.exe, ffmpeg files and "cache" folder in config (just in case)) and redirected "generated" folder to initial stash location on SMB (s:\stash\generated). movies are on another SMB - t:.

Test 1:
I moved whole s:\stash\generated\screenshots folder somewhere else. Fast first part since screenshots is empty
s:\stash\generated\screenshots - empty
s:\stash\generated\thumbnails - 200k+ files, stored in two level deep folders
s:\stash\generated\vtt - ~50k files

time="2022-08-24 16:36:05" level=info msg="Calculating oshash for t:\dl\1\something120.mp4 ..."
time="2022-08-24 16:36:05" level=info msg="t:\dl\1\something120.mp4 doesn't exist. Creating new item..."
time="2022-08-24 16:36:05" level=debug msg="Creating thumbnail for t:\dl\1\something120.mp4"
time="2022-08-24 16:36:05" level=debug msg="created thumbnail: s:\stash\generated\screenshots\34f85dfe4387a676.thumb.jpg"
time="2022-08-24 16:36:05" level=debug msg="Creating screenshot for t:\dl\1\something120.mp4"
time="2022-08-24 16:36:05" level=debug msg="created screenshot: s:\stash\generated\screenshots\34f85dfe4387a676.jpg"
time="2022-08-24 16:36:14" level=info msg="[generator] generating phash sprite for t:\dl\1\something120.mp4"
time="2022-08-24 16:36:18" level=info msg="[generator] generating sprite image for t:\dl\1\something120.mp4"
time="2022-08-24 16:36:36" level=info msg="[generator] generating sprite vtt for t:\dl\1\something120.mp4"
time="2022-08-24 16:36:36" level=info msg="Calculating oshash for t:\dl\1\something120.mp4 ..."

Fast first part since screenshots is empty

Test 2:
I moved whole vtt folder somewhere else, returned screenshots folder back:
s:\stash\generated\screenshots - ~65k files
s:\stash\generated\thumbnails - 200k+ files, stored in two level deep folders
s:\stash\generated\vtt - empty

time="2022-08-24 16:57:36" level=info msg="Calculating oshash for t:\dl\1\something220.mp4 ..."
time="2022-08-24 16:57:36" level=info msg="t:\dl\1\something220.mp4 doesn't exist. Creating new item..."
time="2022-08-24 16:57:47" level=debug msg="Creating thumbnail for t:\dl\1\something220.mp4"
time="2022-08-24 16:57:53" level=debug msg="created thumbnail: s:\stash\generated\screenshots\84a7340be6934e32.thumb.jpg"
time="2022-08-24 16:57:53" level=debug msg="Creating screenshot for t:\dl\1\something220.mp4"
time="2022-08-24 16:57:59" level=debug msg="created screenshot: s:\stash\\generated\screenshots\84a7340be6934e32.jpg"
time="2022-08-24 16:57:59" level=info msg="[generator] generating phash sprite for t:\dl\1\something220.mp4"
time="2022-08-24 16:57:59" level=info msg="[generator] generating sprite image for t:\dl\1\something220.mp4"
time="2022-08-24 16:58:07" level=info msg="[generator] generating sprite vtt for t:\dl\1\something220.mp4"
time="2022-08-24 16:58:07" level=info msg="Calculating checksum for t:\dl\1\something221.mp4..."

Fast second part since vtt is empty

Test 3:
I moved both screenshots and vtt folders somewhere else:
s:\stash\generated\screenshots - empty
s:\stash\generated\thumbnails - 200k+ files, stored in two level deep folders
s:\stash\generated\vtt - empty

time="2022-08-24 17:02:44" level=info msg="Calculating oshash for t:\dl\1\something250.mp4 ..."
time="2022-08-24 17:02:44" level=info msg="t:\dl\1\something250.mp4 doesn't exist. Creating new item..."
time="2022-08-24 17:02:44" level=debug msg="Creating thumbnail for t:\dl\1\something250.mp4"
time="2022-08-24 17:02:44" level=debug msg="created thumbnail: s:\stash\generated\screenshots\380b43d21343734c.thumb.jpg"
time="2022-08-24 17:02:44" level=debug msg="Creating screenshot for t:\dl\1\something250.mp4"
time="2022-08-24 17:02:44" level=debug msg="created screenshot: s:\stash\generated\screenshots\380b43d21343734c.jpg"
time="2022-08-24 17:02:44" level=info msg="[generator] generating phash sprite for t:\dl\1\something250.mp4"
time="2022-08-24 17:02:44" level=info msg="[generator] generating sprite image for t:\dl\1\something250.mp4"
time="2022-08-24 17:03:14" level=info msg="[generator] generating sprite vtt for t:\dl\1\something250.mp4"
time="2022-08-24 17:03:14" level=info msg="Calculating oshash for t:\dl\1\something251.mp4 ..."

Fast everything since both are empty

Test 4:
I returned both screenshots and vtt folders back:
s:\stash\generated\screenshots - ~65k files
s:\stash\generated\thumbnails - 200k+ files, stored in two level deep folders
s:\stash\generated\vtt - ~50k files

time="2022-08-24 17:23:43" level=info msg="Calculating oshash for t:\dl\1\something291.mp4 ..."
time="2022-08-24 17:23:43" level=info msg="t:\dl\1\something291.mp4 doesn't exist. Creating new item..."
time="2022-08-24 17:23:55" level=debug msg="Creating thumbnail for t:\dl\1\something291.mp4"
time="2022-08-24 17:24:00" level=debug msg="created thumbnail: s:\stash\generated\screenshots\aeb4b94d5147b534.thumb.jpg"
time="2022-08-24 17:24:00" level=debug msg="Creating screenshot for t:\dl\1\something291.mp4"
time="2022-08-24 17:24:06" level=debug msg="created screenshot: s:\stash\generated\screenshots\aeb4b94d5147b534.jpg"
time="2022-08-24 17:24:16" level=info msg="[generator] generating phash sprite for t:\dl\1\something291.mp4"
time="2022-08-24 17:24:21" level=info msg="[generator] generating sprite image for t:\dl\1\something291.mp4"
time="2022-08-24 17:24:39" level=info msg="[generator] generating sprite vtt for t:\dl\1\something291.mp4"
time="2022-08-24 17:24:39" level=info msg="Calculating oshash for t:\dl\1\something292.mp4 ..."

Slowdown as in the begining.

Test 5:
Moved stash working dir (db, config, exe, ffmpeg) to SMB share (s:\stash), same as when i started testing Stash
s:\stash\generated\screenshots - ~65k files
s:\stash\generated\thumbnails - 200k+ files, stored in two level deep folders
s:\stash\generated\vtt - ~50k files

time="2022-08-24 18:13:35" level=info msg="Calculating oshash for t:\dl\1\something305.mp4 ..."
time="2022-08-24 18:13:35" level=info msg="t:\dl\1\something305.mp4 doesn't exist. Creating new item..."
time="2022-08-24 18:13:47" level=debug msg="Creating thumbnail for t:\dl\1\something305.mp4"
time="2022-08-24 18:13:52" level=debug msg="created thumbnail: s:\stash\generated\screenshots\a5258d52dc20b525.thumb.jpg"
time="2022-08-24 18:13:52" level=debug msg="Creating screenshot for t:\dl\1\something305.mp4"
time="2022-08-24 18:13:58" level=debug msg="created screenshot: s:\stash\generated\screenshots\a5258d52dc20b525.jpg"
time="2022-08-24 18:14:08" level=info msg="[generator] generating phash sprite for t:\dl\1\something305.mp4"
time="2022-08-24 18:14:13" level=info msg="[generator] generating sprite image for t:\dl\1\something305.mp4"
time="2022-08-24 18:14:34" level=info msg="[generator] generating sprite vtt for t:\dl\1\something305.mp4"
time="2022-08-24 18:14:34" level=info msg="Calculating oshash for t:\dl\1\something292.mp4 ..."

Sort of same times as test 4 where DB was on SSD directly.

Your experimental test with ffmpeg (s:\stash\generated\tmp\ folder is empty).

ffmpeg -v error -y -ss 10 -i "t:\dl\1\something305.mp4" -frames:v 1 -q:v 2 -f image2 s:\stash\generated\tmp\test.jpg

Generating took perhaps half a second

And did additional ones with folders full of files (~65k and ~50k):

ffmpeg -v error -y -ss 10 -i "t:\dl\1\something305.mp4" -frames:v 1 -q:v 2 -f image2 s:\stash\generated\screenshots\test.jpg

~3s

ffmpeg -v error -y -ss 10 -i "t:\dl\1\something305.mp4" -frames:v 1 -q:v 2 -f image2 s:\stash\generated\vtt\test.jpg

~3s

@don20aba
Copy link

don20aba commented Oct 7, 2022

I see this bug report is visible again.

To recap findings from the other thread:
The problem lies in screenshots and vtt folders having too many files inside in single folder. This gets more noticeable the more files you have, but especially so when you have screenshots and vtt folders (generated\screenshots, generated\vtt) on SMB network share. Stash app is located on local ssd (c:).
Solution would be to remake those folders the same way thumbnails are stored (two levels deep), so there are never many files in one single folder.

Right now i stopped using this awesome app as i can not use it due to inability to scan collection. Am very hopeful it will be fixed in some future release.

@don20aba
Copy link

don20aba commented May 2, 2023

Stash is usable again! :)

Thank you for reworking cover images in "blobs" folder in same way as thumbnails (two levels deep).

If you do the same for "generated\screenshots" and "generated\vtt", features that use those folders (previews,..) would work well in this scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug report Bug reports that are not yet verified
Projects
Status: To triage
Development

No branches or pull requests

7 participants