Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate preview generation to async job execution #11685

Closed
LukasReschke opened this issue Oct 21, 2014 · 74 comments
Closed

Migrate preview generation to async job execution #11685

LukasReschke opened this issue Oct 21, 2014 · 74 comments

Comments

@LukasReschke
Copy link
Member

LukasReschke commented Oct 21, 2014

The preview generation should happen using cron (preferably only the system cron) instead of happening on demand. - Currently we have a lot of users complaining about the performance just because preview generation is taking ages.

\cc @georgehrke
@karlitschek FYI

@LukasReschke
Copy link
Member Author

And yes, I know that this is not possible with encryption, but the cronjob could just check whether the instance is encrypted or not.

Just because there is one use-case where it doesn't work shouldn't prevent massive speed gains on other use-cases ;-)

@LukasReschke
Copy link
Member Author

Actually, we should not remove the on-demand generation but add another cronjob that does the generation already before in some "common" sizes (i.e. retina and non-retina).
If the user visits the folder before the cronjob happened the on-demand generation would happen, otherwise the existing pre-generated thumbnails would be shown.

@karlitschek
Copy link
Contributor

I'm not really sure how this would improve the user experience. A background job is not faster with generating the thumbnails neither is it done necessarily at the optimal time. I assume people keep on suggesting a background job because the old on demand generation was blocking the UI which is horrible. So perhaps we should wait if people are more happy with the behavior of 7.0.3

@georgehrke
Copy link
Contributor

I agree with @karlitschek here. Let's wait for user-feedback from 7.0.3.
If people still experience performance issues with 7.0.3 this is definitely something we should consider for ownCloud 8.

a few things we should discuss when considering implementing this:

  • breadth-first or depth-first. (I'd vote for breadth-first)
  • sizes for caching: maybe also 1024x1024 for public sharing

@LukasReschke LukasReschke added this to the ownCloud 8 milestone Oct 21, 2014
@LukasReschke
Copy link
Member Author

Setting to ownCloud 8 and triaging. - Let's close this if nobody complains, if not let's see what we can do.

@ronnicek
Copy link

@karlitschek it's easy.. set up Raspberry Pi /w Owncloud and load gallery with some photos :)

@budulinek
Copy link

Hi, first of all thanks a lot for considering the issue.

@karlitschek the problem is not the UI, but the preview generation.

I have just installed 7.0.3 and the UI itself is really quick. Good job. But the preview generation takes ages on slow machines (15 images take cca 2 minutes). I have a router running Openwrt (1200 MHz, 2GB RAM).

Background job is not faster. I agree. But once the process is over, all thumbnails are ready to be shown. And they do not have to be generated on-demand.

With the cron job, the user (server administrator) can decide when thumbnails are generated. Of course, it is difficult to choose optimal time. But it will be up to him to decide which time is optimal for the thumbnail pre-generation.

My suggestion:

  • create new php file, do not use existing cron.php for thumbnail pre-generation
  • the new php file would trigger preview generation for Gallery and for Files app
  • server administrator can schedule the preview generation using crontab
  • server administrator can set the priority of the thumbnail generation with nice
  • If the user visits the folder before the cronjob happened the on-demand generation would happen, otherwise the existing pre-generated thumbnails would be shown

@oparoz
Copy link
Contributor

oparoz commented Nov 24, 2014

  • Define a max size and expose it in the admin area perhaps Introducing the maximum size preview #13674
  • Suggest all apps use that setting for performance reasons.Preview providers should take maxX and maxY into consideration #13607
  • Add a batch processing button for users so that they can upload all the pics from their camera, press the button and come back later to check the result. Users don't know when the cron job will run and are less likely to get frustrated with the gallery loading pictures slowly if they know they simply have to wait 30 minutes for all previews to get generated

@DeepDiver1975
Copy link
Member

As soon come to a conclusion regarding asnc file handling/job queing/etc thumbnail generation can make use of this as well. No specific implementation only for thumbnail operations. THX

@DeepDiver1975 DeepDiver1975 modified the milestones: 8.1-next, ownCloud 8 Jan 8, 2015
@oparoz
Copy link
Contributor

oparoz commented Jan 22, 2015

A first step could be to detect uploads via the desktop client and generate thumbnails then. All the heavy lifting would be done at that stage.

@DeepDiver1975 DeepDiver1975 changed the title Migrate preview generation to cron Migrate preview generation to async job execution Jan 22, 2015
@MorrisJobke MorrisJobke added triage and removed triage labels Feb 27, 2015
@cbxk1xg
Copy link

cbxk1xg commented Mar 12, 2015

Please consider my posting:
owncloud/gallery#88 (comment)

@danimo
Copy link
Contributor

danimo commented Mar 31, 2015

Any update here? Thumbnail generation is painfully slow, and the way I see it, there is only two ways to improve things:

  • Make the client ask for all visible items only: Bad user experience during scrolling
  • Pre-render thumbnails: Potential storage overhead

Worst case it doesn't help, I'd say to judge that we'd need a reference implementation first.

@DeepDiver1975 are any conclusion about the specifics you mentioned above yet?

@DeepDiver1975
Copy link
Member

As of today thumbnails are stored after generation on the server.
The initial generation itself is done in a lazy manner - meaning as soon as a thumbnail is requested the first time it will be generated.

This initial lazy generation can cause issues - which might be solvable with the async job system we started to implement for 8.1 - we can use it for thumbnails in the dev cycle of 8.2

@DeepDiver1975 DeepDiver1975 added this to the 8.2-next milestone Mar 31, 2015
@PVince81
Copy link
Contributor

Backlog for now as it is tricky due to encryption and sharing.

Maybe the issue really is "previews generation can take a long time" and this async job is just one way of improving it.

CC @pmaier1 for considering this in future releases

@ladiko
Copy link

ladiko commented Feb 9, 2017

i tried to generate thumbnails on the filesystem by getting each jpgs fileid from the database. and adding the previews to $username/thumbnails/$fileid/1200x896-max.png, 268x200-with-aspect.png, 200x200.png and 32x32.png + scanning the thumbnails folder via occ files:scan --path ... but when I opened the gallery, it didnt use those thumbnails but regenerated them. So i was looking for another solution and it took me several hours of try'n'error but i finally managed to create a bash script which is slow, but works. It took 80 minutes to generate thumbnails for 2200 jpgs of 1200x896 pixels. the curl calls are very slow...

  • issues:
    • for other database backends than sqlite one has to adjust the query command
    • this sqlite query is restricted to jpgs as you see in the command - adjust the query to your needs
    • as all my pictures are 1200x896, i use these values for the preview generation, no idea if it works for images with different resolutions
#!/bin/bash

OC_PATH='/srv/http/owncloud'
OC_USER='owncloud_username'
OC_PASSWORD='owncloud_password'
OC_FOLDER='' # f.e. OC_FOLDER='path/within/your/owncloud/webinterface'
HTTP_USER='www-data' # http on archlinux

# run as http / www-data user
[ "$(id -un)" = "${HTTP_USER}" ] || { sudo -u "${HTTP_USER}" "$0" "$@" ; exit ; }

# check if the selected folder exists
[ -e "${OC_PATH}/data/${OC_USER}/files/${OC_FOLDER}" ] || { echo "file not found: ${OC_PATH}/data/${OC_USER}/files/${OC_FOLDER}" ; exit 1; }

START_TIME=$(date +%s)
FILES=$(sqlite3 "${OC_PATH}/data/owncloud.db" "select fileid,path from oc_filecache where path like 'files/${OC_FOLDER}%.jpg'")
AMOUNT=$(wc -l <<< "${FILES}")
COUNT=1
while IFS='|' read -r OC_FILEID OC_FILE ; do
        CURRENT_TIME=$(date +%s)
        EPOCH_DIFF=$((CURRENT_TIME - START_TIME))
        TIME_DIFF=$(date -u -d @${EPOCH_DIFF} +%s)
        TIME_RUN=$(date -u -d @${EPOCH_DIFF} +%-H%-Mm%-Ss)
        ESTIMATED_DURATION=$(date -u -d @$(( AMOUNT * EPOCH_DIFF / COUNT )) +%-Hh%-Mm%-Ss)

        # print some stats about the progress
		echo "$((COUNT++))/${AMOUNT} ${TIME_RUN} of ${ESTIMATED_DURATION} - ${OC_PATH}/data/${OC_USER}/${OC_FILE}"
        curl -sS --user "${OC_USER}:${OC_PASS}" https://owncloud.cosmoproducts.de/index.php/apps/gallery/api/preview/${OC_FILEID}/1280/896 > /dev/null
        curl -sS --user "${OC_USER}:${OC_PASS}" https://owncloud.cosmoproducts.de/index.php/apps/gallery/api/preview/${OC_FILEID}/268/200 > /dev/null
done <<< "${FILES}"

@ronnicek
Copy link

Wohooo.. looks like solution is here https://twitter.com/Nextclouders/status/841246279279755264 :)

@hast0011
Copy link

Viewing pictures is still veeeeryyyyyyy slow. Even 100 pictures render the system unusable which is responsive in files, calendar or contacts. You can use bitnami, archlinux, debian and owncloud from 7.0 to 10.0 all the same over years.
Your can even try a ready to use stack in a virtual machine with 16 GB RAM.
How should this replace google, dropbox or onedrive?
Even the preview with windows explorer over network drive to the same data is 100 times faster.

Any preview generator is faster than now.

@jankaluza
Copy link

jankaluza commented Oct 26, 2018

My approach to workaround that while adding all the historical photos I have to owncloud was following:

  1. Open some gallery which does not have previews generated.
  2. Open Firefox web developer console (ctrl+shift+k).
  3. Wait for XHDR request to /apps/gallery/thumbnails.
  4. Right click on it and copy it to "curl" to get the full curl URL.
  5. Copy paste that URL to the script below, reformat it to set "fileid" into "ids" GET attr, and run it.
from __future__ import division
import sys
import os
from multiprocessing import Pool
import MySQLdb

db = MySQLdb.connect("localhost","owncloud","pass","owncloud" )
cursor = db.cursor()

sql = "select fileid from oc_filecache where path like 'files/%.JPG'"
cursor.execute(sql)
results = cursor.fetchall()

def run_curl(row):
    fileid = row[0]
    os.system("curl -sS 'http://172.16.16.141/owncloud/index.php/apps/gallery/thumbnails?ids=" + str(fileid) + "&scale=1&square=0&requesttoken=....")

p = Pool(40)

for i, _ in enumerate(p.imap_unordered(run_curl, results), 1):
    sys.stderr.write('\rdone {0:%}'.format(i/len(results)))

You will need to tweak the script and probably also check other extensions, but for someone who can code it's easy to do that and it might save time to someone, so adding it here.

@tryan225
Copy link

Any update on this? Preview/thumbnail generation in large folders of photographs is enough to chew through all 1GB ram and knock over my server which otherwise runs ownCloud without a problem.

@ronnicek
Copy link

@PVince81
Copy link
Contributor

In the past, it was a design choice to not provide features if they cannot be applied to the whole platform and cannot work with some features. Such preview generator would not work with user-key encryption.

Now design goals have changed a bit, so it should be possible to provide such command / background job and add a disclaimer for user-key encryption. External storage will also be slow as it will need to download all the files so this could be added as a warning as well.

Note: the preview system will be reworked / rewritten in the future as the current way is suboptimal and outdated.

@nero120
Copy link

nero120 commented Nov 30, 2018

Note: the preview system will be reworked / rewritten in the future as the current way is suboptimal and outdated.

That is a very welcome statement @PVince81! Any idea on when we might be able to expect it?

@jonferreira
Copy link

Wondering if there's any updates on this subject? Preview is still painfully slow and sucks that thumbnails can't be generated "automagically" with a cronjob :(

@fungs
Copy link

fungs commented Feb 12, 2019

@jonferreira see my suggestion #17916 from 2015, basically I was proposing to integrate another JPEG preview program which takes 3 sek instead of 3 min for my example digital photo on a weak CPU. Nothing has happened since :(

@nero120
Copy link

nero120 commented Feb 12, 2019

@PVince81 are you able to incorporate @fungs solution into your reworking of the preview system?

@jonferreira
Copy link

@fungs :(

@PVince81
Copy link
Contributor

@nero120 the reason this was likely never incorporated back then is because the design goals of ownCloud was to be purely PHP to make sure that everyone can deploy this on their shared hosters, etc.

Now in 2019 I'm still unsure if / how to incorporate such external tool in a way to make it work from OC. Maybe it needs an OC app that is able to call it with shell_exec.

In any case, I don't think that's the right direction and we should stay in PHP-world to keep the easy of deployment for everyone.

Currently we're reworking the frontend and integrating both the video player and slideshow view into a single app, see https://github.com/owncloud/files_mediaviewer (release coming soon).

This app implements the previews with fixed scales instead of expecting previews with every possible pixel size. So the viewer would pick the preview with the closest size instead of requesting a preview with the exact size from the server.

I think this is a good prerequisite for a future background job for preview generation, because said job could then simply build previews for all these fixed supported sizes.

Now I'm not aware of any timeline to provide such background job.
On server with 100k users it is likely that such background job would take ages to finish and consume a lot of CPU on many servers. So this needs careful consideration.

@pmaier1

@elpraga
Copy link

elpraga commented Feb 12, 2019

Finally!! Set preview sizes!! Finally!

@nero120
Copy link

nero120 commented Feb 13, 2019

Thanks for your comments @PVince81.

On server with 100k users it is likely that such background job would take ages to finish and consume a lot of CPU on many servers.

Whilst that may or may not be true (depending on whether the users on the server upload many high quality images), surely it's for the server admin to determine how best to spend the server's resources and that will depend on the hosting scenario. If you are an admin intending to provide users with a place to upload and share their photos then without preview generation this is not really fit for purpose.

I appreciate your point, but the risk can easily be mitigated by providing a preview generation job with parameters to enable admins to control when and how previews are generated. But I have to be honest, a cpu intensive task such as preview generation must be done in the background otherwise the experience is severely degraded whilst the user is forced to wait long moments every time they want to look at an image...

@PVince81
Copy link
Contributor

so instead of an OC bkg job this could be implemented as an occ command instead, which gives more freedom for admins to decide when to run them.

@tribut
Copy link
Member

tribut commented Feb 13, 2019

See also https://apps.nextcloud.com/apps/previewgenerator, which adds an occ command for this (Nextcloud).

@TNCS-git
Copy link

After 5-years, still a milestone of 'maybe someday' goal... I sincerely think the target of 'ownCloud' audience is loosing its plot along the way. I believe ownCloud is best suit target at the audience of people who choose not their data over public cloud offering, whether it Dropbox, LiveDrive, Sharepoint, etc. The reason can simply be, but not limited to:

  1. Privacy reason
  2. Decision over it should be opex (Cloud based) or capex (self host solution)
  3. People who like more flexibility and control

I have so many self scenario that a simple OCC command line for admins to do background thumbnail generating at their own freewill will work out better than ondemand generation. For one, copy over the images manually (without OC), another sync mass images without the OC due to performance factors. Made a decision of sync over my Android phone using Folder sync, after 8000+ images, realize it was impossible to use now as ownCloud app can never finish generating the Thumbnail.

Regardless of which situation, this type of limit (and few others), really make me think the ownCloud target audience isn't for what aforementioned, but instead smaller Cloud reseller. What even puzzling me is, how NextCloud has implemented this nearly two years ago, and ownCloud is still considering to do this or not...

@nagelp
Copy link

nagelp commented Aug 18, 2020

so instead of an OC bkg job this could be implemented as an occ command instead, which gives more freedom for admins to decide when to run them.

Any update on this? I think an occ command would be perfect. My small home server takes really long to generate the thumbnails, so browsing images through the web interface is really painful.

@hast0011
Copy link

hast0011 commented Aug 18, 2020 via email

@micbar
Copy link
Contributor

micbar commented Sep 16, 2021

This will be implemented in the future in ocis. Closing here.

@micbar micbar closed this as completed Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests