Skip to content

Files search is very slow - SQL query very complex esp. when collectives, talk installed #44596

Open
@pjrobertson

Description

⚠️ This issue respects the following points: ⚠️

Bug description

I have a server with 465348 lines in the oc_filecache table. Searching for a file using unified search, the request takes about 1.5-2s to return a response. Upon investigation, the underlying SQL query (I am on postgresql) takes around 1-1.5s. I feel like there is room for optimisation here.

Below are my findings (scroll down for original sql queries and process of getting to a conclusion):

Update: You can view a nice overview of the perpared statement ahd which parts take the longest here: http://tatiyants.com/pev/#/plans/plan_1711947023189

It's clear that the searches used for collectives/talk/deck are the heaviest, taking up ~50% of the execution time (2% execution time for each of the ~25 searches in my case, adding up to ~50%). This is confirmed by removing all the corresponding parts of the SQL query. Just using the first part reduces the time by half or 687ms.

explain analyze SELECT "file"."fileid", "storage", "path", "path_hash", "file"."parent", "file"."name", "mimetype", "mimepart", "size", "mtime", "storage_mtime", "encrypted", "etag", "file"."permissions", "checksum", "unencrypted_size", "meta"."json" AS "meta_json", "meta"."sync_token" AS "meta_sync_token" FROM "oc_filecache" "file" LEFT JOIN "oc_files_metadata" "meta" ON "file"."fileid" = "meta"."file_id" WHERE ("file"."name" ILIKE '%so%') AND ((("storage" = '4') AND (("path" = 'files') OR ("path" LIKE 'files/%'))));
                                                                         QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=4869.62..60019.75 rows=878 width=738) (actual time=60.518..686.015 rows=11185 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Nested Loop Left Join  (cost=3869.62..58931.95 rows=366 width=738) (actual time=20.113..586.505 rows=3728 loops=3)
         ->  Parallel Bitmap Heap Scan on oc_filecache file  (cost=3869.33..57153.65 rows=366 width=233) (actual time=19.982..562.403 rows=3728 loops=3)
               Recheck Cond: (storage = '4'::bigint)
               Filter: (((name)::text ~~* '%so%'::text) AND (((path)::text = 'files'::text) OR ((path)::text ~~ 'files/%'::text)))
               Rows Removed by Filter: 104122
               Heap Blocks: exact=9659
               ->  Bitmap Index Scan on fs_storage_mimepart  (cost=0.00..3869.11 rows=325958 width=0) (actual time=50.535..50.536 rows=345536 loops=1)
                     Index Cond: (storage = '4'::bigint)
         ->  Index Scan using files_meta_fileid on oc_files_metadata meta  (cost=0.29..4.86 rows=1 width=513) (actual time=0.005..0.005 rows=0 loops=11185)
               Index Cond: (file_id = file.fileid)
 Planning Time: 1.275 ms
 Execution Time: 687.419 ms

Possible solutions:

I've dug a bit deeper, and perhaps (somehow) in lib/private/Files/Node/Folder.php we can filter out the caches that are for apps that have their own provider (or just hard code it to collectives, talk etc). It seems messy, and I don't know how to do it (I don't really understand ICache) but perhaps it's doable. Another way might be to allow getCachesAndMountPointsForSearch to filter out certain mount points / caches. E.g. $searchHelper->getCachesAndMountPointsForSearch($this->root, $this->path, $exclude->exclude, $limitToHome)

[$caches, $mountByMountPoint] = $searchHelper->getCachesAndMountPointsForSearch($this->root, $this->path, $limitToHome);
		
		// filter out collectives and talk caches - they have their own search providers and only slow down the file search if we add them here
		$caches = array_filter($caches, function (ICache $cache) {
                        // !!!PSEUDO CODE!!!
			return !in_array($cache->getStorage()->getId(), ['collectives', 'talk']);
		});

Raw sql queries

Using postgresql's slow query log, here is an example of one query, for which I have subsequently run ANALYZE on it:

LOG:  duration: 1443.320 ms  execute <unnamed>:  SELECT "file"."fileid", "storage", "path", "path_hash", "file"."parent", "file"."name", "mimetype", "mimepart", "size", "mtime", "storage_mtime", "encrypted", "etag", "file"."permissions", "checksum", "unencrypted_size", "meta"."json" AS "meta_json", "meta"."sync_token" AS "meta_sync_token" FROM "oc_filecache" "file" LEFT JOIN "oc_files_metadata" "meta" ON "file"."fileid" = "meta"."file_id" WHERE ("file"."name" ILIKE $1) AND ((("storage" = $2) AND (("path" = $3) OR ("path" LIKE $4))) OR (("storage" = $5) AND (("path" = $6) OR ("path" LIKE $7))) OR (("storage" = $8) AND (("path" = $9) OR ("path" LIKE $10))) OR (("storage" = $11) AND (("path" = $12) OR ("path" LIKE $13))) OR (("storage" = $14) AND (("path" = $15) OR ("path" LIKE $16))) OR (("storage" = $17) AND (("path" = $18) OR ("path" LIKE $19))) OR (("storage" = $20) AND (("path" = $21) OR ("path" LIKE $22))) OR (("storage" = $23) AND (("path" = $24) OR ("path" LIKE $25))) OR (("storage" = $26) AND (("path" = $27) OR ("path" LIKE $28))) OR (("storage" = $29) AND (("path" = $30) OR ("path" LIKE $31))) OR (("storage" = $32) AND (("path" = $33) OR ("path" LIKE $34))) OR (("storage" = $35) AND (("path" = $36) OR ("path" LIKE $37))) OR (("storage" = $38) AND (("path" = $39) OR ("path" LIKE $40))) OR (("storage" = $41) AND (("path" = $42) OR ("path" LIKE $43))) OR (("storage" = $44) AND (("path" = $45) OR ("path" LIKE $46))) OR (("storage" = $47) AND (("path" = $48) OR ("path" LIKE $49))) OR (("storage" = $50) AND (("path" = $51) OR ("path" LIKE $52))) OR (("storage" = $53) AND (("path" = $54) OR ("path" LIKE $55))) OR (("storage" = $56) AND (("path" = $57) OR ("path" LIKE $58))) OR (("storage" = $59) AND (("path" = $60) OR ("path" LIKE $61))) OR (("storage" = $62) AND (("path" = $63) OR ("path" LIKE $64))) OR (("storage" = $65) AND (("path" = $66) OR ("path" LIKE $67))) OR (("storage" = $68) AND (("path" = $69) OR ("path" LIKE $70))) OR (("storage" = $71) AND (("path" = $72) OR ("path" LIKE $73))) OR (("storage" = $74) AND (("path" = $75) OR ("path" LIKE $76))) OR (("storage" = $77) AND (("path" = $78) OR ("path" LIKE $79))) OR (("storage" = $80) AND ("path" = $81)) OR (("storage" = $82) AND ("path" = $83)) OR (("storage" = $84) AND ("path" = $85)) OR (("storage" = $86) AND ("path" = $87)) OR (("storage" = $88) AND ("path" = $89)) OR (("storage" = $90) AND ("path" = $91)) OR (("storage" = $92) AND ("path" = $93)) OR (("storage" = $94) AND ("path" = $95)) OR (("storage" = $96) AND ("path" = $97)) OR (("storage" = $98) AND ("path" = $99)) OR (("storage" = $100) AND ("path" = $101)) OR (("storage" = $102) AND ("path" = $103)) OR (("storage" = $104) AND ("path" = $105)) OR (("storage" = $106) AND ("path" = $107)) OR (("storage" = $108) AND ("path" = $109)) OR (("storage" = $110) AND ("path" = $111)) OR (("storage" = $112) AND ("path" = $113)) OR (("storage" = $114) AND ("path" = $115)) OR (("storage" = $116) AND ("path" = $117)) OR (("storage" = $118) AND ("path" = $119)) OR (("storage" = $120) AND ("path" = $121)) OR (("storage" = $122) AND ("path" = $123)) OR (("storage" = $124) AND ("path" = $125)) OR (("storage" = $126) AND ("path" = $127)) OR (("storage" = $128) AND ("path" = $129)) OR (("storage" = $130) AND ("path" = $131)) OR (("storage" = $132) AND ("path" = $133)) OR (("storage" = $134) AND ("path" = $135)) OR (("storage" = $136) AND ("path" = $137)) OR (("storage" = $138) AND ("path" = $139)) OR (("storage" = $140) AND ("path" = $141)) OR (("storage" = $142) AND ("path" = $143)) OR (("storage" = $144) AND ("path" = $145)) OR (("storage" = $146) AND ("path" = $147)) OR (("storage" = $148) AND ("path" = $149)) OR (("storage" = $150) AND ("path" = $151)) OR (("storage" = $152) AND ("path" = $153)) OR (("storage" = $154) AND ("path" = $155)) OR (("storage" = $156) AND ("path" = $157)) OR (("storage" = $158) AND ("path" = $159)) OR (("storage" = $160) AND ("path" = $161)) OR (("storage" = $162) AND ("path" = $163)) OR (("storage" = $164) AND ("path" = $165)) OR (("storage" = $166) AND ("path" = $167)) OR (("storage" = $168) AND ("path" = $169)) OR (("storage" = $170) AND ("path" = $171)) OR (("storage" = $172) AND ("path" = $173))) ORDER BY "mtime" + $174 desc LIMIT 5

Analyzed (with args partially obfuscated):



Limit  (cost=312527.91..312528.49 rows=5 width=746) (actual time=1078.179..1091.077 rows=5 loops=1)
   ->  Gather Merge  (cost=312527.91..312615.41 rows=750 width=746) (actual time=952.836..965.730 rows=5 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=311527.88..311528.82 rows=375 width=746) (actual time=902.232..902.235 rows=4 loops=3)
               Sort Key: ((file.mtime + '0'::bigint)) DESC
               Sort Method: top-N heapsort  Memory: 33kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 28kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 30kB
               ->  Nested Loop Left Join  (cost=0.29..311521.65 rows=375 width=746) (actual time=49.220..897.684 rows=3726 loops=3)
                     ->  Parallel Seq Scan on oc_filecache file  (cost=0.00..309711.86 rows=375 width=233) (actual time=49.114..874.850 rows=3726 loops=3)
                           Filter: (((name)::text ~~* '%so%'::text) AND (((storage = '4'::bigint) AND (((path)::text = 'files'::text) OR ((path)::text ~~ 'files/%'::text))) OR 
((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/3'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/3/%'::text))) OR ((storage = 
'178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/4'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/4/%'::text))) OR ((storage = '178'::bigin
t) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/5'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/5/%'::text))) OR ((storage = '178'::bigint) AND (((pa
th)::text = 'appdata_ocfhweyb1d2t/collectives/7'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/7/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 
'appdata_ocfhweyb1d2t/collectives/6'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/6/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocf
hweyb1d2t/collectives/8'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/8/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/co
llectives/12'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/12/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/
18'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/18/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/19'::text)
 OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/19/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/16'::text) OR ((path
)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/16/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/22'::text) OR ((path)::text ~~
 'appdata\_ocfhweyb1d2t/collectives/22/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/26'::text) OR ((path)::text ~~ 'appdata\
_ocfhweyb1d2t/collectives/26/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/27'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1
d2t/collectives/27/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/25'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collec
tives/25/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/10'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/10/%
'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/30'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/30/%'::text)))
 OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/36'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/36/%'::text))) OR ((stor
age = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/38'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/38/%'::text))) OR ((storage = '178
'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/39'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/39/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/29'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/29/%'::text))) OR
 ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/43'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/43/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/44'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/44/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 'appdata_ocfhweyb1d2t/collectives/42'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/42/%'::text))) OR ((storage = '178'::bigint) AND (((path)::text = 
'appdata_ocfhweyb1d2t/collectives/45'::text) OR ((path)::text ~~ 'appdata\_ocfhweyb1d2t/collectives/45/%'::text))) OR ((storage = '185'::bigint) AND (((path)::text = 'files/My Share'::text) OR ((path)::text ~~ 'files/My Share/%'::text))) OR ((storage = '206'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.png'::text)) OR ((storage = '210'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.JPG'::text)) OR ((storage = '210'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.PNG'::text)) OR ((storage = '206'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.png'::text)) OR ((storage = '206'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.png'::text)) OR ((storage = '211'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.m..png'::text)) OR ((storage = '211'::bigint) AND ((path)::text = 'files/
Talk/my_file_redacted.m..png'::text)) OR ((storage = '211'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.png'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.JPG'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.JPG'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.JPG'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.JPG'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.JPG'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.JPG'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.JPG'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.JPG'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.php'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.png'::text)) OR ((storage = '233'::bigint) 
AND ((path)::text = 'files/Talk/my_file_redacted.png'::text)) OR ((storage = '233'::bigint) AND ((path)::text = 'files/Talk/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.jpg'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) 
OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) 
AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.php'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text)) OR ((storage = '171'::bigint) AND ((path)::text = 'files/Deck/my_file_redacted.png'::text))))
                           Rows Removed by Filter: 465348
                     ->  Index Scan using files_meta_fileid on oc_files_metadata meta  (cost=0.29..4.82 rows=1 width=513) (actual time=0.005..0.005 rows=0 loops=11179)
                           Index Cond: (file_id = file.fileid)
 Planning Time: 4.460 ms
 JIT:
  Functions: 391
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 54.835 ms, Inlining 0.000 ms, Optimization 26.228 ms, Emission 527.024 ms, Total 608.087 ms
 Execution Time: 1758.145 ms

Steps to reproduce

  1. Have a lot of files on your NC install (500k+ ideal)
  2. Enable slow query logging in postgres (or MySQL I guess) - set the slow query time to anything above 500ms
  3. Run a search in the unified search
  4. View slow log file

Expected behavior

Search should be faster

Installation method

Community VM appliance

Nextcloud Server version

28

Operating system

Debian/Ubuntu

PHP engine version

PHP 8.2

Web server

Apache (supported)

Database engine version

PostgreSQL

Is this bug present after an update or on a fresh install?

Fresh Nextcloud Server install

Are you using the Nextcloud Server Encryption module?

None

What user-backends are you using?

  • Default user-backend (database)
  • LDAP/ Active Directory
  • SSO - SAML
  • Other

Configuration report

No response

List of activated Apps

No response

Nextcloud Signing status

No response

Nextcloud Logs

No response

Additional info

No response

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions