[Bug]: File Search poor performance with lots of files because of index workaround #38243
Description
⚠️ This issue respects the following points: ⚠️
- This is a bug, not a question or a configuration/webserver/proxy issue.
- This issue is not already reported on Github (I've searched it).
- Nextcloud Server is up to date. See Maintenance and Release Schedule for supported versions.
- Nextcloud Server is running on 64bit capable CPU, PHP and OS.
- I agree to follow Nextcloud's Code of Conduct.
Bug description
When you search for a file in a NextCloud instance where the user has a lot of them (800k+), performance is atrocious. Might be related to having multiple storages, though not sure of it.
Alas, for some reason, this can be traced back to a deliberate decision that isn't clear to me though it must have a really good reason for it, hence me wanting to raise the question to make sure that the reason behind it still holds.
I'm showing the results of the current query that Nextcloud runs (and the MySQL explanation), and then the same for the changed query I put together. For the sake of intellectual correctness, I ran all queries after restarting the MySQL service, so that there would be no cache at the time.
Fixed Query: 0.428 secs
MariaDB [nextcloud]> SELECT `file`.`fileid`, `storage`, `path`, `path_hash`, `file`.`parent`, `file`.`name`, `mimetype`, `mimepart`, `size`, `mtime`, `storage_mtime`, `encrypted`, `etag`, `permissions`, `checksum`, `unencrypted_size` FROM `oc_filecache` `file` WHERE (`file`.`name` COLLATE utf8mb4_general_ci LIKE '%mario%') AND (((`storage` = 38) AND ((`path` = 'files') OR (`path` LIKE 'files/%'))) OR (`storage` = 30) OR (`storage` = 31) OR (`storage` = 32) OR (`storage` = 35) OR (`storage` = 36)) ORDER BY `mtime` desc LIMIT 6;
<...>
6 rows in set (0.428 sec)
Current Query in Nextcloud: 7.401 secs (it adds a +'0' to the order by condition and MySQL doesn't use the mtime index for it).
MariaDB [nextcloud]> SELECT `file`.`fileid`, `storage`, `path`, `path_hash`, `file`.`parent`, `file`.`name`, `mimetype`, `mimepart`, `size`, `mtime`, `storage_mtime`, `encrypted`, `etag`, `permissions`, `checksum`, `unencrypted_size` FROM `oc_filecache` `file` WHERE (`file`.`name` COLLATE utf8mb4_general_ci LIKE '%mario%') AND (((`storage` = 38) AND ((`path` = 'files') OR (`path` LIKE 'files/%'))) OR (`storage` = 30) OR (`storage` = 31) OR (`storage` = 32) OR (`storage` = 35) OR (`storage` = 36)) ORDER BY `mtime` +'0' desc LIMIT 6;
<...>
6 rows in set (7.401 sec)
Explanation for the fixed query: uses mtime index key
MariaDB [nextcloud]> explain SELECT `file`.`fileid`, `storage`, `path`, `path_hash`, `file`.`parent`, `file`.`name`, `mimetype`, `mimepart`, `size`, `mtime`, `storage_mtime`, `encrypted`, `etag`, `permissions`, `checksum`, `unencrypted_size` FROM `oc_filecache` `file` WHERE (`file`.`name` COLLATE utf8mb4_general_ci LIKE '%mario%') AND (((`storage` = 38) AND ((`path` = 'files') OR (`path` LIKE 'files/%'))) OR (`storage` = 30) OR (`storage` = 31) OR (`storage` = 32) OR (`storage` = 35) OR (`storage` = 36)) ORDER BY `mtime` desc LIMIT 6;
+------+-------------+-------+-------+-----------------------------------------------------------------------------------------------------+----------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+-----------------------------------------------------------------------------------------------------+----------+---------+------+------+-------------+
| 1 | SIMPLE | file | index | fs_storage_path_hash,fs_storage_mimetype,fs_storage_mimepart,fs_storage_size,fs_storage_path_prefix | fs_mtime | 8 | NULL | 6 | Using where |
+------+-------------+-------+-------+-----------------------------------------------------------------------------------------------------+----------+---------+------+------+-------------+
1 row in set (0.004 sec)
Explanation for the current query: uses fs_storage_path_prefix
MariaDB [nextcloud]> explain SELECT `file`.`fileid`, `storage`, `path`, `path_hash`, `file`.`parent`, `file`.`name`, `mimetype`, `mimepart`, `size`, `mtime`, `storage_mtime`, `encrypted`, `etag`, `permissions`, `checksum`, `unencrypted_size` FROM `oc_filecache` `file` WHERE (`file`.`name` COLLATE utf8mb4_general_ci LIKE '%mario%') AND (((`storage` = 38) AND ((`path` = 'files') OR (`path` LIKE 'files/%'))) OR (`storage` = 30) OR (`storage` = 31) OR (`storage` = 32) OR (`storage` = 35) OR (`storage` = 36)) ORDER BY `mtime` +'0' desc LIMIT 6;
+------+-------------+-------+-------+-----------------------------------------------------------------------------------------------------+------------------------+---------+------+--------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+-----------------------------------------------------------------------------------------------------+------------------------+---------+------+--------+----------------------------------------------------+
| 1 | SIMPLE | file | range | fs_storage_path_hash,fs_storage_mimetype,fs_storage_mimepart,fs_storage_size,fs_storage_path_prefix | fs_storage_path_prefix | 267 | NULL | 814718 | Using index condition; Using where; Using filesort |
+------+-------------+-------+-------+-----------------------------------------------------------------------------------------------------+------------------------+---------+------+--------+----------------------------------------------------+
1 row in set (0.002 sec)
Also, for context, total number of files in instance (there's only a single user): 862k.
MariaDB [nextcloud]> select count(fileid) from oc_filecache;
+---------------+
| count(fileid) |
+---------------+
| 862541 |
+---------------+
1 row in set (0.477 sec)
Now, the thing is that there seems to be a very deliberate effort to force MySQL to use the index with fs_storage_path_prefix instead of mtime . However, for whatever reason, in this specific context, it just makes the search query very slow. I have multiple storage points (5, I think) and lots of files. The order by clause is what causes the slow-down - removing the order by clause works as well, but I imagine there is an intention (for the sake of usability) to show the most recent files first. The mtime index helps, but I struggle to understand why we wouldn't want to use it and in what scenarios does the other index perform better.
This change was introduced one year ago in #29300 and then #32074 by @icewind1991 . I'd love to learn more so that I can understand whether it makes sense for me to submit a pull request to revert it (as maybe the underlying context might have changed since and that workaround might no longer be necessary?), or should I just revert the change on my end alone (which I have done and seems to work as intended so far).
This issue seems to be similar to #35776 and others related to search performance, but it clearly seems there are many others in the actual team who are more knowledgeable about the topic, so would love to hear their guidance.
Regards, and keep up the great work.
Steps to reproduce
- Search for a file on the file search box, on a large repository.
- Results take at time over 10 seconds to come back.
Expected behavior
File search results to return shortly.
Installation method
Community Manual installation with Archive
Nextcloud Server version
25
Operating system
Debian/Ubuntu
PHP engine version
PHP 8.1
Web server
Apache (supported)
Database engine version
MariaDB
Is this bug present after an update or on a fresh install?
Updated to a major version (ex. 22.2.3 to 23.0.1)
Are you using the Nextcloud Server Encryption module?
None
What user-backends are you using?
- Default user-backend (database)
- LDAP/ Active Directory
- SSO - SAML
- Other
Configuration report
{
"system": {
"instanceid": "***REMOVED SENSITIVE VALUE***",
"passwordsalt": "***REMOVED SENSITIVE VALUE***",
"secret": "***REMOVED SENSITIVE VALUE***",
"trusted_domains": [
"<redacted>"
],
"datadirectory": "***REMOVED SENSITIVE VALUE***",
"dbtype": "mysql",
"version": "25.0.3.2",
"overwrite.cli.url": "<redacted>",
"dbname": "***REMOVED SENSITIVE VALUE***",
"dbhost": "***REMOVED SENSITIVE VALUE***",
"dbport": "",
"dbtableprefix": "oc_",
"default_phone_region": "pt",
"defaultapp": "files",
"mysql.utf8mb4": true,
"dbuser": "***REMOVED SENSITIVE VALUE***",
"dbpassword": "***REMOVED SENSITIVE VALUE***",
"installed": true,
"twofactor_enforced": "true",
"twofactor_enforced_groups": [],
"twofactor_enforced_excluded_groups": [],
"mail_smtpmode": "sendmail",
"mail_smtpauthtype": "LOGIN",
"mail_sendmailmode": "pipe",
"mail_smtpsecure": "ssl",
"mail_from_address": "***REMOVED SENSITIVE VALUE***",
"mail_domain": "***REMOVED SENSITIVE VALUE***",
"mail_smtpauth": 1,
"mail_smtphost": "***REMOVED SENSITIVE VALUE***",
"mail_smtpport": "587",
"mail_smtpname": "***REMOVED SENSITIVE VALUE***",
"mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
"maintenance": false,
"memcache.local": "\\OC\\Memcache\\Redis",
"memcache.locking": "\\OC\\Memcache\\Redis",
"redis": {
"host": "***REMOVED SENSITIVE VALUE***",
"port": 6379
},
"theme": "",
"loglevel": 2,
"updater.release.channel": "stable"
}
}
List of activated Apps
Enabled:
- activity: 2.17.0
- bruteforcesettings: 2.5.0
- calendar: 4.3.3
- circles: 25.0.0
- cloud_federation_api: 1.8.0
- comments: 1.15.0
- contacts: 5.2.0
- contactsinteraction: 1.6.0
- dashboard: 7.5.0
- dav: 1.24.0
- federatedfilesharing: 1.15.0
- federation: 1.15.0
- files: 1.20.1
- files_external: 1.17.0
- files_pdfviewer: 2.6.0
- files_rightclick: 1.4.0
- files_sharing: 1.17.0
- files_trashbin: 1.15.0
- files_versions: 1.18.0
- firstrunwizard: 2.14.0
- logreader: 2.10.0
- lookup_server_connector: 1.13.0
- nextcloud_announcements: 1.14.0
- notifications: 2.13.1
- oauth2: 1.13.0
- onlyoffice: 7.8.0
- password_policy: 1.15.0
- photos: 2.0.1
- previewgenerator: 5.2.4
- privacy: 1.9.0
- provisioning_api: 1.15.0
- related_resources: 1.0.4
- serverinfo: 1.15.0
- settings: 1.7.0
- sharebymail: 1.15.0
- support: 1.8.0
- survey_client: 1.13.0
- systemtags: 1.15.0
- tasks: 0.14.5
- text: 3.6.0
- theming: 2.0.1
- twofactor_backupcodes: 1.14.0
- twofactor_nextcloud_notification: 3.6.0
- twofactor_totp: 7.0.0
- updatenotification: 1.15.0
- user_status: 1.5.0
- viewer: 1.9.0
- weather_status: 1.5.0
- workflow_script: 1.10.0
- workflowengine: 2.7.0
Disabled:
- admin_audit
- encryption
- extract: 1.3.5
- files_accesscontrol: 1.15.1
- music: 1.8.3
- printer: 0.0.3
- recommendations: 0.6.0
- suspicious_login
- twofactor_gateway: 0.20.0
- user_ldap
Nextcloud Signing status
No errors have been found.
Nextcloud Logs
No response
Additional info
No response
Activity