Closed
Description
From first look, I would consider this harmful if one is not careful.
There may be 2 reasons non-duplicates are detected as duplicates:
- sys_file.storage should be checked as well (if more than one storage used)
- if DB checks case-insensitive so files such as
/myfile.jpg
and/MYFILE.jpg
are detected as duplicates (depends on DB collation, e.g.utf8mb4_general_ci
is case insensitive)
What collation does TYPO3 use by default?
See
v11.5:
public/typo3/sysext/install/Classes/Controller/InstallerController.php: 'collate' => 'utf8mb4_unicode_ci',
Reproduce
Try adding the following examples as files and check if they are detected as duplicates
Examples:
sys_file.storage | sys_file.identifier | file |
---|---|---|
1 (fileadmin) | /dir1/abc.jpg | fileadmin/dir1/abc.jpg |
2 (media) | /dir1/abc.jpg | media/dir1/abc.jpg |
1 (fileadmin) | /dir1/ABC.jpg | fileadmin/dir1/ABC.jpg |
recommendation:
- always check sys_file.storage as well (if duplicate, identifer and storage must be identical)
- improve DB query, e.g. by checking if the following is identical as well: identifier_hash, sha1 (identifier_hash should suffice, I think)
- or do a binary compare of the field
- or do additional compare of the strings in PHP
Metadata
Metadata
Assignees
Labels
No labels