Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list of disallowed special characters for filenames is not accurate #8926

Open
matthew-a-dunlap opened this issue Aug 19, 2022 · 5 comments
Open
Labels
Feature: File Upload & Handling Hackathon: Low Hanging Fruit hacktoberfest It's Hacktoberfest! https://groups.google.com/g/dataverse-community/c/n_Nn_T2yA-w/m/BcoXO4tEAQAJ Help Wanted: Code Help Wanted: Documentation Mentor: pdurbin Type: Bug a defect User Role: Depositor Creates datasets, uploads data, etc.

Comments

@matthew-a-dunlap
Copy link
Contributor

matthew-a-dunlap commented Aug 19, 2022

Working on CORE2 which uploads to dataverse, I'm writing code to prevent certain characters in filenames.

Testing Dataverse's restrictions, it reports File Name cannot contain any of the following characters: / : * ? " < > | ; # .

This info is incorrect for at least two reasons:

  1. This filename still errors even though none of those characters are present: !$%&’()+,-=@[\]^_{}~. It seems this is because the \ character is prevented as well.
  2. the . character is not actually prevented by Dataverse as far as I can tell, even if its used multiple times.

It would be useful for the user to have a more accurate error message. Also would be useful if this was in the documentation (I may have missed it). This happens on 5.3 and 5.10.1

@pdurbin pdurbin added Hackathon: Low Hanging Fruit Help Wanted: Code Help Wanted: Documentation Mentor: pdurbin hacktoberfest It's Hacktoberfest! https://groups.google.com/g/dataverse-community/c/n_Nn_T2yA-w/m/BcoXO4tEAQAJ labels Oct 1, 2022
@shlake
Copy link
Contributor

shlake commented Oct 18, 2022

Working on this, but have found another problem. See this #9080

As noted above \ isn't "allowed" in a filename, but the Dataverse software changes the filename with \ , but it doesn't generate an error. This filename citations\files.txt is changed to files.txt without an error.

@poikilotherm
Copy link
Contributor

poikilotherm commented Oct 18, 2022

Related:

@shlake
Copy link
Contributor

shlake commented Oct 18, 2022

@matthew-a-dunlap I'm checking on non-valid characters, but I have figured out that the . in the error list, is just the end of the sentence . it does not mean that "period" is an invalid character, but it does need to be removed to avoid confusion.

The error message is coming from this file: WEB-INF/classes/ValidationMessages.properties

@ErykKul
Copy link
Collaborator

ErykKul commented Sep 28, 2023

Patter for the "label" in the code: https://github.com/IQSS/dataverse/blob/develop/src/main/java/edu/harvard/iq/dataverse/FileMetadata.java#L72
regexp="^[^:<>;#/\"\\*\\|\\?\\\\]*$"

@ErykKul
Copy link
Collaborator

ErykKul commented Sep 28, 2023

Directory name validator: https://github.com/IQSS/dataverse/blob/develop/src/main/java/edu/harvard/iq/dataverse/FileDirectoryNameValidator.java#L32

String validCharacters = "[\\w\\\\/. -]+";

@pdurbin pdurbin added Feature: File Upload & Handling Type: Bug a defect User Role: Depositor Creates datasets, uploads data, etc. labels Oct 8, 2023
@pdurbin pdurbin changed the title File Name list of disallowed special characters it not accurate list of disallowed special characters for filenames is not accurate Mar 29, 2024
@DS-INRAE DS-INRAE moved this to 🔍 Interest in Recherche Data Gouv Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: File Upload & Handling Hackathon: Low Hanging Fruit hacktoberfest It's Hacktoberfest! https://groups.google.com/g/dataverse-community/c/n_Nn_T2yA-w/m/BcoXO4tEAQAJ Help Wanted: Code Help Wanted: Documentation Mentor: pdurbin Type: Bug a defect User Role: Depositor Creates datasets, uploads data, etc.
Projects
Status: 🔍 Interest
Development

No branches or pull requests

5 participants