Skip to content

Some DOCX files are not properly recognized. #38

Open
@majkel89

Description

@majkel89

Describe the bug
I noticed this issue on one of my production services.
I was able to reproduce this bug by obtaining issue311docx.testfile file from https://github.com/file/file repository

To Reproduce
Steps to reproduce the behavior:

  1. Go to test: added example of docx that does not match current magic numbers #37
  2. See tests are failing on 365-issue311.docx

Expected behavior
365-issue311.docx is properly recognized as DOCX

Additional context
DOCX is hard to recognized because it is ZIP archive with different extension
DOCX can be verified by searching for common file names within its contents
Moreover ZIP file is hard to verify because one should start to check it from the very end.
Not sure it this bug should be fixed here or you just need to let to know others that for DOCX they should use something else eg .https://github.com/hey-red/Mime/tree/master

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions