requirements: pin PyYAML and add a test for a trailing tab #314
Description
I tried to import some records finalised last week (ins763822, ins791177, ins919778) that contain trailing tab characters in the YAML data files. These data files gave exceptions like yaml.scanner.ScannerError: while scanning for the next token found character '\t' that cannot start any token
. This seems to be a long-standing bug in PyYAML (yaml/pyyaml#306 and yaml/pyyaml#450). The problem is avoided by using LibYAML, for example, yaml.CSafeLoader
instead of yaml.SafeLoader
. With PyYAML 5.3.1 on my laptop (macOS), the LibYAML extension is not automatically included, but this problem is resolved by upgrading to PyYAML 5.4.1 (yaml/pyyaml#407). It seems that the HEPData Docker images (Linux) use LibYAML with both PyYAML 5.3.1 and PyYAML 5.4.1, so this might just be a macOS problem. It can be solved by pinning PyYAML 5.4.1 in requirements.txt
. We should also add a test for a trailing tab to check the PyYAML installation.