Skip to content

Avoid nested bags by default #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

theatischbein
Copy link

In #186 I describe the unwanted creation of nested bags. This PR also closes the issue.
Currently it is not transparent that a nested bag is created.

Because it may be used, I implemented a flag that still allows the creation of nested bags, but by default a RuntimeError will be raised.

Changes

  • added a function is_bag(bag_dir), which uses the Bag constructor to test whether a directory is already a bag.
  • add flag allow_nested_bag=False to function make_bag
  • add logic to function make_bag that raises a RuntimeError if the given bag_dir is already a bag using the new function is_bag
  • add test cases for the functions is_bag and make_bag

Tests

All test are running successfully with my changes. See the log for more information.

Details of output of test.py

❯ python test.py
/home/thea/git/bagit-python/bagit.py:1451: DeprecationWarning: 'count' is passed as positional argument
  s = re.sub(r"%0D", "\r", s, re.IGNORECASE)
/home/thea/git/bagit-python/bagit.py:1452: DeprecationWarning: 'count' is passed as positional argument
  s = re.sub(r"%0A", "\n", s, re.IGNORECASE)
.........../home/thea/git/bagit-python/bagit.py:165: DeprecationWarning: The `checksum` argument for `make_bag` should be replaced with `checksums`
  warnings.warn(
...Disabling requested hash algorithm not-really-a-name: hashlib does not support it
An error occurred creating a bag in /tmp/tmp8450qsbp
Traceback (most recent call last):
  File "/home/thea/git/bagit-python/bagit.py", line 260, in make_bag
    total_bytes, total_files = make_manifests(
                               ~~~~~~~~~~~~~~^
        "data", processes, algorithms=checksums, encoding=encoding
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/thea/git/bagit-python/bagit.py", line 1275, in make_manifests
    checksums = [manifest_line_generator(i) for i in _walk(data_dir)]
                 ~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/home/thea/git/bagit-python/bagit.py", line 1418, in generate_manifest_lines
    hashers = get_hashers(algorithms)
  File "/home/thea/git/bagit-python/bagit.py", line 1136, in get_hashers
    raise ValueError(
    ...<3 lines>...
    )
ValueError: Unable to continue: hashlib does not support any of the requested algorithms!
.Bag directory /home/thea/git/bagit-python/this-directory-does-not-exist does not exist
.....The following files do not have read permissions:
('/tmp/tmpnompusg1/loc/2478433644_2839c5e8b8_o_d.jpg',)
An error occurred creating a bag in /tmp/tmpnompusg1
Traceback (most recent call last):
  File "/home/thea/git/bagit-python/bagit.py", line 229, in make_bag
    raise BagError(
        _("Read permissions are required to calculate file fixities")
    )
bagit.BagError: Read permissions are required to calculate file fixities
.Unable to write to the following directories and files:
['/tmp/tmpgl1u4_go']
An error occurred creating a bag in /tmp/tmpgl1u4_go
Traceback (most recent call last):
  File "/home/thea/git/bagit-python/bagit.py", line 213, in make_bag
    raise BagError(
        _("Missing permissions to move all files and directories"))
bagit.BagError: Missing permissions to move all files and directories
.The following directories do not have read permissions:
('/tmp/tmp4qzlyr7a/loc',)
An error occurred creating a bag in /tmp/tmp4qzlyr7a
Traceback (most recent call last):
  File "/home/thea/git/bagit-python/bagit.py", line 229, in make_bag
    raise BagError(
        _("Read permissions are required to calculate file fixities")
    )
bagit.BagError: Read permissions are required to calculate file fixities
.Unable to write to the following directories and files:
['/tmp/tmp6t3bs2_m', '/tmp/tmp6t3bs2_m/loc']
An error occurred creating a bag in /tmp/tmp6t3bs2_m
Traceback (most recent call last):
  File "/home/thea/git/bagit-python/bagit.py", line 213, in make_bag
    raise BagError(
        _("Missing permissions to move all files and directories"))
bagit.BagError: Missing permissions to move all files and directories
..........The following files do not have read permissions:
('/tmp/tmpcutz17p7/bag-info.txt',)
..........Creating bag for directory /tmp/tmp65leir02
Creating data directory
Moving si to /tmp/tmp65leir02/tmpjdlclbxc/si
Moving loc to /tmp/tmp65leir02/tmpjdlclbxc/loc
Moving README to /tmp/tmp65leir02/tmpjdlclbxc/README
Moving /tmp/tmp65leir02/tmpjdlclbxc to data
Using 1 processes to generate manifests: sha256, sha512
Generating manifest lines for file data/README
Generating manifest lines for file data/loc/2478433644_2839c5e8b8_o_d.jpg
Generating manifest lines for file data/loc/3314493806_6f1db86d66_o_d.jpg
Generating manifest lines for file data/si/2584174182_ffd5c24905_b_d.jpg
Generating manifest lines for file data/si/4011399822_65987a4806_b_d.jpg
Creating bagit.txt
Creating bag-info.txt
Creating /tmp/tmp65leir02/tagmanifest-sha256.txt
Creating /tmp/tmp65leir02/tagmanifest-sha512.txt
..............................bag-info.txt defines multiple Payload-Oxum values!
...data/README exists in manifest but was not found on filesystem
data/extra_file exists on filesystem but is not in the manifest
...data/README sha256 validation failed: expected="9006a02daf291a3ce8eebbb094ed3d17fcb0177b8e8d3421fbb8a080a2be48bf" found="d54d79889e20997c4b265488131fb593580f1885b3a5d75df49fe7f6604b66d0"
data/README sha512 validation failed: expected="06f3dedbd5c7796b75a7d5021aaf54559e0679c27b37d355f65ea64e31fd29a70b6e06e5c0b73fad809c579fb0f6fb7076ceec055c17a173e49007955c9f5820" found="c758e703c015e05a7e0631cb4f15ed5397c318e8ad56e1227ad2ce974d00c33642ec413172414545102708cb326176935e30e41c1f72733c894c2fb031477145"
..tmpk6fiecpp/tagfile md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="098f6bcd4621d373cade4e832627b4f6"
tmpk6fiecpp/tagfile exists in manifest but was not found on filesystem
.tmp79jtp40e/tagfolder/tagfile md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="098f6bcd4621d373cade4e832627b4f6"
tmp79jtp40e/tagfolder/tagfile exists in manifest but was not found on filesystem
.Unable to calculate file hashes for /tmp/tmprxq331w5
Traceback (most recent call last):
  File "/home/thea/git/bagit-python/bagit.py", line 916, in _validate_entries
    pool = multiprocessing.Pool(
        processes if processes else None, initializer=worker_init
    )
  File "/usr/lib/python3.13/unittest/mock.py", line 1169, in __call__
    return self._mock_call(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/unittest/mock.py", line 1173, in _mock_call
    return self._execute_mock_call(*args, **kwargs)
           ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/unittest/mock.py", line 1228, in _execute_mock_call
    raise effect
RuntimeError
.bag-info.txt exists in manifest but was not found on filesystem
data/extra_file exists on filesystem but is not in the manifest
.data/loc/2478433644_2839c5e8b8_o_d.jpg md5 validation failed: expected="9a2b89e9940fea6ac3a0cc71b0a933a0" found="Could not read /tmp/tmprxwtxkyt/data/loc/2478433644_2839c5e8b8_o_d.jpg: [Errno 13] Permission denied: '/tmp/tmprxwtxkyt/data/loc/2478433644_2839c5e8b8_o_d.jpg'"
.bag-info.txt exists in manifest but was not found on filesystem
data/README exists in manifest but was not found on filesystem
data/extra exists on filesystem but is not in the manifest
.data/README md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="fd41543285d17e7c29cd953f5cf5b955"
................bag-info.txt defines multiple Payload-Oxum values!
...data/README exists in manifest but was not found on filesystem
data/extra_file exists on filesystem but is not in the manifest
...data/README sha256 validation failed: expected="9006a02daf291a3ce8eebbb094ed3d17fcb0177b8e8d3421fbb8a080a2be48bf" found="d54d79889e20997c4b265488131fb593580f1885b3a5d75df49fe7f6604b66d0"
data/README sha512 validation failed: expected="06f3dedbd5c7796b75a7d5021aaf54559e0679c27b37d355f65ea64e31fd29a70b6e06e5c0b73fad809c579fb0f6fb7076ceec055c17a173e49007955c9f5820" found="c758e703c015e05a7e0631cb4f15ed5397c318e8ad56e1227ad2ce974d00c33642ec413172414545102708cb326176935e30e41c1f72733c894c2fb031477145"
..tmp9s2ei8kh/tagfile md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="098f6bcd4621d373cade4e832627b4f6"
tmp9s2ei8kh/tagfile exists in manifest but was not found on filesystem
.tmp5na6jn06/tagfolder/tagfile md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="098f6bcd4621d373cade4e832627b4f6"
tmp5na6jn06/tagfolder/tagfile exists in manifest but was not found on filesystem
.bag-info.txt exists in manifest but was not found on filesystem
data/extra_file exists on filesystem but is not in the manifest
.data/loc/2478433644_2839c5e8b8_o_d.jpg md5 validation failed: expected="9a2b89e9940fea6ac3a0cc71b0a933a0" found="Could not read /tmp/tmpcmz8z7bq/data/loc/2478433644_2839c5e8b8_o_d.jpg: [Errno 13] Permission denied: '/tmp/tmpcmz8z7bq/data/loc/2478433644_2839c5e8b8_o_d.jpg'"
.bag-info.txt exists in manifest but was not found on filesystem
data/README exists in manifest but was not found on filesystem
data/extra exists on filesystem but is not in the manifest
.data/README md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="fd41543285d17e7c29cd953f5cf5b955"
.
----------------------------------------------------------------------
Ran 117 tests in 1.151s

OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant