Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplementation of Recollect #148

Merged
merged 14 commits into from
Oct 15, 2024
Merged

Reimplementation of Recollect #148

merged 14 commits into from
Oct 15, 2024

Conversation

blakeo2
Copy link
Collaborator

@blakeo2 blakeo2 commented Aug 8, 2024

In a previous pull request, the changes made to the recollect had been accidentally removed. I have patched this problem and reimplemented all of the original functionality. Here are some of the changes:

  • Legacy zip files (from molli_firstgen) can now be recollected correctly with a new function recollect_legacy. This functionality has been extended to allow the use of either base geometries or conformer ensembles. It also offers a new warning to indicate if there may be unexpected behavior when requesting a recollection.
  • Normal zip files can now be recollected into MoleculeLibrary and ConformerLibrary objects
  • Directories can now be recollected into MoleculeLibrary and ConformerLibrary
  • The encoding and decoding methods for collections now utilize the ml.dumps and ml.loads functions, offering the full functionality to write any file type to directories through the utilization of the openbabel interface
  • A new option has been added for iconv, which allows things from directories or zip files to be read in as a Molecule or ConformerEnsemble. This also allows for a separate output conversion if for example a ConformerEnsemble is read in and the output is a MoleculeLibrary, the first Conformer of the ConformerEnsemble will be read into the MoleculeLibrary
  • Some of the backend errors in the ZipCollection and DirCollection associated with reading/writing have been fixed

I have attached a variety of the commands associated with the tests I have run. If requested, I can also include the testing folders and directories utilized in this test. All tests appeared to behave as expected.

test_commands.txt

This pull request takes care of the issues #107, #123, and #136

blakeo2 and others added 10 commits July 15, 2024 16:08
-Fixed to use variables `input_type` and `output_type`
-This required the creation of a separate `recollect_legacy` function. This is in the same format of the normal `recollect` function
-This has been updated to use the `ml.loads` function allowing implementation of any file types. For testing, I used zip files containing mol2, xyz, pdb, and sdf files successfully
-In the event that non-native molli file formats are used with the molli parser, the code will correctly error out indicating that openbabel should be used.
… various objects:

- Added `-iconv` option to indicate if files in directories and zip folders should be read in as a `ConformerEnsemble` or `Molecule`
- Changed `converter` to `output_conv`
- Added extra possibilities to `output_conv` when the input_conversion is being written to a `ConformerLibrary` and was read in as a `Molecule` or it is being written to a `MoleculeLibrary` and was read in as a `ConformerEnsemble`
- Added `LegacyMolliWarning` to account for unexpected behavior when recollecting an ensemble to molecule or molecule to ensemble with the `recollect_legacy`
-
@blakeo2 blakeo2 requested a review from esalx August 8, 2024 23:52
Copy link
Contributor

@esalx esalx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, great work @blakeo2.

@@ -53,7 +53,7 @@ def __init__(
) -> None:
self._path = Path(path)

if not self._path.is_file() and readonly:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may need to be addressed in the future. I can definitely see how extension-less files and directories can conflict here, but let's hope for the best.
Non-issue for now.

@esalx esalx merged commit 037849d into main Oct 15, 2024
12 checks passed
@esalx esalx deleted the recollect-fix branch October 15, 2024 03:41
@esalx esalx restored the recollect-fix branch October 15, 2024 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants