Skip to content

Conversation

siddharth7113
Copy link
Collaborator

@siddharth7113 siddharth7113 commented Sep 22, 2025

The PR makes the following changes :

  1. Renames all the files according to PEP8.
  2. Ran ruff changes on all the files.
  3. Refactored Complex.py into complex.py and chain.py .
  4. Added docstrings to chain.py.
    Note : For the purpose of this PR, only chain.py needs to be reviewed, rest all of the files still need to be refactored and docstrings to be addded.

@siddharth7113 siddharth7113 changed the title [WIP] Updating Complex.py into different classes and files. [WIP] Files renaming and refactor of Complex into Complex and Chain Sep 22, 2025
@siddharth7113
Copy link
Collaborator Author

@fkiraly I wanted to just open a PR for getting review for chain/py but with ruff changes and files renaming this PR has already become complicated, I am guessing best would be to review only chain.py since rest of it is basically pre-commit triggered changes?

@siddharth7113 siddharth7113 self-assigned this Sep 22, 2025
@siddharth7113 siddharth7113 marked this pull request as ready for review September 22, 2025 15:22
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR performs a comprehensive refactoring of the MAWS codebase, moving from flat module structure to a proper package layout with modernized Python typing. The changes include:

  • Converting from old-style to modern Python typing (e.g., List[str]list[str])
  • Renaming files from PascalCase to snake_case and moving them into a maws/ package
  • Splitting the monolithic Complex.py into separate complex.py and chain.py modules
  • Refactoring code style and imports

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
pyproject.toml Removes author information temporarily
maws/structure.py Modernizes typing and code style
maws/space.py New file with geometric sampling spaces
maws/routines.py Statistical calculation functions
maws/rna_structure.py RNA residue templates with modern typing
maws/prepare.py AMBER library preparation utilities
maws/maws2023.py Main MAWS algorithm with updated imports
maws/kernels.py Numba-optimized computational kernels
maws/helpers.py Unit conversion and geometry utilities
maws/dna_structure.py DNA residue templates with modern typing
maws/debug.ipynb New Jupyter notebook for debugging
maws/complex.py Complex class for managing molecular systems
maws/chain.py Chain class for polymer chains
maws/tools.py External tool execution utilities

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

maws/complex.py:1

  • The condition checking element[2] is None and element[2] == 0 after the loop that modifies element[i] creates unreachable code. The element[2] is None check should occur before the loop that normalizes negative indices.
# maws/complex.py

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@fkiraly
Copy link
Collaborator

fkiraly commented Sep 24, 2025

note from earlier today: cany ou kindly add a PR description with a request what to review?

@siddharth7113
Copy link
Collaborator Author

Sorry , adding right now, I keep forgetting

note from earlier today: cany ou kindly add a PR description with a request what to review?

@siddharth7113 siddharth7113 changed the title [WIP] Files renaming and refactor of Complex into Complex and Chain Files renaming and refactor of Complex into Complex and Chain and renaming of files. Sep 24, 2025
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

I think this is quite hard for me to review, given that I am not certain where or how to use these classes.

Can you please ensure that you have a full docstring, and usage examples of the new Complex class at least?

@siddharth7113
Copy link
Collaborator Author

@fkiraly I have added docstrings for complex.py, undoubtedly this might be the most important file in maws algorithm, it took me a lot of time to go through it, and I even took AI help to write docstring, I think I understand 75% of what they are doing, to ensure I am not doing something wrong, I have verified the new code with the old version of MAWS, hopefully everything should be alright, I am soon going to add tests after this PR, as it is too many lines.

I would recommend only going through structure.py, chain.py and complex.py in this PR, as these 3 files are intertwined and used in maws2023.py for making aptamers.

OpenMM integrator (defaults to Langevin).
simulation : app.Simulation | None
OpenMM Simulation wrapper with context.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a full review, I am missing an example how this class gets used, or generated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on adding those, The implementration of Complex class is here , since one of these methods depend on make_lib from prepare.py which depends on ambertools, and I am getting Path errors which I am trying to resolve ,once that happens it would be easier to understand the method.

* each chain's LEaP ``init_string`` (``loadoff``/``loadamberparams``),
* each chain's *canonical* sequence (after alias translation).
- External tools must be discoverable on ``PATH``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguments of __init__ should be described in a Parameters section

LEaP ``source`` line for the aptamer FF (e.g., RNA.OL3/DNA.OL21).
force_field_ligand : str
LEaP ``source`` line for the ligand/protein FF (e.g., ff19SB or gaff2).
structure : None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with what?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I understand the question correctly but these leap commands are passed to tleap via prepare.py make_lib functions which further run tools.py commands.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, I think the code is currently much clearer!
I have no major reservations with respect to the structure - it also looks quite close to the original high-level architecture.

Some questions:

  • do I understand correctly that Complex is using a builder pattern?
  • what are the main differences to the previous file, Complex.py?
  • is it correct that: Complex contains Chains (and gets more through builder pattern), and Structure is what complex is built from?
  • architecturally, where are the interfaces to files in the entire set of classes?
  • how do the rotate_ methods in Complex and Chain relate? If Chains are part of Complex why does rotate_ not also rotate them? Or does it?

Requests:

  • usage examples would help a lot with the review, i.e., intended usage. Is there currently already such code? I assume you must have been using something like that for testing.
  • in the methods, could you at the top always say what attribute they exactly modify in self? E.g., add_chain modifies self.chains.
  • not blocking, but maybe helpful: would you be able to do the rename of Complex to complex etc already in main, and have this PR only change the contents? It would make review easier, as the diffs are easier to obtain.

@siddharth7113
Copy link
Collaborator Author

Excellent, I think the code is currently much clearer! I have no major reservations with respect to the structure - it also looks quite close to the original high-level architecture.

Some questions:

  • do I understand correctly that Complex is using a builder pattern?
  • what are the main differences to the previous file, Complex.py?
  • is it correct that: Complex contains Chains (and gets more through builder pattern), and Structure is what complex is built from?
  • architecturally, where are the interfaces to files in the entire set of classes?
  • how do the rotate_ methods in Complex and Chain relate? If Chains are part of Complex why does rotate_ not also rotate them? Or does it?

Requests:

  • usage examples would help a lot with the review, i.e., intended usage. Is there currently already such code? I assume you must have been using something like that for testing.
  • in the methods, could you at the top always say what attribute they exactly modify in self? E.g., add_chain modifies self.chains.
  • not blocking, but maybe helpful: would you be able to do the rename of Complex to complex etc already in main, and have this PR only change the contents? It would make review easier, as the diffs are easier to obtain.
  • I am not very familiar with builder pattern, I'll read up and try to write up a detailed reply, but after cursory reading it feels like the current code, is indeed using build pattern.

  • The difference between the original Complex.py and current are majorly the following :

    • Breaking Chain and Complex classes into separate scripts to understand better.
    • Caching mechanism (This was something I did as initial experiment, to reduce tleap calls, but I doubt if it helpful in terms of speed, but it does reduce file clutter during final run) , I'll be honest , while originally refactoring I was looking ways to reduce file clutter and I came up with it with the help of AI 😬, hence the scepticism.
  • Yes, it is correct Complex contains the list of Chains and get more through builder pattern but I do not understand the part of " Structure is what complex is built from?", It would be a different class but it uses all the information that would be part of Structure class to perform geometrical operations in Complex.

  • I am not sure if I understand the question, but these classes are being interfaced in maws2023.py (I have a suspicion, the question was about something else, please correct me here).

  • I am little bit confused about that part, I'll look again an get back to you.

About requests

  • I think you are right, let me make a branch with renaming changes first, merge into main and then this would be easier for you to review.

  • For testing purpose, I have a terrible mechanism where I simply run the maws2023.py algorithm and produce the output for a small number of inputs, I have to write test cases but I'll start adding usage examples,.

@siddharth7113
Copy link
Collaborator Author

@fkiraly I hope this would be easier to review better than before, especially the renaming files are sorted, working on rest of requested changes.

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 2, 2025

I do not understand the part of " Structure is what complex is built from?",

I meant, is Complex a composite containing Structure? Or is it interacting with it, instead?

architecturally, where are the interfaces to files in the entire set of classes?

I am not sure if I understand the question, but these classes are being interfaced in maws2023.py (I have a suspicion, the question was about something else, please correct me here).

I meant, where are files being interfaced? In which functions, classes, methods precisely?

I have a terrible mechanism where I simply run the maws2023.py algorithm and produce the output for a small number of inputs, I have to write test cases but I'll start adding usage examples,.

Ok, that is your test workflow then and primary vignette, just to confirm?

@siddharth7113 siddharth7113 marked this pull request as draft October 10, 2025 05:53
@siddharth7113 siddharth7113 marked this pull request as ready for review October 16, 2025 07:37
@siddharth7113
Copy link
Collaborator Author

I do not understand the part of " Structure is what complex is built from?",

I meant, is Complex a composite containing Structure? Or is it interacting with it, instead?

architecturally, where are the interfaces to files in the entire set of classes?

I am not sure if I understand the question, but these classes are being interfaced in maws2023.py (I have a suspicion, the question was about something else, please correct me here).

I meant, where are files being interfaced? In which functions, classes, methods precisely?

I have a terrible mechanism where I simply run the maws2023.py algorithm and produce the output for a small number of inputs, I have to write test cases but I'll start adding usage examples,.

Ok, that is your test workflow then and primary vignette, just to confirm?

  1. complex is a composite of chain objects; each chain holds a reference to a structure. So complex doesn’t contain structure directly,it interacts with structure via chain.

2 . maws.prepare.make_lib(...) (called by Complex.add_chain_from_pdb) to create/parameterize a ligand template next to the input PDB, and Complex.build(...) to run LEaP and move/read the resulting AMBER artifacts via a content-addressed cache (.maws_cache/). Everything else (chains, geometry, energies/MD) is in memory.

  1. Precisely here:
  • maws.prepare.make_lib(pdb_path, pdb_name, force_field_aptamer, force_field_ligand, parameterized)

    • Writes: <pdb_name>.lib (and <pdb_name>.frcmod if needed) next to the input PDB.
    • Also produces transient inputs for AmberTools (e.g., .mol2, a small s.in, etc.).
    • Triggered by: Complex.add_chain_from_pdb(...).
  • Complex.build(target_path="", file_name="out")

    • Writes: <target_path>/<file_name>.in (LEaP input).
    • External call: runs tleap (via maws.tools.run(find_exe("tleap"), ...)).
    • Expects then moves: <file_name>.prmtop/.inpcrd.maws_cache/<sha1>.prmtop/.inpcrd (content-addressed key from _build_cache_key).
    • Reads: cached .prmtop/.inpcrd into OpenMM (AmberPrmtopFile, AmberInpcrdFile).
  • Examples / legacy script

  • examples/complex_aptamer.py and earlier maws2023.py write PDB snapshots with openmm.app.PDBFile.writeModel(...).

  • maws2023.py also writes simple log files (*_output.log, *_entropy.log, etc.).

  • Everything else (Structure, Chain, geometry ops like rotate_*/translate_*, energy/minimization/MD) does no file I/O; they operate on Complex.positions in memory.

  1. I have added a new directory called examples/ to help with understanding of classes and algorithm along with acting as a temproary test harness, once I complete refactoring, I would be much clear on how and what tests to write for entire algorithm.

@siddharth7113
Copy link
Collaborator Author

Also, I have added docstrings in helpers.py which would help in understanding of some functions especially related to energy and simulations in complex.py.

Also at this point the architectural design of complex class could be made better such as we could somehow differentiate energy/molecular dynamics functions somewhere else and structural/rotational functions else, but I don't have any clear vision on how to do that, would need your help here.

@fkiraly fkiraly self-requested a review October 17, 2025 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants