-
Notifications
You must be signed in to change notification settings - Fork 24
Parse secondary structure annotations from mmCIF #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #65 +/- ##
==========================================
- Coverage 94.99% 94.95% -0.04%
==========================================
Files 14 14
Lines 1919 1944 +25
==========================================
+ Hits 1823 1846 +23
- Misses 96 98 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
jgreener64
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few comments.
I will also give you write access to the package.
| # Secondary structure assignment | ||
| if haskey(mmcif_dict, "_struct_conf.conf_type_id") | ||
| (run_dssp | run_stride) && @warn "Secondary structure assignment will be overwritten" | ||
| for (i, id) in pairs(mmcif_dict["_struct_conf.conf_type_id"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth benchmarking that this doesn't slow down mmCIF reading much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With
git diff ../src/mmcif.jl
diff --git a/src/mmcif.jl b/src/mmcif.jl
index 4dc94d4..7e14271 100644
--- a/src/mmcif.jl
+++ b/src/mmcif.jl
@@ -347,8 +347,9 @@ function MolecularStructure(mmcif_dict::MMCIFDict;
end
# Secondary structure assignment
- if haskey(mmcif_dict, "_struct_conf.conf_type_id")
- (run_dssp | run_stride) && @warn "Secondary structure assignment will be overwritten"
+ if !(run_dssp | run_stride) && haskey(mmcif_dict, "_struct_conf.conf_type_id")
+ println("parsing secondary structure from mmCIF file")
+
for (i, id) in pairs(mmcif_dict["_struct_conf.conf_type_id"])
chainid = mmcif_dict["_struct_conf.beg_label_asym_id"][i]
mmcif_dict["_struct_conf.end_label_asym_id"][i] == chainid || continue # mismatch in chain id
@@ -369,12 +370,12 @@ function MolecularStructure(mmcif_dict::MMCIFDict;
if run_dssp && run_stride
throw(ArgumentError("run_dssp and run_stride cannot both be true"))
end
- if run_dssp
- rundssp!(struc)
- end
- if run_stride
- runstride!(struc)
- end
+ # if run_dssp
+ # rundssp!(struc)
+ # end
+ # if run_stride
+ # runstride!(struc)
+ # end
return struc
endI get this:
julia> @time read(cif_path, MMCIFFormat; run_dssp=false)
parsing secondary structure from mmCIF file
0.060777 seconds (780.75 k allocations: 50.053 MiB)
MolecularStructure 1BQ0.cif with 20 models, 1 chains (A), 77 residues, 1244 atoms
julia> @time read(cif_path, MMCIFFormat; run_dssp=true)
0.058849 seconds (780.61 k allocations: 50.030 MiB)
MolecularStructure 1BQ0.cif with 20 models, 1 chains (A), 77 residues, 1244 atoms
|
Looks good, thanks. |
While we can get this info from running DSSP, it seems reasonable to extract this information from the file when it is present.
One issue is that the annotations used are more detailed than the one-letter codes. I may not have handled this correctly, and perhaps it may be worth considering if we want to consider an alternative approach. (E.g., an
@enum?) It might also be nice to include the numeric code, e.g., for TM6. (Presumably there will always be a switch in categorization before the next structure of the same type, so this is not as critical.)