Algorithms for Identifying the Undesignability of RNA secondary structures
This project explores the (un)designability of RNA secondary structures, under the context of RNA design, building upon the research presented in the papers by T. Zhou et al. [1][2][3].
[1] Zhou, T., Tang, W.Y., Mathews, D.H. and Huang, L.
"Undesignable RNA Structure Identification via Rival Structure Generation and Structure Decomposition."
(RECOMB 2024, arXiv preprint)
[2] Zhou, T., Tang, W.Y., Apoorv, M., Mathews, D.H. and Huang, L.
"Scalable and Interpretable Identification of Minimal Undesignable
RNA Structure Motifs with Rotational Invariance" (RECOMB 2025, arXiv preprint)
[3] Zhou, T., Mathews, D.H. and Huang, L.
"Probabilistic RNA Designability via Interpretable Ensemble Approximation and Dynamic Decomposition"
Eterna100
ArchiveII
Motifs of length up to 14 (excluding motifs with external loops for comparison with CountingDesign):
Detailed results for undesignable motifs of length up to 14:
- GCC 4.8.5 or above
# Test with CentOS
$ make main
$ make main_nosh # turn off special hairpins
$ make lineardecompose # Probabilistic RNA Designability Quantification
# Test with Apple Silicon
$ make main_mac
$ make main_nosh_mac # turn off special hairpins
$export OMP_NUM_THREADS=8 # parallel computing eabled by OpenMP
$export PATH_UNDESIGNABLE_LIB=path/to/lib_undesignable.txt
$export PATH_DESIGNABLE_LIB/path/to/motifs/libs/lib_designable.txt
Libraries for designable and undesignable motifs are avaialbe at: https://drive.google.com/drive/u/0/folders/1lMBWVEvUAVI0YHV1BvqipHXuGQ11EphO
To replicate the experiment results in the paper:
1. ArchiveII
cat data/archiveii1144.txt | ./bin/lineardecompose minp | grep -E "Minimum" | tee results_archiveii1144.txt # prob. bounds will be saved to results_archiveii1144.txt2. Eterna100
cat data/eterna100.txt | ./bin/lineardecompose minp | grep -E "Minimum" | tee results_eterna100.txt # prob. bounds will be saved to results_eterna100.txt$echo ".((......((......))......((......((......))......((......))......))......))....." | ./bin/main --alg fastmotif
$echo "(.(*)...(..(*)))" | ./bin/main --alg motif # motif as a dot-bracket string where (*) is a boundary pair
$./bin/main --alg 1
......(.........((((.....)))).........)......................
................((((.....))))................................
$./bin/main --alg 2
AAAAUGAGCCCCACGAAAGGAGAGUGCUCACAAA
....((((((((.(....)).).).)))))....
....(((((((..(....)..).).)))))....
$./bin/main --alg 2c
UUAAGGGAAAAUCUUAGCCGAGAAAUCGGAUCCAAAGCGGCAUAAAAAAGAAAGCGCCGAAAUUCGCAGAAAUGCGAGAAAGGCAAGCAAAGAAUUCGGCAGAAAAAAUGCCGACCGGGCAAUGAAAAUUCGCCCGUGGAGCCAAGCGGG
((((((.....)))))(((((....)))).)((...(((((............(((((....((((((....))))))...)))..)).......((((((.......))))))(((((((((....))).))))).)..)))..)))))
((((((.....))))).((((....))))..((...(((((............(((((....((((((....))))))...)))..)).......((((((.......))))))(((((((((....))).))))).)..)))..)))))
$./bin/main --alg 3
ACUAAAUGGUGAGCAGACCCAGUGGAAACACACGCAGCCGAAAGGUACCCAUCCGAGAGGAAGUCAGGCGAAAGCUAACGGAAAGAACGUAGACAGGGAGCGAGGGACAAAGACUGCAAGGGAAAGUACACAAGACAAAGUAAAAAAAGGUGAGGCAGGGGAAACCCCGGGAAACCGGUCGAAAGACGCCAGCAAACCGCAGAAACAGCCACCCAGCGAGACAGACAAAAGCGGAUACGUAGUCGACGGAAACGUAGUCAGGGGAAACCCACGCAAUCGAAAGAUAGGGAGUCGGUGAAAACCAGAGAAAUCUACUCAAAAGAGGACAGGCAGCGGAACCCCUACACCGAAAAAA
.......((((.((((.(((.(((....))).(((.(((....))).(((.(((....))).(((.(((....))).(((.......))).))).))).))).))).(...).)))).((((...((.(((...((...))........))).(((.(((....)))(((....)))(((....)))))).))...((((.(...).(((.(((.(((.(((.(((....(((....))).))).(((....))).))).(((....))).))).(((....))).))).((((((....)))(((....))).(((....)))))).))).))))...)))).)))).......
$./bin/main --alg eval
CACACGCACUACAAAAUGUCCAAAGGAAAAGGCACCACCAGCAAAGCACCAAAGGUAAGGGGAAAAG
.....((.((.((...)).((...))...)))).((.((.((...)).((...))...)))).....
(output)total energy: -2.30
$./bin/main --alg dp
..((((((((.......(.((((((....)))))).(((((((....))))))).).......))))))))..
..((((((((.........((((((....)))))).(((((((....))))))).........))))))))..
$./bin/main --alg ed
GGGAGACCCAAAAAAAAGGGCAACUGCAAAAAGGAGACAGCACCCCGAAAAAAGACUGGAAAAAGGGCGAAAAGCUCGAAAAACACGACCAACGGAAAACAGGACGAAAGAGAACAAGCAAGCCAAAGGGAAACAGACUAAAAACGCGAAAGCGACUGCAAAGGGGGAGAAAAAGCGACCCUGAACGAAAAAGGGGCGAAAAAUUGGAACAAAAAAAGGAGGGGGGAAAGGAAAGUCAAAGACACUCGAAACGAGUGAGCGGGCAAAAAAAAAAACGGGGGAUGAAUAACGGACGGAAACGCGGCGGAAAGCGAAAAAAAGAAAAACGUCGUACGGACUACUGGGGUGCAAAAAAAAGGAGGGGCGCAAAAAGGAAAAAACAGGGUCCACUA
((..(((((........(.((..((.(.....(....).((((((((.........((......((((.....))))......))...((..((.......(..(......)..)..((..((.....(....).((((.....(((.....(..((.(...).))..).....))).((((...(.......(..(.((...)).)..).......)...))))........)))).....((((((...)))))).))..))...........))..)).........(..((....)((((((.....(........).....))))))..)..)...))))))))........).))..)).).....(.......).)))))..)).
((..(((((........(.((..((.(.....(....).((((((((.........((......((((.....))))......))...((..((.......(..(......)..)..((..((............((((.....(((.....(..((.(...).))..).....))).((((...(.......(..(.((...)).)..).......)...))))........)))).....((((((...)))))).))..))...........))..)).........(..((.....((((((.....(........).....)))))).))..)...))))))))........).))..)).).....(.......).)))))..)).
ref1 energy: -68.20
ref2 energy: -72.70
GGGAGACCCAAAAAAAAGGGCAACUGCAAAAAGGAGACAGCACCCCGAAAAAAGACUGGAAAAAGGGCGAAAAGCUCGAAAAACACGACCAACGGAAAACAGGACGAAAGAGAACAAGCAAGCCAAAGGGAAACAGACUAAAAACGCGAAAGCGACUGCAAAGGGGGAGAAAAAGCGACCCUGAACGAAAAAGGGGCGAAAAAUUGGAACAAAAAAAGGAGGGGGGAAAGGAAAGUCAAAGACACUCGAAACGAGUGAGCGGGCAAAAAAAAAAACGGGGGAUGAAUAACGGACGGAAACGCGGCGGAAAGCGAAAAAAAGAAAAACGUCGUACGGACUACUGGGGUGCAAAAAAAAGGAGGGGCGCAAAAAGGAAAAAACAGGGUCCACUA
ref1 energy = 7.20, ref2 energy = 2.70
pass test: true
e1 - e2: 4.50
delta : 4.50
or
$./bin/main --alg ed < data/seq_refs.txt # batched input

