SamplingDesign: RNA Design via Continuous Optimization with Coupled Variables and Monte-Carlo Sampling

This repository contains the source code for the SamplingDesign project. The Python code for generating the figures in the paper is available in ./figures/

Wei Yu Tang, Ning Dai, Tianshuo Zhou, David H. Mathews, and Liang Huang*

* corresponding author

For questions, please contact the corresponding author at liang.huang.sh@gmail.com.

To Compile

Compiler version: g++ (Spack GCC) 8.3.0

make

Python Dependencies

python (3.8.20), numpy (1.24.4), matplotlib (3.7.5), viennarna (2.6.4)

pip install -r requirements.txt

Example Script

Run SamplingDesign for the shortest five structures (up to 30 nucleotides) in Eterna100 with 200 steps (takes ~30 seconds).

./run.sh example ./data/example.txt

The results will be saved in ./results/example/. The script then parses the result file to generate learning curves in ./graphs/example/ and output the best solution (based on each metric) into ./analysis/example/.

Run all structures in a dataset

Reproduce the results in the paper (run both uniform and $\epsilon$-targeted initializations). Note: On our server with 64 cores, the Eterna100 dataset took approximately 10 days to complete.

Command

./run_all.sh ./data/eterna100_v1.txt
python merge.py eterna100_v1 eterna100_uniform eterna100_targeted # summarize results

Replace ./data/eterna100.txt with

./data/eterna100_v2.txt to run all the modified 19 structures in Eterna100-v2.
./data/rfam27.txt to run all the Rfam-Taneda-27 structures.
./data/rnasolo.txt to run all the RNAsolo-764 structures.

To run SamplingDesign

Command

echo "[target structure]" | ./main [args] > ./result.txt

Example

echo "(((...)))" | ./main --steps 50 --verbose > ./result.txt

Arguments

Objective functions: "prob" - Boltzmann probability, "ned" - normalized ensemble defect, "dist" - structural distance, "ddg" - free energy gap. (default: "prob")

--obj [prob/ned/dist/ddg]

Initializations: uniform or targeted (default: targeted)

--init [uniform/targeted]

eps: For $\epsilon$-targeted initialization. To use targeted initialization, set $\epsilon$ = 1.0. (default: 0.75)

--eps [a value between 0 and 1]

projection: use the direct parameterization (projected gradient descent) instead of the softmax parameterization. (default: false)

--projection

no_adam: turn off adam optimizer (default: false)

--no_adam

beta_1, beta_2: the first and second moment decay rates for adam optimizer (default: 0.9, 0.999)

--beta_1 [value] --beta_2 [value]

nesterov: use Nesterov's accelerated gradient descent for projected gradient descent (default: false)

--nesterov

initial_lr: set initial learning rate (default: 0.01)

--initial_lr [value]

num_steps: set max number of steps (default: 2000)

--num_steps [value]

no_early_stop: turn off early stopping (default: false)

--no_early_stop

k_ma: k moving average parameter used for early stopping (default: 50)

--k_ma [value]

beamsize: set beamsize and sharpturn (default: 250)

--beamsize [value]

is_lazy: use lazyOutside when evaluating NED (default: False)

--is_lazy

sample_size: number of samples used per step (default: 2500)

--sample_size [value]

best_k: print out best k samples (in terms of objective function) at each step (default: 1)

--best_k [value]

verbose: print out the logit, the distribution and the gradient at each step (default: False)

--verbose

num_threads: max number of threads used by openMP, if 0 then use the default number (default: 0)

--num_threads [value]

boxplot: print out the objective of all samples at each step (for generating the boxplot in the learning curve) (default: False)

--boxplot

analysis.py

analysis.py performs two main tasks:

Re-evaluates the best solution at each step using the ViennaRNA package (version 2.6.4), and saves the best solution for each metric to ./analysis/{folder}/.
Generates a learning curve and saves it to ./graphs/{folder}/.

Example Command

python analysis.py --folder "example" --file "1.txt"

Input: Parses the results from ./results/{folder}/{file}. Output:

Best solutions are saved to ./analysis/{folder}/{file}.txt
The learning curve is saved as ./graphs/{folder}/{file}.pdf

Arguments

folder, file: parse the file at ./results/{folder}/{file}

--folder "example" --file "8.txt"

max_workers: set the max number of workers (threads) (default: None)

--max_workers [value]

merge.py

merge.py combines multiple results folder from ./analysis/ and summarize them.

python merge.py <data file> <folder 1> [folder 2] ... [folder n]

Note that merge.py takes only the file name as arguments. E.g.,

python merge.py eterna100 eterna100_uniform eterna100_targeted

after running ./run_all.sh ./data/eterna100.txt

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
data		data
figures		figures
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
AUTHORS		AUTHORS
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
analysis.py		analysis.py
gflags.py		gflags.py
main		main
merge.py		merge.py
requirements.txt		requirements.txt
run.sh		run.sh
run_all.sh		run_all.sh
vienna.py		vienna.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SamplingDesign: RNA Design via Continuous Optimization with Coupled Variables and Monte-Carlo Sampling

To Compile

Python Dependencies

Example Script

Run all structures in a dataset

Command

To run SamplingDesign

Command

Example

Arguments

analysis.py

Example Command

Arguments

merge.py

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

weiyutang1010/ncrna_design

Folders and files

Latest commit

History

Repository files navigation

SamplingDesign: RNA Design via Continuous Optimization with Coupled Variables and Monte-Carlo Sampling

To Compile

Python Dependencies

Example Script

Run all structures in a dataset

Command

To run SamplingDesign

Command

Example

Arguments

analysis.py

Example Command

Arguments

merge.py

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages