-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
This is a Python script I used to batch process a FASTA file within Python:
from poplars.sequence_locator import *
from poplars.common import *
import sys
fasta = convert_fasta(open(sys.argv[1]))
virus = 'hiv'
base = 'NA'
configs = handle_args(virus, base)
ref_nt_seq, ref_aa_seq = configs[0][0][1], configs[1]
nt_coords = configs[2]
reference_sequence = configs[3]
nt_coords_handle = open(nt_coords, 'r')
ref_genome = Genome(virus, nt_coords_handle, ref_nt_seq, ref_aa_seq,
reference_sequence, base)
for h, s in fasta:
query_seq = get_query(base, s, False)
query = Query(base, ref_genome, query_sequence=query_seq)
left, right = query.qcoords
sys.stdout.write('{}\t{}\t{}\n'.format(h, left, right))Some of this is unnecessarily complicated, such as setting up the Genome object. Ideally the workflow would look more like this:
from poplar import sequence_locator as locator
handle = open(sys.argv[1])
for h, s in convert_fasta(handle):
result = locator(s, base='NT', virus='hiv')
sys.stdout.write('{}\t{}\t{}\n'.format(h, result.left, result.right))Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels