Skip to content

Working with third party pipelines

Frédéric Mahé edited this page Nov 27, 2022 · 3 revisions

Several pipelines have been created to deal with amplicon-data: Mothur, QIIME, UPARSE. We will try to show here how-to use swarm clusters with these pipelines.

Produce swarm results compatible with Mothur

That code block runs swarm and swarm post-processing with different d values, and convert swarm's format into Mothur's format.

FASTA="amplicons.fasta"
SWARM=$(readlink -f ./swarm)
CLUSTERS=$(mktemp)

# Unique amplicons
clusters=$(grep -c "^>" "${FASTA}")
(echo -ne "unique\t${clusters}\t"
    grep "^>" "${FASTA}" | tr -d ">" | tr "\n" "\t"
    echo) > test.list

# Test 20 d values
for ((d=1 ; d<=20 ; d++)) ; do
    # First step
    "${SWARM}" -d "${d}" "${FASTA}" > "${CLUSTERS}"
    # Convert to Mothur format
    clusters=$(wc -l < "${CLUSTERS}")
    (echo -ne "d${d}\t${clusters}\t"
        tr "\n" "\t" < "${CLUSTERS}" | tr " " ","
        echo) >> test.list
done
rm "${CLUSTERS}" "${CLUSTERS}"