Skip to content

Clonality

Serghei Mangul edited this page Sep 3, 2017 · 5 revisions

We provide scripts to estimate clonality of the sample based on the assembled CDR3 sequences and their relative abundances.

To estimate clonality of the sample use this command

python clonality.py toyExample/toyExample.cdr3 clonality

On the screen the following informative message to appear:

Read CDR3 assembled by ImreP toyExample/toyExample.cdr3
Total number of IGH CDR3 is 8
Total number of IGK CDR3 is 0
Total number of IGL CDR3 is 0
Total number of TCRA CDR3 is 0
Total number of TCRB CDR3 is 0
Total number of TCRG CDR3 is 0
Total number of TCRD CDR3 is 0

Results are in clonality directory. This directory contains SUMMARY_clonality.txt. This file reports the following measures of clonality per chain:

  • Number of distinct CDR3s
  • Number of reads supporting CDR3s
  • Immune diversity (alpha diversity, Shannon entropy)

The format of SUMMARY_clonality.txt

SAMPLE,nIGH,nIGK,nIGL,nTCRA,nTCRB,nTCRD,nTCRG, loadIGH,nIGK,loadIGL,loadTCRA,loadTCRB,loadTCRD,loadTCRG,alphaIGH,alphaIGK,alphaIGL,alphaTCRA,alphaTCRB,alphaTCRD,alphaTCRG
toyExample,8,0,0,0,0,0,0,92,0,0,0,0,0,0,1.8773144224,0,0,0,0,0,0

The output directory also contains CDR3s of each chain saved in a separate file

IGH_cdr3_clonality2.txt  TCRD_cdr3_clonality2.txt
IGK_cdr3_clonality2.txt  TCRA_cdr3_clonality2.txt  TCRG_cdr3_clonality2.txt
IGL_cdr3_clonality2.txt  TCRB_cdr3_clonality2.txt

Across multiple samples

The following commands can be used to estimate clonality across multiple samples

ls *cdr3 | awk -F ".cdr3" '{print $1}'>samples.txt
while read line; do python ~/code/imrep/clonality.py ${line}.cdr3 ${line};done<samples.txt 

Clone this wiki locally