Contents
- S1: Data sets and preprocessing
- S2: Assembly tools
- S3: Executed assembly commands
- S4: HISAT2 re-mapping rate
- S5: RNAQuast statistics
- S6: TransRate
- S7: ExN50
- S8: BUSCO
- S9: DETONATE
- S10: Selected main metrics
- S11: Runtime and memory consumption
- S12: (0,1)-normalized scores per data set and metric
All big data files (processed read data, assemblies, ...) as well as execution commands and Blast results are also available at the Open Science Framework under accession doi.org/10.17605/OSF.IO/5ZDX4.
S1: Data sets and preprocessing
Overview of all data sets used for assembly. All reads were quality checked with FASTQC and Prinseq prior assembly. Shown are read statistics before and after trimming. If a data set was sequenced by preserving the strand specificity (ss), this is indicated by the direction of the sequenced read (pairs): F -- forward, R -- reverse.
ID | Species | Domain | Encoding | Type | Stranded | # Reads (raw) | Read length (raw) | %GC (raw) | FastQC (raw) | # Reads (trimmed) | Read length (trimmed) | %GC (trimmed) | FastQC (trimmed) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ECO | Escherichia coli | Bacteria | Illumina 1.9 | single-end | ss F | 7943130 | 94 | 40% | SINGLE | 6643709 | 25-94 | 50% | SINGLE |
CAL3 | Candida albicans | Fungi | Illumina 1.9 | paired-end | not | 11576932 | 51 | 38% | FORWARD | REVERSE | 11168529 | 25-51 | 38% | FORWARD | REVERSE |
ATH | Arabidopsis thaliana | Plant | Illumina 1.9 | single-end | not | 16911774 | 30-101 | 46% | SINGLE | 16828036 | 25-101 | 46% | SINGLE |
MMU | Mus musculus | Mammal | Illumina 1.9 | paired-end | ss FR | 52645238 | 76 | 48% | FORWARD | REVERSE | 43321537 | 25-76 | 48% | FORWARD | REVERSE |
HSA | Homo sapiens | Mammal | Illumina 1.9 | paired-end | ss FR | 97548052 | 101 | 53% | FORWARD | REVERSE | 96093116 | 25-101 | 52% | FORWARD | REVERSE |
EBOV_HSA_3H | Homo sapiens + EBOV 3h | Mammal+Virus | Illumina 1.9 | paired-end | not | 17203785 | 20-100 | 48% | FORWARD | REVERSE | 15649763 | 25-100 | 47% | FORWARD | REVERSE |
EBOV_HSA_7H | Homo sapiens + EBOV 7h | Mammal+Virus | Illumina 1.9 | paired-end | not | 24688523 | 20-100 | 47% | FORWARD | REVERSE | 22268601 | 25-100 | 47% | FORWARD | REVERSE |
EBOV_HSA_23H | Homo sapiens + EBOV 23h | Mammal+Virus | Illumina 1.9 | paired-end | not | 26470155 | 20-100 | 46% | FORWARD | REVERSE | 23930250 | 25-100 | 46% | FORWARD | REVERSE |
HSA_FLUX | Homo sapiens simulated | Mammal-simulated | Illumina 1.9 | paired-end | not | 60002178 | 3-100 | 50% | FORWARD | REVERSE | 56962181 | 25-100 | 49% | FORWARD | REVERSE |
S2: Assembly tools
Overview about the evaluated assembly tools. We obtained the most recent versions in November 2018. We chose Oases and Trans-ABySS as two transcriptome assemblers build on top of the genome assemblers Velvet and ABySS, respectively. To achieve an assembly with Oases or Trans-ABySS one first has to run the underlying genome assembly tool with a range of different k-mers. SOAPdenovo-Trans, build on the principles of SOAPdenovo2, can be used as a stand alone tool for de novo transcriptome assembly based on single k-mer values. Trinity was also especially designed for transcriptome assembly, using only a fixed k-mer value of 25. All these four transcriptome assemblers are designed in the scope of working with RNA-Seq data and are based on de Bruijn graph algorithms. In contrast the Mira assembler works on overlap- graphs and therefore uses no k-mer approach to decompose reads before assembly. It can be run in EST mode to deal with the special requirements of RNA-Seq data. Next to this transcriptome assemblers we also chose one de novo genome assembly tool, SPAdes, originally developed for smaller bacterial-size genomes and single-cell data and also based on de Bruijn graph and multiple k-mer values. MK – Yes, if the tool has an build in multiple kmer approach and automatically merges the output of different kmer runs. aOases was used on top of the de novo genome assembler Velvet (v1.2.10). bSPAdes, originally designed as a de novo genome assembler for single-cell data, was used in RNA-Seq modus (-rna) and single-cell modus (-sc), respectively. cWhen running SPAdes in RNA-Seq modus, two k-mer values are used.
Assembler | Version | Multiple k-mer mode | PMID |
---|---|---|---|
Trinity | 2.8.4 | no | 21572440 |
Trans-ABySS | 2.0.1 | yes | 20935650 |
SOAPdenovo-Trans | 1.03 | no | 24532719 |
Oasesa | 0.2.08 | yes | 22368243 |
IDBA-Tran | 1.1.1 | yes | 23813001 |
Bridger | v2014-12-01 | no | 25723335 |
BinPacker | 1.0 | no | 26894997 |
Shannon | 0.0.2 | no | bioRxiv |
SPAdes-scb | 3.13.0 | yes | 22506599 |
SPAdes-rnab | 3.13.0 | yesc | bioRxiv |
S3: Executed assembly commands
In the following we provide all executed commands that were used to run the different assemblies. Each assembler listed in Tab. S2 was run on each data set listed in Tab. S1. For details like used k-mers and strand-specificity download the readme files or expand the view for the corresponding data set. The command files still include the execution command of MIRA, however the output was not used in the comparison because MIRA performed worse on all data sets (very short contigs, almost no homology to known transcripts, bad re-mapping rate, ... see manuscript).
Assembly of Escherichia coli (download script)
#######################################
## Eco1 1x 94bp, strand specific F
#######################################
##PARAMETERS SHARED
PROJECT=eco
R1=eco1_formated.fastq
THREADS=48
MEM=400
DIR=~/$PROJECT
FINAL_DIR=~/$PROJECT/final
mkdir -p $DIR
mkdir -p $FINAL_DIR
kmer1=25
kmer2=35
kmer3=45
kmer4=55
kmer5=65
###################################################################################
## TRINITY
mkdir -p $DIR/trinity
Trinity --seqType fq --max_memory $MEM"G" --single $R1 --CPU $THREADS --output $DIR/trinity/ --SS_lib_type F
ln -s $DIR/trinity/Trinity.fasta $FINAL_DIR/trinity.fasta
###################################################################################
###################################################################################
## TRANS-ABYSS
mkdir -p $DIR/transabyss
for kmer in $kmer1 $kmer2 $kmer3 $kmer4 $kmer5; do
name=k${kmer}
assemblydir=$DIR/transabyss/${name}
mkdir -p $assemblydir
finalassembly=${assemblydir}/${name}-final.fa
transabyss -k ${kmer} --se $R1 --SS --outdir ${assemblydir} --name ${name} --threads $THREADS
done
mergedassembly=$DIR/transabyss/merged.fa
finalassembly1=$DIR/transabyss/k"$kmer1"/k"$kmer1"-final.fa
finalassembly2=$DIR/transabyss/k"$kmer2"/k"$kmer2"-final.fa
finalassembly3=$DIR/transabyss/k"$kmer3"/k"$kmer3"-final.fa
finalassembly4=$DIR/transabyss/k"$kmer4"/k"$kmer4"-final.fa
finalassembly5=$DIR/transabyss/k"$kmer5"/k"$kmer5"-final.fa
transabyss-merge --threads $THREADS --mink $kmer1 --maxk $kmer5 --SS --prefixes k${kmer1}. k${kmer2}. k${kmer3}. k${kmer4}. k${kmer5}. --out ${mergedassembly} ${finalassembly1} ${finalassembly2} ${finalassembly3} ${finalassembly4} ${finalassembly5}
ln -s $DIR/transabyss/merged.fa $FINAL_DIR/trans-abyss.fa
###################################################################################
###################################################################################
## SOAP-TRANS
mkdir -p $DIR/soap-trans
CONFIG=$DIR/soap-trans/$PROJECT.config
touch $CONFIG
echo '#maximal read length' > $CONFIG
echo "max_rd_len=94" >> $CONFIG
echo "[LIB]" >> $CONFIG
echo "#maximal read length in this lib" >> $CONFIG
echo "rd_len_cutoff=94" >> $CONFIG
#echo "#average insert size" >> $CONFIG
#echo "avg_ins=200" >> $CONFIG
echo "#if sequence needs to be reversed" >> $CONFIG
echo "reverse_seq=0" >> $CONFIG
echo "#in which part(s) the reads are used" >> $CONFIG
echo "asm_flags=3" >> $CONFIG
#echo "q=#{@reads}" >> $CONFIG #if @type == :SE
echo "#fastq paired end files" >> $CONFIG
echo "q="$R1 >> $CONFIG
kmer_min=$kmer1
kmer_max=$kmer5
# run soap with default settings
mkdir -p $DIR/soap-trans/default
SOAPdenovo-Trans-127mer all -s $DIR/soap-trans/$PROJECT.config -o $DIR/soap-trans/default/k23 -p $THREADS
ln -s $DIR/soap-trans/default/k23.scafSeq $FINAL_DIR/soap-trans-default.fasta
###################################################################################
###################################################################################
## RNA-SPADES
mkdir -p $DIR/rnaspades
rnaspades.py --s1 $R1 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/rnaspades/
ln -s $DIR/rnaspades/transcripts.fasta $FINAL_DIR/rna-spades.fasta
###################################################################################
###################################################################################
## SPADES
mkdir -p $DIR/spades
spades.py --sc --s1 $R1 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/spades/
ln -s $DIR/spades/scaffolds.fasta $FINAL_DIR/spades.fasta
###################################################################################
###################################################################################
## OASES
k_min=$kmer1
k_max=$((kmer5 + 1))
k_step=10
type=-short
mkdir -p $DIR/oases
oases_pipeline.py -m $k_min -M $k_max -s $k_step -o $DIR/oases/ -d " $type -fastq $R1 -strand_specific " -c
ln -s $DIR/oases/Merged/transcripts.fa $FINAL_DIR/oases.fasta
###################################################################################
###################################################################################
## IDBA-TRANS
###this tools assume the paired-end reads are in order (->, <-). If your data is in order (<-, ->), please convert it by yourself.
mkdir $DIR/idba-tran
fq2fa --filter $R1 $DIR/idba-tran/r1.fasta
idba_tran -r $DIR/idba-tran/r1.fasta -o $DIR/idba-tran/ --mink $kmer1 --maxk $kmer5 --step 10 --num_threads $THREADS
ln -s $DIR/idba-tran/contig.fa $FINAL_DIR/idba-tran.fa
###################################################################################
###################################################################################
## BRIDGER
OUT=$DIR/bridger/
Bridger.pl --seqType fq --single $R1 --SS_lib_type F --CPU $THREADS -o $OUT
Bridger.pl --seqType fq --single $R1 --SS_lib_type F --CPU $THREADS -o $OUT # start second time, because first run crashs at some point and then finishs
ln -s $OUT/Bridger.fasta $FINAL_DIR/bridger.fasta
###################################################################################
###################################################################################
## BINPACKER
OUT=$DIR/binpacker/
mkdir -p $OUT
cd $OUT
BinPacker -s fq -p single -u $R1 -m F -o $OUT
cd ..
ln -s $OUT/BinPacker_Out_Dir/BinPacker.fa $FINAL_DIR/binpacker.fasta
###################################################################################
###################################################################################
## SHANNON
mkdir $DIR/shannon
python shannon.py -p $THREADS -o $DIR/shannon --single $R1 --ss
ln -s $DIR/shannon/shannon.fasta $FINAL_DIR/shannon.fasta
###################################################################################
###################################################################################
## MIRA
mkdir $DIR/mira
touch $DIR/mira/manifest.config
echo "project = $PROJECT" > $DIR/mira/manifest.config
echo "job = est,denovo,accurate" >> $DIR/mira/manifest.config
echo "parameters = -DI:trt=/mnt/dessertlocal/projects/transcriptome_assembly/tmp -NW:cnfs=no -NW:cmrnl=no -SK:mmhr=1" >> $DIR/mira/manifest.config
echo "# defining illumina paired end reads" >> $DIR/mira/manifest.config
echo "readgroup = se" >> $DIR/mira/manifest.config
echo "data = $R1" >> $DIR/mira/manifest.config
echo "technology = solexa" >> $DIR/mira/manifest.config
#echo "template_size = 50 1000 autorefine" >> $DIR/mira/manifest.config
mira -c $DIR/mira -t $THREADS $DIR/mira/manifest.config
ln -s $DIR/mira/"$PROJECT"_assembly/"$PROJECT"_d_results/"$PROJECT"_out.unpadded.fasta $FINAL_DIR/mira.fasta
###################################################################################
Assembly of Candida albicans (download script)
#######################################
## CAL, 2x 51bp not strand-specific
#######################################
##PARAMETERS SHARED
PROJECT=cal3
R1=cal3_1_formated.fastq
R2=cal3_2_formated.fastq
THREADS=48
MEM=400
DIR=~/$PROJECT
FINAL_DIR=~/$PROJECT/final
mkdir -p $DIR
mkdir -p $FINAL_DIR
kmer1=21
kmer2=27
kmer3=33
kmer4=39
###################################################################################
## TRINITY
mkdir -p $DIR/trinity
Trinity --seqType fq --max_memory $MEM"G" --left $R1 --right $R2 --CPU $THREADS --output $DIR/trinity/
ln -s $DIR/trinity/Trinity.fasta $FINAL_DIR/trinity.fasta
###################################################################################
###################################################################################
## TRANS-ABYSS
mkdir -p $DIR/transabyss
for kmer in $kmer1 $kmer2 $kmer3 $kmer4; do
name=k${kmer}
assemblydir=$DIR/transabyss/${name}
mkdir -p $assemblydir
finalassembly=${assemblydir}/${name}-final.fa
transabyss -k ${kmer} --pe $R1 $R2 --outdir ${assemblydir} --name ${name} --threads $THREADS
done
mergedassembly=$DIR/transabyss/merged.fa
finalassembly1=$DIR/transabyss/k"$kmer1"/k"$kmer1"-final.fa
finalassembly2=$DIR/transabyss/k"$kmer2"/k"$kmer2"-final.fa
finalassembly3=$DIR/transabyss/k"$kmer3"/k"$kmer3"-final.fa
finalassembly4=$DIR/transabyss/k"$kmer4"/k"$kmer4"-final.fa
transabyss-merge --threads $THREADS --mink $kmer1 --maxk $kmer4 --SS --prefixes k${kmer1}. k${kmer2}. k${kmer3}. k${kmer4}. --out ${mergedassembly} ${finalassembly1} ${finalassembly2} ${finalassembly3} ${finalassembly4}
ln -s $DIR/transabyss/merged.fa $FINAL_DIR/trans-abyss.fasta
###################################################################################
###################################################################################
## SOAP-TRANS
mkdir -p $DIR/soap-trans
CONFIG=$DIR/soap-trans/test.config
touch $CONFIG
echo '#maximal read length' > $CONFIG
echo "max_rd_len=51" >> $CONFIG
echo "[LIB]" >> $CONFIG
echo "#maximal read length in this lib" >> $CONFIG
echo "rd_len_cutoff=51" >> $CONFIG
#echo "#average insert size" >> $CONFIG
#echo "avg_ins=200" >> $CONFIG
echo "#if sequence needs to be reversed" >> $CONFIG
echo "reverse_seq=0" >> $CONFIG
echo "#in which part(s) the reads are used" >> $CONFIG
echo "asm_flags=3" >> $CONFIG
echo "#fastq paired end files" >> $CONFIG
echo "q1="$R1 >> $CONFIG
echo "q2="$R2 >> $CONFIG
# run soap with defualt settings
mkdir -p $DIR/soap-trans/default
SOAPdenovo-Trans-127mer all -s $DIR/soap-trans/test.config -o $DIR/soap-trans/default/k23 -p $THREADS
ln -s $DIR/soap-trans/default/k23.scafSeq $FINAL_DIR/soap-trans-default.fasta
###################################################################################
###################################################################################
## RNA-SPADES
##ATTENTION CHANGED KMER BECAUS DEFAULT 55 IS TO BIG
mkdir -p $DIR/rnaspades
rnaspades.py -k 25 --pe1-1 $R1 --pe1-2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/rnaspades/
ln -s $DIR/rnaspades/transcripts.fasta $FINAL_DIR/rna-spades.fasta
###################################################################################
###################################################################################
## SPADES
mkdir -p $DIR/spades
spades.py --sc --pe1-1 $R1 --pe1-2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/spades/
ln -s $DIR/spades/scaffolds.fasta $FINAL_DIR/spades.fasta
##################################################################################
###################################################################################
## OASES
k_min=$kmer1
k_max=$((kmer4 + 1))
k_step=6
type=-shortPaired
mkdir -p $DIR/oases
oases_pipeline.py -m $k_min -M $k_max -s $k_step -o $DIR/oases/ -d " $type -fastq $R1 $R2 " -c
ln -s $DIR/oases/Merged/transcripts.fa $FINAL_DIR/oases.fasta
###################################################################################
###################################################################################
## IDBA-TRANS
###this tools assume the paired-end reads are in order (->, <-). If your data is in order (<-, ->), please convert it by yourself.
mkdir $DIR/idba-tran
fq2fa --merge --filter $R1 $R2 $DIR/idba-tran/r12.fasta
idba_tran -r $DIR/idba-tran/r12.fasta -o $DIR/idba-tran/ --mink $kmer1 --maxk $kmer4 --step 6 --num_threads $THREADS
ln -s $DIR/idba-tran/contig.fa $FINAL_DIR/idba-tran.fasta
###################################################################################
###################################################################################
## BRIDGER
OUT=$DIR/bridger/
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT # start second time, because first run crashs at some point and then finishs
ln -s $OUT/Bridger.fasta $FINAL_DIR/bridger.fasta
###################################################################################
###################################################################################
## BINPACKER
OUT=$DIR/binpacker/
mkdir -p $OUT
BinPacker -s fq -p pair -l $R1 -r $R2 -o $OUT
ln -s $OUT/BinPacker_Out_Dir/BinPacker.fa $FINAL_DIR/binpacker.fasta
###################################################################################
###################################################################################
## SHANNON
mkdir $DIR/shannon
python shannon.py -p $THREADS -o $DIR/shannon --left $R1 --right $R2
ln -s $DIR/shannon/shannon.fasta $FINAL_DIR/shannon.fasta
###################################################################################
###################################################################################
## MIRA
mkdir $DIR/mira
touch $DIR/mira/manifest.config
echo "project = $PROJECT" > $DIR/mira/manifest.config
echo "job = est,denovo,accurate" >> $DIR/mira/manifest.config
echo "parameters = -DI:trt=/mnt/dessertlocal/projects/transcriptome_assembly/tmp -NW:cnfs=no -NW:cmrnl=no -SK:mmhr=1" >> $DIR/mira/manifest.config
echo "# defining illumina paired end reads" >> $DIR/mira/manifest.config
echo "readgroup = pe" >> $DIR/mira/manifest.config
echo "data = $R1 $R2" >> $DIR/mira/manifest.config
echo "technology = solexa" >> $DIR/mira/manifest.config
echo "template_size = 50 1000 autorefine" >> $DIR/mira/manifest.config
echo "segment_placement = ---> <---" >> $DIR/mira/manifest.config
mira -c $DIR/mira -t $THREADS $DIR/mira/manifest.config
ln -s $DIR/mira/"$PROJECT"_assembly/"$PROJECT"_d_results/"$PROJECT"_out.unpadded.fasta $FINAL_DIR/mira.fasta
###################################################################################
Assembly of Arabidopsis thaliana (download script)
#######################################
## Arabidopsis 1x 100bp, not strand specific
#######################################
##PARAMETERS SHARED
PROJECT=ath
R1=ath_formated.fastq
THREADS=48
MEM=400
DIR=~/$PROJECT
FINAL_DIR=~/$PROJECT/final
mkdir -p $DIR
mkdir -p $FINAL_DIR
kmer1=25
kmer2=35
kmer3=45
kmer4=55
kmer5=65
###################################################################################
## TRINITY
mkdir -p $DIR/trinity
Trinity --seqType fq --max_memory $MEM"G" --single $R1 --CPU $THREADS --output $DIR/trinity/
ln -s $DIR/trinity/Trinity.fasta $FINAL_DIR/trinity.fasta
###################################################################################
###################################################################################
## TRANS-ABYSS
mkdir -p $DIR/transabyss
for kmer in $kmer1 $kmer2 $kmer3 $kmer4 $kmer5; do
name=k${kmer}
assemblydir=$DIR/transabyss/${name}
mkdir -p $assemblydir
finalassembly=${assemblydir}/${name}-final.fa
transabyss -k ${kmer} --se $R1 --outdir ${assemblydir} --name ${name} --threads $THREADS
done
mergedassembly=$DIR/transabyss/merged.fa
finalassembly1=$DIR/transabyss/k"$kmer1"/k"$kmer1"-final.fa
finalassembly2=$DIR/transabyss/k"$kmer2"/k"$kmer2"-final.fa
finalassembly3=$DIR/transabyss/k"$kmer3"/k"$kmer3"-final.fa
finalassembly4=$DIR/transabyss/k"$kmer4"/k"$kmer4"-final.fa
finalassembly5=$DIR/transabyss/k"$kmer5"/k"$kmer5"-final.fa
transabyss-merge --threads $THREADS --mink $kmer1 --maxk $kmer5 --prefixes k${kmer1}. k${kmer2}. k${kmer3}. k${kmer4}. k${kmer5}. --out ${mergedassembly} ${finalassembly1} ${finalassembly2} ${finalassembly3} ${finalassembly4} ${finalassembly5}
ln -s $DIR/transabyss/merged.fa $FINAL_DIR/trans-abyss.fasta
###################################################################################
###################################################################################
## SOAP-TRANS
mkdir -p $DIR/soap-trans
CONFIG=$DIR/soap-trans/$PROJECT.config
touch $CONFIG
echo '#maximal read length' > $CONFIG
echo "max_rd_len=100" >> $CONFIG
echo "[LIB]" >> $CONFIG
echo "#maximal read length in this lib" >> $CONFIG
echo "rd_len_cutoff=100" >> $CONFIG
#echo "#average insert size" >> $CONFIG
#echo "avg_ins=200" >> $CONFIG
echo "#if sequence needs to be reversed" >> $CONFIG
echo "reverse_seq=0" >> $CONFIG
echo "#in which part(s) the reads are used" >> $CONFIG
echo "asm_flags=3" >> $CONFIG
#echo "q=#{@reads}" >> $CONFIG #if @type == :SE
echo "#fastq single end files" >> $CONFIG
echo "q="$R1 >> $CONFIG
kmer_min=$kmer1
kmer_max=$kmer5
# run soap with default settings
mkdir -p $DIR/soap-trans/default
SOAPdenovo-Trans-127mer all -s $DIR/soap-trans/$PROJECT.config -o $DIR/soap-trans/default/k23 -p $THREADS
ln -s $DIR/soap-trans/default/k23.scafSeq $FINAL_DIR/soap-trans-default.fasta
###################################################################################
###################################################################################
## RNA-SPADES
mkdir -p $DIR/rnaspades
rnaspades.py --s1 $R1 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/rnaspades/
ln -s $DIR/rnaspades/transcripts.fasta $FINAL_DIR/rna-spades.fasta
###################################################################################
###################################################################################
## SPADES
mkdir -p $DIR/spades
spades.py --sc --s1 $R1 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/spades/
ln -s $DIR/spades/scaffolds.fasta $FINAL_DIR/spades.fasta
###################################################################################
###################################################################################
## OASES
k_min=$kmer1
k_max=$((kmer5 + 1))
k_step=10
type=-short
mkdir -p $DIR/oases
oases_pipeline.py -m $k_min -M $k_max -s $k_step -o $DIR/oases/ -d " $type -fastq $R1 " -c
ln -s $DIR/oases/Merged/transcripts.fa $FINAL_DIR/oases.fasta
###################################################################################
###################################################################################
## IDBA-TRANS
###this tools assume the paired-end reads are in order (->, <-). If your data is in order (<-, ->), please convert it by yourself.
mkdir $DIR/idba-tran
fq2fa --filter $R1 $DIR/idba-tran/r1.fasta
idba_tran -r $DIR/idba-tran/r1.fasta -o $DIR/idba-tran/ --mink $kmer1 --maxk $kmer5 --step 10 --num_threads $THREADS
ln -s $DIR/idba-tran/contig.fa $FINAL_DIR/idba-tran.fasta
###################################################################################
###################################################################################
## BRIDGER
OUT=$DIR/bridger/
Bridger.pl --seqType fq --single $R1 --CPU $THREADS -o $OUT
Bridger.pl --seqType fq --single $R1 --CPU $THREADS -o $OUT # start second time, because first run crashs at some point and then finishs
ln -s $OUT/Bridger.fasta $FINAL_DIR/bridger.fasta
###################################################################################
###################################################################################
## BINPACKER
OUT=$DIR/binpacker/
mkdir -p $OUT
cd $OUT
BinPacker -s fq -p single -u $R1 -o $OUT
cd ..
ln -s $OUT/BinPacker_Out_Dir/BinPacker.fa $FINAL_DIR/binpacker.fasta
###################################################################################
###################################################################################
## SHANNON
mkdir $DIR/shannon
python shannon.py -p $THREADS -o $DIR/shannon --single $R1
ln -s $DIR/shannon/shannon.fasta $FINAL_DIR/shannon.fasta
###################################################################################
###################################################################################
## MIRA
mkdir $DIR/mira
touch $DIR/mira/manifest.config
echo "project = $PROJECT" > $DIR/mira/manifest.config
echo "job = est,denovo,accurate" >> $DIR/mira/manifest.config
echo "parameters = -DI:trt=/mnt/dessertlocal/projects/transcriptome_assembly/tmp -NW:cnfs=no -NW:cmrnl=no -SK:mmhr=1" >> $DIR/mira/manifest.config
echo "# defining illumina paired end reads" >> $DIR/mira/manifest.config
echo "readgroup = se" >> $DIR/mira/manifest.config
echo "data = $R1" >> $DIR/mira/manifest.config
echo "technology = solexa" >> $DIR/mira/manifest.config
mira -c $DIR/mira -t $THREADS $DIR/mira/manifest.config
ln -s $DIR/mira/"$PROJECT"_assembly/"$PROJECT"_d_results/"$PROJECT"_out.unpadded.fasta $FINAL_DIR/mira.fasta
###################################################################################
Assembly of Mus musculus (download script)
#######################################
## MMU TRINITY TEST DATA, 2x 76bp, strand specific RF
#######################################
##PARAMETERS SHARED
PROJECT=mmu
R1=mmu_1_formated.fastq
R2=mmu_2_formated.fastq
THREADS=48
MEM=400
DIR=~/$PROJECT
FINAL_DIR=~/$PROJECT/final
mkdir -p $DIR
mkdir -p $FINAL_DIR
kmer1=25
kmer2=35
kmer3=45
kmer4=55
###################################################################################
## TRINITY
mkdir -p $DIR/trinity
Trinity --seqType fq --max_memory $MEM"G" --left $R1 --right $R2 --CPU $THREADS --output $DIR/trinity/ --SS_lib_type FR
ln -s $DIR/trinity/Trinity.fasta $FINAL_DIR/trinity.fasta
###################################################################################
###################################################################################
## TRANS-ABYSS
mkdir -p $DIR/transabyss
for kmer in $kmer1 $kmer2 $kmer3 $kmer4; do
name=k${kmer}
assemblydir=$DIR/transabyss/${name}
mkdir -p $assemblydir
finalassembly=${assemblydir}/${name}-final.fa
transabyss -k ${kmer} --pe $R1 $R2 --SS --outdir ${assemblydir} --name ${name} --threads $THREADS
done
mergedassembly=$DIR/transabyss/merged.fa
finalassembly1=$DIR/transabyss/k"$kmer1"/k"$kmer1"-final.fa
finalassembly2=$DIR/transabyss/k"$kmer2"/k"$kmer2"-final.fa
finalassembly3=$DIR/transabyss/k"$kmer3"/k"$kmer3"-final.fa
finalassembly4=$DIR/transabyss/k"$kmer4"/k"$kmer4"-final.fa
transabyss-merge --threads $THREADS --mink $kmer1 --maxk $kmer4 --SS --prefixes k${kmer1}. k${kmer2}. k${kmer3}. k${kmer4}. --out ${mergedassembly} ${finalassembly1} ${finalassembly2} ${finalassembly3} ${finalassembly4}
ln -s $DIR/transabyss/merged.fa $FINAL_DIR/trans-abyss.fasta
###################################################################################
###################################################################################
## SOAP-TRANS
mkdir -p $DIR/soap-trans
CONFIG=$DIR/soap-trans/test.config
touch $CONFIG
echo '#maximal read length' > $CONFIG
echo "max_rd_len=76" >> $CONFIG
echo "[LIB]" >> $CONFIG
echo "#maximal read length in this lib" >> $CONFIG
echo "rd_len_cutoff=76" >> $CONFIG
#echo "#average insert size" >> $CONFIG
#echo "avg_ins=200" >> $CONFIG
echo "#if sequence needs to be reversed" >> $CONFIG
echo "reverse_seq=1" >> $CONFIG
echo "#in which part(s) the reads are used" >> $CONFIG
echo "asm_flags=3" >> $CONFIG
#echo "q=#{@reads}" >> $CONFIG #if @type == :SE
echo "#fastq paired end files" >> $CONFIG
echo "q1="$R1 >> $CONFIG
echo "q2="$R2 >> $CONFIG
kmer_min=$kmer1
kmer_max=$kmer4
# run soap with defualt settings
mkdir -p $DIR/soap-trans/default
SOAPdenovo-Trans-127mer all -s $DIR/soap-trans/test.config -o $DIR/soap-trans/default/k23 -p $THREADS
ln -s $DIR/soap-trans/default/k23.scafSeq $FINAL_DIR/soap-trans-default.fasta
###################################################################################
###################################################################################
## RNA-SPADES
mkdir -p $DIR/rnaspades
rnaspades.py --pe1-1 $R1 --pe1-2 $R2 --pe1-fr -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/rnaspades/
ln -s $DIR/rnaspades/transcripts.fasta $FINAL_DIR/rna-spades.fasta
###################################################################################
###################################################################################
## SPADES
mkdir -p $DIR/spades
spades.py --sc --pe1-1 $R1 --pe1-2 $R2 --pe1-fr -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/spades/
ln -s $DIR/spades/scaffolds.fasta $FINAL_DIR/spades.fasta
##################################################################################
###################################################################################
## OASES
k_min=$kmer1
k_max=$((kmer4 + 1))
k_step=10
type=-shortPaired
mkdir -p $DIR/oases
oases_pipeline.py -m $k_min -M $k_max -s $k_step -o $DIR/oases/ -d " $type -fastq $R1 $R2 -strand_specific -separate " -c
ln -s $DIR/oases/Merged/transcripts.fa $FINAL_DIR/oases.fasta
###################################################################################
###################################################################################
## IDBA-TRANS
###this tools assume the paired-end reads are in order (->, <-). If your data is in order (<-, ->), please convert it by yourself.
mkdir $DIR/idba-tran
R1=/mnt/dessertlocal/projects/transcriptome_assembly/data/prinseq/mmu_1_formated_reversed.fastq
R2=/mnt/dessertlocal/projects/transcriptome_assembly/data/prinseq/mmu_2_formated_reversed.fastq
fq2fa --merge --filter $R1 $R2 $DIR/idba-tran/r12.fasta
idba_tran -r $DIR/idba-tran/r12.fasta -o $DIR/idba-tran/ --mink $kmer1 --maxk $kmer4 --step 10 --num_threads $THREADS
ln -s $DIR/idba-tran/contig.fa $FINAL_DIR/idba-tran.fasta
R1=/mnt/dessertlocal/projects/transcriptome_assembly/data/prinseq/mmu_1_formated.fastq
R2=/mnt/dessertlocal/projects/transcriptome_assembly/data/prinseq/mmu_2_formated.fastq
###################################################################################
###################################################################################
## BRIDGER
#Error! try unstranded mode!
OUT=$DIR/bridger/
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT # start second time, because first run crashs at some point and then finishs
ln -s $OUT/Bridger.fasta $FINAL_DIR/bridger.fasta
###################################################################################
###################################################################################
## BINPACKER
OUT=$DIR/binpacker/
mkdir -p $OUT
BinPacker -s fq -p pair -l $R1 -r $R2 -m RF -o $OUT
ln -s $OUT/BinPacker_Out_Dir/BinPacker.fa $FINAL_DIR/binpacker.fasta
###################################################################################
###################################################################################
## SHANNON
mkdir $DIR/shannon
python shannon.py -p $THREADS -o $DIR/shannon --left $R1 --right $R2 --ss
ln -s $DIR/shannon/shannon.fasta $FINAL_DIR/shannon.fasta
###################################################################################
###################################################################################
## MIRA
mkdir $DIR/mira
touch $DIR/mira/manifest.config
echo "project = $PROJECT" > $DIR/mira/manifest.config
echo "job = est,denovo,accurate" >> $DIR/mira/manifest.config
echo "parameters = -DI:trt=/mnt/dessertlocal/projects/transcriptome_assembly/tmp -NW:cnfs=no -NW:cmrnl=no -SK:mmhr=1" >> $DIR/mira/manifest.config
# if @type == :SE
# config << "# defining illumina single end reads"
# config << "readgroup = se"
# config << "data = #{@reads}"
# config << "technology = #{@technology}"
# end
echo "# defining illumina paired end reads" >> $DIR/mira/manifest.config
echo "readgroup = pe" >> $DIR/mira/manifest.config
echo "data = $R1 $R2" >> $DIR/mira/manifest.config
echo "technology = solexa" >> $DIR/mira/manifest.config
echo "template_size = 50 1000 autorefine" >> $DIR/mira/manifest.config
#echo "segment_placement = ---> <---" >> $DIR/mira/manifest.config
echo "segment_placement = <--- --->" >> $DIR/mira/manifest.config
mira -c $DIR/mira -t $THREADS $DIR/mira/manifest.config
ln -s $DIR/mira/"$PROJECT"_assembly/"$PROJECT"_d_results/"$PROJECT"_out.unpadded.fasta $FINAL_DIR/mira.fasta
###################################################################################
Assembly of Homo sapiens (download script)
#######################################
## HSA SRA, 2x 100bp, strand specific RF
#######################################
##PARAMETERS SHARED
PROJECT=hsa
R1=hsa_1.fastq
R2=hsa_2.fastq
THREADS=48
MEM=400
DIR=~/$PROJECT
FINAL_DIR=~/$PROJECT/final
mkdir -p $DIR
mkdir -p $FINAL_DIR
kmer1=25
kmer2=35
kmer3=45
kmer4=55
kmer5=65
###################################################################################
## TRINITY
mkdir -p $DIR/trinity
Trinity --seqType fq --max_memory $MEM"G" --left $R1 --right $R2 --CPU $THREADS --output $DIR/trinity/ --SS_lib_type FR
ln -s $DIR/trinity/Trinity.fasta $FINAL_DIR/trinity.fasta
###################################################################################
###################################################################################
## TRANS-ABYSS
mkdir -p $DIR/transabyss
for kmer in $kmer1 $kmer2 $kmer3 $kmer4 $kmer5; do
name=k${kmer}
assemblydir=$DIR/transabyss/${name}
mkdir -p $assemblydir
finalassembly=${assemblydir}/${name}-final.fa
transabyss -k ${kmer} --pe $R1 $R2 --SS --outdir ${assemblydir} --name ${name} --threads $THREADS
done
mergedassembly=$DIR/transabyss/merged.fa
finalassembly1=$DIR/transabyss/k"$kmer1"/k"$kmer1"-final.fa
finalassembly2=$DIR/transabyss/k"$kmer2"/k"$kmer2"-final.fa
finalassembly3=$DIR/transabyss/k"$kmer3"/k"$kmer3"-final.fa
finalassembly4=$DIR/transabyss/k"$kmer4"/k"$kmer4"-final.fa
finalassembly5=$DIR/transabyss/k"$kmer5"/k"$kmer5"-final.fa
transabyss-merge --threads $THREADS --mink $kmer1 --maxk $kmer5 --SS --prefixes k${kmer1}. k${kmer2}. k${kmer3}. k${kmer4}. k${kmer5}. --out ${mergedassembly} ${finalassembly1} ${finalassembly2} ${finalassembly3} ${finalassembly4} ${finalassembly5}
ln -s $DIR/transabyss/merged.fa $FINAL_DIR/trans-abyss.fasta
###################################################################################
###################################################################################
## SOAP-TRANS
mkdir -p $DIR/soap-trans
CONFIG=$DIR/soap-trans/test.config
touch $CONFIG
echo '#maximal read length' > $CONFIG
echo "max_rd_len=100" >> $CONFIG
echo "[LIB]" >> $CONFIG
echo "#maximal read length in this lib" >> $CONFIG
echo "rd_len_cutoff=100" >> $CONFIG
#echo "#average insert size" >> $CONFIG
#echo "avg_ins=200" >> $CONFIG
echo "#if sequence needs to be reversed" >> $CONFIG
echo "reverse_seq=1" >> $CONFIG
echo "#in which part(s) the reads are used" >> $CONFIG
echo "asm_flags=3" >> $CONFIG
#echo "q=#{@reads}" >> $CONFIG #if @type == :SE
echo "#fastq paired end files" >> $CONFIG
echo "q1="$R1 >> $CONFIG
echo "q2="$R2 >> $CONFIG
kmer_min=$kmer1
kmer_max=$kmer5
# run soap with defualt settings
mkdir -p $DIR/soap-trans/default
SOAPdenovo-Trans-127mer all -s $DIR/soap-trans/test.config -o $DIR/soap-trans/default/k23 -p $THREADS
ln -s $DIR/soap-trans/default/k23.scafSeq $FINAL_DIR/soap-trans-default.fasta
###################################################################################
###################################################################################
## RNA-SPADES
mkdir -p $DIR/rnaspades
rnaspades.py --pe1-1 $R1 --pe1-2 $R2 --pe1-fr -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/rnaspades/
ln -s $DIR/rnaspades/transcripts.fasta $FINAL_DIR/rna-spades.fasta
###################################################################################
###################################################################################
## SPADES
mkdir -p $DIR/spades
spades.py --sc --pe1-1 $R1 --pe1-2 $R2 --pe1-fr -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/spades/
ln -s $DIR/spades/scaffolds.fasta $FINAL_DIR/spades.fasta
###################################################################################
###################################################################################
## OASES
k_min=$kmer1
k_max=$((kmer5 + 1))
k_step=10
type=-shortPaired
mkdir -p $DIR/oases
oases_pipeline.py -m $k_min -M $k_max -s $k_step -o $DIR/oases/ -d " $type -fastq $R1 $R2 -strand_specific -separate " -c
ln -s $DIR/oases/Merged/transcripts.fa $FINAL_DIR/oases.fasta
###################################################################################
###################################################################################
## IDBA-TRANS
###this tools assume the paired-end reads are in order (->, <-). If your data is in order (<-, ->), please convert it by yourself.
mkdir $DIR/idba-tran
R1=/mnt/dessertlocal/projects/transcriptome_assembly/data/prinseq/hsa_1_reversed.fastq
R2=/mnt/dessertlocal/projects/transcriptome_assembly/data/prinseq/hsa_2_reversed.fastq
fq2fa --merge --filter $R1 $R2 $DIR/idba-tran/r12.fasta
idba_tran -r $DIR/idba-tran/r12.fasta -o $DIR/idba-tran/ --mink $kmer1 --maxk $kmer5 --step 10 --num_threads $THREADS
ln -s $DIR/idba-tran/contig.fa $FINAL_DIR/idba-tran.fasta
R1=/mnt/dessertlocal/projects/transcriptome_assembly/data/prinseq/hsa_1.fastq
R2=/mnt/dessertlocal/projects/transcriptome_assembly/data/prinseq/hsa_2.fastq
###################################################################################
###################################################################################
## BRIDGER
# Error, try unstranded
OUT=$DIR/bridger/
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT # start second time, because first run crashs at some point and then finishs
ln -s $OUT/Bridger.fasta $FINAL_DIR/bridger.fasta
###################################################################################
###################################################################################
## BINPACKER
OUT=$DIR/binpacker/
mkdir -p $OUT
BinPacker -s fq -p pair -l $R1 -r $R2 -m RF -o $OUT
ln -s $OUT/BinPacker_Out_Dir/BinPacker.fa $FINAL_DIR/binpacker.fasta
###################################################################################
###################################################################################
## SHANNON
mkdir $DIR/shannon
python /mnt/prostlocal/programs/shannon/0.0.2/shannon.py -p $THREADS -o $DIR/shannon --left $R1 --right $R2 --ss
ln -s $DIR/shannon/shannon.fasta $FINAL_DIR/shannon.fasta
###################################################################################
###################################################################################
## MIRA
mkdir $DIR/mira
touch $DIR/mira/manifest.config
echo "project = $PROJECT" > $DIR/mira/manifest.config
echo "job = est,denovo,accurate" >> $DIR/mira/manifest.config
echo "parameters = -DI:trt=/mnt/dessertlocal/projects/transcriptome_assembly/tmp -NW:cnfs=no -NW:cmrnl=no -SK:mmhr=1" >> $DIR/mira/manifest.config
# if @type == :SE
# config << "# defining illumina single end reads"
# config << "readgroup = se"
# config << "data = #{@reads}"
# config << "technology = #{@technology}"
# end
echo "# defining illumina paired end reads" >> $DIR/mira/manifest.config
echo "readgroup = pe" >> $DIR/mira/manifest.config
echo "data = $R1 $R2" >> $DIR/mira/manifest.config
echo "technology = solexa" >> $DIR/mira/manifest.config
echo "template_size = 50 1000 autorefine" >> $DIR/mira/manifest.config
#echo "segment_placement = ---> <---" >> $DIR/mira/manifest.config
echo "segment_placement = <--- --->" >> $DIR/mira/manifest.config
mira -c $DIR/mira -t $THREADS $DIR/mira/manifest.config
ln -s $DIR/mira/"$PROJECT"_assembly/"$PROJECT"_d_results/"$PROJECT"_out.unpadded.fasta $FINAL_DIR/mira.fasta
###################################################################################
Assembly of Homo sapiens + EBOV 3h (download script)
#######################################
## EBOV HSA 3h
## paired-end, not strand specific, 100bp
#######################################
##PARAMETERS SHARED
PROJECT=ebov_hsa_3h
R1=ebov_hsa_3h_1.fastq
R2=ebov_hsa_3h_2.fastq
THREADS=48
MEM=400
DIR=~/$PROJECT
FINAL_DIR=~/$PROJECT/final
mkdir -p $DIR
mkdir -p $FINAL_DIR
kmer1=25
kmer2=29
kmer3=33
kmer4=37
kmer5=41
###################################################################################
## TRINITY
mkdir -p $DIR/trinity
Trinity --seqType fq --max_memory $MEM"G" --left $R1 --right $R2 --CPU $THREADS --output $DIR/trinity/
ln -s $DIR/trinity/Trinity.fasta $FINAL_DIR/trinity.fasta
###################################################################################
###################################################################################
## TRANS-ABYSS
mkdir -p $DIR/transabyss
for kmer in $kmer1 $kmer2 $kmer3 $kmer4 $kmer5; do
name=k${kmer}
assemblydir=$DIR/transabyss/${name}
mkdir -p $assemblydir
finalassembly=${assemblydir}/${name}-final.fa
transabyss -k ${kmer} --pe $R1 $R2 --outdir ${assemblydir} --name ${name} --threads $THREADS
done
mergedassembly=$DIR/transabyss/merged.fa
finalassembly1=$DIR/transabyss/k"$kmer1"/k"$kmer1"-final.fa
finalassembly2=$DIR/transabyss/k"$kmer2"/k"$kmer2"-final.fa
finalassembly3=$DIR/transabyss/k"$kmer3"/k"$kmer3"-final.fa
finalassembly4=$DIR/transabyss/k"$kmer4"/k"$kmer4"-final.fa
finalassembly5=$DIR/transabyss/k"$kmer5"/k"$kmer5"-final.fa
transabyss-merge --threads $THREADS --mink 25 --maxk 41 --prefixes k${kmer1}. k${kmer2}. k${kmer3}. k${kmer4}. k${kmer5}. --out ${mergedassembly} ${finalassembly1} ${finalassembly2} ${finalassembly3} ${finalassembly4} ${finalassembly5}
ln -s $DIR/transabyss/merged.fa $FINAL_DIR/trans-abyss.fasta
###################################################################################
###################################################################################
## SOAP-TRANS
mkdir -p $DIR/soap-trans
CONFIG=$DIR/soap-trans/test.config
touch $CONFIG
echo '#maximal read length' > $CONFIG
echo "max_rd_len=100" >> $CONFIG
echo "[LIB]" >> $CONFIG
echo "#maximal read length in this lib" >> $CONFIG
echo "rd_len_cutoff=100" >> $CONFIG
#echo "#average insert size" >> $CONFIG
#echo "avg_ins=200" >> $CONFIG
echo "#if sequence needs to be reversed" >> $CONFIG
echo "reverse_seq=0" >> $CONFIG
echo "#in which part(s) the reads are used" >> $CONFIG
echo "asm_flags=3" >> $CONFIG
#echo "q=#{@reads}" >> $CONFIG #if @type == :SE
echo "#fastq paired end files" >> $CONFIG
echo "q1="$R1 >> $CONFIG
echo "q2="$R2 >> $CONFIG
kmer_min=$kmer1
kmer_max=$kmer5
# run soap with defualt settings
mkdir -p $DIR/soap-trans/default
SOAPdenovo-Trans-127mer all -s $DIR/soap-trans/test.config -o $DIR/soap-trans/default/k23 -p $THREADS
ln -s $DIR/soap-trans/default/k23.scafSeq $FINAL_DIR/soap-trans-default.fasta
###################################################################################
###################################################################################
## RNA-SPADES
mkdir -p $DIR/rnaspades/
rnaspades.py -1 $R1 -2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/rnaspades/
ln -s $DIR/rnaspades/transcripts.fasta $FINAL_DIR/rna-spades.fasta
###################################################################################
###################################################################################
## SPADES
mkdir -p $DIR/spades
spades.py --sc -1 $R1 -2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/spades/
ln -s $DIR/spades/scaffolds.fasta $FINAL_DIR/spades.fasta
###################################################################################
###################################################################################
## OASES
k_min=$kmer1
k_max=$((kmer5 + 1))
k_step=4
type=-shortPaired
mkdir -p $DIR/oases
oases_pipeline.py -m $k_min -M $k_max -s $k_step -o $DIR/oases/ -d " $type -fastq $R1 $R2 -separate " -c
ln -s $DIR/oases/Merged/transcripts.fa $FINAL_DIR/oases.fasta
###################################################################################
###################################################################################
## IDBA-TRANS
###this tools assume the paired-end reads are in order (->, <-). If your data is in order (<-, ->), please convert it by yourself.
mkdir $DIR/idba-tran
fq2fa --merge --filter $R1 $R2 $DIR/idba-tran/r12.fasta
idba_tran -r $DIR/idba-tran/r12.fasta -o $DIR/idba-tran/ --mink 25 --maxk 41 --step 4
ln -s $DIR/idba-tran/contig.fa $FINAL_DIR/idba-tran.fasta
###################################################################################
###################################################################################
## BINPACKER
OUT=$DIR/binpacker/
mkdir -p $OUT
cd $OUT
BinPacker -s fq -p pair -l $R1 -r $R2 -o $DIR/binpacker
cd ..
ln -s $OUT/BinPacker_Out_Dir/BinPacker.fa $FINAL_DIR/binpacker.fasta
###################################################################################
###################################################################################
## SHANNON
mkdir $DIR/shannon
python shannon.py -p $THREADS -o $DIR/shannon --left $R1 --right $R2
ln -s $DIR/shannon/shannon.fasta $FINAL_DIR/shannon.fasta
###################################################################################
###################################################################################
## BRIDGER
OUT=$DIR/bridger/
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT # start second time, because first run crashs at some point and then finishs
ln -s $OUT/Bridger.fasta $FINAL_DIR/bridger.fasta
###################################################################################
###################################################################################
## MIRA
mkdir $DIR/mira
touch $DIR/mira/manifest.config
echo "project = $PROJECT" > $DIR/mira/manifest.config
echo "job = est,denovo,accurate" >> $DIR/mira/manifest.config
echo "parameters = -DI:trt=/mnt/dessertlocal/projects/transcriptome_assembly/tmp -NW:cnfs=no -NW:cmrnl=no -SK:mmhr=1" >> $DIR/mira/manifest.config
# if @type == :SE
# config << "# defining illumina single end reads"
# config << "readgroup = se"
# config << "data = #{@reads}"
# config << "technology = #{@technology}"
# end
echo "# defining illumina paired end reads" >> $DIR/mira/manifest.config
echo "readgroup = pe" >> $DIR/mira/manifest.config
echo "data = $R1 $R2" >> $DIR/mira/manifest.config
echo "technology = solexa" >> $DIR/mira/manifest.config
echo "template_size = 50 1000 autorefine" >> $DIR/mira/manifest.config
echo "segment_placement = ---> <---" >> $DIR/mira/manifest.config
mira -c $DIR/mira -t $THREADS $DIR/mira/manifest.config
###################################################################################
Assembly of Homo sapiens + EBOV 7h (download script)
#######################################
## EBOV HSA 7h
## paired-end, not strand specific, 100bp
#######################################
##PARAMETERS SHARED
PROJECT=ebov_hsa_7h
R1=ebov_hsa_7h_1.fastq
R2=ebov_hsa_7h_2.fastq
THREADS=48
MEM=400
DIR=~/$PROJECT
FINAL_DIR=~/$PROJECT/final
mkdir -p $DIR
mkdir -p $FINAL_DIR
kmer1=25
kmer2=29
kmer3=33
kmer4=37
kmer5=41
###################################################################################
## TRINITY
mkdir -p $DIR/trinity
Trinity --seqType fq --max_memory $MEM"G" --left $R1 --right $R2 --CPU $THREADS --output $DIR/trinity/
ln -s $DIR/trinity/Trinity.fasta $FINAL_DIR/trinity.fasta
###################################################################################
###################################################################################
## TRANS-ABYSS
mkdir -p $DIR/transabyss
for kmer in $kmer1 $kmer2 $kmer3 $kmer4 $kmer5; do
name=k${kmer}
assemblydir=$DIR/transabyss/${name}
mkdir -p $assemblydir
finalassembly=${assemblydir}/${name}-final.fa
transabyss -k ${kmer} --pe $R1 $R2 --outdir ${assemblydir} --name ${name} --threads $THREADS
done
mergedassembly=$DIR/transabyss/merged.fa
finalassembly1=$DIR/transabyss/k"$kmer1"/k"$kmer1"-final.fa
finalassembly2=$DIR/transabyss/k"$kmer2"/k"$kmer2"-final.fa
finalassembly3=$DIR/transabyss/k"$kmer3"/k"$kmer3"-final.fa
finalassembly4=$DIR/transabyss/k"$kmer4"/k"$kmer4"-final.fa
finalassembly5=$DIR/transabyss/k"$kmer5"/k"$kmer5"-final.fa
transabyss-merge --threads $THREADS --mink 25 --maxk 41 --prefixes k${kmer1}. k${kmer2}. k${kmer3}. k${kmer4}. k${kmer5}. --out ${mergedassembly} ${finalassembly1} ${finalassembly2} ${finalassembly3} ${finalassembly4} ${finalassembly5}
ln -s $DIR/transabyss/merged.fa $FINAL_DIR/trans-abyss.fasta
###################################################################################
###################################################################################
## SOAP-TRANS
mkdir -p $DIR/soap-trans
CONFIG=$DIR/soap-trans/test.config
touch $CONFIG
echo '#maximal read length' > $CONFIG
echo "max_rd_len=100" >> $CONFIG
echo "[LIB]" >> $CONFIG
echo "#maximal read length in this lib" >> $CONFIG
echo "rd_len_cutoff=100" >> $CONFIG
#echo "#average insert size" >> $CONFIG
#echo "avg_ins=200" >> $CONFIG
echo "#if sequence needs to be reversed" >> $CONFIG
echo "reverse_seq=0" >> $CONFIG
echo "#in which part(s) the reads are used" >> $CONFIG
echo "asm_flags=3" >> $CONFIG
#echo "q=#{@reads}" >> $CONFIG #if @type == :SE
echo "#fastq paired end files" >> $CONFIG
echo "q1="$R1 >> $CONFIG
echo "q2="$R2 >> $CONFIG
kmer_min=$kmer1
kmer_max=$kmer5
# run soap with defualt settings
mkdir -p $DIR/soap-trans/default
SOAPdenovo-Trans-127mer all -s $DIR/soap-trans/test.config -o $DIR/soap-trans/default/k23 -p $THREADS
ln -s $DIR/soap-trans/default/k23.scafSeq $FINAL_DIR/soap-trans-default.fasta
###################################################################################
###################################################################################
## RNA-SPADES
mkdir -p $DIR/rnaspades/
rnaspades.py -1 $R1 -2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/rnaspades/
ln -s $DIR/rnaspades/transcripts.fasta $FINAL_DIR/rna-spades.fasta
###################################################################################
###################################################################################
## SPADES
mkdir -p $DIR/spades
spades.py --sc -1 $R1 -2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/spades/
ln -s $DIR/spades/scaffolds.fasta $FINAL_DIR/spades.fasta
###################################################################################
###################################################################################
## OASES
k_min=$kmer1
k_max=$((kmer5 + 1))
k_step=4
type=-shortPaired
mkdir -p $DIR/oases
oases_pipeline.py -m $k_min -M $k_max -s $k_step -o $DIR/oases/ -d " $type -fastq $R1 $R2 -separate " -c
ln -s $DIR/oases/Merged/transcripts.fa $FINAL_DIR/oases.fasta
###################################################################################
###################################################################################
## IDBA-TRANS
###this tools assume the paired-end reads are in order (->, <-). If your data is in order (<-, ->), please convert it by yourself.
mkdir $DIR/idba-tran
fq2fa --merge --filter $R1 $R2 $DIR/idba-tran/r12.fasta
idba_tran -r $DIR/idba-tran/r12.fasta -o $DIR/idba-tran/ --mink 25 --maxk 41 --step 4
ln -s $DIR/idba-tran/contig.fa $FINAL_DIR/idba-tran.fasta
###################################################################################
###################################################################################
## BINPACKER
OUT=$DIR/binpacker/
mkdir -p $OUT
cd $OUT
BinPacker -s fq -p pair -l $R1 -r $R2 -o $DIR/binpacker
cd ..
ln -s $OUT/BinPacker_Out_Dir/BinPacker.fa $FINAL_DIR/binpacker.fasta
###################################################################################
###################################################################################
## SHANNON
mkdir $DIR/shannon
python shannon.py -p $THREADS -o $DIR/shannon --left $R1 --right $R2
ln -s $DIR/shannon/shannon.fasta $FINAL_DIR/shannon.fasta
###################################################################################
###################################################################################
## BRIDGER
OUT=$DIR/bridger/
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT # start second time, because first run crashs at some point and then finishs
ln -s $OUT/Bridger.fasta $FINAL_DIR/bridger.fasta
###################################################################################
###################################################################################
## MIRA
mkdir $DIR/mira
touch $DIR/mira/manifest.config
echo "project = $PROJECT" > $DIR/mira/manifest.config
echo "job = est,denovo,accurate" >> $DIR/mira/manifest.config
echo "parameters = -DI:trt=/mnt/dessertlocal/projects/transcriptome_assembly/tmp -NW:cnfs=no -NW:cmrnl=no -SK:mmhr=1" >> $DIR/mira/manifest.config
# if @type == :SE
# config << "# defining illumina single end reads"
# config << "readgroup = se"
# config << "data = #{@reads}"
# config << "technology = #{@technology}"
# end
echo "# defining illumina paired end reads" >> $DIR/mira/manifest.config
echo "readgroup = pe" >> $DIR/mira/manifest.config
echo "data = $R1 $R2" >> $DIR/mira/manifest.config
echo "technology = solexa" >> $DIR/mira/manifest.config
echo "template_size = 50 1000 autorefine" >> $DIR/mira/manifest.config
echo "segment_placement = ---> <---" >> $DIR/mira/manifest.config
mira -c $DIR/mira -t $THREADS $DIR/mira/manifest.config
###################################################################################
Assembly of Homo sapiens + EBOV 23h (download script)
#######################################
## EBOV HSA 23h
## paired-end, not strand specific, 100bp
#######################################
##PARAMETERS SHARED
PROJECT=ebov_hsa_23h
R1=ebov_hsa_23h_1.fastq
R2=ebov_hsa_23h_2.fastq
THREADS=48
MEM=400
DIR=~/$PROJECT
FINAL_DIR=~/$PROJECT/final
mkdir -p $DIR
mkdir -p $FINAL_DIR
kmer1=25
kmer2=29
kmer3=33
kmer4=37
kmer5=41
###################################################################################
## TRINITY
mkdir -p $DIR/trinity
Trinity --seqType fq --max_memory $MEM"G" --left $R1 --right $R2 --CPU $THREADS --output $DIR/trinity/
ln -s $DIR/trinity/Trinity.fasta $FINAL_DIR/trinity.fasta
###################################################################################
###################################################################################
## TRANS-ABYSS
mkdir -p $DIR/transabyss
for kmer in $kmer1 $kmer2 $kmer3 $kmer4 $kmer5; do
name=k${kmer}
assemblydir=$DIR/transabyss/${name}
mkdir -p $assemblydir
finalassembly=${assemblydir}/${name}-final.fa
transabyss -k ${kmer} --pe $R1 $R2 --outdir ${assemblydir} --name ${name} --threads $THREADS
done
mergedassembly=$DIR/transabyss/merged.fa
finalassembly1=$DIR/transabyss/k"$kmer1"/k"$kmer1"-final.fa
finalassembly2=$DIR/transabyss/k"$kmer2"/k"$kmer2"-final.fa
finalassembly3=$DIR/transabyss/k"$kmer3"/k"$kmer3"-final.fa
finalassembly4=$DIR/transabyss/k"$kmer4"/k"$kmer4"-final.fa
finalassembly5=$DIR/transabyss/k"$kmer5"/k"$kmer5"-final.fa
transabyss-merge --threads $THREADS --mink 25 --maxk 41 --prefixes k${kmer1}. k${kmer2}. k${kmer3}. k${kmer4}. k${kmer5}. --out ${mergedassembly} ${finalassembly1} ${finalassembly2} ${finalassembly3} ${finalassembly4} ${finalassembly5}
ln -s $DIR/transabyss/merged.fa $FINAL_DIR/trans-abyss.fasta
###################################################################################
###################################################################################
## SOAP-TRANS
mkdir -p $DIR/soap-trans
CONFIG=$DIR/soap-trans/test.config
touch $CONFIG
echo '#maximal read length' > $CONFIG
echo "max_rd_len=100" >> $CONFIG
echo "[LIB]" >> $CONFIG
echo "#maximal read length in this lib" >> $CONFIG
echo "rd_len_cutoff=100" >> $CONFIG
#echo "#average insert size" >> $CONFIG
#echo "avg_ins=200" >> $CONFIG
echo "#if sequence needs to be reversed" >> $CONFIG
echo "reverse_seq=0" >> $CONFIG
echo "#in which part(s) the reads are used" >> $CONFIG
echo "asm_flags=3" >> $CONFIG
#echo "q=#{@reads}" >> $CONFIG #if @type == :SE
echo "#fastq paired end files" >> $CONFIG
echo "q1="$R1 >> $CONFIG
echo "q2="$R2 >> $CONFIG
kmer_min=$kmer1
kmer_max=$kmer5
# run soap with defualt settings
mkdir -p $DIR/soap-trans/default
SOAPdenovo-Trans-127mer all -s $DIR/soap-trans/test.config -o $DIR/soap-trans/default/k23 -p $THREADS
ln -s $DIR/soap-trans/default/k23.scafSeq $FINAL_DIR/soap-trans-default.fasta
###################################################################################
###################################################################################
## RNA-SPADES
mkdir -p $DIR/rnaspades/
rnaspades.py -1 $R1 -2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/rnaspades/
ln -s $DIR/rnaspades/transcripts.fasta $FINAL_DIR/rna-spades.fasta
###################################################################################
###################################################################################
## SPADES
mkdir -p $DIR/spades
spades.py --sc -1 $R1 -2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/spades/
ln -s $DIR/spades/scaffolds.fasta $FINAL_DIR/spades.fasta
###################################################################################
###################################################################################
## OASES
k_min=$kmer1
k_max=$((kmer5 + 1))
k_step=4
type=-shortPaired
mkdir -p $DIR/oases
oases_pipeline.py -m $k_min -M $k_max -s $k_step -o $DIR/oases/ -d " $type -fastq $R1 $R2 -separate " -c
ln -s $DIR/oases/Merged/transcripts.fa $FINAL_DIR/oases.fasta
###################################################################################
###################################################################################
## IDBA-TRANS
###this tools assume the paired-end reads are in order (->, <-). If your data is in order (<-, ->), please convert it by yourself.
mkdir $DIR/idba-tran
fq2fa --merge --filter $R1 $R2 $DIR/idba-tran/r12.fasta
idba_tran -r $DIR/idba-tran/r12.fasta -o $DIR/idba-tran/ --mink 25 --maxk 41 --step 4
ln -s $DIR/idba-tran/contig.fa $FINAL_DIR/idba-tran.fasta
###################################################################################
###################################################################################
## BINPACKER
OUT=$DIR/binpacker/
mkdir -p $OUT
cd $OUT
BinPacker -s fq -p pair -l $R1 -r $R2 -o $DIR/binpacker
cd ..
ln -s $OUT/BinPacker_Out_Dir/BinPacker.fa $FINAL_DIR/binpacker.fasta
###################################################################################
###################################################################################
## SHANNON
mkdir $DIR/shannon
python shannon.py -p $THREADS -o $DIR/shannon --left $R1 --right $R2
ln -s $DIR/shannon/shannon.fasta $FINAL_DIR/shannon.fasta
###################################################################################
###################################################################################
## BRIDGER
OUT=$DIR/bridger/
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT # start second time, because first run crashs at some point and then finishs
ln -s $OUT/Bridger.fasta $FINAL_DIR/bridger.fasta
###################################################################################
###################################################################################
## MIRA
mkdir $DIR/mira
touch $DIR/mira/manifest.config
echo "project = $PROJECT" > $DIR/mira/manifest.config
echo "job = est,denovo,accurate" >> $DIR/mira/manifest.config
echo "parameters = -DI:trt=/mnt/dessertlocal/projects/transcriptome_assembly/tmp -NW:cnfs=no -NW:cmrnl=no -SK:mmhr=1" >> $DIR/mira/manifest.config
# if @type == :SE
# config << "# defining illumina single end reads"
# config << "readgroup = se"
# config << "data = #{@reads}"
# config << "technology = #{@technology}"
# end
echo "# defining illumina paired end reads" >> $DIR/mira/manifest.config
echo "readgroup = pe" >> $DIR/mira/manifest.config
echo "data = $R1 $R2" >> $DIR/mira/manifest.config
echo "technology = solexa" >> $DIR/mira/manifest.config
echo "template_size = 50 1000 autorefine" >> $DIR/mira/manifest.config
echo "segment_placement = ---> <---" >> $DIR/mira/manifest.config
mira -c $DIR/mira -t $THREADS $DIR/mira/manifest.config
###################################################################################
Assembly of Homo sapiens simulated (download script)
#######################################
## HSA FLUX SIMULATED, 2x100bp, not strand specific
#######################################
##PARAMETERS SHARED
PROJECT=hsa_flux
R1=hsa_flux_1.fastq
R2=hsa_flux_2.fastq
THREADS=48
MEM=400
DIR=~/$PROJECT
FINAL_DIR=~/$PROJECT/final
mkdir -p $DIR
mkdir -p $FINAL_DIR
kmer1=25
kmer2=35
kmer3=45
kmer4=55
kmer5=65
###################################################################################
## TRINITY
mkdir -p $DIR/trinity
Trinity --seqType fq --max_memory $MEM"G" --left $R1 --right $R2 --CPU $THREADS --output $DIR/trinity/
ln -s $DIR/trinity/Trinity.fasta $FINAL_DIR/trinity.fasta
###################################################################################
###################################################################################
## TRANS-ABYSS
mkdir -p $DIR/transabyss
for kmer in $kmer1 $kmer2 $kmer3 $kmer4 $kmer5; do
name=k${kmer}
assemblydir=$DIR/transabyss/${name}
mkdir -p $assemblydir
finalassembly=${assemblydir}/${name}-final.fa
transabyss -k ${kmer} --pe $R1 $R2 --outdir ${assemblydir} --name ${name} --threads $THREADS
done
mergedassembly=$DIR/transabyss/merged.fa
finalassembly1=$DIR/transabyss/k"$kmer1"/k"$kmer1"-final.fa
finalassembly2=$DIR/transabyss/k"$kmer2"/k"$kmer2"-final.fa
finalassembly3=$DIR/transabyss/k"$kmer3"/k"$kmer3"-final.fa
finalassembly4=$DIR/transabyss/k"$kmer4"/k"$kmer4"-final.fa
finalassembly5=$DIR/transabyss/k"$kmer5"/k"$kmer5"-final.fa
transabyss-merge --threads $THREADS --mink $kmer1 --maxk $kmer5 --prefixes k${kmer1}. k${kmer2}. k${kmer3}. k${kmer4}. k${kmer5}. --out ${mergedassembly} ${finalassembly1} ${finalassembly2} ${finalassembly3} ${finalassembly4} ${finalassembly5}
ln -s $DIR/transabyss/merged.fa $FINAL_DIR/trans-abyss.fasta
###################################################################################
###################################################################################
## SOAP-TRANS
mkdir -p $DIR/soap-trans
CONFIG=$DIR/soap-trans/$PROJECT.config
touch $CONFIG
echo '#maximal read length' > $CONFIG
echo "max_rd_len=100" >> $CONFIG
echo "[LIB]" >> $CONFIG
echo "#maximal read length in this lib" >> $CONFIG
echo "rd_len_cutoff=100" >> $CONFIG
#echo "#average insert size" >> $CONFIG
#echo "avg_ins=200" >> $CONFIG
echo "#if sequence needs to be reversed" >> $CONFIG
echo "reverse_seq=0" >> $CONFIG
echo "#in which part(s) the reads are used" >> $CONFIG
echo "asm_flags=3" >> $CONFIG
#echo "q=#{@reads}" >> $CONFIG #if @type == :SE
echo "#fastq paired end files" >> $CONFIG
echo "q1="$R1 >> $CONFIG
echo "q2="$R2 >> $CONFIG
kmer_min=$kmer1
kmer_max=$kmer5
# run soap with defualt settings
mkdir -p $DIR/soap-trans/default
SOAPdenovo-Trans-127mer all -s $DIR/soap-trans/$PROJECT.config -o $DIR/soap-trans/default/k23 -p $THREADS
ln -s $DIR/soap-trans/default/k23.scafSeq $FINAL_DIR/soap-trans-default.fasta
###################################################################################
###################################################################################
## RNA-SPADES
mkdir -p $DIR/rnaspades
rnaspades.py -1 $R1 -2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/rnaspades/
ln -s $DIR/rnaspades/transcripts.fasta $FINAL_DIR/rna-spades.fasta
###################################################################################
###################################################################################
## SPADES
mkdir -p $DIR/spades
spades.py --sc -1 $R1 -2 $R2 -t $THREADS -m $MEM --cov-cutoff auto -o $DIR/spades/
ln -s $DIR/spades/scaffolds.fasta $FINAL_DIR/spades.fasta
###################################################################################
###################################################################################
## OASES
k_min=$kmer1
k_max=$((kmer5 + 1))
k_step=10
type=-shortPaired
mkdir -p $DIR/oases
oases_pipeline.py -m $k_min -M $k_max -s $k_step -o $DIR/oases/ -d " $type -fastq $R1 $R2 -separate " -c
ln -s $DIR/oases/Merged/transcripts.fa $FINAL_DIR/oases.fasta
###################################################################################
###################################################################################
## IDBA-TRANS
###this tools assume the paired-end reads are in order (->, <-). If your data is in order (<-, ->), please convert it by yourself.
mkdir $DIR/idba-tran
fq2fa --merge --filter $R1 $R2 $DIR/idba-tran/r12.fasta
idba_tran -r $DIR/idba-tran/r12.fasta -o $DIR/idba-tran/ --mink $kmer1 --maxk $kmer5 --step 10 --num_threads $THREADS
ln -s $DIR/idba-tran/contig.fa $FINAL_DIR/idba-tran.fasta
###################################################################################
###################################################################################
## BRIDGER
OUT=$DIR/bridger/
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT
Bridger.pl --seqType fq --left $R1 --right $R2 --CPU $THREADS -o $OUT # start second time, because first run crashs at some point and then finishs
ln -s $OUT/Bridger.fasta $FINAL_DIR/bridger.fasta
###################################################################################
###################################################################################
## BINPACKER
OUT=$DIR/binpacker/
mkdir -p $OUT
BinPacker -s fq -p pair -l $R1 -r $R2 -o $OUT
ln -s $OUT/BinPacker_Out_Dir/BinPacker.fa $FINAL_DIR/binpacker.fasta
###################################################################################
###################################################################################
## SHANNON
mkdir $DIR/shannon
python shannon.py -p $THREADS -o $DIR/shannon --left $R1 --right $R2
ln -s $DIR/shannon/shannon.fasta $FINAL_DIR/shannon.fasta
###################################################################################
###################################################################################
## MIRA
mkdir $DIR/mira
touch $DIR/mira/manifest.config
echo "project = $PROJECT" > $DIR/mira/manifest.config
echo "job = est,denovo,accurate" >> $DIR/mira/manifest.config
echo "parameters = -DI:trt=/mnt/dessertlocal/projects/transcriptome_assembly/tmp -NW:cnfs=no -NW:cmrnl=no -SK:mmhr=1" >> $DIR/mira/manifest.config
# if @type == :SE
# config << "# defining illumina single end reads"
# config << "readgroup = se"
# config << "data = #{@reads}"
# config << "technology = #{@technology}"
# end
echo "# defining illumina paired end reads" >> $DIR/mira/manifest.config
echo "readgroup = pe" >> $DIR/mira/manifest.config
echo "data = $R1 $R2" >> $DIR/mira/manifest.config
echo "technology = solexa" >> $DIR/mira/manifest.config
echo "template_size = 50 1000 autorefine" >> $DIR/mira/manifest.config
echo "segment_placement = ---> <---" >> $DIR/mira/manifest.config
mira -c $DIR/mira -t $THREADS $DIR/mira/manifest.config
ln -s $DIR/mira/"$PROJECT"_assembly/"$PROJECT"_d_results/"$PROJECT"_out.unpadded.fasta $FINAL_DIR/mira.fasta
###################################################################################
S4: HISAT2 re-mapping rate
We mapped the quality-controlled reads back to their assembled contigs, to determine the amount of reads that were included in the assembly process by each tool.
S5: RNAQuast statistics
Overview of RNA-Quast statistical output. Please click on the corresponding link to get a short summary of the assembly comparisons or to observe the full output.
- Escherichia coli: short report (pdf) | short report (tsv) | full report + plots (html)
- Candida albicans: short report (pdf) | short report (tsv) | full report + plots (html)
- Arabidopsis thaliana: short report (pdf) | short report (tsv) | full report + plots (html)
- Mus musculus: short report (pdf) | short report (tsv) | full report + plots (html)
- Homo sapiens: short report (pdf) | short report (tsv) | full report + plots (html)
- Homo sapiens + EBOV 3h: short report (pdf) | short report (tsv) | full report + plots (html)
- Homo sapiens + EBOV 7h: short report (pdf) | short report (tsv) | full report + plots (html)
- Homo sapiens + EBOV 23h: short report (pdf) | short report (tsv) | full report + plots (html)
- Homo sapiens simulated: short report (pdf) | short report (tsv) | full report + plots (html)
S6: TransRate
TransRate: reference-free quality assessment of de novo transcriptome assemblies (27252236)
- TransRate results for Escherichia coli: download (CSV) | show (HTML)
- TransRate results for Candida albicans: download (CSV) | show (HTML)
- TransRate results for Arabidopsis thaliana: download (CSV) | show (HTML)
- TransRate results for Mus musculus: download (CSV) | show (HTML)
- TransRate results for Homo sapiens: download (CSV) | show (HTML)
- TransRate results for Homo sapiens + EBOV 3h: download (CSV) | show (HTML)
- TransRate results for Homo sapiens + EBOV 7h: download (CSV) | show (HTML)
- TransRate results for Homo sapiens + EBOV 23h: download (CSV) | show (HTML)
- TransRate results for Homo sapiens simulated: download (CSV) | show (HTML)
S7: ExN50
An alternative to the standard contig Nx (e.g. N50) statistic that can be more appropriate for transcriptome assembly is the ExN50 statistic. The N50 statistic is computed as usual but limited to the top most highly expressed transcripts that represent x% of the total normalized expression data. Therefore, we first calculated transcript abundances using the alignment-free quantification tool Salmon and subsequently used the Trinity utilities to calculate ExN50 values. We report the Ex90N50 metric as one of our main metrics.
- Trinity: 13.61/18
- Oases: 10.36/18
- Trans_ABySS: 11.53/18
- SOAPdenovo_Trans: 12.59/18
- Bridger: 13.55/18
- BinPacker: 5.55/18
- IDBA_Tran: 11.49/18
- Shannon: 13.08/18
- SPAdes_sc: 13.37/18
- SPAdes_rna: 12.85/18
- Trinity: 13.633/20
- Oases: 8.329/20
- Trans_ABySS: 12.486/20
- SOAPdenovo_Trans: 10.277/20
- Bridger: 12.639/20
- BinPacker: 12.243/20
- IDBA_Tran: 8.308/20
- Shannon: 9.632/20
- SPAdes_sc: 14.996/20
- SPAdes_rna: 13.86/20
- Trinity: 14.15/18
- Oases: 11.53/18
- Trans_ABySS: 12.36/18
- SOAPdenovo_Trans: 11.9/18
- Bridger: 13.31/18
- BinPacker: 5.57/18
- IDBA_Tran: 12.6/18
- Shannon: 11.03/18
- SPAdes_sc: 14.18/18
- SPAdes_rna: 14.3/18
- Trinity: 14.209/20
- Oases: 7.358/20
- Trans_ABySS: 14.429/20
- SOAPdenovo_Trans: 15.05/20
- Bridger: 13.501/20
- BinPacker: 5.102/20
- IDBA_Tran: 12.126/20
- Shannon: 11.576/20
- SPAdes_sc: 14.983/20
- SPAdes_rna: 14.46/20
- Trinity: 12.375/20
- Oases: 6.307/20
- Trans_ABySS: 14.235/20
- SOAPdenovo_Trans: 11.92/20
- Bridger: 11.128/20
- BinPacker: 6.587/20
- IDBA_Tran: 8.611/20
- Shannon: 10.299/20
- SPAdes_sc: 11.475/20
- SPAdes_rna: 12.028/20
- Trinity: 14.173/20
- Oases: 10.449/20
- Trans_ABySS: 14.35/20
- SOAPdenovo_Trans: 14.63/20
- Bridger: 13.64/20
- BinPacker: 4.808/20
- IDBA_Tran: 12.255/20
- Shannon: 13.418/20
- SPAdes_sc: 15.082/20
- SPAdes_rna: 14.351/20
- Trinity: 13.535/20
- Oases: 8.503/20
- Trans_ABySS: 14.111/20
- SOAPdenovo_Trans: 13.243/20
- Bridger: 11.81/20
- BinPacker: 9.171/20
- IDBA_Tran: 8.177/20
- Shannon: 12.505/20
- SPAdes_sc: 12.696/20
- SPAdes_rna: 13.004/20
- Trinity: 13.426/20
- Oases: 7.683/20
- Trans_ABySS: 13.865/20
- SOAPdenovo_Trans: 14.527/20
- Bridger: 12.722/20
- BinPacker: 7.552/20
- IDBA_Tran: 9.219/20
- Shannon: 8.989/20
- SPAdes_sc: 12.942/20
- SPAdes_rna: 13.126/20
- Trinity: 14.165/20
- Oases: 9.833/20
- Trans_ABySS: 15.68/20
- SOAPdenovo_Trans: 11.403/20
- Bridger: 12.466/20
- BinPacker: 11.885/20
- IDBA_Tran: 10.306/20
- Shannon: 7.509/20
- SPAdes_sc: 13.219/20
- SPAdes_rna: 12.316/20
- Escherichia coli normalized score heat map (PDF)
- Candida albicans normalized score heat map (PDF)
- Arabidopsis thaliana normalized score heat map (PDF)
- Mus musculus normalized score heat map (PDF)
- Homo sapiens normalized score heat map (PDF)
- Homo sapiens + EBOV 3h normalized score heat map (PDF)
- Homo sapiens + EBOV 7h normalized score heat map (PDF)
- Homo sapiens + EBOV 23h normalized score heat map (PDF)
- Homo sapiens simulated normalized score heat map (PDF)
S8: BUSCO
BUSCO: Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs (26059717)
Escherichia coli | Candida albicans | Arabidopsis thaliana |
---|---|---|
![]() | ![]() | ![]() |
Homo sapiens + EBOV 3h | Homo sapiens + EBOV 7h | Homo sapiens + EBOV 23h |
![]() | ![]() | ![]() |
Mus musculus | Homo sapiens | Homo sapiens simulated |
![]() | ![]() | ![]() |
S9: DETONATE
DETONATE: DE novo TranscriptOme rNa-seq Assembly with or without the Truth Evaluation (25608678).
Escherichia coli DETONATE results
RSEM-EVAL score | KC score | Contig F1 score | Nucleotide F1 score | |
---|---|---|---|---|
trinity | -210712852.43 | 0.817076 | 0.0320276 | 0.639604 |
oases | -245161321.58 | 0.79302 | 0.0264751 | 0.487533 |
trans-abyss | -151118336.50 | 0.822888 | 0.0328714 | 0.611844 |
soap-trans-default | -340812817.07 | 0.861768 | 0.062212 | 0.735111 |
bridger | -174824953.19 | 0.811427 | 0.0303294 | 0.635494 |
binpacker | -262761906.88 | 0.162242 | 0.000152509 | 0.0646008 |
idba-tran | -437779699.97 | 0.828836 | 0.028401 | 0.661802 |
shannon | -197944222.32 | 0.807803 | 0.026662 | 0.62028 |
spades | -194007062.42 | 0.874921 | 0.0582104 | 0.711878 |
rna-spades | -200089085.55 | 0.843184 | 0.0697163 | 0.716887 |
Candida albicans DETONATE results
RSEM-EVAL score | KC score | Contig F1 score | Nucleotide F1 score | |
---|---|---|---|---|
trinity | -334561156.85 | 0.679298 | 0.0766464 | 0.726441 |
oases | -419101803.95 | 0.617874 | 0.0756063 | 0.505806 |
trans-abyss | -341314448.16 | 0.734363 | 0.0836716 | 0.594869 |
soap-trans-default | -448426551.63 | 0.603828 | 0.0627947 | 0.730318 |
bridger | -354594754.45 | 0.681131 | 0.0647035 | 0.742303 |
binpacker | -355541613.29 | 0.681339 | 0.0650714 | 0.725447 |
idba-tran | -611471453.61 | 0.552593 | 0.0463257 | 0.727045 |
shannon | -356256806.38 | 0.718923 | 0.0686033 | 0.635121 |
spades | -369270117.24 | 0.707372 | 0.0618753 | 0.744748 |
rna-spades | -352586681.38 | 0.691902 | 0.0555932 | 0.738039 |
Arabidopsis thaliana DETONATE results
RSEM-EVAL score | KC score | Contig F1 score | Nucleotide F1 score | |
---|---|---|---|---|
trinity | -458951076.16 | 0.706259 | 0.0503788 | 0.713888 |
oases | -601376841.14 | 0.657365 | 0.0315914 | 0.421727 |
trans-abyss | -452320279.03 | 0.742054 | 0.0417782 | 0.586087 |
soap-trans-default | -733935298.80 | 0.616916 | 0.0417285 | 0.766783 |
bridger | -540615846.28 | 0.664801 | 0.0413928 | 0.717523 |
binpacker | -963432760.48 | 0.479321 | 0.00483293 | 0.255829 |
idba-tran | -671741034.20 | 0.656141 | 0.0324962 | 0.762677 |
shannon | -1343654128.47 | 0.373612 | 0.0404999 | 0.631825 |
spades | -593763835.97 | 0.713745 | 0.0592855 | 0.792133 |
rna-spades | -573548801.65 | 0.720746 | 0.0674779 | 0.786267 |
Mus musculus DETONATE results
RSEM-EVAL score | KC score | Contig F1 score | Nucleotide F1 score | |
---|---|---|---|---|
trinity | -2226879012.19 | 0.663224 | 0.00924793 | 0.424159 |
oases | -3338075293.16 | 0.466246 | 0.00191568 | 0.185852 |
trans-abyss | -2152573788.62 | 0.671318 | 0.0245507 | 0.399779 |
soap-trans-default | -2638014207.45 | 0.55072 | 0.0232672 | 0.509524 |
bridger | -2376224869.25 | 0.601723 | 0.00662728 | 0.450365 |
binpacker | -4888140588.38 | 0.284626 | 8.78969e-05 | 0.0681366 |
idba-tran | -5029841952.79 | 0.531343 | 0.0107357 | 0.495647 |
shannon | -2938129166.97 | 0.559955 | 0.00746879 | 0.38078 |
spades | -3008040595.31 | 0.686892 | 0.0051975 | 0.522144 |
rna-spades | -2447339857.69 | 0.714442 | 0.0143569 | 0.498195 |
Homo sapiens DETONATE results
RSEM-EVAL score | KC score | Contig F1 score | Nucleotide F1 score | |
---|---|---|---|---|
trinity | -6510590491.36 | 0.511762 | 0.0161655 | 0.42599 |
oases | -11817309171.03 | 0.243335 | 0.0189894 | 0.181064 |
trans-abyss | -6255709753.29 | 0.550435 | 0.203218 | 0.509015 |
soap-trans-default | -9033548866.99 | 0.373036 | 0.205052 | 0.566304 |
bridger | -7718007001.60 | 0.398725 | 0.010466 | 0.480306 |
binpacker | -10026565572.34 | 0.366908 | 0.000374367 | 0.1451 |
idba-tran | -16304023086.75 | 0.286957 | 0.0168006 | 0.552302 |
shannon | -8959903268.75 | 0.420737 | 0.0229751 | 0.346262 |
spades | -12123853339.02 | 0.389103 | 0.0140352 | 0.606564 |
rna-spades | -7163901001.16 | 0.427956 | 0.0128943 | 0.619367 |
Homo sapiens + EBOV 3h DETONATE results
RSEM-EVAL score | KC score | Contig F1 score | Nucleotide F1 score | |
---|---|---|---|---|
trinity | -1235651692.60 | 0.518515 | 0.0149362 | 0.457458 |
oases | -1726915020.53 | 0.400049 | 0.0149624 | 0.213295 |
trans-abyss | -1196653162.51 | 0.541571 | 0.0469628 | 0.458425 |
soap-trans-default | -1348871374.03 | 0.452867 | 0.0387229 | 0.558308 |
bridger | -1245885780.74 | 0.507537 | 0.0134374 | 0.491961 |
binpacker | -2956358196.35 | 0.216398 | 6.61916e-05 | 0.0537161 |
idba-tran | -1840189936.95 | 0.435537 | 0.0135966 | 0.552636 |
shannon | -1264628787.04 | 0.516653 | 0.0279902 | 0.388568 |
spades | -1316600593.27 | 0.493276 | 0.0130479 | 0.571439 |
rna-spades | -1199874337.85 | 0.542847 | 0.022366 | 0.610776 |
Homo sapiens + EBOV 7h DETONATE results
RSEM-EVAL score | KC score | Contig F1 score | Nucleotide F1 score | |
---|---|---|---|---|
trinity | -1621797552.26 | 0.548502 | 0.0133497 | 0.461589 |
oases | -2354768025.10 | 0.446502 | 0.01377 | 0.204378 |
trans-abyss | -1644765267.08 | 0.565015 | 0.0474134 | 0.456358 |
soap-trans-default | -1853762729.85 | 0.489643 | 0.0382348 | 0.552421 |
bridger | -1803791019.01 | 0.509876 | 0.0115023 | 0.489854 |
binpacker | -1948162277.75 | 0.498755 | 0.000981932 | 0.303781 |
idba-tran | -2796730511.36 | 0.433572 | 0.0122321 | 0.555899 |
shannon | -1739716273.79 | 0.551822 | 0.0303068 | 0.373245 |
spades | -2194102523.92 | 0.481227 | 0.0139344 | 0.570306 |
rna-spades | -1712618020.99 | 0.555596 | 0.0208343 | 0.6074 |
Homo sapiens + EBOV 23h DETONATE results
RSEM-EVAL score | KC score | Contig F1 score | Nucleotide F1 score | |
---|---|---|---|---|
trinity | -1434423706.95 | 0.767084 | 0.0124217 | 0.44378 |
oases | -3869713310.16 | 0.20915 | 0.0121553 | 0.218292 |
trans-abyss | -1445566961.21 | 0.779996 | 0.044702 | 0.448004 |
soap-trans-default | -1599735160.94 | 0.694346 | 0.0354917 | 0.547701 |
bridger | -1543477872.71 | 0.739796 | 0.0109893 | 0.47975 |
binpacker | -2016262591.54 | 0.706858 | 0.00035399 | 0.203836 |
idba-tran | -4311460879.81 | 0.161691 | 0.0113647 | 0.551236 |
shannon | -4947278470.01 | 0.0955414 | 0.0282574 | 0.379485 |
spades | -3603623299.10 | 0.247863 | 0.0105064 | 0.564203 |
rna-spades | -1673532042.88 | 0.589338 | 0.0196731 | 0.601376 |
Homo sapiens simulated DETONATE results
RSEM-EVAL score | KC score | Contig F1 score | Nucleotide F1 score | |
---|---|---|---|---|
trinity | -2794698982.84 | 0.883827 | 0.0633606 | 0.566043 |
oases | -4615211068.41 | 0.738406 | 0.0488742 | 0.223824 |
trans-abyss | -2379646547.27 | 0.923633 | 0.0987237 | 0.679067 |
soap-trans-default | -5270177770.68 | 0.595837 | 0.0752428 | 0.733277 |
bridger | -3663922534.65 | 0.81708 | 0.04805 | 0.710854 |
binpacker | -3649658509.59 | 0.821483 | 0.0475004 | 0.601458 |
idba-tran | -7528195849.86 | 0.59977 | 0.0504635 | 0.783647 |
shannon | -11513711720.10 | 0.261483 | 0.0695831 | 0.453332 |
spades | -4779916710.32 | 0.773079 | 0.0528581 | 0.789225 |
rna-spades | -3911011816.48 | 0.785485 | 0.0541619 | 0.695881 |
S10: Selected main metrics
Based on our experiences and in comparison to other studies (see manuscript for details) we selected 20 metrics to get an initial idea of the performance of each assembler on each data set. For each metric and assembly we calculated (0,1)-normalized scores displayed in subscript next to the statistical value.
Escherichia coli selected metrics
Overall mapping rate | Transcripts >=1000nt | Misassemblies | Mismatches per transcript | Average alignment length | 95%-assembled isoforms | Duplication ratio | Ex90N50 | # full-length transcripts | Reference coverage | Mean ORF percentage | Optimal score | Percentage bases uncovered | Number of ambiguous bases | Nucleotide F1 | Contig F1 | KC score | RSEM EVAL | Complete BUSCOs | Missing BUSCOs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trinity | 77.010.7 | 3470.42 | 130.87 | 0.3680.88 | 467.0350.2 | 3960.95 | 1.0120.98 | 3690.46 | 4070.92 | 0.316351.0 | 75.03611.0 | NA | NA | 21430.43 | 0.6396040.86 | 0.03202760.46 | 0.8170760.92 | -0.210.79 | 2840.83 | 1900.96 | Oases | 49.160.24 | 7431.0 | 1030.0 | 0.7060.7 | 457.7170.19 | 4141.0 | 1.4430.0 | 2290.21 | 4180.94 | 0.305580.97 | 65.717260.69 | NA | NA | 36020.0 | 0.4875330.63 | 0.02647510.38 | 0.793020.89 | -0.250.66 | 2990.88 | 1721.0 | Trans_ABySS | 95.671.0 | 4460.56 | 180.83 | 0.3220.9 | 310.4420.0 | 320.0 | 1.0440.9 | 1770.12 | 4360.99 | 0.028440.07 | 72.730440.92 | NA | NA | 30810.15 | 0.6118440.82 | 0.03287140.47 | 0.8228880.93 | -0.151.0 | 2970.88 | 1701.0 | SOAPdenovo_Trans | 56.620.36 | 4390.55 | 10.99 | 0.1341.0 | 364.8810.07 | 2610.6 | 1.0021.0 | 1730.11 | 4230.96 | 0.194920.61 | 71.988070.9 | NA | NA | 25790.3 | 0.7351111.0 | 0.0622120.89 | 0.8617680.98 | -0.340.34 | 3160.94 | 1780.99 | Bridger | 87.350.86 | 3630.44 | 520.5 | 0.2650.93 | 472.8060.21 | 3820.92 | 1.0140.97 | 3930.5 | 4020.9 | 0.310820.98 | 74.590620.99 | NA | NA | 21170.44 | 0.6354940.85 | 0.03032940.43 | 0.8114270.91 | -0.170.93 | 2850.83 | 1900.96 | BinPacker | 71.090.6 | 660.0 | 80.92 | 2.0510.0 | 1100.7741.0 | 340.01 | 1.2670.4 | 6751.0 | 370.0 | 0.006190.0 | 45.133380.0 | NA | NA | 1911.0 | 0.06460080.0 | 0.0001525090.0 | 0.1622420.0 | -0.260.62 | 500.0 | 7110.0 | IDBA_Tran | 34.310.0 | 4140.51 | 01.0 | 0.1670.98 | 540.8280.29 | 2500.57 | 1.0011.0 | 3040.34 | 3960.89 | 0.160290.5 | 71.993010.9 | NA | NA | 20690.45 | 0.6618020.89 | 0.0284010.41 | 0.8288360.94 | -0.440.0 | 2960.87 | 1960.95 | Shannon | 76.690.69 | 3720.45 | 540.48 | 0.1850.97 | 482.6590.22 | 3930.95 | 1.0270.94 | 3630.45 | 3990.9 | 0.30490.96 | 73.34470.94 | NA | NA | 21450.43 | 0.620280.83 | 0.0266620.38 | 0.8078030.91 | -0.20.83 | 2800.82 | 1980.95 | SPAdes_sc | 88.040.88 | 4890.62 | 120.88 | 0.2480.94 | 558.9040.31 | 2100.47 | 1.0060.99 | 1110.0 | 4411.0 | 0.143910.44 | 69.469140.81 | NA | NA | 23700.36 | 0.7118780.97 | 0.05821040.83 | 0.8749211.0 | -0.190.86 | 3321.0 | 1721.0 | SPAdes_rna | 88.90.89 | 3400.4 | 100.9 | 0.2120.96 | 359.1950.06 | 1870.41 | 1.0090.98 | 2450.24 | 3380.75 | 0.175870.55 | 73.776770.96 | NA | NA | 25560.31 | 0.7168870.97 | 0.06971631.0 | 0.8431840.96 | -0.20.83 | 2550.73 | 1890.96 |
Escherichia coli summary of (0,1)-normalized scores
Candida albicans selected metrics
Overall mapping rate | Transcripts >=1000nt | Misassemblies | Mismatches per transcript | Average alignment length | 95%-assembled isoforms | Duplication ratio | Ex90N50 | # full-length transcripts | Reference coverage | Mean ORF percentage | Optimal score | Percentage bases uncovered | Number of ambiguous bases | Nucleotide F1 | Contig F1 | KC score | RSEM EVAL | Complete BUSCOs | Missing BUSCOs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trinity | 97.320.91 | 34170.09 | 720.89 | 1.8190.27 | 969.3270.56 | 41460.46 | 1.0390.97 | 18620.73 | 17320.76 | 0.183570.06 | 81.692610.51 | 0.455720.83 | 0.130150.85 | 93910.96 | 0.7264410.92 | 0.07664640.81 | 0.6792980.7 | -0.331.0 | 13550.65 | 1640.7 | Oases | 93.610.6 | 98131.0 | 6130.0 | 2.1210.09 | 941.60.53 | 48080.79 | 2.250.0 | 15880.4 | 18360.91 | 0.235930.31 | 79.5370.33 | 0.058270.05 | 0.867920.0 | 260100.0 | 0.5058060.0 | 0.07560630.78 | 0.6178740.36 | -0.420.68 | 13820.71 | 1400.79 | Trans_ABySS | 98.451.0 | 79790.74 | 4490.27 | 1.2880.58 | 773.5840.29 | 52141.0 | 1.9160.27 | 16640.49 | 18951.0 | 0.38241.0 | 81.468160.49 | 0.033630.0 | 0.821940.05 | 225810.2 | 0.5948690.37 | 0.08367161.0 | 0.7343631.0 | -0.340.96 | 14610.87 | 1090.9 | SOAPdenovo_Trans | 95.120.73 | 27690.0 | 91.0 | 0.5691.0 | 567.1690.0 | 32900.03 | 1.0071.0 | 13740.14 | 12740.08 | 0.198890.13 | 84.146380.71 | 0.480330.88 | 0.018970.98 | 89470.98 | 0.7303180.94 | 0.06279470.44 | 0.6038280.28 | -0.450.57 | 10420.0 | 2480.4 | Bridger | 96.720.86 | 40380.18 | 2360.62 | 1.9960.17 | 1131.3260.79 | 40050.39 | 1.1050.92 | 19230.8 | 16980.71 | 0.171960.0 | 79.399540.32 | 0.363080.65 | 0.312310.64 | 106510.88 | 0.7423030.99 | 0.06470350.49 | 0.6811310.71 | -0.350.93 | 14160.78 | 1330.81 | BinPacker | 96.660.85 | 42670.21 | 2590.59 | 2.2830.0 | 1189.5520.87 | 40330.4 | 1.1360.89 | 18760.74 | 17010.71 | 0.171370.0 | 78.649890.26 | 0.367720.66 | 0.36050.59 | 110720.86 | 0.7254470.92 | 0.06507140.5 | 0.6813390.71 | -0.360.89 | 14160.78 | 1350.81 | IDBA_Tran | 86.340.0 | 29490.03 | 81.0 | 0.8070.86 | 922.0960.5 | 32380.0 | 1.0031.0 | 12570.0 | 12230.0 | 0.19610.12 | 83.374130.65 | 0.414210.75 | 0.002891.0 | 86521.0 | 0.7270450.93 | 0.04632570.0 | 0.5525930.0 | -0.610.0 | 10700.06 | 2400.43 | Shannon | 96.510.84 | 40770.19 | 3620.41 | 0.9990.75 | 678.9470.16 | 37360.25 | 1.4120.67 | 13210.08 | 14800.38 | 0.302640.62 | 87.689731.0 | 0.051880.04 | 0.484550.44 | 127930.76 | 0.6351210.54 | 0.06860330.6 | 0.7189230.92 | -0.360.89 | 10880.1 | 3590.0 | SPAdes_sc | 97.510.92 | 35040.1 | 600.91 | 1.7990.28 | 1279.4111.0 | 44370.61 | 1.0011.0 | 20901.0 | 18470.93 | 0.174570.02 | 77.021730.13 | 0.542361.0 | 0.002941.0 | 91730.97 | 0.7447481.0 | 0.06187530.42 | 0.7073720.85 | -0.370.86 | 15231.0 | 811.0 | SPAdes_rna | 97.430.92 | 39390.17 | 820.88 | 1.9590.19 | 1171.7140.85 | 45330.66 | 1.0590.95 | 20340.93 | 18370.91 | 0.198190.13 | 75.462750.0 | 0.406230.73 | 0.160620.82 | 105670.89 | 0.7380390.97 | 0.05559320.25 | 0.6919020.77 | -0.350.93 | 14990.95 | 880.97 |
Candida albicans summary of (0,1)-normalized scores
Arabidopsis thaliana selected metrics
Overall mapping rate | Transcripts >=1000nt | Misassemblies | Mismatches per transcript | Average alignment length | 95%-assembled isoforms | Duplication ratio | Ex90N50 | # full-length transcripts | Reference coverage | Mean ORF percentage | Optimal score | Percentage bases uncovered | Number of ambiguous bases | Nucleotide F1 | Contig F1 | KC score | RSEM EVAL | Complete BUSCOs | Missing BUSCOs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trinity | 95.710.96 | 133570.33 | 12120.84 | 1.3150.79 | 978.2130.55 | 23020.8 | 1.1280.88 | 14510.83 | 72190.81 | 0.155120.69 | 74.790090.57 | NA | NA | 359640.61 | 0.7138880.85 | 0.05037880.73 | 0.7062590.9 | -0.460.99 | 11171.0 | 2191.0 | Oases | 91.930.88 | 326471.0 | 68040.06 | 3.4710.42 | 1232.6880.89 | 27531.0 | 1.9760.0 | 13050.71 | 83611.0 | 0.192070.9 | 72.074970.38 | NA | NA | 787620.0 | 0.4217270.31 | 0.03159140.43 | 0.6573650.77 | -0.60.83 | 11080.99 | 2480.97 | Trans_ABySS | 97.551.0 | 197140.55 | 72270.0 | 0.6120.91 | 745.0880.24 | 19600.65 | 1.3410.66 | 9400.41 | 81330.96 | 0.209921.0 | 72.504470.41 | NA | NA | 531800.37 | 0.5860870.62 | 0.04177820.59 | 0.7420541.0 | -0.451.0 | 11191.0 | 2240.99 | SOAPdenovo_Trans | 85.310.73 | 87090.17 | 511.0 | 0.0991.0 | 561.7460.0 | 12130.32 | 1.0081.0 | 4490.0 | 54110.52 | 0.14790.65 | 80.777781.0 | NA | NA | 286320.72 | 0.7667830.95 | 0.04172850.59 | 0.6169160.66 | -0.730.69 | 10580.93 | 2480.97 | Bridger | 90.640.85 | 128820.32 | 19950.73 | 2.4270.6 | 990.3490.57 | 19090.63 | 1.0670.94 | 16360.99 | 71630.8 | 0.1320.57 | 74.663130.56 | NA | NA | 330080.66 | 0.7175230.86 | 0.04139280.58 | 0.6648010.79 | -0.540.9 | 11030.98 | 2290.99 | BinPacker | 67.150.34 | 37750.0 | 12330.84 | 5.9020.0 | 1312.1711.0 | 4890.0 | 1.1490.85 | 14520.83 | 22270.0 | 0.030290.0 | 66.836810.0 | NA | NA | 90811.0 | 0.2558290.0 | 0.004832930.0 | 0.4793210.29 | -0.960.43 | 2620.0 | 11620.0 | IDBA_Tran | 89.040.82 | 95770.2 | 2010.98 | 0.1360.99 | 848.3560.38 | 11750.3 | 1.011.0 | 8050.3 | 60870.63 | 0.146130.64 | 80.707970.99 | NA | NA | 276660.73 | 0.7626770.95 | 0.03249620.44 | 0.6561410.77 | -0.670.75 | 9300.78 | 2690.95 | Shannon | 51.530.0 | 120180.29 | 7690.9 | 0.7240.89 | 1044.9040.64 | 20040.67 | 1.1850.82 | 16541.0 | 65410.7 | 0.147770.65 | 76.083680.66 | NA | NA | 301780.7 | 0.6318250.7 | 0.04049990.57 | 0.3736120.0 | -1.340.0 | 10490.92 | 2960.92 | SPAdes_sc | 94.290.93 | 111710.26 | 11890.84 | 0.430.94 | 944.3880.51 | 23180.81 | 1.0071.0 | 10300.48 | 78000.91 | 0.1510.67 | 74.471820.55 | NA | NA | 295690.71 | 0.7921331.0 | 0.05928550.87 | 0.7137450.92 | -0.590.84 | 10740.95 | 2250.99 | SPAdes_rna | 93.670.92 | 111600.26 | 5050.94 | 0.3410.96 | 807.4220.33 | 22370.77 | 1.0270.98 | 12390.66 | 75010.86 | 0.154150.69 | 75.672170.63 | NA | NA | 306210.69 | 0.7862670.99 | 0.06747791.0 | 0.7207460.94 | -0.570.87 | 10120.88 | 2640.95 |
Arabidopsis thaliana summary of (0,1)-normalized scores
Mus musculus selected metrics
Overall mapping rate | Transcripts >=1000nt | Misassemblies | Mismatches per transcript | Average alignment length | 95%-assembled isoforms | Duplication ratio | Ex90N50 | # full-length transcripts | Reference coverage | Mean ORF percentage | Optimal score | Percentage bases uncovered | Number of ambiguous bases | Nucleotide F1 | Contig F1 | KC score | RSEM EVAL | Complete BUSCOs | Missing BUSCOs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trinity | 92.140.89 | 218040.4 | 5640.99 | 0.840.85 | 1216.8630.52 | 60320.96 | 1.5470.39 | 4200.0 | 82100.97 | 0.218340.88 | 52.207670.82 | 0.221360.53 | 0.519730.43 | 787260.6 | 0.4241590.78 | 0.009247930.37 | 0.6632240.88 | -2.230.97 | 38950.97 | 19590.98 | Oases | 89.290.82 | 518321.0 | 526680.0 | 0.6250.9 | 406.0840.0 | 1830.02 | 1.90.0 | 20440.48 | 62930.71 | 0.022110.09 | 45.357950.4 | 0.019210.0 | 0.909030.0 | 1889470.0 | 0.1858520.26 | 0.001915680.07 | 0.4662460.42 | -3.340.59 | 30660.75 | 24760.85 | Trans_ABySS | 93.960.93 | 297190.56 | 6120.99 | 0.3160.97 | 752.8110.22 | 62811.0 | 1.750.17 | 5300.03 | 84011.0 | 0.247281.0 | 53.78860.92 | 0.115520.25 | 0.628360.31 | 1051530.46 | 0.3997790.73 | 0.02455071.0 | 0.6713180.9 | -2.151.0 | 39921.0 | 19260.99 | SOAPdenovo_Trans | 90.980.86 | 126030.21 | 611.0 | 0.1631.0 | 519.9620.07 | 22440.35 | 1.0310.97 | 27960.7 | 73750.86 | 0.095390.38 | 51.454310.78 | 0.401381.0 | 0.119220.88 | 536110.74 | 0.5095240.97 | 0.02326720.95 | 0.550720.62 | -2.640.83 | 36160.9 | 19870.97 | Bridger | 91.660.88 | 182660.33 | 26430.95 | 0.9130.83 | 1073.3950.43 | 22360.35 | 1.3480.62 | 31000.79 | 76390.9 | 0.099330.4 | 49.562060.66 | 0.180760.42 | 0.413640.55 | 669540.67 | 0.4503650.84 | 0.006627280.27 | 0.6017230.74 | -2.380.92 | 39140.98 | 19260.99 | BinPacker | 54.310.0 | 20370.0 | 6280.99 | 4.6090.0 | 1972.8091.0 | 5540.08 | 1.3460.62 | 27150.68 | 10140.0 | 0.01540.06 | 38.855790.0 | 0.130360.29 | 0.600650.34 | 65141.0 | 0.06813660.0 | 8.78969e-050.0 | 0.2846260.0 | -4.890.05 | 3100.0 | 58710.0 | IDBA_Tran | 70.640.38 | 122940.21 | 411.0 | 0.1870.99 | 811.990.26 | 9670.14 | 1.0051.0 | 16720.37 | 42080.43 | 0.091590.37 | 54.281470.95 | 0.294260.72 | 0.017111.0 | 455260.79 | 0.4956470.94 | 0.01073570.44 | 0.5313430.57 | -5.030.0 | 26790.64 | 21880.92 | Shannon | 86.60.76 | 169150.3 | 19270.96 | 0.8020.86 | 877.7060.3 | 710.0 | 1.230.75 | 20670.49 | 52480.57 | 0.001070.0 | 55.043171.0 | 0.183940.43 | 0.426890.54 | 529480.75 | 0.380780.69 | 0.007468790.3 | 0.5599550.64 | -2.940.73 | 28290.68 | 25340.84 | SPAdes_sc | 93.830.93 | 130990.22 | 5700.99 | 0.7060.88 | 1089.2640.44 | 26330.41 | 1.0031.0 | 38141.0 | 77360.91 | 0.084570.34 | 44.852870.37 | 0.379530.94 | 0.01341.0 | 515750.75 | 0.5221441.0 | 0.00519750.21 | 0.6868920.94 | -3.010.7 | 38230.95 | 18771.0 | SPAdes_rna | 96.861.0 | 193670.35 | 10800.98 | 0.8020.86 | 910.4660.32 | 32690.51 | 1.1350.85 | 34180.88 | 78590.93 | 0.106240.43 | 43.578140.29 | 0.177180.41 | 0.332040.64 | 734610.63 | 0.4981950.95 | 0.01435690.58 | 0.7144421.0 | -2.450.9 | 37730.94 | 18801.0 |
Mus musculus summary of (0,1)-normalized scores
Homo sapiens selected metrics
Overall mapping rate | Transcripts >=1000nt | Misassemblies | Mismatches per transcript | Average alignment length | 95%-assembled isoforms | Duplication ratio | Ex90N50 | # full-length transcripts | Reference coverage | Mean ORF percentage | Optimal score | Percentage bases uncovered | Number of ambiguous bases | Nucleotide F1 | Contig F1 | KC score | RSEM EVAL | Complete BUSCOs | Missing BUSCOs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trinity | 91.90.81 | 640610.22 | 33780.99 | 1.3820.74 | 795.2250.27 | 67880.99 | 2.3960.0 | 3260.0 | 89300.97 | 0.227440.87 | 50.815380.64 | 0.13180.3 | 0.590350.38 | 2864790.72 | 0.425990.59 | 0.01616550.08 | 0.5117620.87 | -6.510.98 | 40040.96 | 18040.99 | Oases | 88.040.69 | 2074741.0 | 2161270.0 | 1.2530.77 | 343.4790.06 | 8680.1 | 2.3550.03 | 6660.17 | 80240.83 | 0.08810.33 | 42.088860.0 | 0.017930.0 | 0.940010.0 | 8432350.0 | 0.1810640.08 | 0.01898940.09 | 0.2433350.0 | -11.820.45 | 35880.79 | 19220.93 | Trans_ABySS | 98.341.0 | 597790.2 | 27430.99 | 0.5730.93 | 246.8460.01 | 68241.0 | 1.7430.47 | 4410.06 | 91101.0 | 0.262191.0 | 51.922480.72 | 0.105880.23 | 0.632970.33 | 4378450.53 | 0.5090150.77 | 0.2032180.99 | 0.5504351.0 | -6.261.0 | 41061.0 | 17701.0 | SOAPdenovo_Trans | 89.930.75 | 275290.03 | 2791.0 | 0.2691.0 | 218.00.0 | 22640.31 | 1.1870.87 | 7110.19 | 68060.64 | 0.091830.34 | 48.018340.44 | 0.267520.66 | 0.326750.67 | 2412360.78 | 0.5663040.89 | 0.2050521.0 | 0.3730360.42 | -9.030.72 | 26250.39 | 21640.83 | Bridger | 86.830.66 | 432010.11 | 73290.97 | 1.4390.73 | 654.4130.21 | 21050.28 | 1.7080.5 | 13700.51 | 84400.89 | 0.0850.31 | 45.104190.22 | 0.140930.32 | 0.420730.57 | 2066350.83 | 0.4803060.71 | 0.0104660.05 | 0.3987250.51 | -7.720.85 | 39090.92 | 18120.98 | BinPacker | 72.60.24 | 226110.0 | 56030.98 | 4.6290.0 | 2335.7291.0 | 28240.39 | 2.3890.01 | 23811.0 | 44560.26 | 0.073030.27 | 42.565740.04 | 0.069560.14 | 0.837540.11 | 729181.0 | 0.14510.0 | 0.0003743670.0 | 0.3669080.4 | -10.030.62 | 20090.13 | 40780.0 | IDBA_Tran | 64.610.0 | 235160.0 | 3021.0 | 0.6660.91 | 487.1060.13 | 7090.07 | 1.0121.0 | 7080.19 | 27830.0 | 0.083260.31 | 52.459840.76 | 0.250390.61 | 0.022561.0 | 1386990.91 | 0.5523020.86 | 0.01680060.08 | 0.2869570.14 | -16.30.0 | 16820.0 | 26150.63 | Shannon | 84.270.58 | 313280.05 | 28370.99 | 1.2620.77 | 711.8310.23 | 2420.0 | 1.530.63 | 13240.49 | 67580.63 | 0.003730.0 | 55.697821.0 | 0.066340.13 | 0.5030.48 | 1170680.94 | 0.3462620.42 | 0.02297510.11 | 0.4207370.58 | -8.960.73 | 33850.7 | 21330.84 | SPAdes_sc | 92.040.81 | 310390.05 | 20220.99 | 0.8030.88 | 410.2240.09 | 17550.23 | 1.0151.0 | 11860.42 | 56760.46 | 0.080420.3 | 46.127570.3 | 0.396651.0 | 0.031190.99 | 1774770.86 | 0.6065640.97 | 0.01403520.07 | 0.3891030.47 | -12.120.42 | 26250.39 | 22680.78 | SPAdes_rna | 95.950.93 | 498600.15 | 51260.98 | 1.2490.78 | 412.2380.09 | 32530.46 | 1.1920.87 | 7820.22 | 71550.69 | 0.111190.42 | 46.254250.31 | 0.232010.57 | 0.214490.79 | 2940830.71 | 0.6193671.0 | 0.01289430.06 | 0.4279560.6 | -7.160.91 | 30890.58 | 19490.92 |
Homo sapiens summary of (0,1)-normalized scores
Homo sapiens + EBOV 3h selected metrics
Overall mapping rate | Transcripts >=1000nt | Misassemblies | Mismatches per transcript | Average alignment length | 95%-assembled isoforms | Duplication ratio | Ex90N50 | # full-length transcripts | Reference coverage | Mean ORF percentage | Optimal score | Percentage bases uncovered | Number of ambiguous bases | Nucleotide F1 | Contig F1 | KC score | RSEM EVAL | Complete BUSCOs | Missing BUSCOs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trinity | 90.730.94 | 271540.26 | 15800.94 | 1.7280.77 | 1083.5550.25 | 38260.97 | 1.6970.76 | 27530.6 | 81010.98 | 0.084340.67 | 54.506380.74 | 0.188690.3 | 0.59660.37 | 1094960.67 | 0.4574580.72 | 0.01493620.32 | 0.5185150.93 | -1.240.98 | 39791.0 | 18291.0 | Oases | 86.920.88 | 981961.0 | 248450.0 | 2.3830.66 | 1308.2160.35 | 39481.0 | 3.910.0 | 24610.45 | 82881.0 | 0.123281.0 | 48.701840.32 | 0.021090.0 | 0.937240.0 | 3189360.0 | 0.2132950.29 | 0.01496240.32 | 0.4000490.56 | -1.730.7 | 38030.95 | 18920.98 | Trans_ABySS | 93.91.0 | 307670.3 | 79560.68 | 0.7040.95 | 517.9160.02 | 31530.78 | 1.5540.81 | 19470.18 | 80720.97 | 0.116450.94 | 55.633720.82 | 0.11490.17 | 0.580880.38 | 1274480.61 | 0.4584250.73 | 0.04696281.0 | 0.5415711.0 | -1.21.0 | 39821.0 | 18380.99 | SOAPdenovo_Trans | 89.80.93 | 131100.12 | 1481.0 | 0.4291.0 | 464.3680.0 | 25530.62 | 1.0420.99 | 23720.4 | 72190.86 | 0.071330.57 | 52.668490.6 | 0.358720.61 | 0.110180.89 | 627530.82 | 0.5583080.91 | 0.03872290.82 | 0.4528670.72 | -1.350.91 | 35280.88 | 19340.97 | Bridger | 89.950.93 | 204340.19 | 18530.93 | 1.430.83 | 925.6190.19 | 27810.68 | 1.4340.85 | 26760.56 | 75190.9 | 0.070580.56 | 51.845730.54 | 0.199280.32 | 0.476790.5 | 831870.76 | 0.4919610.79 | 0.01343740.29 | 0.5075370.89 | -1.250.97 | 38790.97 | 18620.99 | BinPacker | 36.60.0 | 16460.0 | 2790.99 | 6.1840.0 | 2899.3921.0 | 2550.0 | 2.0140.65 | 35301.0 | 4960.0 | 0.003610.0 | 44.282060.0 | 0.047230.05 | 0.831110.11 | 67521.0 | 0.05371610.0 | 6.61916e-050.0 | 0.2163980.0 | -2.960.0 | 2050.0 | 59760.0 | IDBA_Tran | 78.410.73 | 135080.12 | 1781.0 | 0.7990.94 | 718.4020.11 | 5690.09 | 1.0051.0 | 19240.17 | 31210.34 | 0.06920.55 | 54.468860.73 | 0.426770.73 | 0.009061.0 | 544940.85 | 0.5526360.9 | 0.01359660.29 | 0.4355370.67 | -1.840.64 | 21620.52 | 22160.9 | Shannon | 90.230.94 | 282480.28 | 21610.92 | 1.3130.85 | 921.0460.19 | 23030.55 | 1.8080.72 | 22360.33 | 70400.84 | 0.110890.9 | 58.165311.0 | 0.050620.05 | 0.693150.26 | 1041260.69 | 0.3885680.6 | 0.02799020.6 | 0.5166530.92 | -1.260.97 | 35750.89 | 20960.93 | SPAdes_sc | 91.410.96 | 127750.12 | 12620.95 | 0.9440.91 | 661.3220.08 | 26530.65 | 1.0031.0 | 27350.59 | 75370.9 | 0.077290.62 | 51.507640.52 | 0.574751.0 | 0.0111.0 | 609000.83 | 0.5714390.93 | 0.01304790.28 | 0.4932760.85 | -1.320.93 | 38900.98 | 18131.0 | SPAdes_rna | 93.831.0 | 171950.16 | 9430.97 | 1.0010.9 | 462.0870.0 | 33350.83 | 1.1170.96 | 16030.0 | 74100.89 | 0.084450.68 | 50.270420.43 | 0.352520.6 | 0.161380.84 | 942740.72 | 0.6107761.0 | 0.0223660.48 | 0.5428471.0 | -1.21.0 | 36550.91 | 18510.99 |
Homo sapiens + EBOV 3h summary of (0,1)-normalized scores
Homo sapiens + EBOV 7h selected metrics
Overall mapping rate | Transcripts >=1000nt | Misassemblies | Mismatches per transcript | Average alignment length | 95%-assembled isoforms | Duplication ratio | Ex90N50 | # full-length transcripts | Reference coverage | Mean ORF percentage | Optimal score | Percentage bases uncovered | Number of ambiguous bases | Nucleotide F1 | Contig F1 | KC score | RSEM EVAL | Complete BUSCOs | Missing BUSCOs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trinity | 92.580.91 | 313190.16 | 12940.94 | 1.7460.6 | 1040.8750.29 | 46960.9 | 1.7570.78 | 28390.56 | 84580.93 | 0.082960.37 | 52.152560.77 | 0.190230.35 | 0.58220.39 | 1312840.82 | 0.4615890.64 | 0.01334970.27 | 0.5485020.87 | -1.621.0 | 41191.0 | 17591.0 | Oases | 85.780.53 | 1155961.0 | 188930.0 | 2.5150.36 | 1364.3770.44 | 51151.0 | 4.3470.0 | 30030.64 | 88611.0 | 0.121490.9 | 43.815020.0 | 0.015720.0 | 0.947070.0 | 4134180.0 | 0.2043780.0 | 0.013770.28 | 0.4465020.1 | -2.350.38 | 40550.97 | 18170.91 | Trans_ABySS | 94.221.0 | 374100.22 | 47690.75 | 0.7530.9 | 534.4020.04 | 40040.74 | 1.6630.8 | 20390.19 | 84050.92 | 0.128511.0 | 51.966260.75 | 0.072460.12 | 0.615660.35 | 1590920.74 | 0.4563580.63 | 0.04741341.0 | 0.5650151.0 | -1.640.98 | 41211.0 | 17710.98 | SOAPdenovo_Trans | 90.520.79 | 152510.0 | 1251.0 | 0.4431.0 | 455.9590.0 | 32250.56 | 1.060.98 | 23850.35 | 75690.77 | 0.075710.27 | 49.378910.51 | 0.33140.64 | 0.134790.87 | 771710.97 | 0.5524210.86 | 0.03823480.8 | 0.4896430.43 | -1.850.81 | 36670.77 | 18550.85 | Bridger | 89.140.72 | 234870.08 | 25090.87 | 1.4870.68 | 866.6080.2 | 29650.5 | 1.4510.87 | 27450.52 | 75400.77 | 0.075210.26 | 47.639590.35 | 0.177060.33 | 0.469340.51 | 1017880.9 | 0.4898540.71 | 0.01150230.23 | 0.5098760.58 | -1.80.85 | 39790.93 | 17850.96 | BinPacker | 84.630.46 | 229280.08 | 26490.87 | 3.6740.0 | 2505.4261.0 | 28510.48 | 1.8840.74 | 37641.0 | 68440.64 | 0.056030.0 | 47.66020.35 | 0.113780.2 | 0.758620.2 | 822390.96 | 0.3037810.25 | 0.0009819320.0 | 0.4987550.5 | -1.950.72 | 35900.73 | 23920.0 | IDBA_Tran | 76.390.0 | 161810.01 | 2031.0 | 0.8340.88 | 697.7480.12 | 7800.0 | 1.0071.0 | 19670.15 | 31970.0 | 0.075330.27 | 50.414160.61 | 0.389710.76 | 0.011161.0 | 675641.0 | 0.5558990.87 | 0.01223210.24 | 0.4335720.0 | -2.80.0 | 21340.0 | 22170.28 | Shannon | 90.990.82 | 357850.2 | 26050.87 | 1.4260.7 | 918.8540.23 | 29640.5 | 1.9540.72 | 23280.32 | 75240.76 | 0.12250.92 | 54.68871.0 | 0.036240.04 | 0.709650.25 | 1341080.81 | 0.3732450.42 | 0.03030680.63 | 0.5518220.9 | -1.740.9 | 37770.83 | 19550.69 | SPAdes_sc | 91.940.87 | 155980.0 | 20420.9 | 0.720.91 | 489.20.02 | 27950.46 | 1.0131.0 | 27660.53 | 70040.67 | 0.076180.28 | 48.010430.39 | 0.508021.0 | 0.025290.98 | 766460.97 | 0.5703060.91 | 0.01393440.28 | 0.4812270.36 | -2.190.52 | 36440.76 | 18350.88 | SPAdes_rna | 93.40.95 | 218670.07 | 12190.94 | 1.0670.81 | 470.1610.01 | 38500.71 | 1.1610.95 | 16470.0 | 72260.71 | 0.090820.48 | 45.993420.2 | 0.298630.57 | 0.200120.8 | 1205970.85 | 0.60741.0 | 0.02083430.43 | 0.5555960.93 | -1.710.92 | 35420.71 | 17800.97 |
Homo sapiens + EBOV 7h summary of (0,1)-normalized scores
Homo sapiens + EBOV 23h selected metrics
Overall mapping rate | Transcripts >=1000nt | Misassemblies | Mismatches per transcript | Average alignment length | 95%-assembled isoforms | Duplication ratio | Ex90N50 | # full-length transcripts | Reference coverage | Mean ORF percentage | Optimal score | Percentage bases uncovered | Number of ambiguous bases | Nucleotide F1 | Contig F1 | KC score | RSEM EVAL | Complete BUSCOs | Missing BUSCOs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trinity | 94.390.98 | 277850.23 | 10660.91 | 1.6840.63 | 1143.4920.26 | 40650.93 | 1.8060.73 | 29580.55 | 79351.0 | 0.073240.51 | 53.733730.46 | 0.140250.3 | 0.606690.36 | 1155780.72 | 0.443780.6 | 0.01242170.27 | 0.7670840.98 | -1.431.0 | 38850.99 | 19441.0 | Oases | 70.050.6 | 835301.0 | 111940.0 | 2.3110.44 | 1500.7760.39 | 43301.0 | 3.970.0 | 31150.62 | 28680.0 | 0.107890.89 | 47.19380.02 | 0.031910.0 | 0.938110.0 | 3017800.0 | 0.2182920.04 | 0.01215530.27 | 0.209150.17 | -3.870.31 | 38320.97 | 20140.97 | Trans_ABySS | 95.491.0 | 328660.3 | 32530.72 | 0.7690.91 | 578.8810.04 | 35750.79 | 1.6440.78 | 21780.21 | 78720.99 | 0.117471.0 | 53.731220.46 | 0.051680.05 | 0.614980.35 | 1317120.66 | 0.4480040.61 | 0.0447021.0 | 0.7799961.0 | -1.450.99 | 39071.0 | 19541.0 | SOAPdenovo_Trans | 93.10.96 | 134180.04 | 941.0 | 0.4681.0 | 482.2570.0 | 27490.56 | 1.0490.98 | 24890.34 | 71220.84 | 0.068690.46 | 51.416920.3 | 0.395981.0 | 0.126490.88 | 645470.92 | 0.5477010.86 | 0.03549170.79 | 0.6943460.87 | -1.60.95 | 34350.79 | 20420.96 | Bridger | 92.430.95 | 205600.13 | 18440.84 | 1.430.71 | 929.8570.17 | 27360.56 | 1.4590.85 | 28410.5 | 72220.86 | 0.068840.47 | 48.883160.13 | 0.196350.45 | 0.482010.49 | 876210.83 | 0.479750.69 | 0.01098930.24 | 0.7397960.94 | -1.540.97 | 37840.94 | 19780.99 | BinPacker | 81.570.78 | 108620.0 | 14360.88 | 3.7550.0 | 3072.8981.0 | 15230.22 | 2.090.63 | 39651.0 | 32620.08 | 0.026310.0 | 46.967480.0 | 0.06750.1 | 0.816430.13 | 439131.0 | 0.2038360.0 | 0.000353990.0 | 0.7068580.89 | -2.020.83 | 16950.0 | 43920.0 | IDBA_Tran | 48.370.27 | 141960.05 | 1341.0 | 0.8660.88 | 725.580.1 | 7140.0 | 1.0061.0 | 20480.15 | 31260.05 | 0.069540.47 | 52.334530.37 | 0.244310.58 | 0.010961.0 | 572710.95 | 0.5512360.87 | 0.01136470.25 | 0.1616910.1 | -4.310.18 | 20420.16 | 24190.81 | Shannon | 31.290.0 | 150070.06 | 3810.97 | 0.5830.97 | 703.7480.09 | 16310.25 | 1.4350.85 | 17140.0 | 53490.49 | 0.077190.56 | 61.577021.0 | 0.048470.05 | 0.465020.51 | 608170.93 | 0.3794850.44 | 0.02825740.63 | 0.09554140.0 | -4.950.0 | 28520.52 | 27640.67 | SPAdes_sc | 91.150.93 | 132450.03 | 12060.9 | 0.9510.85 | 649.4190.07 | 26540.54 | 1.0041.0 | 30710.6 | 71010.84 | 0.068370.46 | 49.530840.18 | 0.389490.98 | 0.018770.99 | 638720.92 | 0.5642030.91 | 0.01050640.23 | 0.2478630.22 | -3.60.38 | 37100.91 | 19521.0 | SPAdes_rna | 93.120.96 | 186290.11 | 9080.93 | 1.150.79 | 476.1760.0 | 35350.78 | 1.1470.95 | 22030.22 | 69600.81 | 0.081930.61 | 47.538830.04 | 0.217370.51 | 0.192250.8 | 1020950.77 | 0.6013761.0 | 0.01967310.44 | 0.5893380.72 | -1.670.93 | 34050.77 | 19920.98 |
Homo sapiens + EBOV 23h summary of (0,1)-normalized scores
Homo sapiens simulated selected metrics
Overall mapping rate | Transcripts >=1000nt | Misassemblies | Mismatches per transcript | Average alignment length | 95%-assembled isoforms | Duplication ratio | Ex90N50 | # full-length transcripts | Reference coverage | Mean ORF percentage | Optimal score | Percentage bases uncovered | Number of ambiguous bases | Nucleotide F1 | Contig F1 | KC score | RSEM EVAL | Complete BUSCOs | Missing BUSCOs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trinity | 96.390.95 | 84510.24 | 1390.97 | 1.2310.44 | 2261.3321.0 | 28980.96 | 2.1540.7 | 32340.96 | 14920.85 | 0.243960.34 | 43.378410.63 | 0.111680.22 | 0.502360.4 | 309900.77 | 0.5660430.61 | 0.06336060.31 | 0.8838270.94 | -2.790.96 | 5880.94 | 260.99 | Oases | 73.260.62 | 281431.0 | 40940.0 | 2.1050.0 | 2090.550.88 | 29951.0 | 4.8540.0 | 33000.99 | 16721.0 | 0.375480.74 | 37.427360.1 | 0.01070.0 | 0.829810.0 | 1107230.0 | 0.2238240.0 | 0.04887420.03 | 0.7384060.72 | -4.620.75 | 6131.0 | 221.0 | Trans_ABySS | 99.561.0 | 67030.17 | 1170.97 | 0.4590.82 | 939.3120.06 | 28770.95 | 1.5720.85 | 29020.78 | 13230.71 | 0.459441.0 | 45.491520.81 | 0.224310.47 | 0.316680.63 | 242820.84 | 0.6790670.81 | 0.09872371.0 | 0.9236331.0 | -2.381.0 | 5370.8 | 211.0 | SOAPdenovo_Trans | 91.680.89 | 42630.07 | 490.99 | 0.3080.89 | 1009.080.11 | 16790.46 | 1.2740.93 | 28360.75 | 10060.45 | 0.182470.16 | 36.277480.0 | 0.205410.43 | 0.167580.81 | 152590.93 | 0.7332770.9 | 0.07524280.54 | 0.5958370.5 | -5.270.68 | 2890.16 | 1090.74 | Bridger | 94.020.92 | 51670.11 | 5330.87 | 1.6520.23 | 1581.2990.51 | 15700.41 | 1.5070.87 | 32520.97 | 12100.62 | 0.18770.17 | 42.219870.52 | 0.142180.29 | 0.221250.74 | 180250.9 | 0.7108540.86 | 0.048050.01 | 0.817080.84 | -3.660.86 | 5250.77 | 280.98 | BinPacker | 93.030.91 | 74240.2 | 7850.81 | 1.8490.13 | 1755.1550.64 | 16600.45 | 1.9490.76 | 29090.79 | 12460.65 | 0.202810.22 | 43.570580.64 | 0.129430.26 | 0.431880.49 | 258510.82 | 0.6014580.67 | 0.04750040.0 | 0.8214830.85 | -3.650.86 | 5270.78 | 290.98 | IDBA_Tran | 85.340.79 | 27400.02 | 81.0 | 0.0961.0 | 859.9660.0 | 5770.0 | 1.0131.0 | 13970.0 | 4500.0 | 0.177640.14 | 47.62021.0 | 0.347940.74 | 0.012921.0 | 104890.97 | 0.7836470.99 | 0.05046350.06 | 0.599770.51 | -7.530.44 | 2260.0 | 1420.65 | Shannon | 30.770.0 | 23410.0 | 660.99 | 0.3940.85 | 1061.0130.14 | 10560.2 | 1.4370.89 | 23680.51 | 6660.18 | 0.130610.0 | 47.367290.98 | 0.056170.1 | 0.174490.8 | 78011.0 | 0.4533320.41 | 0.06958310.43 | 0.2614830.0 | -11.510.0 | 2420.04 | 3640.0 | SPAdes_sc | 97.020.96 | 26230.01 | 500.99 | 0.2150.94 | 979.9810.09 | 16250.43 | 1.0121.0 | 33151.0 | 10070.46 | 0.160270.09 | 39.618070.29 | 0.466921.0 | 0.01221.0 | 106740.97 | 0.7892251.0 | 0.05285810.1 | 0.7730790.77 | -4.780.74 | 4070.47 | 550.9 | SPAdes_rna | 96.720.96 | 60530.14 | 3510.92 | 1.820.14 | 1649.1470.56 | 22200.68 | 1.5350.86 | 25490.6 | 11410.57 | 0.24590.35 | 40.382840.36 | 0.191160.4 | 0.247670.71 | 203020.88 | 0.6958810.83 | 0.05416190.13 | 0.7854850.79 | -3.910.83 | 4620.61 | 260.99 |
Homo sapiens simulated summary of (0,1)-normalized scores
S11: Runtime and memory consumption
Shown are the runtime and max memory peak for each assembler and data set, run with 48 threads. For Trinity, we observed high memory peaks in the first few minutes of assembly. Here, we removed this high maximum memory peaks from the plots to show a better comparison of the memory usage over time.
S12: (0,1)-normalized scores per data set and metric
For each metric and data set, we calculated normalized scores in the range of 0 and 1 (see manuscript methods) to compare the performance of the different assembly tools regarding this metric. For example, consider the E. coli data set and the metric Overall mapping rate that results in the following vector:
(77.01, 49.16, 95.67, 56.62, 87.35, 71.09, 34.31, 76.69, 88.04, 88.9)
The calculated (0,1)-normalized scores are:
(0.7, 0.24, 1.0, 0.36, 0.86, 0.6, 0.0, 0.69, 0.88, 0.89)
With this transformation, we aim to achive a somewhat fair comparison of the different assembly tools and metrics.
For each data set and assembly tool the scores are summed up to achieve a final score for comparison. Here we show a heat map for each data set representing the different normalized scores calculated for each metric and assembly tool. We used this heat maps to identify potentially highly correlated metrics.