Pipeline execution failed for 0 samples.
## [1] "No samples failed."
For more details, have a look at the log files in /cephfs/abteilung4/Projects_NGS/aquamis_contamination/resources/test_data/aquamis/logs.
The table is searchable and sortable. At the end, clickable links are provided to the created fasta file, to the best ncbi reference contig and to the quast and icarus reports.
Note that the best reference is selected from the set of all complete bacterial chromosome assemblies. Hence plasmids are excluded which in turn might reflect in a larger genome length of the sample than the reference. Futhermore, for bacteria with more than one chromosome, only the best matching chromosome is reported. Thus gene count, reference length etc. are properties of the best matching chromosome.
Three most abundant species according to read-based taxonomic classification with kraken2 and abundance estimation with braken based on the kraken2 minikraken database.
Three most abundant species according to contig-based taxonomic classification with kraken2 based on the kraken2 minikraken database.
Three most abundant genus according to read-based taxonomic classification with kraken2 and abundance estimation with braken based on the kraken2 minikraken database.
Three most abundant genus according to contig-based taxonomic classification with kraken2 based on the kraken2 minikraken database.
Inter and intra species contamination is assessed using confindr. Sound intra-species contamination is performed using a genus specific cgMLST approach. A sample is marked as contaminated if either more than 1 contaminating SNV per 10000 base pairs examined was found - or there is cross contamination between genera. Missing values were not determined (ND).
Figure 1: Average assembly coverage depth of all mapped reads (per sample).
Figure 2: Fraction of reference genes fully or partially found in assembly. Note that if a genome consists of more than one chromosome, only the fraction beloning to the largest chromosome is displayed.
Figure 3: Fraction of reads that map back to the assembly. A lower fraction indicates problems with the assembly and/or contamination in a sample.
Figure 4: Insert size distrubution per sample (violin plot). The mean insert sizes are indicated by red diamonds. The insert size is the same as the fragment size without barcodes. Thus, if the insert size is smaller than two times the read length, the reads overlap.
Figure 6: Coverage depth distribution of each sample(violin graph), in logscale. Each separate bubble indicates the presence of a DNA molecule with a defined coverage depth distribition. Hence, the presence of more than one bubble may be associated with the presence of (high copy number) plasmids.
In the following, the coverage depth distribution is shown for each sample (in log scale):
## $SRR1206159
##
## $SRR1609871
##
## $SRR2985019
##
## $SRR498433
## Created by AQUAMIS:"1.3.1"
## version :"v1.3.0-6-g7893989"
## workdir :"/cephfs/abteilung4/Projects_NGS/aquamis_contamination/resources/test_data/aquamis"
## samples :"/cephfs/abteilung4/Projects_NGS/aquamis_contamination/resources/test_data/samples.tsv"
## params :List of 12
## ..$ threads :10
## ..$ docker :""
## ..$ run_name :"aquamis_test_data"
## ..$ remove_temp:FALSE
## ..$ fastp :List of 1
## .. ..$ length_required:15
## ..$ confindr :List of 1
## .. ..$ database:"/cephfs/abteilung4/Projects_NGS/aquamis_contamination/repo/AQUAMIS/reference_db/confindr"
## ..$ kraken2 :List of 4
## .. ..$ db_kraken :"/cephfs/abteilung4/Projects_NGS/aquamis_contamination/repo/AQUAMIS/reference_db/kraken"
## .. ..$ read_length :150
## .. ..$ taxonomic_qc_level:"G"
## .. ..$ taxonkit_db :"/cephfs/abteilung4/Projects_NGS/aquamis_contamination/repo/AQUAMIS/reference_db/taxonkit"
## ..$ shovill :List of 7
## .. ..$ assembler :"spades"
## .. ..$ depth :100
## .. ..$ tmpdir :"/tmp/shovill"
## .. ..$ ram :16
## .. ..$ output_options:""
## .. ..$ extraopts :""
## .. ..$ modules :"--noreadcorr"
## ..$ mash :List of 3
## .. ..$ mash_refdb :"/cephfs/abteilung4/Projects_NGS/aquamis_contamination/repo/AQUAMIS/reference_db/mash/mashDB.msh"
## .. ..$ mash_kmersize :21
## .. ..$ mash_sketchsize:1000
## ..$ mlst :List of 1
## .. ..$ scheme:""
## ..$ qc :List of 1
## .. ..$ thresholds:"/cephfs/abteilung4/Projects_NGS/aquamis_contamination/repo/AQUAMIS/resources/AQUAMIS_thresholds.json"
## ..$ json_schema:List of 2
## .. ..$ validation:"/cephfs/abteilung4/Projects_NGS/aquamis_contamination/repo/AQUAMIS/resources/AQUAMIS_schema_v20210226.json"
## .. ..$ filter :"/cephfs/abteilung4/Projects_NGS/aquamis_contamination/repo/AQUAMIS/resources/AQUAMIS_schema_filter_v20210226.json"
Software | Version |
---|---|
fastp | 0.20.1 |
ConFindr | 0.7.4 |
Kraken | 2.1.1 |
bracken | 2.5 |
taxonkit | 0.7.2 |
shovill | 1.1.0 |
bwa | 0.7.17-r1188 |
flash | 1.2.11 |
java | 11.0.8-internal |
kmc | 3.1.0 |
lighter | 1.1.2 |
megahit | 1.2.9 |
megahit_toolkit | 1.2.9 |
pigz | 2.5 |
pilon | 1.23 |
samclip | 0.4.0 |
samtools | 1.11 |
seqtk | 1.3-r106 |
skesa | 2.4.0 |
spades | 3.14.1 |
trimmomatic | 0.39 |
velvetg | 1.2.10 |
velveth | 1.2.10 |
mash | 2.2.2 |
QUAST | 5.0.2 |
mlst | 2.19.0 |
Column | Details |
---|---|
Sample Name | Name of sample |
QC Vote | Recommended quality assessment based all criteria (PASS or FAIL) |
QC Fail | Number of fields falling below the fail threshold |
QC Warn | Number of fields falling below the warning threshold |
QC N.D. | Number of fields where no threshold could be applied, either by missing value or missing genus/species-specific threshold |
Reference | Best NCBI complete genome according to mash |
Reference Accession | Accession number of best NCBI complete genome according to mash |
Species | Species of best NCBI complete genome according to mash |
# Reads | Number of reads after trimming |
Megabases | Number of bases from all trimmed reads |
Q30 Base Fraction | Fraction of bases that have Q30 or higher |
Coverage Depth | Average depth over all positions and contigs |
# Contigs | Number of contigs larger 0 base pairs |
# Contigs >1000 bp | Number of contigs larger 1000 base pairs |
N50 | N50 value in basepairs (indicator of average contig size and assembly quality) |
Read Fraction Majority Taxon | Fraction of reads assigned to the most abundant taxonomic rank, e.g. species or genus |
Contig Fraction Majority Taxon | Fraction of contigs assigned to the most abundant taxonomic rank, e.g. species or genus |
Contam. Status | Contamination Status |
Contam. # SNVs | Number of contaminating Single Nucleotide Variations |
Single-Copy Orthologs | Fraction of universal single-copy orthologs that were found. Values below 1 indicate incompleteness |
Duplicated Orthologs | Fraction of universal single-copy orthologs that were found in duplicate. Non-zero indicate contamination |
MLST Loci w/ Multiple Alleles | MLST loci with multiple alleles, an indicator of intra-species contamination |
MLST Loci Missing | missing MLST loci, compare to ST in scheme to confirm a true miss |
MLST Schema | MLST schema - determined automatically (default) or chosen by user |
MLST ST | MLST sequence type of associated MLST schema |
# Full Genes | Number of reference genes found |
# Partial Genes | Number of reference genes partially found |
Fraction Genes Recovered | Fraction of genes found compared to all reference genes, includes partial matches |
Reference Coverage | Genome coverage compared to reference |
Duplication Ratio | Total number of aligned bases in the assembly divided by the total number of aligned bases in the reference genome |
GC | Total number of G and C nucleotides in the assembly, divided by the total length of the assembly |
Total Length | Sum of all contigs |
Reference Length | Length of reference genomes |
Reference Similarity | Similarity to reference according to mash. The value describes the fraction of shared kmers |
Fraction Mapped Reads | Fraction of reads that map to contigs. Values below 1 indicate assembly issues |
Insert Size | Calculated fragment length, i.e. read length plus insert size |
Trimming Details | fastp report with details related to read trimming and read QC |
FASTA | Link to assembly file |
NCBI | Link to reference NCBI entry |
Ikarus | Link to ikarus report on structural comparison to reference |
QUAST | Link to assembly quality report |