Data Analysis | Vironomics Core

As a Vironomics Core, we not only perform the assay services that generate data, but we also offer analysis services to assist transforming the raw data into ready-to-use data.

qPCR

Analysis of the raw qPCR is not included in the assay cost; however, we do offer a data analysis/consultation service that can assist you with analysis or creating ready-to-publish figures. Please contact Dr. Dirk Dittmer and/or the Vironomics Core Manger for this service and estimated turn-around time.

Next Generation Sequencing

For data analysis of next generation sequencing, including 454, PGM, S5, Illumina, and PacBio data, we use both commercial software and open source software. Below is the summary of the frequently used software:

CLC genomic workbench
3 licenses
This is a commercial software for analyzing and visualizing next generation sequencing data; it includes a number of features within the fields of genomics, transcriptomics, and epigenomics. The CLC supports all major next generation sequencing platforms and read mapping as well as de novo assembly of hybrid data.

Geneious
2 licenses
De novo assembly or reference mapping of Illumina, PacBio or Ion Torrent reads (any length, paired ends, barcodes), using industry leading algorithms including TopHat and Velvet. Build phylogenetic trees using peer-reviewed algorithms including RAxML and PAUP* and adjust display settings for publication-ready graphics. Batch BLAST against NCBI and directly search GenBank. Centralize and collaborate on data with seamlessly integrated shared repositories. Import and export most industry standard file formats.
SPAdes
2 licenses
St. Petersburg genome assembler – is an assembly toolkit containing various assembly pipelines. SPAdes works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. You can also provide additional contigs that will be used as long reads. SPAdes supports paired-end reads, mate-pairs and unpaired reads. SPAdes can take as input several paired-end and mate-pair libraries simultaneously. Note, that SPAdes was initially designed for small genomes.

Bowtie 2
2 licenses
This is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters and aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.

Celera Assembler
This is a de novo whole-genome shotgun (WGS) DNA sequence assembler. It reconstructs long sequences of genomic DNA from fragmentary data produced by whole genome shotgun sequencing. Celera Assembler has enabled many advances in genomics, including the first whole genome shotgun sequence of a multi-cellular organism (Myers 2000) and the first diploid sequence of an individual human (Levy 2007). Celera Assembler was developed at Celera Genomics starting in 1999. It was released to SourceForge in 2004 as the wgs-assembler under the GNU General Public License. The pipeline revised for 454 data was named CABOG (Miller 2008)

Newbler
1 license
This is the assembly/mapping program developed by 454 Life Sciences for of 454 data, it is in fact the core of both the gsAssembler/GS De Novo Assembler (GUI based), gsMapper/GS Reference Mapper Software /GUIbased), runAssembly (command-line based) and gsMapper (command-line based). It uses k-mer based hashing and the ‘overlap-layout-consensus’ approach. Takes both shotgun and paired end reads.

For more in depth information on the computers and software available at the Dittmer lab please see this page