Proton Sequencing

Ion Torrent Sequencing on S5

S5, an upgrade of Ion Torrent Proton, has been installed at the DNA Sequencing Facility in early 2017. It uses the same semiconductor technology with a simple sequencing chemistry (http://www.iontorrent.com). Both 200b and 400b sequencing are done on three different chips – 520, 530 and 540 with throughputs ranging from 5M – 100M reads and about 1Gb -12Gb. S5 is essentially an amalgamation of PGM and Proton; 318 chip (PGM) and PI chip (Proton) are equivalent to 520 and 540 chips respectively.

The main applications using different chips on S5 are similar to the ones on PGM with the difference that lot more multiplexing can be done on the former due to a 12-15 fold higher throughput. Human exome and whole RNA-seq can be done on 540 chip with multiplexing. All targeted gene panels can be put on S5 after pooling several more samples compared to the number pooled on 318 chip in PGM. Similarly, several large bacterial and viral genomes can be sequenced together on S5.

The major advantages of the technology are low cost and rapid turnaround. Only one sample is run at a time. So there is no waiting necessary to fill out a plate, a flow cell or a slide unlike other next generation sequencing platforms. The sequencing run times are only between 2.5 – 4 hours compared to few days run on Illumina.

Major Strength

Targeted sequencing of amplicons

Uses AmpliSeq technology to design ultrahigh multiplex PCRs to sequence custom panel of genes
Simple workflow, low cost – multiplex PCR based library preparation from FFPE samples
Low DNA input – 10 ng DNA (minimum) unlike other platforms
Quick turnaround – sequencing takes less than a day, only 1 sample/pool is run at a time
Variant calling & annotation via point-N-click Torrent Suite & Ion Reporter software

Ion AmpliSeq Transcriptome Human Gene Expression

-includes more than 95% of human RefSeq genes with 10 ng RNA from FFPE and other samples

A rapid, inexpensive gene expression analysis kit on FFPE and other samples
10 ng RNA input from fresh or frozen tissue
Differential gene expression for more than 95% of human RefSeq genes
Reproducible detection of high and low abundance transcripts from 20,802 genes
Detection range exceeds microarray sensitivity
Up to 8 multiplexed samples on a 540 chip (8 – 10M reads per sample)

Ion RNA-Seq for whole transcriptome or small RNA using same kit

-for identification of alternative splicing, transcript isoforms and fusion transcripts along with novel and low abundance transcripts with 100 ng – 1 µg total RNA

Needs 100 ng – 1 µg total RNA
Differential gene expression, discovery of novel & low abundance transcripts
Identification of alternative splicing, transcript isoforms and fusion transcripts
Up to 4 samples on a 540 chip (~20M reads per sample) depending on the depth of coverage required

Ion AmpliSeq Exome

A simple and flexible workflow that targets ~33 Mb of coding exons
Covers greater than 97% of coding regions as described by CCDS annotation
Low DNA input of 50 ng per sample, up to 3 samples on a 540 chip with 75-100X mean coverage
High percentage of on-target bases (>90%) with excellent uniformity

Ion TargetSeq Exome

Hybridization based capture approach to make exome library
Captures exons identified in several data bases e.g. Gencode, RefSeq, UCSC etc, functional RNA genes, predicted MiRNA binding sites and COSMIC annotated tumor variants
A total of 46.2 Mb, up to 2 exomes can be multiplexed on a 540 chip

Comprehensive Cancer Panel

Targets over 16,000 amplicons in 400 genes involved in tumor formation
95% of targets at 300X coverage using the 318 chip on PGM
6-8 samples pooled with 400-500X coverage on 540 chip

AmpliSeq Inherited Disease Panel

Targets over 10,000 primer pairs to amplify the coding exons of 328 genes
Over 700 unique inherited diseases included
Major disease groups- neuromascular, cardiovascular, developmental and metabolic
Multiple samples pooled on 540 chip depending on the coverage depth required.

AmpliSeq Custom Panel (www.ampliseq.com)

Amplifies 24 to 3,072 amplicons in a single tube, target a few genes to hundreds of genes
Ion AmpliSeq designer v5.6, is an online tool, to create and order custom panels
The cumulative target sequence is up to 1 Mb.

Data Analysis

Torrent Suite, the Torrent server analysis pipeline is the primary software used to process raw data acquired by Proton to produce read files containing high quality bases. The base calls are in both SFF and FASTQ file formats for easy downstream analysis with third party analysis tools. The Torrent browser provides many matrices, graphs and reporting features derived from the pipeline results.

Filtering and trimming: This is done to remove low quality bases or uncertain base calls by filtering out entire reads and trimming low quality 3’ ends of reads respectively. The SFF files contain per-base quality scores along with all other read information. The FASTQ files also provide per-base quality scores.

TMAP: Torrent Mapping Alignment Program implements a two-stage mapping approach – reads that do not align during the 1st stage are passed to the 2nd stage with a new set of algorithms and/or parameters. Overall the alignment provides an index to determine run and library quality.

Variant Caller: After analysis the variant caller generates a report of the SNVs and the insertion-deletions obtained from the data set (Torrent Suite User Documentation, Life Technologies).

Ion Reporter: Provides annotated variant calls with various filtering options by using different databases.

Commercial software NextGene, DNASTAR, PARTEK and other open source software are available for analysis of Proton data.

We provide data analysis service for a fee using Torrent suite, Ion Reporter, NextGene, and a number of open source software.

ExpEExpected Run Results from S5 (200 and 400b sequencing) and Charges (effective Jan 1, 2018)
	520 chip (eqvt. to 318 chip on PGM)	530 chip	540 chip (eqvt. to PI chip on Proton)
Read Length	200b & 400b	200b & 400b	200b only
No of Reads	5 - 7M	16 - 22M	70 - 90M
Throughput	1Gb - 2Gb	3.5Gb - 7GB	~12Gb
§ Sequencing(pooled libraries submitted)	$1,090	$1,230	$1,400
Barcoded library, gDNA or long amplicons, with Pippin Prep size selection,1 – 3 libraries - $100 each; 4 or more libraries - $75 each
Cancer Hotspot Panel v2 library, 4 minimum - $170 each, more for fewer samples
Comprehensive cancer panel or custom panel library - Contact us
AmpliSeq transcriptome/total RNA-seq/AmpliSeq exome/TargetSeq exome library - Contact us
Volume discount - Contact us

§ The sequencing charges include emPCR, sequencing, and preliminary data analysis to provide FASTQ, BAM and BAI files, as well as sequence alignment and variant calls.

Library prep & QC not included.

Note – Downstream data analysis (provided after consultation with the user): $100/hr

Please contact Tapan Ganguly at 215-573-7238, or gangulyt@pennmedicine.upenn.edu for consultation.