National Cancer Institute Cancer Genomics Research Laboratory

The federal Cancer Genomics Research Laboratory automated its sequencing service to deliver data and analyses faster. This government agency streamlined processes and simplified internal data analysis collaboration with tools from Seven Bridges and by sharing data on the cloud-based Seven Bridges Platform.

Overview

As part of the National Institutes of Health, the National Cancer Institute (NCI) is a U.S. federal agency charged with improving medical research and knowledge. Since the 1930s, this laboratory has advanced clinical research. It turned to Seven Bridges to improve operations at its Cancer Genomic Research (CGR), by streamlining data transfer and analysis processes -- saving both time and money.

The CGR engages in sequencing and bioinformatics analysis for its own internal projects and for external partner labs, handling a range of experiment design, sequencing, and analysis. Quicker collaboration on both experiment pipelines and more effective data management allowed staff scientists – and those working remotely – to handle more diverse projects more efficiently.

Sequencer Integration

For handling raw reads the CGR uses a Seven Bridges plug-in to stream data automatically from the sequencer to the Seven Bridges Platform. As a result, uploads are completely unattended meaning the CGR staff starts a sequencing run and walks away. Additionally, uploaded data is ready immediately – eliminating the routine and time-consuming need for data transfer.

David Roberson, senior scientist at CGR/NCI, said “the fact that the plugin will auto-run means that the data will be uploaded before we even get to work the next day.”

The center manages a heavy workload for its projects and those of client labs and partners nationwide. Writing scripts that dictate which software versions are used and their specific order ate up valuable time until Seven Bridges helped automatically track all their work. Opening up more options for designing experiments, while cutting preparation time, led to significant advances in productivity.

Advanced Pipeline Development

RNA Sequencing (RNASeq) analysis of data generated by Ion Torrent machines also presented a challenge. These workhorse systems produced unique error profiles of different sequencing technologies that demanded specific solutions – so Seven Bridges built the first RNA-Seq pipeline optimized for Ion Torrent data. With continuing work and research, we iterated to create best-practice RNA-Seq pipelines. Now CGR/NCI uses these Seven Bridges pipelines exclusively.

Working collaboratively with the NCI, we developed a pipeline for coverage analysis and visualization of next-generation amplicons sequencing data. This tool means researchers can identify regions that are dropping and will need improvement at the bench level. A utility produces coverage analysis images and lists of sites, amplicons, targets, and samples that failed coverage thresholds set by the user.

We are also teaming with NCI on analysis of fusion transcripts -- another best practice pipeline. It will call fusion transcripts, not only from paired-end reads, but also from challenging single-end set-up – creating analysis that yields faster lab validation of the most probable candidates.

These advances also benefited many of the labs that send data to the CGR core facility. Faster, clearer and more accurate results for external projects also produced a better work experience for both CGR staff and their user partners. Some of those remote labs also adopted the pipelines and tools created by Seven Bridges.