1. DiGiulio, Sarah

Article Content

The goal is a world where you can capture a patient's whole genome sequence, analyze the data, and determine a treatment for that patient based on his or her genome by the end of that afternoon, Eric Dishman, General Manager of Health & Life Sciences at Intel, said in a phone interview to discuss the joint effort being spearheaded by Intel and the Oregon Health and Science University Knight Cancer Institute. "We call it the 'All in One Day by 2020' initiative."

ERIC DISHMAN. ERIC D... - Click to enlarge in new windowERIC DISHMAN. ERIC DISHMAN
JOE GRAY, PHD. JOE G... - Click to enlarge in new windowJOE GRAY, PHD. JOE GRAY, PHD

That goal is the one Brian Druker, MD, Director of the OHSU Knight Cancer Institute, mentioned speaking as part of a panel of experts from academia, industry, and advocacy during a session on the future of data sharing in medicine at the 2015 Partnering for Cures meeting (sponsored by FasterCures, a center of the Milken Institute) (OT 1/10/16 issue). Ears perked up when Druker mentioned an effort was underway (in collaboration with Intel) to make all cancer data shareable by 2020.


The cancer center and tech company together are developing the Collaborative Cancer Cloud-a precision medicine analytics platform that allows institutions to securely share patient genomic, imaging, and clinical data, according to a description on OHSU's website. Dishman, who focuses on the Intel side of the project; and Joe Gray, PhD, Professor and Associate Director of Translational Research for the OHSU Knight Cancer Institute, who is part of the quantitative oncology program at the Knight Institute working on the collaboration, further explained how the platform will work and some of the big challenges they face in a joint phone interview with OT.


1 What exactly is the Collaborative Cancer Cloud-and how does it work?

GRAY: "What we're imagining is that each institution around the world would have a computer and a data storage capability to manage their data. And each of us would be able to control the data on our site-but if we all agree that we would be able to share that data to answer a particular question, the computer system would allow for me to send a query to your computer; it would run there and then bring the derivative results back to me.


"And this would be done quickly, securely, and cheaply. And it would be done in a way that doesn't require that I the analyst actually know how your computer system works. This system that we're developing handles all of that."


DISHMAN: "It's a data-sharing and analytics infrastructure that allows for the sharing of access to data, but without [any one institution] giving up control of it-it allows multiple people to do analysis of the data.


"For example, if I am diagnosed and go to see Dr. Druker at OHSU and he wants to find somebody that looks genetically like me, the chances of him having that data in the OHSU data center are one in a million. Instead, he has to have a huge common denominator of lots of patients' data to say, 'Here's what worked for them, and here's how I can customize a treatment for you based on your genetics.'


"This system-the Collaborative Cancer Cloud-would allow him to send a query to say, 'Hey, anybody else have somebody who looks genetically like Eric?' And instead of them having to give up control of that data or letting him copy and paste all of their data (which you can't do because of HIPAA and security reasons), this system allows that query to be securely sent to an authorized database and bring an answer back to the clinician.


"And the more people you have on that network, the larger that common denominator of data is going to be and the better the chance you have of finding somebody that looks like Eric."


2 What makes this project so challenging? Why is sharing cancer data different from sharing other files over the Internet?

DISHMAN: "First, it's dealing with the large size of the files and moving them around. Many academic medical centers don't even have access to their own cancer data because those files live in the databases of the different departments and can't be shared.


"To put it in perspective-there are about 1.6 to 1.7 million cancer diagnoses expected in the U.S. this year. If we genetically sequenced all those patients' tumors just once, it would create four exabytes of data, which is about 400,000 times all of the printed content of the Library of Congress.


"And that's just sequencing new cancer patients just once.


"Genome sequencing just for cancer is the largest of the big data challenges that the planet faces."


GRAY: "And even though from a technical point of view-just the engineering of it-the team has already demonstrated proof of concept; the engineers know how to do it. But, to me, the biggest challenge in all of this is how do we get everyone on the same page?


"Once you tell us what to compute on, we can do that. But it's defining the standards-getting people to agree what constitutes adequate security for all the data we're trying to manage-that is the biggest challenge. To what extent should we share all of the data? What ethical issues need to be dealt with as we move into the world of sharing a lot of sensitive information about the world's populations?


"These are all what I would call societal problems that the system will enable us to deal with-but we the society actually have to figure out exactly how we want to do it."


3 How is this project different than other big data initiatives, like CancerLinQ, ORIEN?

DISHMAN: "These other efforts are taking a slightly different approach; a lot of them are trying to centralize all the data. They try to bring the big players together who have data and put it all in one central database. All of those efforts create public datasets of genome data for cancer. Intel works on a lot of those efforts.


"But this platform [the Collaborative Cancer Cloud] can stitch all of those different efforts together. So the more of those networks that get created and standardized, the better. [The Collaborative Cancer Cloud] will not replace them. It will create an uber-network that everyone can share from."


GRAY: "The fundamental design of this system is scalability."


DISHMAN: "The Collaborative Cancer Cloud is trying to solve the big data problem now that will work for the research of tomorrow. We think we have a lot of big data now, but it's tiny in comparison to what's coming even two, three, or four years out. Eventually, it's going to become too expensive to build a big server farm or database for all of this data-and it can become too difficult for policy and privacy reasons to move all of that data to one central place.


"So how do we connect all of these other efforts-including all of the cancer centers big and small? How do you do that in a safe, secure way where you don't need to be an IT expert to use the system? That's where we're headed with the Collaborative Cancer Cloud."


Access More "3 Questions On..."

Read more answers straight from the experts on the latest news and provocative topics in Sarah DiGiulio's award-winning blog:

Figure. No caption a... - Click to enlarge in new windowFigure. No caption available.