SAS runs an annual one-day conference about healthcare and the life sciences. With merciful brevity written into its DNA, the meeting can pack its program densely with substance. There is no need for dreaded filler speakers enlisted mostly to accommodate some hotel manager's minimum revenue goal for the event.

Steven Labkoff is an example of the sort of presenter that SAS attracted. He’s a cardiologist who is currently head of strategic programs in the R&D information department at AstraZeneca.

Labkoff was on a SAS panel tasked with discussing such small matters as the long-term direction of the life sciences and the usage of "big data"—mountains of 1s and 0s in quantities needed by Procter & Gamble or the CIA. Even now, in its infancy, big data has multiple definitions, not all of which may be validly applied to a 50-patient Phase II rhinitis trial.

Personalized Emphasis

At AstraZeneca, Labkoff said, there is a push to find drugs that have companion biomarkers, or lab assays that definitively indicate a specific, significant physiological state. Perhaps 60 percent of the candidate molecules at AZ have associated biomarkers, he said. "We are working very hard to find personalized medicine for all our projects," Labkoff noted.

It's hard to know whether drug and device companies will ever develop personalized medicines for a person or a genotype. Even blockbuster treatments for the masses have well-publicized risks and dubious economic underpinnings in an era of high-cost research.

Labkoff, for his part, went on to opine that personalized medicine will be aided by the analysis of larger and larger quantities of data in all phases of industrial research. "Big data can be used from the beginning of the pipeline until the end," he said.

Recruitment Modelling

Labkoff added that his former employer, Pfizer, is relying upon electronic health record (EHR) data to tweak patient recruitment strategies in clinical trials even before the first patient is screened.

AZ will be employing the same technique, he said. "You try to model out what will the end result of the inclusion/exclusion criteria look like. That helps generate a cleaner way into the trial. It would be nice to run trials in data," he said, offering an immediate disclaimer. "I think that is a long way's off."

Even with such a caveat, Labkoff seemed energized about using nonstandard, large quantities of data that might not be part of any typical clinical trial today. In a response to a question from your faithful correspondent, he suggested that he is not worrying about, say, casually tossing together clinical trial data and EHR data (each with radically different vocabularies, data structures and numerical units) into a big informatic Cuisinart. Not at all. Instead Labkoff seems to be envisioning a more judicious, selective grabbing of data from industry or healthcare.

Pharma-Insurer Alliances

Any usage of physician or hospital data by the R&D trade, needless to say, presumes that there will be no massive protest from politicians or privacy activists. It's easy to imagine headlines about someone's beloved grandma in Vermont who has had her medical history of incontinence “stolen” by a big bad lab, pharmacy, insurance, or drug company.

Assuming the public can be persuaded of the anonymized nature of data swapping, industry will need to get its hands on much larger quantities of prescription, claim and safety data. All companies in the life sciences will clearly need to purchase or partner for the data on a massive scale to routinely analyze tens or even hundreds of millions of people at one time. That’s what the FDA is doing, and it's big data by any standard.

Labkoff noted a handful of recent contracts have been signed to begin to help industry in the coming big data race. His employer, AstraZeneca, just signed a deal with Healthcore, which is part of the Wellpoint empire. As of last year, Pfizer had a pact with Humana; Sanofi inked a similar arrangement with Express Scripts. The bandwagon is just getting going.

Clinical Islands

"No one institution has a lock on the entire spectrum," he said. Needless to say, healthcare repositories change so fast and are so voluminous that we are not talking about something that can be burned on a DVD. There will be entirely new storage, transfer and analytic tools needed for big data projects, even if the same familiar IT goliaths dominate the market.

Conceptually, Labkoff said, the claims, prescription, trial and other types of medical information are islands. Each provider has only partial, isolated insights into any medication or device. In some cases, IT folks and statisticians will be doing the heavy lifting to connect the islands. He lamented the shortage of people with the necessary postgraduate training for such work.

Business Idea

But Labkoff is privately imagining a better approach. He noted there are only a few repositories of longitudinal, life-long medical histories at places like Kaiser Permanente. That forward-looking organization marshals data about treatment, prescriptions, radiology and lab tests from a patient’s birth until death. (Most European or Asian countries with single-payor insurance systems can do likewise.)

To overcome the informatically Balkanized U.S. healthcare landscape, creativity will be required. Labkoff has been ruminating about the possibility of a new, still imaginary non-profit organization that might combine all the commercially available U.S. healthcare data sources, and serve them up (to take a bit of poetic license) as one unified burrito of medical data.

The complexity of combining so much data, he conceded, would be significant. He did not explicitly say that it would be impractical for each company in industry to reinvent the wheel every time such data were needed on a case-by-case basis. But he hinted in that direction.

And Labkoff quickly stated aloud what many in the audience must have been thinking, which is that those who currently control healthcare data are not hillbillies with scant knowledge of the economic value of their assets. "It will probably never come to pass because of how people value their data," he said of his own data-aggregating vision.

Yet the larger value of such compilations of data, he insisted, is not simply for the data in its raw state. Rather, the value emerges after the data has been analyzed. Said Labkoff: "The data is what you do with it. What are the algorithms you use with it?"