Is the data-handling part of the pharmaceutical industry being inexact in its use of some key terms? Is it preparing intelligently for the future? Consultant George Laszlo wonders.

On the terminology front, he says sheer sloppiness is obscuring some of the most urgent priorities. His question: where exactly should the industry clean all that clinical data it’s gathering? What should that bucket be called, and what should it do?

Laszlo, by way of introduction, has worked for both goliath IT companies (IBM, Computer Sciences, and Fujitsu Consulting) and in the life sciences (Ayerst, Hoechst-Marion-Roussel and Schering-Plough). He’s old enough to have survived a few IT fads and young enough to pen a superb blog. Here’s one of his posts on data repositories. “A clinical data repository IS NOT a clinical data warehouse,” he writes.

The issue, for Laszlo, is that the industry is using the word “warehouse” too casually. “Almost everyone in the pharmaceutical industry has misinterpreted that word,” says Laszlo. “A data warehouse traditionally would be a system that is only used to access clean data.” SAS, Insightful and other companies take over from there, he notes.

You Say Toe-Mah-Toe

What most people in the industry really mean when they say “warehouse,” he says, is in fact a repository. “All the talks about clinical warehouses were actually about clinical data repositories,” he says of the DIA annual meeting.

Is this a semantic issue? No, Laszlo says. The terminology may now be quite muddled, he concedes. Even so, he says it can be helpful to define one’s terms and adhere to the definitions. “With EDC coming on,” says Laszlo, “there is the question do you need a place to clean the data? From my perspective, the proper place is the clinical data repository.”

Missing Functionality

Hybrid studies of paper and electronic data are creating new data-cleaning issues, even in the EDC era, says Laszlo. Some EDC systems can’t check dependencies between patient visits. “A clinical data repository can be used for that, depending on the capabilities of the EDC vendor,” he notes.

In delivering that functionality, he says, there’s no reason EDC vendors may not step up on their own (or be persuaded to do so). But the use of a data repository could address the data cleaning issues—and provide other benefits. Laszlo says it’s a question of where each sponsor wants to do the cleaning, and how important ad hoc reporting on the clean data is post-cleaning. His view is that some key functions (related to collaboration, traceability and auditability) are not as well supported by clinical data management systems (CDMS), which were designed to be paper-dependent. Everything in a clinical data repository, on the other hand, is electronically traced and audited.

Data Explosion

Laszlo says that combination of EDC and clinical data repositories will be attractive at companies of all sizes. Biotechs lacking big investments in traditional clinical data management systems are the first to see some of the benefits.

One of the key attractions of a clinical data repository is that the sources of clinical trial data are heterogeneous now. They are likely to become more so, with genomics and metabolomics and biomarkers being used more widely. Ideally, all that data should live in a centralized or federated, company-wide repository, he says. “That becomes your vault for everything that has to do with data,” Laszlo says. Laboratory data, a cheek swab of DNA, the demographic data from a patient—it should be stored in the same place.

Is there a trick to picking the right repository for clinical data? For Laszlo, it is not just a matter of functionality, which is fairly similar in the marketing literature for all of the clinical data repositories. Rather, the crux is a matter of a much more subjective ability to let ordinary employees grab data from the repository and do their jobs. That is a lot harder than it sounds. “It is not just the features and functions of the software but the way it’s delivered to the end user so that they can get back to productivity,” he says.

Editor’s note: the second part of this story can be found here and includes a few clinical data repository options, plus and their long-term advantages.