More trials. Bigger trials. More technology and services firms supporting each trial. More types of data. More silos. More entrenched egos defending the silos.
All of the above should be driving the adoption of the approach of the Clinical Data Interchange Standards Consortium (CDISC). But pockets of resistance to standards remain.
Outside a tight group of CDISC insiders and volunteers, the alphabet soup of data standards can seem like a jumble of bolts and wooden parts from an Ikea store. But no diagram to show how they all go together. Somebody might be able to assemble a table out of them. But to the uninitiated, the pile is just a bunch of parts.
Kit Howard is the sort of person who could make sense of any jumbled pile of parts and construct a coherent whole. A consultant with her own firm, Kestrel Consultants of Ann Arbor, Michigan, she claims no magic wand, peddles no sacred incantations. She does not sell software.
But she can offer a plan. Specifically, a data life cycle plan, or DLP. It's a Microsoft Word document. Written in English, not XML or computer gibberish. It's organized by a variety of types of data. (There's a link to one chapter of a DLP at the bottom of this article.) Each chapter might be 20 or 120 pages. Collectively, all of the DLPs could run to more than a thousand pages. Ideally, they would be stored electronically—not printed and put on a shelf.
The DLP guides an organization and serves as a blueprint for how to create every type of data across all therapeutic areas and functional specialties. Howard describes the DLP as "an overall document that says here are the things you need to think about." In some DLPs, there might be more than 15 chapters, each controlled by a group of domain experts.
Standard operating procedures (SOPs) typically cover process. DLPs, in contrast, are technical specifications about what happens to the data. Both SOPs and DLPs should be subject to similar governance. The DLP creates a framework for discussions that do occur on their own, but it forces them to an earlier stage of the process. "We spend so much time at the end of the process trying to fix all these things we didn't anticipate," she notes.
In her view, defining an organization's data standards is not some grim chore of discussing database specifications or variable names. Instead, developing a company's data standards is a scientific or intellectual exercise that helps the organization undertake trial design consistently. Rather than be surprised by some nuance of the Hamilton depression scale late in a project, the DLP creation process would define the important aspects of the scale for every major team on the trial.
Senior management has to support a DLP. But once it's complete, the package of documents can help an organization worry about something important. Like, say, science. And avoid the 17th round of haggling over how to collect vital signs or whether an adverse event form can use Helvetica fonts.
Balance of Power
"In isolation, your variable names and data sets are a tiny piece of the process," Howard notes. "If you haven't consistently defined how you ask the questions and clean the questions, then you end up with data that are not necessarily comparable. Standards have a much bigger meaning than what you are going to call your database names."
The DLP document is created with a well-defined process. Howard's vision is that organizations sponsoring clinical trials need to actively, energetically engage all of the stake holders, and help them to understand the impact of every decision. With so many deep reservoirs of domain expertise, facilitating that process can be challenging.
Says Howard of the DLP document: "Everybody owns a section. You are equalizing the power. It's one of the biggest strengths of this to equalize the power around the table, so you don't have solutions that inappropriately penalize one person."
So Many Silos
To be clear, she does not feel everyone needs to be able to do each other's job. She does feel a greater level of understanding is essential. "Cross functional communication and understanding are our next huge hurdles to overcome," she says.
One byproduct of making DLPs, she reports, is that the clinical types or the regulatory people can't be dictatorial. They can see the consequences of their actions on drug safety colleagues, data managers or medical writers. "This kind of knowledge helps everyone involved in the process to optimize process across the organization, as opposed to just their particular silo," Howard adds.
Howard has been working in the life sciences for a few decades. Her industry stints include time at Parke Davis in the early 1990s. She reports that firm had an uncommonly far-sighted approach to data standards. "The greater the transparency and the greater the understanding across functional areas, in this case data and files, the greater the quality will be," she says.
Like apple pie, data quality has many admirers. But Howard's been around the industry long enough to know some of the dynamics that produce its opposite. Piecemeal approaches, approaches dominated by physicians or biostatisticians alone, may lead to trouble. "You can't define data quality unless you know what you're capturing and what you're going to use it for," she says. "Quality is not an inherent quality. To understand if your data are of good quality, you have to understand what questions you're going to ask of your data."
As part of a DLP, Howard delivers a short reference to the applicable federal regulation, if any. Her intent is to minimize a certain amount of pointlessly circular discussions of hypothetical requirements by the regulatory authorities. She aims to provide the underlying legal requirement without getting bogged down in it. "Here's the reg, here's our interpretation, here's why we did what we did," Howard explains. "If we can't provide a clear, unambiguous answer, then at least we can give you the information you need to draw the conclusions yourself."
At times, she muses, such discussions can take on a life of their own, much as people debate politics. The DLP provides a way to have the discussion once and move on to more fruitful conversations. "That is the way to prevent the discussion from happening over and over again," she says. "It builds institutional knowledge."
We've done our best to keep some of the CDISC nomenclature out of this story, but have no choice to refer to the study data tabulation model (SDTM). The main thing to know about SDTM is that the FDA is on record that it will one day be the way data must be submitted to the agency.
Howard says that some organizations consult her after trying to perform somewhat simplistic mapping exercises of their data into SDTM. "Mapping is always a dangerous business," she says. "SDTM is not trivial. It is a very robust model. You can put almost anything into it." She advises a more gradual and comprehensive approach, but grants that a DLP is only one of many methods to successfully implement data standards.
Some organizations, tragically, may wish to go through the motions of data standards without understanding the larger efficiencies and benefits that may accrue to an organization as a whole over several years.
Although most of her work is for the sponsor community, the contract research organization (CRO) industry is also starting to appreciate the value of standards. Howard sees data standards and DLPs, used appropriately, as a way to ensure satisfied partners in a CRO-sponsor pairing. Says Howard: "You're much less likely to have screaming about how the data is of such poor quality that we're never going to use you again."
By the same token, the absence of DLPs and data standards may lead to drama and failed projects. "You have a greater risk that you are going to have a misunderstanding and data that are not competently handled," Howard warns.
Indeed, she is skeptical about the outsourcing of data management, simply because a profound appreciation of the data seem like a peculiar matter to entrust to others. "I don't understand how companies can say we are outsourcing our data management because it's not our core competency," Howard says. "I don't understand what it is that we do that isn't about the data. Without the data, all you have is a pile of white powder on a shelf."
Editor's note: Here's a sample chapter of a DLP for demographic data.d9A2t49mkex