To read part-2 of this series click here.

In the last 2 years I have been able to attend the two “Big Data in Healthcare” conferences run by Phacilitate. From these conferences it has become apparent that the potential for Big Data to change the way we conduct and manage clinical trials is disruptive to say the least. But big change will only occur if we, the clinical trial community and the regulatory agencies, embrace the digital age in which we exist.

Big Data definition: Big Data is not, as many people might think, just datasets that are larger than standard datasets. Big Data is defined by being large on four dimensions, aka the 4 V’s: volume, velocity, variability and veracity. Classic clinical trials are “barely there” by this definition (perhaps some very large programs where many different data sets are being collected from a large variety of sources, such as the anti-Nerve Growth Factor Compounds).   With the huge and rapid availability of incoming Big Data, there is a loss of the ability to clean the data in the classic sense. As such, we end up with significantly more data points but they are not “clean” which generally means that they either lack structure and/or are incomparable or contain irrelevant data points, sometimes called “noise”. The upside is that we can spot trends and evaluate interventions closer to ‘real life’ rather than having to stop and ask the subject every few hours how they are doing by whatever variable we are recording.

Wearable devices are now becoming mainstream in clinical trial data collections. This provides us the ability to monitor subjects continuously and for longer and longer periods of time. With this ability comes large volumes of data, however the data that we receive cannot be cleaned in the usual sense. Mainly because of the volume, but also because of the veracity and velocity of the incoming data. One of the speakers at this year’s “Big Data in Healthcare” conference commented that Medidata spent 2 weeks at the GSK human laboratory using the state of the art wearable and 24hour monitoring. As a result, the team gained more data in that time than their database had acquired from all the previous studies combined!. If we are receiving this level of information in a ‘lab’ in 2weeks, consider the data we can obtain from a 12month study.

 Long term this will change the way we conduct larger Phase III and Phase IV clinical trials. The way we develop the hypotheses or objectives will become more sophisticated, however that is only if the regulatory agencies allow this new input. In much the same way adaptive clinical trial design is allowing more complexity and providing more insight, the acceptance of Big Data will change the face of clinical trials, likely very much for the better.