The term “big data,” which is used to describe massive amounts of information that can’t be easily processed, has been around since the nineties and embraced in the early aughts.1 The term is used across all industries including health care but the precise types of data, how they’re collected and used can vary by field. To get the best understanding of big data’s role in health care it’s important to start by understanding big data’s definition and role in technology before exploring how data analysis can prove to be a powerful tool for transforming health care.
What is “big data”?
Big data doesn’t refer solely to a large amount but the complexity of the data and rate at which it’s generated. Analysts refer to the factors that add up to big data as the three Vs2:
- Volume: Big data frequently shows up in large volumes in an unstructured manner. This is data like clickstreams on a web page or in a mobile app and social media feeds.
- Velocity: Another characteristic of big data is that it is produced and collected at an incredibly fast rate. Some data streams are used in near real time like in wearable tech.
- Variety: Traditional data was generally structured to fit in a relational database (think Excel tables). Big data encompasses many different unstructured or semistructured data types like video, audio and some forms of text that require additional process to extract meaning.
Think of all the different things that could potentially create data and how the five Vs might apply: wearable smart devices, artificial intelligence, social media, Internet of Things (IoT), mobile devices, location and motion sensors and more. It’s easy to see how data has gotten so “big.”
In recent years, two more Vs have been considered integral to understanding big data2:
- Value: All data basically has intrinsic value but obviously some is more valuable on its own than others. Consider that a health care data record may be valued at up to $250 per record on the black market, compared to payment card information which is the next highest value record at $5.40.3 A single health record is monetarily valuable in part because it gives a complete profile of a person and includes many data points which can be put to use in a variety of ways.
- Veracity: The level of accuracy of your data will affect any insights drawn from it. Think about how it’s important to have accurate measurements when doing a research study. In order for the findings to be meaningful, data must come from a reliable source (whether that is an institution or tool).
The two additional Vs bring up one of the most important things to keep in mind when considering big data and its uses: the value and veracity of data is partially dependent on how it is used. This is where analysis comes in as it is the means by which insights (the intrinsic value of data) can be uncovered and also where the manner in which it is applied could affect its accuracy and the reliability of the findings.4
Historical use of data in health care
Data is at the heart of scientific study and therefore central to health care. Medical records have been around in some form for about four thousand years5 but permanent patient case files are believed to have come about in the early 1800s at The New York Hospital and were used primarily for teaching cases to medical students. At the time records varied and reflected both the opinions and personalities of the person keeping them (in other words, they weren’t very accurate or reliable). At the end of the 19th century, real-time recording of cases and a fixed chart structure through the use of forms were introduced to streamline record keeping for individual patients.6 Public health history is also a good place to understand the start of data science in health care. The American Public Health Association was founded in 1872 at a time when scientific advances were helping to reveal the causes of communicable diseases and tracking the spread of disease through a city.7
Typically when you talk to someone about ‘data in health care’ the first thing they’ll think of is their medical history and all the associated notes and test results they’ve accumulated over their lives. This health care data is stored in a patient’s medical or health record. These records came about from the structured notes of those 19th century doctors and medical students and now are primarily stored electronically in computer databases and known as electronic health record (EHR) systems. Early efforts to manage health records and the tremendous amount of data within them began with the advent of computing in the 1960s and 70s. The Institute of Medicine (IOM) recognized the need for serious analysis of paper health records and spent a decade creating a report to argue the case for using EHR systems in 1991, identifying it as a key method to improving patient records.8 Paper health records, however, have hung around for a long time after the IOM report, their fallibility and limitations exposed repeatedly. (In 2005 Hurricanes Katrina and Rita destroyed hundreds of thousands of paper patient medical records.)9 In 2009 President Obama incorporated EHR into his American Recovery and Reinvestment Act of 2009 as part of the Health Information Technology for Economic and Clinical Health Act (HITECH) which encouraged medical providers to adopt technology (including EHRs) for improved quality and coordination of patient care— and as a result, much more easily accessible and useful health care data.8
Using big data in health care settings
Today, big data is used across health systems in hospitals, clinics, emergency rooms and in related environments like insurance companies and governmental and public health institutions.
All sorts of health care data can be collected including10,11:
- Health histories
- Medical imaging
- Laboratory results
- Machine generated/sensor data, such as from monitoring vital signs in inpatient settings and from wearable fitness trackers or heart monitors and other Internet of Things (IoT) devices
- Information gathered through research trials
- Insurance claims
- Patient surveys
This data can then be combined with demographic, geographic and social data for well-informed powerful decision-making. A health care data analyst can use this information to accomplish all sorts of tasks like11:
- Detecting and analyzing patterns of disease
- Determining potential successful interventions as well as determining when an intervention isn’t worth the risk (e.g. surgery or chemotherapy for cancer)
- Identifying patients at risk of hospital-acquired infections like sepsis and MRSA
- Reducing unnecessary health care expenditures
- Identifying and fixing inefficiencies in processes
- Identify health needs in communities and providing appropriate services (population health)
- Reducing hospital or emergency room readmissions
Being a data analyst can be a creative job as you come up with different ways to use the information you have to improve the entire health care system and potentially save lives. Here are just a few examples of how others have used the big data available in health care11:
- Columbia University Medical Center’s analysis of “complex correlations” of streams of physiological data related to patients with brain injuries to provide clinicians with critical, timely information has been reported to diagnose complications as much as 48 hours sooner than previously in patients who have suffered a bleeding stroke from a ruptured brain aneurysm
- The University of Michigan Health System standardized the administration of blood transfusions using big data analytics research combined with first-hand experience to reduce transfusions by 31% and reduce expenses by $200,000 per month
- Kaiser Permanente associated clinical data with cost data to generate a key data set, the analytics of which led to the discovery of adverse drug effects and subsequent withdrawal of the drug Vioxx from the market
- Researchers at the Johns Hopkins School of Medicine discovered they could use data from Google Flu Trends to predict sudden increases in flu-related emergency room visits at least a week before warnings from the CDC
Harness the power of data to improve health care
If the idea of digging into data to uncover patterns sparks your interest, consider pursuing the online Master of Science in Health Care Data Analytics (HCDA). The HCDA will provide you with both the analytics skills and health care context to take on one of the most dynamic roles in the field. Become a health care data analyst and take on a career where you can incite meaningful change by clever use of technology and information. Apply today.
- Retrieved on February 2, 2022, from bigdataframework.org/short-history-of-big-data/
- Retrieved on February 2, 2022, from oracle.com/big-data/what-is-big-data/
- Retrieved on February 2, 2022, from www.securelink.com/blog/healthcare-data-new-prize-hackers
- Retrieved on February 2, 2022, from datascience.aero/big-data-veracity-value
- Retrieved on February 2, 2022, from pubmed.ncbi.nlm.nih.gov/24054954
- Retrieved on February 2, 2022, from pubmed.ncbi.nlm.nih.gov/21079225
- Retrieved on February 2, 2022, from apha.org/about-apha/our-history
- Retrieved on February 4, 2022, from journalofethics.ama-assn.org/article/development-electronic-health-record/2011-03
- Retrieved on February 4, 2022, from library.ahima.org/doc?oid=104430#.Yf2Lr_XMJ6o
- Retrieved on February 4, 2022, from healthitanalytics.com/news/which-healthcare-data-is-important-for-population-health-management
- Retrieved on February 4, 2022, from ncbi.nlm.nih.gov/pmc/articles/PMC4341817/