What Is Health Data Science?

View all blog posts under Articles | View all blog posts under Health Informatics

A health data scientist prepares a chart for a presentation.Data science is changing the world. Its influence can be seen in modern-day conveniences, from Google searches to smartphones, and organizations across the public and private sectors are racing to harness its transformative potential. In the healthcare industry, data science has the power to greatly improve quality of care and resolve many other complex issues.

According to the Centers for Disease Control and Prevention (CDC), approximately 60% of adults in the United States have a chronic disease, such as cancer or heart disease, and 40% have two or more chronic diseases. Chronic disease is the leading cause of death and disability, as well as the main driver of the $3.8 trillion spent each year on healthcare costs.

The human and financial toll represented in those figures—combined with a projected surge in the population of older Americans over the next decade and increased health risks associated with climate change—calls for a rethinking of healthcare delivery. The application of data science to healthcare could aid in early disease detection, help doctors identify at-risk patients, lower costs, and drive innovation and discovery in pharmaceuticals, among other advancements.

Health data science will likely have a tremendous impact on healthcare in the years to come, and employers in the healthcare industry are increasingly seeking data scientists, medical records technicians and other data specialists to drive this change. As a result, degrees in fields such as data science, health informatics and analytics have never been more valuable.

Defining Health Data Science

Broadly speaking, data science involves extracting meaningful information out of large quantities of data and using it to guide decision-making. The insight gleaned from this data is often used in business to predict consumer behavior and assess risk. What is health data science exactly?

Healthcare is characterized by vast amounts of complex and unstructured data, flowing in from all the different sectors of the healthcare industry: hospitals and providers, insurance companies, clinics, medical equipment companies, and researchers. Data science allows those at every level of healthcare—doctors, researchers, drugmakers—to access and leverage that information to find innovative solutions to complex issues. Health data science is closely related to the field of health informatics; both involve the use of data and technology to improve healthcare, but they involve two slightly different concepts. Health informatics largely deals with the management of health data, with a focus on health information technology and sharing data to improve healthcare delivery and patient outcomes. Health data science focuses on analyzing big data and real-world applications designed to improve the entire healthcare experience. The adoption of data analytics and data science in healthcare is still a relatively new concept, spurred on in part by the adoption of electronic medical records (EMRs) over the last decade. Former President Barack Obama’s administration spent more than $30 billion incentivizing hospitals and doctor’s offices to switch from paper to electronic records. As a direct result, between 2008 and 2015, the number of hospitals using EMRs grew from 9% to 83%. This massive shift helped make possible the kinds of advancements derived from health data science and health informatics.

Data science techniques such as artificial intelligence (AI) and machine learning have proved especially valuable for healthcare, allowing providers to analyze and interpret massive amounts of data. These tools can identify patterns and correlations in the data with greater efficiency than human analysis, providing actionable insights to improve care. AI and machine learning have been successfully used to diagnose disease, assist in surgical procedures and develop new drugs, along with many other applications.

How Data Science Is Transforming Healthcare

The potential applications of health data science are widespread, with benefits for patients (improved quality of care), providers (reduced costs), and society at large (pandemic response strategies). Data science and AI can be used to improve diagnosis and treatment, advance the development of lifesaving drugs, streamline operations at hospitals and clinics, and much more.

While the adoption of data science in the healthcare industry is still in the early stages, a number of examples already exist of how data science is transforming healthcare.

Disease Diagnosis

Errors in medical diagnosis affect millions of Americans each year, and according to some estimates, between 40,000 to 80,000 people die annually from complications associated with these errors. Using AI to analyze and interpret patient records for patterns and correlations, data science can help prevent misdiagnosis and aid in early diagnosis.

AI algorithms can also assess other characteristics, such as demographics and clinical risk factors, to identify patients at high risk for certain chronic diseases. According to Forbes, AI has been used to identify metastatic breast cancer and detect cases of cardiac arrest on emergency dispatch calls with a higher success rate than humans.


AI and machine learning can also help doctors gain a better understanding of a disease and tailor treatment to a patient’s needs. Researchers have developed AI algorithms to detect genetic variants of cancer, for example, which allows providers to target the underlying cause of the disease and design a personalized treatment plan.

Additionally, AI-powered virtual nursing assistants give patients 24/7 access to medical support and answers to questions about medication. Virtual assistants can help reduce readmission rates and prevent unnecessary hospital visits, transforming healthcare from reactive to proactive.

Operational Support

AI and data science allow healthcare providers to streamline certain administrative duties, reducing overhead and allowing doctors and nurses to prioritize the most urgent tasks. For example, voice-to-text transcriptions can automate tasks such as writing chart notes, ordering tests and prescribing medications.

Data science can also be used to schedule bed assignments in hospitals and project other operational decisions. These enhancements could save the healthcare industry an estimated $18 billion annually, according to Harvard Business Review.

Pharmaceutical Development

Developing and bringing a new drug to market can cost $2.5 billion and take more than a dozen years, according to pharmaceutical company Novartis. Researchers are hopeful that data science can improve drug development by using AI for especially time-consuming and costly processes, such as creating new combinations of compounds or sifting through huge libraries of potential drugs to assess viable candidates for clinical trials. In this way, health data science can make drug discovery faster and cheaper.


Health data science has also been integral in the fight against COVID-19. The following are just a few examples of how the healthcare industry has relied on data science during the pandemic:

  • CNN reported that healthcare data science firm Cogitativo identified California residents who were most at risk of COVID-19 infection by analyzing health insurance claims and county demographics.
  • An AI platform used to predict outbreaks of dengue fever was employed to aid in COVID-19 tracing and case detection in Brazil, Malaysia and the Philippines.
  • Health data has been integral to a vaccine rollout, according to the World Economic Forum, aiding in the tracking and distribution of billions of doses.

Additionally, researchers are hopeful that the data collected during COVID-19—such as how many ventilators a hospital needed at the peak of an outbreak or the impact of social distancing on mortality rates—can help develop more accurate models and strategies to fight future pandemics.

Advancements in digital healthcare and data science hold particular promise for low- and middle-income countries, where they can be used to improve medical care and innovation for underserved populations. According to the World Economic Forum, many of these countries are outpacing the rest of the world in adopting and implementing digital care models. In addition to the AI platform used for dengue outbreaks, an AI-driven smartphone app in Nigeria has been used to diagnose birth asphyxia, which has historically been a problem in the country’s remote regions.

Health Data Science Jobs

As the influence of data in healthcare continues to grow, the healthcare industry will increasingly seek individuals with knowledge of analytics, informatics and data science. The interest is mutual. A 2020 Novartis survey revealed that technology professionals have an increased desire to work in healthcare, in large part due to the industry’s powerful response to COVID-19. Additionally, 85% of those surveyed said that data science has been crucial to the pandemic response.

Health Data Science Job Growth and Salaries

According to LinkedIn’s 2020 Emerging Jobs Report, AI specialist is the fastest-growing job in the country, and data scientist is the third fastest-growing job. Data science in particular has ranked at or near the top of the Emerging Jobs Report for the last few years, experiencing an average annual growth of about 35% year over year. Meanwhile, the healthcare industry has also experienced tremendous recent growth, in part driven by the onset of the COVID-19 pandemic and the seismic workforce shifts it has brought about. The U.S. Bureau of Labor Statistics (BLS) projects that healthcare industry employment will grow by 15% over the next several years.

The outlook for jobs in data science and computer and information research specialists is especially promising, according to the BLS:

  • Data scientists and mathematical science occupations. The median annual salary for these positions was $94,280 in 2019, with the top 10% earning more than $150,000.
  • Computer and information research scientists. The median annual salary was $122,840 in 2019. Jobs in this field are projected to grow by 15% between 2019 and 2029, much faster than the estimated growth rate for all occupations.

Types of Health Data Science Jobs

Those in health data science roles should have a diverse skill set, with knowledge of statistics, analytics, machine learning and data science programming. An analysis of nearly 200 health data scientist job postings by the Journal of the American Medical Informatics Association (JAMIA) found that employers were largely seeking four different types of data scientists with varying skill sets.

Performance Improvers

Performance improvers accounted for the largest number of job postings, according to JAMIA. Performance improvers examine and leverage data to improve healthcare delivery and patient outcomes, as well as financial performance. Individuals in this position are most often employed at health systems or insurance companies and should be skilled in statistics, analytics and data communication.

Product Developers

Product developers develop and deliver software products to accomplish various goals, including personalizing patient treatment, predicting disease conditions, and providing administrative support at hospitals and clinics. They focus on areas such as population health, digital health, behavioral health and speech language solutions. Product developers should be skilled in data mining and machine learning techniques and have expertise in a data science programming language.


Modelers should have core data science skills and machine learning expertise. Modelers leverage large data sets, such as insurance claims or biometric data, to create AI algorithms for problem-solving purposes. They should be skilled in data science programming and statistics.


Health data science innovators address areas such as precision/personalized medicine, genomics, and biology from a healthcare delivery/informatics perspective. They may develop AI for healthcare delivery or use data analytics to resolve real-world challenges in the healthcare industry.

The JAMIA analysis found that healthcare vendors (33.8%), insurance companies (18.7%) and health systems (16.2%) were the largest employers of health data scientists.

The combination of growth in healthcare along with an emerging need for data scientists, analysts and those with technology backgrounds bodes well for those interested in careers in health data science.

Pursuing a Degree with a Focus in Health Data Science

Because the skill set is so specialized, the right education is essential for a career in health data science. Degrees in health informatics, data analytics and data analytics are common, as well as degrees in quantitative fields, such as computer science and statistics. These degrees provide aspiring health data scientists the knowledge and skills they need to advance their careers.

At a minimum, most health data scientist positions require a bachelor’s degree, though many also require an advanced degree, particularly for senior positions. According to JAMIA analysis, 38.4% of job postings required a master’s degree, and 9.1% required a PhD or an MD.

Many health data science positions also require several years of industry experience. The JAMIA analysis found that roughly half the postings called for five to seven years of experience. In some cases, a master’s or other advanced degree can be substituted for professional experience.

Given the competitive nature of the health data science job market, pursuing a graduate degree, such as a master’s in health informatics with a focus on health data science, can give students an edge when they begin their job search. Health informatics and health data science graduate programs teach students the skills to excel as a health data scientist, health informaticist, health data analyst or similar position.

Health informatics programs offer courses in subjects such as healthcare data analysis, healthcare information systems, AI and applied statistics. A health data science concentration drills down into many of these concepts, often with a focus on developing firsthand experience working with healthcare data.

Using Data to Revolutionize Healthcare

Data science will continue to have a profound impact on the healthcare industry for years to come, driving pharmaceutical innovation, streamlining healthcare delivery and improving the overall quality of care. The University of Illinois Chicago’s online Master of Science in Health Informatics and its concentration in Health Data Science can set students on the path to a rewarding career. Electives such as Artificial Intelligence and Applied Statistics for Healthcare Data Analytics can help aspiring health data scientists achieve their professional goals.
The fields of healthcare and data science are expected to grow considerably, adding millions of new jobs, and individuals with a background in health informatics, data science and analytics will have an advantage in a highly competitive and lucrative industry.

Recommended Readings

Which Types of Data Are Valuable to Healthcare Organizations?

Machine Learning in Healthcare: Examples, Tips & Resources for Implementing into Your Care Practice

Essential Tools for Collecting Disease Data During a Pandemic


American Health Information Management Association, Certifications & Careers

American Health Information Management Association, “Data Analytics and Informatics Are Two Separate Disciplines (And Why This Matters to HIM)”

Centers for Disease Control and Prevention, Chronic Diseases in America

CNN, “Thousands Could Be Saved If Vaccines Are Targeted to Specific Vulnerable Populations, Analysis Suggests”

Dataconomy, “Transforming Patient Health: The Power of Data Science in Pharmaceuticals”

European Pharmaceutical Manufacturer, “How Data Science is Driving Genomics and Pharma”

Forbes, “Moneyball Medicine: Data-Driven Healthcare Transformation”

Harvard Business Review, “10 Promising AI Applications in Health Care”

Health Catalyst, “Data Science for Healthcare: What Today’s Leaders Must Know”

Healthline, “Why Getting Medically Misdiagnosed Is More Common Than You May Think”

JAMIA, “Healthcare Data Scientist Qualifications, Skills, and Job Focus: A Content Analysis of Job Postings”

LinkedIn, “Healthcare Informatics vs. Data Analytics: No, They Aren’t the Same, and That’s About All Everyone Can Agree On.”

LinkedIn, Jobs on the Rise in 2021

LinkedIn, 2020 Emerging Jobs Report

Novartis, A Powerful Pairing

ScienceDirect, “Transforming Healthcare with Big Data Analytics and Artificial Intelligence: A Systematic Mapping Study”

U.S. Bureau of Labor Statistics, Computer and Information Research Scientists

U.S. Bureau of Labor Statistics, Data Scientists and Mathematical Science Occupations, All Other

U.S. Bureau of Labor Statistics, Healthcare Occupations

U.S. Bureau of Labor Statistics, Medical Records and Health Information Technicians

Vox, “The Fax of Life”

World Economic Forum, “Why This Is Digital Healthcare’s Moment”