The Health Analytics team is analyzing 500 billion Medicaid claims that are right at Georgia Tech.
In a nondescript room on Fifth Street, billions of files containing millions of Americans’ personal health information are stored. The few gatekeepers allowed into this highly secured room are the members of Georgia Tech’s Health Analytics team, a group of students and faculty members who use data science methodologies to analyze and interpret all of this information.
So what’s in the files? The Health Analytics group has, over time, bought 500 billion of Medicaid claims from the Center for Medicare and Medicaid Services (CMS), a federal agency. The claims date back nearly 10 years, from 2005-2012. For two of those years, Tech has all of the claims filed in the entire United States; for the rest of the period, the claims of 14 states located mostly in the Southeast.
The claims allow the team to study how people use the health care system: how many times they went to the doctor, if and when they filled a prescription, if they went to the specialist they were referred to, or how many times they’ve been to the emergency room or otherwise hospitalized. The data also includes what costs were associated with each doctor, specialist, or hospital visit, and what drives those costs — so they can study how to increase cost efficiency on a systemic level. The CMS data project is a collaborative effort with Georgia Tech’s Institute for People and Technology, the GT Pediatric Technology Center, and the Georgia Department of Public Health. IPaT supports the purchase, curation, security and access to this sensitive data on behalf of the GT research community.
Significant conclusions have already been reached from the team’s analysis. One recent project focused on dental health problems, what Nicoleta Serban, an associate professor in the School of Industrial and Systems Engineering (ISyE) and co-leader of the team, calls the most chronic condition for children. The team was able to assess how preventive dental care affects the health of children on Medicaid down the road. They found that increasing preventive dental care for even just 10 or 20 percent of Georgia’s children would save the state millions of dollars.
“It’s because we have that amount of data we could do it! Otherwise we could not,” says Serban, stressing the importance of being able to track changes in health outcomes over several years.
Richard Zheng, a Ph.D. student who has been working with this data since 2013, sees the team’s role as “bridging the gap” between the healthcare field and the statistics, operations, research, and other analytical methods needed to “solve real healthcare problems.”
Another study, on childhood asthma, revealed that Georgia has the highest percentage of kids who go to the emergency room or are hospitalized for reasons related to asthma. By looking at adherence to guidelines on how to best to care for childhood asthma — the most common respiratory condition among kids — the team hopes to find ways to encourage better treatment and access to primary care providers while reducing expensive emergency room visits.
The team recognizes the potential for policy advocacy using this data. Serban says she has started working closely with Georgia’s Department of Public Health, but there are countless ways the claims data could be used to better inform decision-making in public health.
“Health care is a field with a huge amount of data,” Zheng says. “Those findings can really help make better decisions and inferences.”
Serban says that so far, state agencies have been responsive to the hard numbers she and her team have been able to provide in terms of Medicaid patients’ access to care (or lack thereof) and — especially — ways to save costs.
“[The data] allows us to compare the expenditures per patient in different geographical areas,” says Julie Swann, ISyE professor and co-leader of the team. That kind of comparison allows decision makers to see how others are doing on health outcomes and what they’re spending to achieve them.
“You can compare Georgia to North Carolina to Tennessee…and so on,” Swann says.
And the data allows much more granular analysis than that. The files contain individuals’ information; identifying information that is highly sensitive and thus highly confidential.
That’s important because studying public health outcomes on a county level, as is often done with census data, leaves much room for error.
“Think about Fulton [County]. There are hundreds of census tracts…” explains Serban. “If you take Edgewood versus Lake Claire, they’re very different communities and they have different behaviors.” When trying to target Fulton County generally, she says, one wouldn’t know which area to target for which needs — they vary too greatly within the county as a whole.
While other academic institutions have some CMS data, Georgia Tech’s team has unique access and the ability to use the data from these billions of claims effectively. That’s in large part due to the school’s top-rated computer science and industrial engineering schools and the top-tier students with that sort of expertise.
“In other places, IT does the queries and the doctors make requests [for data analysis],” says Serban. But at Tech, Serban and Swann take on roles as public health scholars, physicians, policy advocates, and data scientists when working with the claims data. Both professors are trained statisticians.
Though the Health Analytics team bought the data from CMS in the beginning of 2011, it took members almost a year to complete the database they needed to build in order to actually access it. It’s this infrastructure — the physical storage and property, the security needed to go with it, the skills to build software to query the data, the manpower to actually do it, and continue to do so — that makes working with Georgia Tech’s data set so valuable.
So far the team has had some meaningful collaborations with Emory University, the Centers for Disease Control and Prevention, and — most closely — Children’s Healthcare of Atlanta, which has played a huge role supporting the team’s work.
But with billions of claims and the seemingly infinite angles they offer to examine Medicaid and the U.S. health care system at large, there is much, much more to be done. And plenty of opportunities for collaboration.
The Health Analytics team is eager to secure more partnerships that would allow them to use their analytical expertise to better the health care system in some way.
“We want to go beyond Georgia — we want to go national,” says Serban.