Hridesh Rajan, interim chair of the Department of Computer Science and Kingland Professor of Data Analytics, was recently awarded a $1.5 million grant from the National Science Foundation (NSF). The funds will support basic research and enable Rajan and his team to develop and eventually establish the Dependable Data-Driven Discovery Institute (D4) at Iowa State University.
Rajan’s $1.5 million grant was awarded by the NSF’s highly competitive TRIPODS (Transdisciplinary Research in Principles of Data Science) Program.
“The D4 Institute at Iowa State brings a unique focus on all aspects of end-to-end dependability in data science to the program’s portfolio,” said Tracy Kimbrel, program director at the National Science Foundation. “This project has the potential to increase confidence and reduce risk in the data-driven decision-making that increasingly impacts the lives of our citizens on a daily basis.”
Harnessing the talent and expertise of several Iowa State faculty members from computer science, statistics, engineering and mathematics—the D4 Institute will conduct foundational research on data science life cycles. Co-principal investigators include: Daniel Nettleton, department chair and Distinguished Professor of Statistics; Eric Weber, professor of mathematics; Pavan Aduri, professor of computer science; and Chinmay Hegde, assistant professor of electrical and computer engineering.
“Data permeates nearly every corner of our lives, and every day millions of data-driven decisions happen across the globe,” Rajan said. “Our D4 researchers will take a comprehensive and holistic approach to studying data-driven decisions. By researching the entire data science life cycle we hope to improve the reliability and trustworthiness of data-driven outcomes.”
Rajan notes that much of the current data science research focuses on narrower issues, such as the credibility of the data or fairness issues involving machine-learning algorithms.
“The research conducted at the D4 institute significantly broadens the scope of data-science research,” he said. “By studying whole data systems—as well as the software and hardware used in these processes—we aim to improve the entire data science life cycle.”
In addition to supporting D4’s research efforts, the NSF funds will allow Rajan and his team to hire postdocs, graduate students and undergraduates. The added capabilities will allow the D4 Institute to expand their research efforts and gather additional data.
The monies will also be used to develop and facilitate workshops that further inform students, faculty and researchers on the importance of studying the data science life cycle. These short-course trainings will be rolled out during the summer and fall.
“We want to train and grow the next generation of data-science researchers,” Rajan said. “Our goal is to establish Iowa State University as the epicenter of research on dependable data-driven discoveries. This generous grant from the NSF will surely help us to reach those goals.”
The collaborative efforts of this research team could boost the trustworthiness of today’s data-driven decisions.
“Our core research team is part of a larger group that has been working together since 2017,” said Rajan. “They’ve achieved major accomplishments, such as spearheading the Midwest Big Data Summer School, establishing a data science major at ISU and developing cutting-edge research teams. We look forward to building on this success of this close-knit team.
Why is it important to study the data science life cycle?
Data-driven decisions are judgements, conclusions and results which are based on hard data, rather than personal experiences, observation or intuition. The data science life cycle is the process by which key decision makers acquire, manage and analyze data which is used to make decisions and recommendations.
Those decisions can have profound and life-altering impacts on people’s lives careers and futures—and even on whole societies.
“Whether a person is admitted into a college, approved for a mortgage—or serves a shorter or longer prison sentence—can be largely determined by data,” Rajan said. “It’s no longer enough to only ask ‘Is this data correct?’ It’s now critical to fully examine the systems that gather, store and extract the knowledge.”
Finding solutions in a data-driven society
According to Rajan, the explosion of readily available data has ushered in a new generation of concerns and questions about the consequences of data-driven decisions.
“Robust analysis and research are critical in these areas because unreliable discoveries and decisions can have far-reaching and catastrophic consequences on society, national defense and individuals,” Rajan said.
The NSF award will fund the team’s research through September 2022.
“We are honored that the National Science Foundation is supporting and validating our work through this generous funding,” Rajan said.