A career path paved with data
Photos by Kathy F. Atkinson May 01, 2023
New UD training program prepares data scientists
What’s among the fastest-growing careers in America? Data scientist. It’s a profession that pays well, too — with an average base salary of $130,556 per year, according to Indeed, the employment website.
Data scientists are becoming essential to more and more areas of our daily lives, from human health to precision agriculture.
Personalized medicine has become real, thanks to breakthroughs in data science, high-performance computing, gene sequencing, microbiology, biomedical engineering and drug development. Some cancers and rare diseases are now being treated in this way — where drug therapies are customized to a person’s unique genetic profile versus the traditional “one size fits all” approach — and many other diseases are on the target list.
Data science is revolutionizing agriculture, too, by analyzing information from robots to drones tracking parameters such as soil temperature, humidity and light to help farmers sustainably produce more food from the same amount of acreage, ushering in a new age of “smart farming.”
Such advances demand a new generation of experts capable of making meaning out of massive data sets using cutting-edge tools, and the University of Delaware has launched a new graduate training program to help prepare them. UD’s Computational Biology, Bioinformatics and Biomedical Data Science Program (CBB) is funded by a $1.5 million grant from the National Institutes of Health.
“We have an unprecedented opportunity here at the University of Delaware for graduate students who want to work at the frontier of biomedical and computational sciences,” said Shawn Polson, associate professor of computer and information sciences, who is director of the new program.
“Bioinformatics data science has exploded in the past few years,” Polson said. “Our training program for doctoral students leverages the expertise of leading faculty and the unique resources we have, from UD’s Data Science Institute and Center for Bioinformatics and Computational Biology to the Biomix and DARWIN computational clusters. Every student will also do an internship valuable to their future careers.”
Jonathan Hicks, a doctoral student in bioinformatics data science from Columbia, Maryland, is in the program’s first cohort that started this past fall. He’s aiming to become a machine-learning engineer in medical sciences. In machine learning, computers are programmed to recognize certain data patterns, learning as they go, without explicit instructions.
“This program caught my attention in its ability to develop a strong career with a backing from the NIH,” Hicks said.
Currently, he is working with Dr. Robert Akins at Nemours Children’s Health in Wilmington, Delaware, to build a test that can diagnose cerebral palsy from the blood at approximately six months of age compared to the current standard, around 19 months.
“There are changes to DNA that don't affect the actual sequence of DNA, but affect whether a particular piece of DNA gets used. This is called methylation,” Hicks said. “I am finding patterns in this methylation that have diagnostic capability in cerebral palsy, and I am building a tool that uses artificial intelligence to group these patterns and can predict whether or not an individual has cerebral palsy.”
Besides expanding students’ skills in areas like artificial intelligence and machine learning, the program offers professional development courses to prepare students as research leaders.
“We developed the curriculum to not only expand students’ usage of cutting-edge data techniques, but we also want students to learn how to lead, to collaborate and to do team science,” Polson said.
An innovative, inclusive and interdisciplinary program
Polson’s co-investigators include Abhyudai Singh, associate professor of electrical and computer engineering; Karen Hoober, associate director of the Center for Bioinformatics and Computational Biology; and Cathy Wu, Edward G. Jefferson Chair in Engineering and Computer Science. Wu also is director of both the Data Science Institute and the Center for Bioinformatics and Computational Biology.
“The CBB T32 training program is designed with three key attributes — the ‘three i’s’ — for innovative, inclusive and interdisciplinary,” Wu said. “We are aiming to provide innovative training activities to facilitate inclusive learning in an interdisciplinary team science environment.”
Thirty faculty from 10 departments at UD, along with affiliates from Delaware State University, are involved in training and mentoring the students, teaching them how to use mathematical, computational and data science approaches to understand biological networks at multiple scales, from the sequence and structure of molecules to the physiology and function of cells and their interactions with different environments.
In the program’s first year, students take courses in quantitative and computational methods, technology, experimental design and data interpretation; learn about responsible conduct, reproducibility, ethics and diversity; and participate in an experiential learning course keying on teamwork, communication and innovation. Year 2 culminates in a 10-week summer internship designed in collaboration with academic, industry and government partners.
The program aims to train 30 doctoral students in the next five years, Polson said, and it is committed to increasing the participation of underrepresented groups.
“There are significant demographic disparities in graduate training of computational and data scientists, including underrepresentation of women and minoritized groups,” said Erin Sparks, assistant professor of plant and soil sciences and a member of the CBB Executive Committee. “This NIH investment will provide new resources to address these inequities and broaden the participation of underrepresented individuals.”
Among its inclusion efforts, the program is strengthening ties with Delaware State University, where CBB Executive Committee member Hacene Boukari, professor of physics, is working to extend the integration of data science into the undergraduate curriculum and research and to help build a bridge to graduate studies for DSU students.
Bioinformatics careers in sight
Rachel Keown, a doctoral student from Parkesburg, Pennsylvania, under the mentorship of Polson, has her sights set on a career in the biotechnology industry.
“The interdisciplinary nature of this program attracted me to apply,” said Keown, whose research focuses on the genetics and protein chemistry of bacteriophage — viruses that infect bacteria —and how these organisms contribute to processes such as biogeochemical cycling in the ocean.
Working with researchers and students with different skills and perspectives, who often approach research in fundamentally different ways, has been a big plus for her.
“The collaborative nature of the coursework and department seminars have helped to broaden my understanding and perspective of the field of bioinformatics,” she said. “The majority of current molecular biology and microbiology research generates big data that require bioinformatics skills to analyze, and my education in this program will allow me to meet that demand upon graduation.”
Yasmin Moghadamnia has traveled some distance to pursue her doctoral degree. Born and raised in Iran, she moved to the United States in 2017 and became a permanent U.S. resident in 2022. She completed her undergraduate studies in physics and a master’s degree in condensed matter physics in Iran and another master’s degree in biophysics at Johns Hopkins University before enrolling at UD, where she’s now a doctoral candidate in bioinformatics data science under the research mentorship of CBB Executive Committee member Ryan Zurakowski and Jason Gleghorn, both associate professors in the Department of Biomedical Engineering.
She’s currently working on HIV spatial and pharmacokinetic modeling and studying antiretroviral drug distribution in certain tissues such as the lymph node.
“My original plan has always been to become an academic researcher and professor,” she said. “I believe this fellowship can open up opportunities for me to be exposed to more fields and types of research, as well as create a strong network for my career.”
With the addition of the CBB program, UD now offers three NIH-sponsored T32 training programs for predoctoral students. The Chemistry-Biology Interface (CBI) program prepares students to apply the atomistic and mechanistic approaches of chemistry to important biological and biomedical related problems, and the Physical Therapy and Rehabilitation Research program provides students with multidisciplinary training to solve problems facing individuals with disabilities for research-intensive positions as physical therapists and research scientists.
The CBB program is supported by NIH’s National Institute of General Medical Sciences (T32 GM142603) with additional institutional support provided by UD’s Graduate College, Data Science Institute, College of Engineering, and College of Arts and Sciences. The program also has leveraged resources provided by past investments by NIH and other agencies, including Delaware INBRE (NIH-NIGMS P20 GM103446 and State of Delaware), BioStore (NIH S10 OD028725) and the DARWIN Computational Cluster (National Science Foundation #1919839).