Data analyst training options
Illustration by Cindy Dolan January 18, 2022
Two UD PCS programs teach in-demand skills
While data is easier than ever to collect and store, knowing what to do with it and how to analyze it is a challenge, especially for those without the proper training. From boosting customer acquisition and retention, managing risk, and identifying the source of product performance problems, to predicting competitive bond-buying bids, putting together a winning baseball roster, and countless other functions, skilled professionals possessing the unique combination of computational, analytical and communication proficiencies necessary to discover data-supported solutions to important business questions are invaluable to an organization’s success.
Tasked with examining large quantities of data to identify patterns, make projections and utilize relevant information to guide business decisions, data analysts are some of the most sought-after professionals in the world. With a strong demand and a limited supply of people who can fill the role, the U.S. Bureau of Labor Statistics predicts 20% growth in data analyst jobs from 2018 to 2028, which is much faster than the average for all occupations.
Employed in numerous industries, including information technology, healthcare, finance, insurance and professional services, a variety of educational paths can be taken to hone one’s data analyst skills. Two of these routes are provided by the University of Delaware Division of Professional and Continuing Studies (UD PCS) via its Predictive Analytics and Data Mining Certificate and Foundations of R for Data Analysis Certificate programs, which are both being offered live-online this spring. Depending on a person’s interests and background, either course — or both — could be a good fit.
Predicting outputs as a function of inputs
Taught by a pair of retired DuPont employees, Steven P. Bailey and Aaron J. Owens, the Predictive Analytics and Data Mining Certificate program addresses how to define the goals of a project, identify or collect appropriate data, analyze the data to determine a solution, and communicate the results effectively to others. Though there are no formal prerequisites for Predictive Analytics and Data Mining, a prior college-level statistics course and/or working knowledge of statistics is required, and previous experience with computer-assisted data management is helpful.
“The more data you have, the more you are going to get a lot of statistically significant information that may not be of practical importance or good at predicting the future,” said Bailey. “If suitably organized into a spreadsheet or a worksheet, we can use a number of techniques all focused on coming up with models that predict one or more outputs as a function of our inputs.”
Whereas predictive analytics refers to the use of statistics and modeling techniques to make predictions about future outcomes and performance, data mining is a process used to turn raw data into useful information. Bailey, who worked with DuPont’s Applied Statistics Group, handles the predictive analytics portion of the program before turning the class over to Owens, a former senior research fellow with DuPont’s Decision Analytics Group, for the data-mining segment.
“One difference between typical predictive analytics, which is based on statistics and using a whole data set, and the data-mining segment is that in the latter we use cross-validation all the time by splitting the data and using some to model and some to predict,” said Owens. “The whole purpose of what we’re trying to do is get a model that is exactly the appropriate complexity for the data set that’s sitting there. Steve and I complement each other with his deep statistical knowledge supported by my experience doing data mining.”
Use of statistical analysis program included
JMP Pro is the primary analytics software used throughout the Predictive Analytics and Data Mining Certificate program. The menu-driven commercial software package, which students have free use of for the remainder of the academic year in which they take the class, does not require programming skills and allows users to perform predictive modeling and cross-validation techniques as well as other actions. No matter what industry they are in, Bailey and Owens believe that anyone who deals with large amounts of data and would like to use historical data to predict future outcomes can benefit from their course.
“Our goal is for the students to develop an appreciation of the tools we give them, understand what each of them can do, and at least in JMP Pro, be able to execute in there,” said Bailey.
Perform data analysis with R
Rather than using a software tool like JMP Pro — or in addition to it — some data analysts employ programming languages to perform their tasks. Engineered to create a standard form of commands that can be interpreted into code understood by a machine, programming languages are regularly used in numerous sectors, including research and academics, information technology, finance, e-commerce, social media, banking, healthcare, manufacturing, and government. While there are a variety of programming languages to choose from, UD’s Foundations of R for Data Analysis Certificate instructor Ryan Harrington’s vehicle of choice is R, a free and open-source statistical language that enables users to extract, clean, visualize and model data.
Previously the lead data scientist at CompassRed Data Labs and a high school math teacher, and currently the director of strategy and operations, Delaware Data Innovation Lab at Tech Impact where he develops data science projects, Harrington introduced the University of Delaware Foundations of R for Data Analysis Certificate program last fall. The foundational level class welcomes anyone to enroll, and no prior programming experience is required for the course designed to help students advance their technical skill set, enable their statistical work, automate their analyses, produce compelling data visualizations, work with larger datasets, create reproducible reports and break into the growing data analysis industry.
“R was developed for statisticians by statisticians and has become an extremely popular language, especially in the last 10 years or so,” said Harrington. “R was my first language for data analytics, and I fell in love with it because it makes hard tasks simpler and solves problems for me every day in doing my job.”
More than a programming course
Using R to support learning the basics for an aspiring data analyst or a data scientist while focusing on the mechanics of programming with it and not on statistical modeling techniques, Harington said he is teaching a data analytics course supported by computer programming rather than what he would call a true programming course. Though programming is the means to the end, his goal is to train the students to be capable data scientists or data analysts.
“I’m not just teaching the language of R; I’m teaching the mindset for how a data analyst would go about doing their job,” said Harrington. “The Predictive Analytics and Data Mining course teaches the methods, and Foundations of R for Data Analysis teaches about a programming language that can be used to perform the methods.”
Register now for spring programs
The 15-week Predictive Analytics and Data Mining Certificate program begins Feb. 7, and the eight-week Foundations of R for Data Analysis program begins March 4. Both courses are held on Mondays, 6-9:15 p.m. Discounts are available, and a payment plan is offered. Prospective students looking for guidance on which course is best for them or any other information are encouraged to visit pcs.udel.edu/data and pcs.udel.edu/foundations-of-r, email firstname.lastname@example.org or call 302-831-7600.