
![]()
The name of the game in computers
Two scientists at UD are analyzing the way computer programmers select names for software components, with an eye to reducing the time spent on system maintenance.
Lori L. Pollock and Vijay K. Shanker, both professors in the Department of Computer and Information Sciences, have received a $400,000 National Science Foundation grant for the research project. They hope to help programmers maintain large and complex systems by mining software for important data that often goes overlooked—the simple words used in naming components.
The project is aimed at analyzing how programmers have named the various components of their software by improving analytical tools, thereby freeing programmers to spend less time troubleshooting and more time writing new software.
The research team also includes both graduate and undergraduate UD students.
“Today’s software is so huge and complex that programmers spend most of their time maintaining existing software as opposed to writing new software,” Pollock says. “We hope to provide tools for software evaluation and maintenance that will make their job easier. That, in turn, will make software less expensive and more reliable for the consumer.”
Shanker says the researchers are processing the natural language the programmers have used to name the software components and “have found a lot of useful information in the way programmers select names.”
The project takes on added importance, he says, given the movement to open-source programs, which must be more readable so that other users can easily modify the programs.
Pollock says the researchers have had much positive reaction to their work during recent workshop presentations, adding that the future is unlimited. “This just keeps exploding as we constantly find new applications for the extracted information,” she says.
The research centers on improving the quality of various tools that developers use throughout the lifetime of a large software system. The team is particularly interested in helping programmers maintain software that already exists because a considerable amount of time is spent maintaining existing systems, Pollock says.
In fact, it has been estimated that because of the size and complexity of modern software and increased code reuse, between 60 percent and 90 percent of programming resources are devoted to modifying applications to meet new requirements or to fix discovered bugs.
To make modifications or fix problems, programmers first must identify the concept that must be changed. Then they must locate and comprehend it before carefully implementing a change in the code.
Software engineers increasingly rely on available software tools to automate maintenance tasks as much as possible. However, Pollock says that despite all the available automated support, recent studies have shown that more development time is still spent reading, locating and comprehending code than actually writing code.
She and Shanker believe that software maintenance tools can be significantly improved by adapting natural language processing to source code analysis.
The researchers say the approach is novel in that they are analyzing how programmers have named various components of their software. For example, the appearance of words such as “store” and “write” in naming components indicate “saving.” While these words are not synonymous in normal English, they are used interchangeably in programs.
By applying, integrating and adapting the analysis of the use of natural language, such as English, the researchers are able to improve search and program navigation tools.
Pollock and Shanker evaluate their newly developed strategies by designing and conducting experimental studies of the use of the tools by software developers, with one evaluation involving the Quantum Leap Innovations firm that was founded by UD graduates and is headquartered in the Delaware Technology Park near the University campus.
The professors say that although they have been working in the same department for 15 years, this is the first time they have collaborated on research.
The project was spurred by a graduate student, David Shepherd, who has since earned his doctorate and accepted a postdoctoral position at the University of British Columbia. Shepherd was studying software engineering tools and, after taking a course with Shanker, realized Shanker’s work in natural language processing could be coupled with Pollock’s in optimization and automatic program analysis.
—Neil Thomas, AS ’76