Chengmo Yang, assistant professor of electrical and computer engineering, has earned a National Science Foundation Career Award to study fault rates and boot reliability in computer systems.

NSF award

UD's Yang receives prestigious National Science Foundation Career Award

TEXT SIZE

10:52 a.m., March 25, 2013--The unending quest for new electronics applications and greater computational power is pushing researchers to produce computer chips that perform better and consume less power. 

However, as computer chips shrink — and more devices are placed on each chip — they become increasingly unpredictable.

Honors Stories

Boer Medal awarded

Antonio Luque, director of the Institute of Solar Energy at the Technical University of Madrid, Spain, will receive the Karl W. Boer Solar Energy Medal of Merit, established by the distinguished UD scientist.

Winning simulation

SimuTrach, a device that provides realistic training for the care of tracheostomy patients, has been selected as the first-place technology innovation winner by the 15th International Meeting on Simulation in Healthcare Scientific Content Committee.

These reliability issues can be “show stoppers” for today’s computer systems, limiting or hampering a system’s ability to run lengthy applications. 

The University of Delaware’s Chengmo Yang, assistant professor of electrical and computer engineering, recently received a prestigious five-year, $449,541 Faculty Early Career Development Award from the National Science Foundation (NSF) to develop resiliency solutions that can help computer systems overcome progressively diverse types of hardware failures.

Funded through the Division of Computer and Network Systems, the new funding will enable Yang to design and evaluate new architectural and system level solutions to boost resiliency in computer systems and to develop new algorithms aimed at simultaneously optimizing a computer’s performance, energy and reliability.

According to Yang, there are three types of hardware faults that typically occur in computers: permanent (where a device breaks or can no longer be programmed); transient (which are random faults or errors); or intermittent (problems related to execution conditions like voltage and temperature).

“Future computer systems are expected to experience continuous faults, across all levels from hardware to software applications, raising critical concerns about the impact of intermittent faults that occur frequently and irregularly over nanosecond to second time scales,” explains Yang.

Previous approaches to address these problems have included adding system redundancies, such as having the computer perform a computation twice and comparing results to ensure accuracy. 

“Doing the computation twice means double the energy expenditure,” explains Yang, who instead proposes adapting the execution conditions to improve efficiency while also controlling costs.

Her approach includes creating a feedback loop within the system to improve the devices’ reliability over time through adaptive “work-arounds” for three tightly connected components:

  • Detection and check pointing — enabling computers to repeatedly adjust approaches to tasks based on a system’s reliability;
  • Error recovery — enabling computers to re-execute commands following failures in a way that minimizes chance of further problems; and
  • Resource management — enabling systems to monitor application and hardware reliability and quickly adapt scheduling decisions as needed. 

By setting up systems that assign the most critical and vulnerable tasks to the computer’s most reliable cores, Yang believes she can help create computer systems that can quickly recover from unplanned or intermittent problems. 

“Our approach reduces the need for devices and interconnects to be 100 percent correct in order to work, which will dramatically reduce associated manufacturing, verification and testing costs,” she says. 

Yang credits her NSF award selection in part to supportive colleagues such as Guang Gao, Distinguished Professor of Electrical and Computer Engineering, and her department chair, Kenneth Barner.

“Professors Gao and Barner, and others within the department, really take junior faculty under their wing and support them. My successful proposal is one example of this,” she said.

Yang joined UD in 2010 as an assistant professor of electrical and computer engineering. She earned her bachelor's degree in microelectronics from Peking University in Beijing, China, and her master's and doctoral degrees at University of California, San Diego, in computer science and computer engineering respectively. 

Article by Karen B. Roberts

Photo by Ambre Alexander

icon-fb icon-tw icon-yt icon-fs
ADVERTISEMENT

News Media Contact

University of Delaware
Communications and Public Affairs
302-831-NEWS
publicaffairs@udel.edu

UDaily is produced by
Communications and Public Affairs

The Academy Building
105 East Main Street
University of Delaware
Newark, DE 19716 | USA
Phone: (302) 831-2792
email: publicaffairs@udel.edu
www.udel.edu/cpa
University of Delaware • Newark, DE 19716
publicaffairs@udel.edu • (302) 831-2792 • ©2012
University of Delaware • Newark, DE 19716 • USA • Phone: (302) 831-2792 • © 2013
Comments|Contact Us|Legal Notices