Energy-Efficient Fault Tolerance in Chip Multiprocessors Using Critical Value Forwarding
Relentless CMOS scaling coupled with lower design tolerances is making ICs increasingly susceptible to wear-out related permanent faults and transient faults, necessitating on-chip fault tolerance in future chip microprocessors (CMPs). In this paper we introduce a new energy-efficient fault-tolerant CMP architecture known as Redundant Execution using Critical Value Forwarding (RECVF). RECVF is based on two observations: (i) forwarding critical instruction results from the leading to the trailing core enables the latter to execute faster, and (ii) this speedup can be exploited to reduce energy consumption by operating the trailing core at a lower voltage-frequency level. Our evaluation shows that RECVF consumes 37% less energy than conventional dual modular redundant (DMR) execution of a program. It consumes only 1.26 times the energy of a nonfault- tolerant baseline and has a performance overhead of just 1.2%.
- Electrical Engineering, Electronic Engineering, Information Engineering
The 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'10)Chicago, Illinois, USA, June 28-July 1, 2010.</em>
Fairmont Chicago, Millenium Park