Maximizing Level of Confidence for Non-Equidistant Checkpointing
Författare
Summary, in English
To combat the increasing soft error rates in recent semiconductor technologies, it is important to employ fault tolerance techniques. While these techniques enable correct operation, they introduce a time overhead, which may cause a deadline violation in real-time systems (RTS). Since correct operation for RTS is defined as producing correct outputs while satisfying time constraints (deadlines), it is important to optimize the fault tolerance techniques such that the probability to meet deadlines is maximized. To measure to what extent a deadline is met, the concept of Level of Confidence (LoC), i.e. the probability to meet the deadline, can be used. Previous studies have focused on evaluating the LoC for Roll-back Recovery with Checkpointing (RRC) with an equidistant distribution of the checkpoints. However, no studies have addressed the problem of evaluating the LoC for a non-equidistant distribution of the checkpoints. In this work, we provide an expression to evaluate the LoC for a non-equidistant checkpointing scheme, and propose a method, i.e. Clustered Checkpointing, to distribute a given number of checkpoints with the goal to maximize the LoC. The results show that the LoC can be improved when a non-equidistant checkpointing scheme is used.
Avdelning/ar
Publiceringsår
2016
Språk
Engelska
Dokumenttyp
Konferensbidrag
Ämne
- Electrical Engineering, Electronic Engineering, Information Engineering
Nyckelord
- soft errors
- fault tolerance
- checkpointing
- real-time systems
- reliability analysis
Conference name
21st Asia and South Pacific Design Automation Conference ASP-DAC
Conference date
2016-01-25
Conference place
Macao
Status
Inpress
Forskningsgrupp
- Digital ASIC