[Seminar] Compiler Directed Lightweight Soft Error Resilience
Computer Science at Virginia Tech
■ 호스트 : 이재진 교수 (x1863,02-880-1863)
In this talk, I will present Clover, a compiler directed soft error detection and recovery scheme for lightweight soft error resilience. The compiler carefully generates soft error tolerant code based on idempotent processing without explicit checkpoint. During program execution, Clover relies on a small number of acoustic wave detectors deployed in the processor to identify soft errors by sensing the wave made by a particle strike. To cope with DUE (detected unrecoverable errors) caused by the sensing latency of error detection, Clover leverages a novel selective instruction duplication technique called tail-DMR (dual modular redundancy). Once a soft error is detected by either the sensor or the tail-DMR, Clover takes care of the error as in the case of exception handling. To recover from the error, Clover simply redirects program control to the beginning of the code region where the error is detected. The experiment results demonstrate that the average runtime. overhead is only 26%, which is a 75% reduction compared to that of the state-of-the-art soft error resilience technique
Changhee Jung is an Assistant Professor in Computer Science at Virginia Tech. His research interests include compilers, computer architectures, software engineering, and dependable systems. His work has appeared in top conferences such as MICRO, PLDI, ICSE, ASPLOS, PPOPP, and SC (Best Student Paper Finalist, 2016). He received Google Faculty Research Award (2015) and the Silver Prize in the SAMSUNG HumanTech Thesis Competition (2005). Changhee received his PhD degree in Computer Science from Georgia Tech in 2013. During the three summers between 2010 and 2012, he worked as a software engineering intern with the compiler optimization team at Google. From 2005 to 2008, he was a member of the research staff at ETRI (Electronics and Telecommunications Research Institute), Korea.