Learning to Live with Errors: Architectural Solutions for Memory Reliability at Extreme Scaling

SpeakerPrashant J. Nair
Organization Georgia Institute of Technology
Location136 Monteith Research Center (MRC)
Start Date February 20, 2017 10:15 AM
End Date February 20, 2017 11:30 AM

 Faculty candidate seminar.

High capacity and scalable memory systems play a vital role in enabling our desktop machines, smartphones, supercomputers, and pervasive technologies like Internet of Things (IoT). Traditionally, the industry has relied on technology scaling to build high capacity memory systems. Unfortunately, scaling memory cells into the sub-20nm regime tend to make them faulty. Even new memory technologies, that offer higher densities, display low reliability as they scale. To make matters worse, traditional solutions that improve reliability tend to incur impractical performance, power, and area overheads. My thesis showcases architectural techniques to enable reliable and scalable memory systems with negligible overheads. In this talk, I will discuss three low-cost architectural techniques to improve memory reliability.

First, I will talk about a cross-layer approach that tolerates scaling-related faults while incurring very low overheads. In this approach, faults are exposed to the architecture layer and a simple error-correction code (ECC) along with a data management mechanism is used to mitigate faults. Second, I will discuss a scheme called “XED” that can improve memory reliability 172x by exposing ECC within each memory die while keeping the memory interface unchanged. Third, I will discuss how we can improve the reliability of new memory technologies like STTRAM. I will show how we can use simple ECC and creatively apply skewed hashes with RAID-4 to improve reliability by 2000x. Overall, this talk will highlight how simple architectural techniques can enable scalable and reliable memories. I will also discuss how my ideas of memory reliability can be extended into the domains of IoT and Quantum Computing.

Prashant J. Nair is a Ph.D. candidate in Georgia Institute of Technology. His Ph.D. advisor is Professor Moinuddin Qureshi. Prashant received his MS (2011-2013) from Georgia Institute of Technology and his BE in Electronics Engineering (2005-2009) from the University of Mumbai. His research interests include reliability, performance, and power optimizations for memory systems, IoTs, and Quantum Computers. Prashant has authored and co-authored 9 papers in top-tier conference venues such as ISCA, MICRO, HPCA, ASPLOS and DSN. He also organized the “1st Memory Reliability Forum” (co-located with HPCA 2016) to highlight the importance of memory reliability to the wider architecture community. He served as the Submission Co-chair of MICRO 2015 and MICRO TOP-PICKS 2017. Prashant has also served on the External Review Committee of ISCA 2016. During his Ph.D., he interned at several leading industrial labs including AMD, Samsung, Intel, and IBM

  February 2017
Sun Mon Tues Wed Thu Fri Sat