
Subho Sankar Banerjee
Subho is a software engineer at Google where he focuses on identifying, triaging, and mitigating silent data corruption in the company's datacenters. His research interests include applying machine learning to improve system reliability and performance. He holds a PhD in Computer Science from the University of Illinois at Urbana-Champaign.
Authored Publications
Sort By
Silent Data Corruption by 10× Test Escapes Threatens Reliable Computing
Subhasish Mitra
Rama Govindaraju
Eric Liu
Mike Fuller
IEEE Design & Test (2025) (to appear)
Preview abstract
Summary:
Silent Data Corruption by 10x Test Escapes Threatens Reliable Computing" highlights a critical issue: manufacturing defects, dubbed "test escapes," are evading current testing methods at an alarming rate, ten times higher than industry targets. These defects lead to Silent Data Corruption (SDC), where applications produce incorrect outputs without error indications, costing companies significantly in debugging, data recovery, and service disruptions. The paper proposes a three-pronged approach: quick diagnosis of defective chips directly from system-level behaviors, in-field detection using advanced testing and error detection techniques like CASP, and new, rigorous test experiments to validate these solutions and improve manufacturing testing practices.
View details