Prasanna Venkatesh Rengasamy

Prasanna Venkatesh Rengasamy

Hi! Welcome to my page.

My current work focuses on machine learning performance profiling, optimization, and hardware architecture.

Previously, my research centered on optimizing hardware-software architectures. I approached this by analyzing workload behaviors through advanced simulation and tracing, specifically targeting CPU and system caching, execution pipelines, and GPGPU memory optimizations. Prior to my current role, I also contributed to the development of Apple Silicon chips for various Apple products.


Education
  • Ph.D. in Computer Science and Engineering — Penn State University
  • M.S. in Computer Science and Engineering — Indian Institute of Technology (IIT) Madras, India
  • B.Tech. in Computer Science and Engineering — SASTRA University, India
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    XProf: An Open, Scalable and Extensible Profiling System for the Modern ML Stack
    Naveen Kumar
    Jose Baiocchi Paredes
    Scott Goodson
    Kelvin Le
    Yin Zhang
    Kan Cai
    Jiten Thakkar
    Sai Ganesh Bandiatmakuri
    Yogesh SY
    Ani Udipi
    Vikas Aggarwal
    Ninth Conference on Machine Learning and Systems (2026)
    Preview abstract Optimizing Large Models across thousands of accelerators requires deep system expertise. To address modern machine learning (ML) optimization needs, we present XProf, the ML profiler for the OpenXLA ecosystem. XProf delivers actionable optimization suggestions and in-depth performance analysis, empowering ML researchers and framework users to improve efficiency without specialized systems knowledge. XProf provides a unified, full-stack view of both host (CPU) and device (accelerator - TPUs/GPUs) performance, leveraging tools like the Roofline Model for comprehensive analysis. XProf’s distributed architecture is designed to monitor thousands of chips with minimal workload overhead (<1%). This architecture is made pluggable through the open-source PJRT C API extension, which has facilitated its adoption by third-party accelerator vendors. XProf has been instrumental in achieving significant efficiency gains at Google and winning MLPerf submissions. This paper presents the design and architecture of XProf, showcases its differentiating tools and capabilities, and highlights its impact within Google and across the industry as a state of the art ML profiler. XProf is available as part of the OpenXLA project at https://github.com/openxla/xprof. View details
    ×