
Rob Springer
Authored Publications
Sort By
A General Purpose Transpiler for Fully Homomorphic Encryption
Shruthi Gorantala
Sean Purser-Haskell
Asra Ali
Eric P. Astor
Itai Zukerman
Sam Ruth
Phillipp Schoppmann
Sasha Kulankhina
Alain Forget
David Marn
Cameron Tew
Rafael Misoczki
Bernat Guillen
Xinyu Ye
Damien Desfontaines
Aishe Krishnamurthy
Miguel Guevara
Yurii Sushko
Google LLC (2021)
Preview abstract
Fully homomorphic encryption (FHE) is an encryption scheme which enables computation on encrypted data without revealing the underlying data. While there have been many advances in the field of FHE, developing programs using FHE still requires expertise in cryptography. In this white paper, we present a fully homomorphic encryption transpiler that allows developers to convert high-level code (e.g., C++) that works on unencrypted data into high-level code that operates on encrypted data. Thus, our transpiler makes transformations possible on encrypted data.
Our transpiler builds on Google's open-source XLS SDK (https://github.com/google/xls) and uses an off-the-shelf FHE library, TFHE (https://tfhe.github.io/tfhe/), to perform low-level FHE operations. The transpiler design is modular, which means the underlying FHE library as well as the high-level input and output languages can vary. This modularity will help accelerate FHE research by providing an easy way to compare arbitrary programs in different FHE schemes side-by-side. We hope this lays the groundwork for eventual easy adoption of FHE by software developers. As a proof-of-concept, we are releasing an experimental transpiler (https://github.com/google/fully-homomorphic-encryption/tree/main/transpiler) as open-source software.
View details
Warehouse-Scale Video Acceleration: Co-design and Deployment in the Wild
Danner Stodolsky
Jeff Calow
Jeremy Dorfman
Clint Smullen
Aki Kuusela
Aaron James Laursen
Alex Ramirez
Alvin Adrian Wijaya
Amir Salek
Anna Cheung
Ben Gelb
Brian Fosco
Cho Mon Kyaw
Dake He
David Alexander Munday
David Wickeraad
Devin Persaud
Don Stark
Drew Walton
Elisha Indupalli
Fong Lou
Hon Kwan Wu
In Suk Chong
Indira Jayaram
Jia Feng
JP Maaninen
Kyle Alan Lucke
Maire Mahony
Mark Steven Wachsler
Mercedes Tan
Narayana Penukonda
Niranjani Dasharathi
Poonacha Kongetira
Prakash Chauhan
Raghuraman Balasubramanian
Ramon Macias
Richard Ho
Roy W Huffman
Sandeep Bhatia
Sarah J. Gwin
Sathish K Sekar
Srikanth Muroor
Ville-Mikko Rautio
Yolanda Ripley
Yoshiaki Hase
Yuan Li
Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery, New York, NY, USA (2021), pp. 600-615
Preview abstract
Video sharing (e.g., YouTube, Vimeo, Facebook, TikTok) accounts for the majority of internet traffic, and video processing is also foundational to several other key workloads (video conferencing, virtual/augmented reality, cloud gaming, video in Internet-of-Things devices, etc.). The importance of these workloads motivates larger video processing infrastructures and – with the slowing of Moore’s law – specialized hardware accelerators to deliver more computing at higher efficiencies. This paper describes the design and deployment, at scale, of a new accelerator targeted at warehouse-scale video transcoding. We present our hardware design including a new accelerator building block – the video coding unit (VCU) – and discuss key design trade-offs for balanced systems at data center scale and co-designing accelerators with large-scale distributed software systems. We evaluate these accelerators “in the wild" serving live data center jobs, demonstrating 20-33x improved efficiency over our prior well-tuned non-accelerated baseline. Our design also enables effective adaptation to changing bottlenecks and improved failure management, and new workload capabilities not otherwise possible with prior systems. To the best of our knowledge, this is the first work to discuss video acceleration at scale in large warehouse-scale environments.
View details
GPUCC - An Open-Source GPGPU Compiler
Jingyue Wu
Mark Heffernan
Chris Leary
Bjarke Roune
Xuetian Weng
Proceedings of the 2016 International Symposium on Code Generation and Optimization, ACM, New York, NY, pp. 105-116
Preview abstract
Graphics Processing Units have emerged as powerful accelerators for massively parallel, numerically intensive workloads. The two dominant software models for these devices are NVIDIA’s CUDA and the cross-platform OpenCL standard. Until now, there has not been a fully open-source compiler targeting the CUDA environment, hampering general compiler and architecture research and making deployment difficult in datacenter or supercomputer environments. In this paper, we present gpucc, an LLVM-based, fully open-source, CUDA compatible compiler for high performance computing. It performs various general and CUDA-specific optimizations to generate high performance code. The Clang-based frontend supports modern language features such as those in C++11 and C++14. Compile time is 8% faster than NVIDIA’s toolchain (nvcc) and it reduces compile time by up to 2.4x for pathological compilations (>100 secs), which tend to dominate build times in parallel build environments. Compared to nvcc, gpucc’s runtime performance is on par for several open-source benchmarks, such as Rodinia (0.8% faster), SHOC (0.5% slower), or Tensor (3.7% faster). It outperforms nvcc on internal large-scale end-to-end benchmarks by up to 51.0%, with a geometric mean of 22.9%.
View details