Aart J.C. Bik
Aart J.C. Bik received his PhD degree from Leiden University in 1996 (his sparse compiler research received the C.J. Kok Outstanding Thesis Award). He was a Principal Engineer at Intel, where he was the lead compiler architect of automatic vectorization in the Intel C++/Fortran compilers (recognized with the Intel Achievement Award). In 2007, he moved to Google, where he has worked on various projects such as the large-scale graph processing system Pregel (awarded the 2020 ACM SIGMOD Test of Time Award), Google Glass, and the optimizing compiler for the Android Runtime. In November 2019, Aart joined the MLIR compiler team, where he revamped his passion for compiler support for sparse computations.
Authored Publications
Sort By
Structured Operations: Modular Design of Code Generators for Tensor Compilers
Nicolas Vasilache
Oleksandr Zinenko
Mahesh Ravishankar
Thomas Raoux
Alexander Belyaev
Matthias Springer
Tobias Gysi
Diego Caballero
Stephan Herhut
Stella Laurenzo
LCPC 2022, Springer (2023)
Preview abstract
The performance of machine learning systems heavily relies on code generators tailored to tensor computations.
We propose an approach to the design and implementation of such code generators leveraging the natural structure of tensor algebra and illustrating the progressive lowering of domain-specific abstractions in the MLIR infrastructure.
View details
Compiler Support for Sparse Tensor Computations in MLIR
Bixia Zheng
Fredrik Kjolstad
Nicolas Vasilache
Tatiana Shpeisman
ACM Transactions on Architecture and Code Optimization (2022) (to appear)
Preview abstract
Sparse tensors arise in problems in science, engineering, machine learning, and data analytics. Programs that operate on such tensors can exploit sparsity to reduce storage requirements and computational time. Developing and maintaining sparse software by hand, however, is a complex and error-prone task. Therefore, we propose to treat sparsity as a property, not a tedious implementation detail, and let a sparse compiler generate sparse code automatically from a sparsity-agnostic definition of the computation. This paper discusses the integration of this idea into MLIR.
View details
Compiler Support for Sparse Tensor Computations in MLIR
LLVM, https://llvm.swoogo.com/2021devmtg/ (2021)
Preview abstract
Sparse vectors, matrices, and their multidimensional generalization into tensors arise in many problems in science, engineering, machine learning, and data analytics. Software that operates on such tensors can exploit the sparsity to reduce both storage requirements and computational time by only storing and computing on nonzero elements. This exploitation comes at a cost, though, since developing and maintaining sparse software by hand is tedious and error-prone. Therefore, it makes sense to treat sparsity merely as a property, not a tedious implementation detail, and let the compiler generate sparse code automatically from a sparsity-agnostic definition of the computation. This idea was pioneered in the MT1 project for linear algebra and formalized to tensor algebra in the TACO (Sparse Tensor Algebra Compiler) project. In this technical talk, we discuss how compiler support for sparse tensor computations was added to MLIR (LLVM’s extensible infrastructure for building domain specific compilers). We discuss the concept of sparse tensor types as first class citizens and show how this simplifies the introduction of new front-ends and back-ends for systems that want to add sparse tensor support. We also show how MLIR can be used for rapid sparse library development, driven by either exhaustively searching for suitable sparse storage formats or using ML to find such formats quicker, or even for end-to-end solutions mapping sparse agnostic specification of kernels to efficient sparse code at runtime. Finally, we discuss how you can contribute to this new sparse tensor support in MLIR.
View details
Preview abstract
Because all modern general-purpose CPUs support small-scale SIMD instructions (typically between 64-bit and 512-bit), modern compilers are becoming progressively better at taking advantage of SIMD
instructions automatically, a translation often referred to as vectorization or SIMDization. Since the Android O release, the optimizing compiler of ART has joined the family of vectorizing compilers with the ability to translate bytecode into native SIMD code for the target Android device. This talk will discuss the general organization of the retargetable part of the vectorizer, which is capable of automatically finding and exploiting vector instructions in bytecode without committing to one of the target SIMD architectures (currently ARM NEON (advanced SIMD), x86 SSE, and MIPS SIMD
Architecture). Furthermore the talk will present particular details of deploying the vectorizing compiler on ARM platforms - its overall impact on performance, some ARM specific considerations and
optimizations - and also will give an update on Linaro ART team's SIMD-related activities.
View details
Pregel: a system for large-scale graph processing
Preview
Grzegorz Malewicz
Matthew H. Austern
James C. Dehnert
Ilan Horn
Grzegorz Czajkowski
Proceedings of the 2010 international conference on Management of data, ACM, New York, NY, USA, pp. 135-146
Pregel: A System for Large-Scale Graph Processing
Preview
Grzegorz Malewicz
Matthew H. Austern
James C. Dehnert
Ilan Horn
Grzegorz Czajkowski
28th ACM Symposium on Principles of Distributed Computing (2009), pp. 6-6