Private Intersection-Sum Protocols with Applications to Attributing Aggregate Ad Conversions

Mihaela Ion
Benjamin Kreuter
Erhan Nergiz
Sarvar Patel
Shobhit Saxena
David Shanahan
2020 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 370-389

Abstract

In this work, we discuss our successful efforts for
industry deployment of a cryptographic secure computation
protocol. The problem we consider is privately computing aggregate conversion rate of advertising campaigns.
This underlying functionality can be abstracted as Private
Intersection-Sum (PI-Sum) with Cardinality. In this setting
two parties hold datasets containing user identifiers, and one
of the parties additionally has an integer value associated
with each of its user identifiers. The parties want to learn
the number of identifiers they have in common and the sum
of the integer values associated with these users without
revealing any more information about their private inputs.
We identify the major properties and enabling factors
which make the deployment of a cryptographic protocol
possible, practical, and uniquely positioned as a solution for
the task at hand. We describe our deployment setting and
the most relevant efficiency measure, which in our setting is
communication overhead rather than computation. We also
present a monetary cost model that can be used as a unifying
cost measure and the computation model which reflect out
use-case: a low-priority batch computing.
We present three PI-Sum with cardinality protocols: our
currently deployed protocol, which relies on a Diffie-Hellman
style double masking, and two new protocols which leverage
more recent techniques for private set intersection (PSI) that
use Random Oblivious Transfer and encrypted Bloom filters.
We compare the later two protocol with our original solution
when instantiated with different additively homomorphic
encryption schemes. We implement our constructions and
compare their costs. We also compare with recent generic
approaches for computing on the intersection of two datasets
and show that our best protocol has monetary cost that is
20× less than the best known generic approach.