Composable core-sets for diversity and coverage maximization

Piotr Indyk
Sepideh Mahabadi
PODS (2014), pp. 100-108

Abstract

In this paper we consider efficient construction of “composable
core-sets” for basic diversity and coverage maximization
problems. A core-set for a point-set in a metric space is a
subset of the point-set with the property that an approximate
solution to the whole point-set can be obtained given
the core-set alone. A composable core-set has the property
that for a collection of sets, the approximate solution to the
union of the sets in the collection can be obtained given the
union of the composable core-sets for the point sets in the
collection. Using composable core-sets one can obtain effi-
cient solutions to a wide variety of massive data processing
applications, including nearest neighbor search, streaming
algorithms and map-reduce computation.
Our main results are algorithms for constructing composable
core-sets for several notions of “diversity objective
functions”, a topic that attracted a significant amount of
research over the last few years. The composable core-sets
we construct are small and accurate: their approximation
factor almost matches that of the best “off-line” algorithms
for the relevant optimization problems (up to a constant
factor). Moreover, we also show applications of our results
to diverse nearest neighbor search, streaming algorithms and
map-reduce computation. Finally, we show that for an alternative
notion of diversity maximization based on the maximum
coverage problem small composable core-sets do not
exist.