Evaluating job packing in warehouse-scale computing

Madhukar Korupolu
IEEE Cluster, Madrid, Spain (2014)
Google Scholar

Abstract

One of the key factors in selecting a good scheduling
algorithm is using an appropriate metric for comparing
schedulers. But which metric should be used when evaluating
schedulers for warehouse-scale (cloud) clusters, which have
machines of different types and sizes, heterogeneous workloads
with dependencies and constraints on task placement, and long-running
services that consume a large fraction of the total
resources? Traditional scheduler evaluations that focus on metrics
such as queuing delay, makespan, and running time fail to
capture important behaviors – and ones that rely on workload
synthesis and scaling often ignore important factors such as
constraints. This paper explains some of the complexities and
issues in evaluating warehouse scale schedulers, focusing on what
we find to be the single most important aspect in practice: how
well they pack long-running services into a cluster. We describe
and compare four metrics for evaluating the packing efficiency
of schedulers in increasing order of sophistication: aggregate
utilization, hole filling, workload inflation and cluster compaction.