Economics and Electronic Commerce

Google is a global leader in electronic commerce. Not surprisingly, considerable attention is devoted to research in this area. Topics include 1) auction design, 2) advertising effectiveness, 3) statistical methods, 4) forecasting and prediction, 5) survey research, 6) policy analysis and a host of other topics. This research involves interdisciplinary collaboration among computer scientists, economists, statisticians, and analytic marketing researchers both at Google and academic institutions around the world.

A major challenge is solving these problems at very large scales. For example, the advertising market has billions of transactions daily, spread across millions of advertisers, presenting a unique opportunity to test and refine economic principles as applied to a very large number of interacting, self-interested parties with a myriad of objectives. At Google, research is an opportunity to translate direction into practice, influencing how production systems are designed and used.

Recent Publications

Preview abstract Effective model calibration is a critical and indispensable component in developing Media Mix Models (MMMs). One advantage of Bayesian-based MMMs lies in their capacity to accommodate the information from experiment results and the modelers' domain knowledge about the ad effectiveness by setting priors for the model parameters. However, it remains ambiguous about how and which Bayesian priors should be tuned for calibration purpose. In this paper, we propose a new calibration method through model reparameterization. The reparameterized model includes Return on Ads Spend (ROAS) as a model parameter, enabling straightforward adjustment of its prior distribution to align with either experiment results or the modeler's prior knowledge. The proposed method also helps address several key challenges regarding combining MMMs and incrementality experiments. We use simulations to demonstrate that our approach can significantly reduce the bias and uncertainty in the resultant posterior ROAS estimates. View details
Data Exchange Markets via Utility Balancing
Aditya Bhaskara
Sungjin Im
Kamesh Munagala
Govind S. Sankar
WebConf (2024)
Preview abstract This paper explores the design of a balanced data-sharing marketplace for entities with heterogeneous datasets and machine learning models that they seek to refine using data from other agents. The goal of the marketplace is to encourage participation for data sharing in the presence of such heterogeneity. Our market design approach for data sharing focuses on interim utility balance, where participants contribute and receive equitable utility from refinement of their models. We present such a market model for which we study computational complexity, solution existence, and approximation algorithms for welfare maximization and core stability. We finally support our theoretical insights with simulations on a mean estimation task inspired by road traffic delay estimation. View details
Modeling Recommender Ecosystems: Research Challenges at the Intersection of Mechanism Design, Reinforcement Learning and Generative Models
Martin Mladenov
Proceedings of the 38th Annual AAAI Conference on Artificial Intelligence (AAAI-24), Vancouver (2024) (to appear)
Preview abstract Modern recommender systems lie at the heart of complex ecosystems that couple the behavior of users, content providers, advertisers, and other actors. Despite this, the focus of the majority of recommender research---and most practical recommenders of any import---is on the \emph{local, myopic} optimization of the recommendations made to individual users. This comes at a significant cost to the \emph{long-term utility} that recommenders could generate for its users. We argue that explicitly modeling the incentives and behaviors of all actors in the system---and the interactions among them induced by the recommender's policy---is strictly necessary if one is to maximize the value the system brings to these actors and improve overall ecosystem ``health.'' Doing so requires: optimization over long horizons using techniques such as \emph{reinforcement learning}; making inevitable tradeoffs among the utility that can be generated for different actors using the methods of \emph{social choice}; reducing information asymmetry, while accounting for incentives and strategic behavior, using the tools of \emph{mechanism design}; better modeling of both user and item-provider behaviors by incorporating notions from \emph{behavioral economics and psychology}; and exploiting recent advances in \emph{generative and foundation models} to make these mechanisms interpretable and actionable. We propose a conceptual framework that encompasses these elements, and articulate a number of research challenges that emerge at the intersection of these different disciplines. View details
Preview abstract Reach and frequency (R&F) is a core lever in the execution of ad campaigns, but it is not widely captured in the marketing mix models (MMMs) being fitted today due to the unavailability of accurate R&F metrics for some traditional media channels. Current practice usually uses impressions aggregated at regional level as inputs for MMMs, which does not take into account the fact that individuals can be exposed to an advertisement multiple times, and that the impact of an advertisement on an individual can change based on the number of times they are exposed. To address this limitation, we propose a R&F MMM which is an extension to Geo-level Bayesian Hierarchical Media Mix Modeling (GBHMMM) and is applicable when R&F data is available for at least one media channel. By incorporating R&F into MMM models, the new methodology is shown to produce more accurate estimates of the impact of marketing on business outcomes, and helps users optimize their campaign execution based on optimal frequency recommendations. View details
Preview abstract We study the design of loss functions for click-through rates (CTR) to optimize (social) welfare in ad auctions. Existing works either only focus on CTR predictions without consideration of business objectives (e.g. welfare) in auctions or assume that the distribution over the participants' expected cost-per-impression (eCPM) is known a priori and uses various additional assumptions on the parametric form of the distribution to derive loss functions for predicting CTRs. In this work, we bring back the welfare objectives of ad auctions into CTR predictions and propose a novel weighted rankloss to train CTR model. Compared with the previous literature, our approach provides a provable guarantee on welfare but without any assumptions on the eCPMs' distribution, while also avoiding the intractability of naively applying existing learning-to-rank methods. We further propose a theoretically justifiable technique for calibrating the losses using labels generated from a teacher network, only assuming that the teacher network has bounded $\ell_2$ generalization error. Finally, we demonstrate the practical advantages of the proposed loss on both synthetic and real-world data. View details
Subset-Reach Estimation in Cross-Media Measurement
Jiayu Peng
Rieman Li
Ying Liu
na, na, na (2023), n/a
Preview abstract We propose two novel approaches to address a critical problem of reach measurement across multiple media -- how to estimate the reach of an unobserved subset of buying groups (BGs) based on the observed reach of other subsets of BGs. Specifically, we propose a model-free approach and a model-based approach. The former provides a coarse estimate for the reach of any subset by leveraging the consistency among the reach of different subsets. Linear programming is used to capture the constraints of the reach consistency. This produces an upper and a lower bound for the reach of any subset. The latter provides a point estimate for the reach of any subset. The key idea behind the latter is to exploit the conditional independence model. In particular, the groups of the model are created by assuming each BG has either high or low reach probability in a group, and the weights of each group are determined through solving a non-negative least squares (NNLS) problem. In addition, we also provide a framework to give both confidence interval and point estimates by integrating these two approaches with training points selection and parameter fine-tuning through cross-validation. Finally, we evaluate the two approaches through experiments on synthetic data. View details