
Collin Green
Collin is the User Experience Research lead and manager of the Engineering Productivity Research team within Developer Intelligence. The Engineering Productivity Research team brings a data-driven approach to business decisions around engineering productivity. They use a combination of qualitative and quantitative methods to triangulate on measuring productivity. Collin received his Ph.D. in Cognitive Psychology from the University of California-Los Angeles.
Research Areas
Authored Publications
Sort By
Preview abstract
Measuring productivity is equivalent to building a model. All models are wrong, but some are useful. Productivity models are often “worryingly selective” (wrong because of omissions). Worrying selectivity can be combated by taking a holistic approach that includes multiple measurements of multiple outcomes. Productivity models should include multiple outcomes, metrics, and methods.
View details
Preview abstract
AI-powered software development tooling is changing the way that developers interact with tools and write code. However, the ability for AI to truly transform software development depends on developers' level of trust in the tools. In this work, we take a mixed methods approach to measuring the factors that influence developers' trust in AI-powered code completion. We identified that familiarity with AI suggestions, quality of the suggestion, and level of expertise with the language all increased acceptance rate of AI-powered suggestions. While suggestion length and presence in a test file decreased acceptance rates. Based on these findings we propose recommendations for the design of AI-powered development tools to improve trust.
View details
Preview abstract
This is the seventh installment of the Developer Productivity for Humans column. This installment focuses on software quality: what it means, how developers see it, how we break it down into 4 types of quality, and the impact these have on each other.
View details
Measuring Developer Experience with a Longitudinal Survey
Jessica Lin
Jill Dicker
IEEE Software (2024)
Preview abstract
At Google, we’ve been running a quarterly large-scale survey with developers since 2018. In this article, we will discuss how we run EngSat, some of our key learnings over the past 6 years, and how we’ve evolved our approach to meet new needs and challenges.
View details
Using Logs Data to Identify When Software Engineers Experience Flow or Focused Work
Ben Holtz
ACM CHI Conference on Human Factors in Computing Systems (2023) (to appear)
Preview abstract
Beyond self-report data, we lack reliable and non-intrusive methods for identifying flow. However, taking a step back and acknowledging that flow occurs during periods of focus gives us the opportunity to make progress towards measuring flow by isolating focused work. Here, we take a mixed-methods approach to design a logs based metric that leverages machine learning and a comprehensive collection of logs data to identify periods of related actions (indicating focus), and validate this metric against self-reported time in focus or flow using diary data and quarterly survey data. Our results indicate that we can determine when software engineers at a large technology company experience focused work which includes instances of flow. This metric speaks to engineering work, but can be leveraged in other domains to non-disruptively measure when people experience focus. Future research can build upon this work to identify signals associated with other facets of flow.
View details
Systemic Gender Inequities in Who Reviews Code
Emerson Murphy-Hill
Jill Dicker
Amber Horvath
Laurie R. Weingart
Nina Chen
Computer Supported Cooperative Work (2023) (to appear)
Preview abstract
Code review is an essential task for modern software engineers, where the author of a code change assigns other engineers the task of providing feedback on the author’s code. In this paper, we investigate the task of code review through the lens of equity, the proposition that engineers should share reviewing responsibilities fairly. Through this lens, we quantitatively examine gender inequities in code review load at Google. We found that, on average, women perform about 25% fewer reviews than men, an inequity with multiple systemic antecedents, including authors’ tendency to choose men as reviewers, a recommender system’s amplification of human biases, and gender differences in how reviewer credentials are assigned and earned. Although substantial work remains to close the review load gap, we show how one small change has begun to do so.
View details
Developer Productivity for Humans, Part 5: Onboarding and Ramp-Up
Lanting He
Demei Shen
Nan Zhang
IEEE Software, 40 (2023), pp. 13-19
Preview abstract
In this installment of our column, we’ll describe some recent research on onboarding software developers, including some of the work that we’ve done with colleagues at Google to understand and measure developer onboarding and ramp-up at Google.
View details
Preview abstract
Measuring the productivity of software developers is inherently difficult; it requires measuring humans doing a complex, creative task. They are affected by both technological and sociological aspects of their job, and these need to be evaluated in concert to deeply understand developer productivity.
View details
Preview abstract
In this installment of the column, we discuss technical debt as a prime example of an entangled human and technical problem. At Google, we sought to better understand technical debt, to measure it, and to start managing it better.
View details
Developer Productivity for Humans, Part 4: Build Latency, Predictability, and Developer Productivity
Preview abstract
In this article, we explore how developers react to build latency, both in terms of the absolute speed and in terms of the predictability of their builds.
View details