Collin Green

Collin Green

Collin is the User Experience Research lead and manager of the Engineering Productivity Research team within Developer Intelligence. The Engineering Productivity Research team brings a data-driven approach to business decisions around engineering productivity. They use a combination of qualitative and quantitative methods to triangulate on measuring productivity. Collin received his Ph.D. in Cognitive Psychology from the University of California-Los Angeles.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Measuring productivity is equivalent to building a model. All models are wrong, but some are useful. Productivity models are often “worryingly selective” (wrong because of omissions). Worrying selectivity can be combated by taking a holistic approach that includes multiple measurements of multiple outcomes. Productivity models should include multiple outcomes, metrics, and methods. View details
    Preview abstract AI-powered software development tooling is changing the way that developers interact with tools and write code. However, the ability for AI to truly transform software development depends on developers' level of trust in the tools. In this work, we take a mixed methods approach to measuring the factors that influence developers' trust in AI-powered code completion. We identified that familiarity with AI suggestions, quality of the suggestion, and level of expertise with the language all increased acceptance rate of AI-powered suggestions. While suggestion length and presence in a test file decreased acceptance rates. Based on these findings we propose recommendations for the design of AI-powered development tools to improve trust. View details
    Preview abstract This is the seventh installment of the Developer Productivity for Humans column. This installment focuses on software quality: what it means, how developers see it, how we break it down into 4 types of quality, and the impact these have on each other. View details
    Preview abstract At Google, we’ve been running a quarterly large-scale survey with developers since 2018. In this article, we will discuss how we run EngSat, some of our key learnings over the past 6 years, and how we’ve evolved our approach to meet new needs and challenges. View details
    Preview abstract Beyond self-report data, we lack reliable and non-intrusive methods for identifying flow. However, taking a step back and acknowledging that flow occurs during periods of focus gives us the opportunity to make progress towards measuring flow by isolating focused work. Here, we take a mixed-methods approach to design a logs based metric that leverages machine learning and a comprehensive collection of logs data to identify periods of related actions (indicating focus), and validate this metric against self-reported time in focus or flow using diary data and quarterly survey data. Our results indicate that we can determine when software engineers at a large technology company experience focused work which includes instances of flow. This metric speaks to engineering work, but can be leveraged in other domains to non-disruptively measure when people experience focus. Future research can build upon this work to identify signals associated with other facets of flow. View details
    Systemic Gender Inequities in Who Reviews Code
    Emerson Murphy-Hill
    Jill Dicker
    Amber Horvath
    Laurie R. Weingart
    Nina Chen
    Computer Supported Cooperative Work (2023) (to appear)
    Preview abstract Code review is an essential task for modern software engineers, where the author of a code change assigns other engineers the task of providing feedback on the author’s code. In this paper, we investigate the task of code review through the lens of equity, the proposition that engineers should share reviewing responsibilities fairly. Through this lens, we quantitatively examine gender inequities in code review load at Google. We found that, on average, women perform about 25% fewer reviews than men, an inequity with multiple systemic antecedents, including authors’ tendency to choose men as reviewers, a recommender system’s amplification of human biases, and gender differences in how reviewer credentials are assigned and earned. Although substantial work remains to close the review load gap, we show how one small change has begun to do so. View details
    Preview abstract In this installment of our column, we’ll describe some recent research on onboarding software developers, including some of the work that we’ve done with colleagues at Google to understand and measure developer onboarding and ramp-up at Google. View details
    Preview abstract Measuring the productivity of software developers is inherently difficult; it requires measuring humans doing a complex, creative task. They are affected by both technological and sociological aspects of their job, and these need to be evaluated in concert to deeply understand developer productivity. View details
    Preview abstract In this installment of the column, we discuss technical debt as a prime example of an entangled human and technical problem. At Google, we sought to better understand technical debt, to measure it, and to start managing it better. View details
    Preview abstract In this article, we explore how developers react to build latency, both in terms of the absolute speed and in terms of the predictability of their builds. View details