Qiao Ma

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Software development is a team sport
    Jie Chen
    Alison Chang
    Rayven Plaza
    Marie Huber
    Claire Taylor
    IEEE Software (2025)
    Preview abstract In this article, we describe our human-centered research focused on understanding the role of collaboration and teamwork in productive software development. We describe creation of a logs-based metric to identify collaboration through observable events and a survey-based multi-item scale to assess team functioning. View details
    Test-retest reliability of four U.S. non-probability sample sources
    Mario Callegaro
    Inna Tsirlin
    American Association for Public Opinion Research (2022)
    Preview abstract It is a common practice in market research to set up cross sectional survey trackers. Although many studies have investigated the accuracy of non-probability-based online samples, less is known about their test-retest reliability which is of key importance for such trackers. In this study, we wanted to assess how stable measurement is over short periods of time so that any changes observed over long periods in survey trackers could be attributed to true changes in sentiment rather than sample artifacts. To achieve this, we repeated the same 10-question survey of 1,500 respondents two weeks apart in four different U.S. non-probability-based samples. The samples included: Qualtrics panels representing a typical non-probability-based online panel, Google Surveys representing a river sampling approach, Google Opinion Rewards representing a mobile panel, and Amazon MTurk, not a survey panel in itself but de facto used as such in academic research. To quantify test-retest reliability, we compared the response distributions from the two survey administrations. Given the attitudes measured were not expected to change in a short timespan and no relevant external events were reported during fielding to potentially affect the attitudes, the assumption was that the two measurements should be very close to each other, aside from transient measurement error. We found two of the samples produced remarkably consistent results between the two survey administrations, one sample was less consistent, and the fourth sample had significantly different response distributions for three of the four attitudinal questions. This study sheds light on the suitability of different non-probability-based samples for cross sectional attitude tracking. It is a common practice in market research to set up cross sectional survey trackers. Although many studies have investigated the accuracy of non-probability-based online samples, less is known about their test-retest reliability which is of key importance for such trackers. In this study, we wanted to assess how stable measurement is over short periods of time so that any changes observed over long periods in survey trackers could be attributed to true changes in sentiment rather than sample artifacts. To achieve this, we repeated the same 10-question survey of 1,500 respondents two weeks apart in four different U.S. non-probability-based samples. The samples included: Qualtrics panels representing a typical non-probability-based online panel, Google Surveys representing a river sampling approach, Google Opinion Rewards representing a mobile panel, and Amazon MTurk, not a survey panel in itself but de facto used as such in academic research. To quantify test-retest reliability, we compared the response distributions from the two survey administrations. Given the attitudes measured were not expected to change in a short timespan and no relevant external events were reported during fielding to potentially affect the attitudes, the assumption was that the two measurements should be very close to each other, aside from transient measurement error. We found two of the samples produced remarkably consistent results between the two survey administrations, one sample was less consistent, and the fourth sample had significantly different response distributions for three of the four attitudinal questions. This study sheds light on the suitability of different non-probability-based samples for cross sectional attitude tracking. View details