Erin MacMurray van Liemt

I am a Senior Sociotechnical Researcher at Google Research based in the Los Angeles area and have served as both a researcher and linguist at Google for ~10 years. My area of focus is on building evidence-based community-informed datasets, namely culturally and linguistically diverse ontologies for plualistic AI. In 2020, I was a Google Fellow as part of the Trevor Project Fellowship for the award winning Crisis Contact Simulator, featured as the Time's 100 Best Inventions of 2021. I later applied this work in a second Google.org Fellowship in partnership with Reflex AI in 2023, where I crafted datasets to support peer-to-peer empathy conversation simulators for the Veteran community. Prior to Google Research, I worked on ontology design for Google’s Knowledge Graph and text/image classification on Ads Privacy and Safety teams.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Socially Responsible Data for Large Multilingual Language Models
    Zara Wudiri
    Mbangula Lameck Amugongo
    Alex
    Stanley Uwakwe
    João Sedoc
    Edem Wornyo
    Seyi Olojo
    Amber Ebinama
    Suzanne Dikker
    2024
    Preview abstract Large Language Models (LLMs) have rapidly increased in size and apparent capabilities in the last three years but their training data is largely English text. There is growing interest in language inclusivity in LLMs, and various efforts are striving for models to accommodate language communities outside of the Global North1 , which include many languages that have been historically underrepresented digitally. These languages have been coined as “low resource languages” or “long tail languages”, and LLMs performance on these languages is generally poor. While expanding the use of LLMs to more languages may bring many potential benefits, such as assisting cross-community communication and language preservation, great care must be taken to ensure that data collection on these languages is not extractive and that it does not reproduce exploitative practices of the past. Collecting data from languages spoken by previously colonized people, indigenous people, and non-Western languages raises many complex sociopolitical and ethical questions, e.g., around consent, cultural safety, and data sovereignty. Furthermore, linguistic complexity and cultural nuances are often lost in LLMs. This position paper builds on recent scholarship, and our own work, and outlines several relevant social, cultural, and ethical considerations and potential ways to mitigate them through qualitative research, community partnerships and participatory design approaches. We provide twelve recommendations for consideration when collecting language data on underrepresented language communities outside of the Global North. View details
    Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation
    Jessica Quaye
    Oana Inel
    Charvi Rastogi
    Hannah Kirk
    Minsuk Kahng
    Max Bartolo
    Jay Tsang
    Justin White
    Nathan Clement
    Vijay Janapa Reddi
    Rafael Mosquera
    Juan Ciro
    2024
    Preview abstract With text-to-image (T2I) generative AI models reaching wide audiences, it is critical to evaluate model robustness against non-obvious attacks to mitigate the generation of offensive images. By focusing on “implicitly adversarial” prompts (those that trigger T2I models to generate unsafe images for non-obvious reasons), we isolate a set of difficult safety issues that human creativity is well-suited to uncover. To this end, we built the Adversarial Nibbler Challenge, a red-teaming methodology for crowdsourcing a diverse set of implicitly adversarial prompts. We have assembled a suite of state-of-the-art T2I models, employed a simple user interface to identify and annotate harms, and engaged diverse populations to capture long-tail safety issues that may be overlooked in standard testing. We present an in-depth account of our methodology, a systematic study of novel attack strategies and safety failures, and a visualization tool for easy exploration of the dataset. The first challenge round resulted in over 10k prompt-image pairs with machine annotations for safety. A subset of 1.5k samples contains rich human annotations of harm types and attack styles. Our findings emphasize the necessity of continual auditing and adaptation as new vulnerabilities emerge. This work will enable proactive, iterative safety assessments and promote responsible development of T2I models. View details
    ×