Tiya Tiyasirichokchai

Google Research
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    TRINDs: Assessing the Diagnostic Capabilities of Large Language Models for Tropical and Infectious Diseases
    Nenad Tomašev
    Chintan Ghate
    Steve Adudans
    Oluwatosin Akande
    Sylvanus Aitkins
    Geoffrey Siwo
    Lynda Osadebe
    Eric Ndombi
    2024
    Preview abstract Neglected tropical diseases (NTDs) and infectious diseases disproportionately affect the poorest regions of the world. While large language models (LLMs) have shown promise for medical question answering, there is limited work focused on tropical and infectious disease-specific explorations. We introduce TRINDs, a dataset of 52 tropical and infectious diseases with demographic and semantic clinical and consumer augmentations. We evaluate various context and counterfactual locations to understand their influence on LLM performance. Results show that LLMs perform best when provided with contextual information such as demographics, location, and symptoms. We also develop TRINDs-LM, a tool that enables users to enter symptoms and contextual information to receive a most likely diagnosis. In addition to the LLM evaluations, we also conducted a human expert baseline study to assess the accuracy of human experts in diagnosing tropical and infectious diseases with 7 medical and public health experts. This work demonstrates methods for creating and evaluating datasets for testing and optimizing LLMs, and the use of a tool that could improve digital diagnosis and tracking of NTDs. View details
    Creating an Empirical Dermatology Dataset Through Crowdsourcing With Web Search Advertisements
    Abbi Ward
    Jimmy Li
    Julie Wang
    Sriram Lakshminarasimhan
    Ashley Carrick
    Jay Hartford
    Pradeep Kumar S
    Sunny Virmani
    Renee Wong
    Margaret Ann Smith
    Dawn Siegel
    Steven Lin
    Justin Ko
    JAMA Network Open (2024)
    Preview abstract Importance: Health datasets from clinical sources do not reflect the breadth and diversity of disease, impacting research, medical education, and artificial intelligence tool development. Assessments of novel crowdsourcing methods to create health datasets are needed. Objective: To evaluate if web search advertisements (ads) are effective at creating a diverse and representative dermatology image dataset. Design, Setting, and Participants: This prospective observational survey study, conducted from March to November 2023, used Google Search ads to invite internet users in the US to contribute images of dermatology conditions with demographic and symptom information to the Skin Condition Image Network (SCIN) open access dataset. Ads were displayed against dermatology-related search queries on mobile devices, inviting contributions from adults after a digital informed consent process. Contributions were filtered for image safety and measures were taken to protect privacy. Data analysis occurred January to February 2024. Exposure: Dermatologist condition labels as well as estimated Fitzpatrick Skin Type (eFST) and estimated Monk Skin Tone (eMST) labels. Main Outcomes and Measures: The primary metrics of interest were the number, quality, demographic diversity, and distribution of clinical conditions in the crowdsourced contributions. Spearman rank order correlation was used for all correlation analyses, and the χ2 test was used to analyze differences between SCIN contributor demographics and the US census. Results: In total, 5749 submissions were received, with a median of 22 (14-30) per day. Of these, 5631 (97.9%) were genuine images of dermatological conditions. Among contributors with self-reported demographic information, female contributors (1732 of 2596 contributors [66.7%]) and younger contributors (1329 of 2556 contributors [52.0%] aged <40 years) had a higher representation in the dataset compared with the US population. Of 2614 contributors who reported race and ethnicity, 852 (32.6%) reported a racial or ethnic identity other than White. Dermatologist confidence in assigning a differential diagnosis increased with the number of self-reported demographic and skin-condition–related variables (Spearman R = 0.1537; P < .001). Of 4019 contributions reporting duration since onset, 2170 (54.0%) reported onset within less than 7 days of submission. Of the 2835 contributions that could be assigned a dermatological differential diagnosis, 2523 (89.0%) were allergic, infectious, or inflammatory conditions. eFST and eMST distributions reflected the geographical origin of the dataset. Conclusions and Relevance: The findings of this survey study suggest that search ads are effective at crowdsourcing dermatology images and could therefore be a useful method to create health datasets. The SCIN dataset bridges important gaps in the availability of images of common, short-duration skin conditions. View details
    A mobile-optimized artificial intelligence system for gestational age and fetal malpresentation assessment
    Ryan Gomes
    Bellington Vwalika
    Chace Lee
    Angelica Willis
    Joan T. Price
    Christina Chen
    Margaret P. Kasaro
    James A. Taylor
    Elizabeth M. Stringer
    Scott Mayer McKinney
    Ntazana Sindano
    William Goodnight, III
    Justin Gilmer
    Benjamin H. Chi
    Charles Lau
    Terry Spitz
    Kris Liu
    Jonny Wong
    Rory Pilgrim
    Akib Uddin
    Lily Hao Yi Peng
    Kat Chou
    Jeffrey S. A. Stringer
    Shravya Ramesh Shetty
    Communications Medicine (2022)
    Preview abstract Background Fetal ultrasound is an important component of antenatal care, but shortage of adequately trained healthcare workers has limited its adoption in low-to-middle-income countries. This study investigated the use of artificial intelligence for fetal ultrasound in under-resourced settings. Methods Blind sweep ultrasounds, consisting of six freehand ultrasound sweeps, were collected by sonographers in the USA and Zambia, and novice operators in Zambia. We developed artificial intelligence (AI) models that used blind sweeps to predict gestational age (GA) and fetal malpresentation. AI GA estimates and standard fetal biometry estimates were compared to a previously established ground truth, and evaluated for difference in absolute error. Fetal malpresentation (non-cephalic vs cephalic) was compared to sonographer assessment. On-device AI model run-times were benchmarked on Android mobile phones. Results Here we show that GA estimation accuracy of the AI model is non-inferior to standard fetal biometry estimates (error difference -1.4 ± 4.5 days, 95% CI -1.8, -0.9, n=406). Non-inferiority is maintained when blind sweeps are acquired by novice operators performing only two of six sweep motion types. Fetal malpresentation AUC-ROC is 0.977 (95% CI, 0.949, 1.00, n=613), sonographers and novices have similar AUC-ROC. Software run-times on mobile phones for both diagnostic models are less than 3 seconds after completion of a sweep. Conclusions The gestational age model is non-inferior to the clinical standard and the fetal malpresentation model has high AUC-ROCs across operators and devices. Our AI models are able to run on-device, without internet connectivity, and provide feedback scores to assist in upleveling the capabilities of lightly trained ultrasound operators in low resource settings. View details