Quang Anh Duong

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    LLM-based Lossless Text Simplification and its Effect on User Comprehension and Cognitive Load
    Theo Guidroz
    Diego Ardila
    Jimmy Li
    Adam Mansour
    Paul Jhun
    Nina Gonzalez
    Xiang Ji
    Mike Sanchez
    Miguel Ángel Garrido
    Divyansh Choudhary
    Jay Hartford
    Georgina Xu
    Henry Serrano
    Yifan Wang
    Jeff Shaffer
    Eric (Yifan) Cao
    Sho Fujiwara
    Peggy Bui
    arXiv (2025)
    Preview abstract Information on the web, such as scientific publications and Wikipedia, often surpasses users' reading level. To help address this, we used a self-refinement approach to develop a LLM capability for minimally lossy text simplification. To validate our approach, we conducted a randomized study involving 4563 participants and 31 texts spanning 6 broad subject areas: PubMed (biomedical scientific articles), biology, law, finance, literature/philosophy, and aerospace/computer science. Participants were randomized to viewing original or simplified texts in a subject area, and answered multiple-choice questions (MCQs) that tested their comprehension of the text. The participants were also asked to provide qualitative feedback such as task difficulty. Our results indicate that participants who read the simplified text answered more MCQs correctly than their counterparts who read the original text (3.9% absolute increase, p<0.05). This gain was most striking with PubMed (14.6%), while more moderate gains were observed for finance (5.5%), aerospace/computer science (3.8%) domains, and legal (3.5%). Notably, the results were robust to whether participants could refer back to the text while answering MCQs. The absolute accuracy decreased by up to ~9% for both original and simplified setups where participants could not refer back to the text, but the ~4% overall improvement persisted. Finally, participants' self-reported perceived ease based on a simplified NASA Task Load Index was greater for those who read the simplified text (absolute change on a 5-point scale 0.33, p<0.05). This randomized study, involving an order of magnitude more participants than prior works, demonstrates the potential of LLMs to make complex information easier to understand. Our work aims to enable a broader audience to better learn and make use of expert knowledge available on the web, improving information accessibility. View details
    Preview abstract Microscopic interpretation of histopathology images underlies many important diagnostic and treatment decisions. While advances in vision–language modeling raise new oppor- tunities for analysis of such images, the gigapixel-scale size of whole slide images (WSIs) introduces unique challenges. Additionally, pathology reports simultaneously highlight key findings from small regions while also aggregating interpretation across multiple slides, often making it difficult to create robust image–text pairs. As such, pathology reports remain a largely untapped source of supervision in computational pathology, with most efforts relying on region-of-interest annotations or self-supervision at the patch-level. In this work, we develop a vision–language model based on the BLIP-2 framework using WSIs paired with curated text from pathology reports. This enables applications utilizing a shared image–text embedding space, such as text or image retrieval for finding cases of interest, as well as integration of the WSI encoder with a frozen large language model (LLM) for WSI-based generative text capabilities such as report generation or AI-in-the-loop interactions. We utilize a de-identified dataset of over 350,000 WSIs and diagnostic text pairs, spanning a wide range of diagnoses, procedure types, and tissue types. We present pathologist evaluation of text generation and text retrieval using WSI embeddings, as well as results for WSI classification and workflow prioritization (slide-level triaging). Model-generated text for WSIs was rated by pathologists as accurate, without clinically significant error or omission, for 78% of WSIs on average. This work demonstrates exciting potential capabilities for language-aligned WSI embeddings. View details
    Preview abstract Background Skin conditions are extremely common worldwide, and are an important cause of both anxiety and morbidity. Since the advent of the internet, individuals have used text-based search (eg, “red rash on arm”) to learn more about concerns on their skin, but this process is often hindered by the inability to accurately describe the lesion’s morphology. In the study, we surveyed respondents’ experiences with an image-based search, compared to the traditional text-based search experience. Methods An internet-based survey was conducted to evaluate the experience of text-based vs image-based search for skin conditions. We recruited respondents from an existing cohort of volunteers in a commercial survey panel; survey respondents that met inclusion/exclusion criteria, including willingness to take photos of a visible concern on their body, were enrolled. Respondents were asked to use the Google mobile app to conduct both regular text-based search (Google Search) and image-based search (Google Lens) for their concern, with the order of text vs. image search randomized. Satisfaction for each search experience along six different dimensions were recorded and compared, and respondents’ preferences for the different search types along these same six dimensions were recorded. Results 372 respondents were enrolled in the study, with 44% self-identifying as women, 86% as White and 41% over age 45. The rate of respondents who were at least moderately familiar with searching for skin conditions using text-based search versus image-based search were 81.5% and 63.5%, respectively. After using both search modalities, respondents were highly satisfied with both image-based and text-based search, with >90% at least somewhat satisfied in each dimension and no significant differences seen between text-based and image-based search when examining the responses on an absolute scale per search modality. When asked to directly rate their preferences in a comparative way, survey respondents preferred image-based search over text-based search in 5 out of 6 dimensions, with an absolute 9.9% more preferring image-based search over text-based search overall (p=0.004). 82.5% (95% CI 78.2 - 86.3) reported a preference to leverage image-based search (alone or in combination with text-based search) in future searches. Of those who would prefer to use a combination of both, 64% indicated they would like to start with image-based search, indicating that image-based search may be the preferred entry point for skin-related searches. Conclusion Despite being less familiar with image-based search upon study inception, survey respondents generally preferred image-based search to text-based search and overwhelmingly wanted to include this in future searches. These results suggest the potential for image-based search to play a key role in people searching for information regarding skin concerns. View details
    Preview abstract Background: Skin conditions are extremely common worldwide, and are an important cause of both anxiety and morbidity. Since the advent of the internet, individuals have used text-based search (eg, “red rash on arm”) to learn more about concerns on their skin, but this process is often hindered by the inability to accurately describe the lesion’s morphology. In the study, we surveyed respondents’ experiences with an image-based search, compared to the traditional text-based search experience. Methods: An internet-based survey was conducted to evaluate the experience of text-based vs image-based search for skin conditions. We recruited respondents from an existing cohort of volunteers in a commercial survey panel; survey respondents that met inclusion/exclusion criteria, including willingness to take photos of a visible concern on their body, were enrolled. Respondents were asked to use the Google mobile app to conduct both regular text-based search (Google Search) and image-based search (Google Lens) for their concern, with the order of text vs. image search randomized. Satisfaction for each search experience along six different dimensions were recorded and compared, and respondents’ preferences for the different search types along these same six dimensions were recorded. Results: 372 respondents were enrolled in the study, with 44% self-identifying as women, 86% as White and 41% over age 45. The rate of respondents who were at least moderately familiar with searching for skin conditions using text-based search versus image-based search were 81.5% and 63.5%, respectively. After using both search modalities, respondents were highly satisfied with both image-based and text-based search, with >90% at least somewhat satisfied in each dimension and no significant differences seen between text-based and image-based search when examining the responses on an absolute scale per search modality. When asked to directly rate their preferences in a comparative way, survey respondents preferred image-based search over text-based search in 5 out of 6 dimensions, with an absolute 9.9% more preferring image-based search over text-based search overall (p=0.004). 82.5% (95% CI 78.2 - 86.3) reported a preference to leverage image-based search (alone or in combination with text-based search) in future searches. Of those who would prefer to use a combination of both, 64% indicated they would like to start with image-based search, indicating that image-based search may be the preferred entry point for skin-related searches. Conclusion: Despite being less familiar with image-based search upon study inception, survey respondents generally preferred image-based search to text-based search and overwhelmingly wanted to include this in future searches. These results suggest the potential for image-based search to play a key role in people searching for information regarding skin concerns. View details
    Iterative quality control strategies for expert medical image labeling
    Sonia Phene
    Abigail Huang
    Rebecca Ackermann
    Olga Kanzheleva
    Caitlin Taggart
    Proceedings of the AAAI Conference on Human Computation and Crowdsourcing (2021)
    Preview abstract Data quality is a key concern for artificial intelligence (AI) efforts that rely upon crowdsourced data collection. In the domain of medicine in particular, labeled data must meet higher quality standards, or the resulting AI may lead to patient harm, and/or perpetuate biases. What are the challenges involved in expert medical labeling? What processes do such teams employ? In this study, we interviewed members of teams developing AI for medical imaging across 4 subdomains (ophthalmology, radiology, pathology, and dermatology). We identify a set of common practices for ensuring data quality. We describe one instance of low-quality labeling caught by post-launch monitoring. However, the more common pattern is to involve experts in an iterative process of defining, testing, and iterating tasks and instructions. Teams invest in these upstream efforts in order to mitigate downstream quality issues during large-scale labeling. View details
    Preview abstract Crowdsourcing has enabled the collection, aggregation and refinement of human knowledge and judgment, i.e. ground truth, for problem domains with data of increasing complexity and scale. This scale of ground truth data generation, especially towards the development of machine learning based medical applications that require large volumes of consistent diagnoses, poses significant and unique challenges to quality control. Poor quality control in crowdsourced labeling of medical data can result in undesired effects on patients' health. In this paper, we study medicine-specific quality control problems, including the diversity of grader expertise and diagnosis guidelines' ambiguity in novel datasets of three eye diseases. We present analytical findings on physicians' work patterns, evaluate existing quality control methods that rely on task completion time to circumvent the scarcity and cost problem of generating ground truth medical data, and share our experiences with a real-world system that collects medical labels at scale. View details