Avinatan Hassidim
Authored Publications
Sort By
Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning
Aaron Bell
Aviad Barzilai
Roy Lee
Gia Jung
Charles Elliott
Adam Boulanger
Amr Helmy
Jacob Bien
Ruth Alcantara
Nadav Sherman
Hassler Thurston
Yotam Gigi
Bolous Jaber
Vered Silverman
Luke Barrington
Tim Thelin
Elad Aharoni
Kartik Hegde
Yuval Carny
Shravya Shetty
Yehonathan Refael
Stone Jiang
David Schottlander
Juliet Rothenberg
Luc Houriez
Yochai Blau
Joydeep Paul
Yang Chen
Yael Maguire
Aviv Slobodkin
Shlomi Pasternak
Alex Ottenwess
Jamie McPike
Per Bjornsson
Natalie Williams
Reuven Sayag
Thomas Turnbull
Ali Ahmadalipour
David Andre
Amit Aides
Ean Phing VanLee
Niv Efron
Monica Bharel
arXiv (preprint 2025), arXiv, arXiv:2510.18318
https://doi.org/10.48550/arXiv.2510.18318
(2025)
Preview abstract
Geospatial data offers immense potential for understanding our planet. However, the sheer volume and diversity of this data along with its varied resolutions, timescales, and sparsity pose significant challenges for thorough analysis and interpretation. The emergence of Foundation Models (FMs) and Large Language Models (LLMs) offers an unprecedented opportunity to tackle some of this complexity, unlocking novel and profound insights into our planet.
This paper introduces a comprehensive approach to developing Earth AI solutions, built upon foundation models across three key domains—Planet-scale Imagery, Population, and Environment—and an intelligent Gemini-powered reasoning engine. We present rigorous benchmarks showcasing the power and novel capabilities of our foundation models and validate that they provide complementary value to improve geospatial inference. We show that the synergy between these models unlocks superior predictive capabilities. To handle complex, multi-step queries, we developed a Gemini-powered agent that jointly reasons over our multiple foundation models along with large geospatial data sources and tools to unlock novel geospatial insights. On a new benchmark of real-world crisis scenarios, our agent demonstrates the ability to deliver critical and timely insights, effectively bridging the gap between raw geospatial data and actionable understanding.
View details
An Empirical Study of Time of Day Breakpoints in Traffic Light Plans
Eliav Buchnik
Tom Kalvari
Jack Haddad
Dan Karliner
Danny Veikherman
Shai Ferster
Ori Rottenstreich
2025
Preview abstract
Fixed time strategy is a common approach in signal traffic control in which signal plans are simple and periodic, enjoying easy implementation without detection mechanisms. A traffic light is associated with several daily plans, each applied to several consecutive hours. Time-of-day breakpoints (TODs) refer to the times over the day in which the plan is changed. TODs are often selected based on traffic, aiming to divide the day into groups of consecutive hours with similar traffic characteristics within each group of hours. We present a methodology to study time-of-day breakpoints in practice. We use this methodology to estimate and analyze time-of-day breakpoints in the city of Rio de Janeiro, Brazil based on traffic properties derived from traffic trajectories. Our study examines over 900 of the city intersections. We refer to properties such as the number of daily plans and the times by which plans start. We also provide traffic-aware insights on the potential improvement in the selection of TODs and identify key intersections where adjusting TODs could reduce average delay times. We identify potential improvements in over 8% of the examined intersections. These findings provide valuable insights for traffic engineers seeking to optimize signal timing.
View details
Towards Conversational AI for Disease Management
Khaled Saab
David Stutz
Kavita Kulkarni
Sara Mahdavi
Joelle Barral
James Manyika
Ryutaro Tanno
Adam Rodman
arXiv (2025)
Preview abstract
While large language models (LLMs) have shown promise in diagnostic dialogue, their capabilities for effective management reasoning - including disease progression, therapeutic response, and safe medication prescription - remain under-explored. We advance the previously demonstrated diagnostic capabilities of the Articulate Medical Intelligence Explorer (AMIE) through a new LLM-based agentic system optimised for clinical management and dialogue, incorporating reasoning over the evolution of disease and multiple patient visit encounters, response to therapy, and professional competence in medication prescription. To ground its reasoning in authoritative clinical knowledge, AMIE leverages Gemini's long-context capabilities, combining in-context retrieval with structured reasoning to align its output with relevant and up-to-date clinical practice guidelines and drug formularies. In a randomized, blinded virtual Objective Structured Clinical Examination (OSCE) study, AMIE was compared to 21 primary care physicians (PCPs) across 100 multi-visit case scenarios designed to reflect UK NICE Guidance and BMJ Best Practice guidelines. AMIE was non-inferior to PCPs in management reasoning as assessed by specialist physicians and scored better in both preciseness of treatments and investigations, and in its alignment with and grounding of management plans in clinical guidelines. To benchmark medication reasoning, we developed RxQA, a multiple-choice question benchmark derived from two national drug formularies (US, UK) and validated by board-certified pharmacists. While AMIE and PCPs both benefited from the ability to access external drug information, AMIE outperformed PCPs on higher difficulty questions. While further research would be needed before real-world translation, AMIE's strong performance across evaluations marks a significant step towards conversational AI as a tool in disease management.
View details
LLM-based Lossless Text Simplification and its Effect on User Comprehension and Mental Load
Theo Guidroz
Diego Ardila
Jimmy Li
Adam Mansour
Paul Jhun
Nina Gonzalez
Xiang Ji
Mike Sanchez
Sujay Kakarmath
Miguel Ángel Garrido
Faruk Ahmed
Divyansh Choudhary
Jay Hartford
Georgina Xu
Henry Serrano
Yifan Wang
Jeff Shaffer
Eric (Yifan) Cao
Sho Fujiwara
Peggy Bui
arXiv (2025)
Preview abstract
Information on the web, such as scientific publications and Wikipedia, often surpasses users' reading level. To help address this, we used a self-refinement approach to develop a LLM capability for minimally lossy text simplification. To validate our approach, we conducted a randomized study involving 4563 participants and 31 texts spanning 6 broad subject areas: PubMed (biomedical scientific articles), biology, law, finance, literature/philosophy, and aerospace/computer science. Participants were randomized to viewing original or simplified texts in a subject area, and answered multiple-choice questions (MCQs) that tested their comprehension of the text. The participants were also asked to provide qualitative feedback such as task difficulty. Our results indicate that participants who read the simplified text answered more MCQs correctly than their counterparts who read the original text (3.9% absolute increase, p<0.05). This gain was most striking with PubMed (14.6%), while more moderate gains were observed for finance (5.5%), aerospace/computer science (3.8%) domains, and legal (3.5%). Notably, the results were robust to whether participants could refer back to the text while answering MCQs. The absolute accuracy decreased by up to ~9% for both original and simplified setups where participants could not refer back to the text, but the ~4% overall improvement persisted. Finally, participants' self-reported perceived ease based on a simplified NASA Task Load Index was greater for those who read the simplified text (absolute change on a 5-point scale 0.33, p<0.05). This randomized study, involving an order of magnitude more participants than prior works, demonstrates the potential of LLMs to make complex information easier to understand. Our work aims to enable a broader audience to better learn and make use of expert knowledge available on the web, improving information accessibility.
View details
Day-of-the-week Awareness in Time of Day Breakpoints for Traffic Light Plans
Ori Rottenstreich
Eliav Buchnik
Shai Ferster
Tom Kalvari
Ron Tsibulsky
Danny Veikherman
Jack Haddad
2025
Preview abstract
Time-of-day breakpoints (TODs) refer to the times over the day in which the plan of a traffic light is changed. Traditionally, TODs are selected jointly for all weekdays (Monday-Friday), typically with additional TODs dedicated to weekends. In this paper, we present an alternative approach motivated by traffic characteristics that can differ among the weekdays Monday-Friday and consider TODs which are day-of-the-week aware. The traffic-aware approach studies similarities among days and computes TODs that can be shared among days with similar characteristics but can also have other forms for weekdays with unique characteristics. Based on traffic properties derived from anonymized trajectories, we apply the new methodology to compute time-of-day breakpoints that are day-of-the-week aware in the city of Rio de Janeiro, Brazil and estimate the impact of the new methodology.
View details
Study of Arterials in the City of Rio de Janeiro for Traffic Coordination
Ori Rottenstreich
Eliav Buchnik
Danny Veikherman
Dan Karliner
Tom Kalvari
Shai Ferster
Ron Tsibulsky
Jack Haddad
2025
Preview abstract
Urban traffic congestion is a growing challenge, and optimizing signal timing strategies is crucial for improving traffic flow and reducing emissions. The coordination of signalized intersections improves both traffic operations and environmental aspects. Coordination is particularly important along arterials, sequences of signalized intersections that serve as the primary routes and carry a high volume of traffic. In this paper we analyze real data from the city of Rio de Janeiro to study properties of arterials. We refer to their length, the distance between intersections and to the properties of the traffic light plans such as cycle time. We then study their in practice level of coordination in terms of number of stops and their common locations along the arterials. We dive into particular arterials and provide insights that can be useful for efficient design of arterials in additional cities. Based on the analysis, we show how simple traffic properties can indicate the potential upon coordinating two adjacent intersections as part of an arterial in improving traffic performance.
View details
Fine-grained Measurement of Vehicle Delay Fairness
Eliav Buchnik
Tom Kalvari
Jack Haddad
Dan Karliner
Danny Veikherman
Ron Tsibulsky
Shai Ferster
Ori Rottenstreich
2025
Preview abstract
Optimizing signal timing in traffic lights helps to improve traffic flow and reduce emissions through reducing delays. At intersections, vehicles from different movements observe different delays impacted by the traffic light plan. This paper analyzes delay fairness among various vehicles at intersections. We refer to three cities: Rio de Janeiro, Hamburg and Seattle with a total number of over 5100 intersections. We present an intuitive methodology to compute delay fairness based on Gini index, a common fairness measure in economics. We evaluate the fairness based on real traffic data and provide insights on the relationship of fairness with day hours and traffic demand. We also examine real changes in traffic light plans that occurred in practice to check whether improving delay is often aligned with increasing fairness.
View details
Generative AI for medical education: Insights from a case study with medical students and an AI tutor for clinical reasoning
Amy Wang
Roma Ruparel
Paul Jhun
Julie Anne Seguin
Patricia Strachan
Renee Wong
2025
Preview abstract
Generative Artificial Intelligence (AI), particularly Large Language Models (LLMs), have demonstrated significant potential in clinical reasoning skills such as history-taking and differential diagnosis generation—critical aspects of medical education. This work explores how LLMs can augment medical curricula through interactive learning. We conducted a participatory design process with medical students, residents and medical education experts to co-create an AI-powered tutor prototype for clinical reasoning. As part of the co-design process, we conducted a qualitative user study, investigating learning needs and practices via interviews, and conducting concept evaluations through interactions with the prototype. Findings highlight the challenges learners face in transitioning from theoretical knowledge to practical application, and how an AI tutor can provide personalized practice and feedback. We conclude with design considerations, emphasizing the importance of context-specific knowledge and emulating positive preceptor traits, to guide the development of AI tools for medical education.
View details
LLM-based Lossless Text Simplification and its Effect on User Comprehension and Cognitive Load
Theo Guidroz
Diego Ardila
Jimmy Li
Adam Mansour
Paul Jhun
Nina Gonzalez
Xiang Ji
Mike Sanchez
Sujay Kakarmath
Miguel Ángel Garrido
Faruk Ahmed
Divyansh Choudhary
Jay Hartford
Georgina Xu
Henry Serrano
Yifan Wang
Jeff Shaffer
Eric (Yifan) Cao
Sho Fujiwara
Peggy Bui
arXiv (2025)
Preview abstract
Information on the web, such as scientific publications and Wikipedia, often surpasses users' reading level. To help address this, we used a self-refinement approach to develop a LLM capability for minimally lossy text simplification. To validate our approach, we conducted a randomized study involving 4563 participants and 31 texts spanning 6 broad subject areas: PubMed (biomedical scientific articles), biology, law, finance, literature/philosophy, and aerospace/computer science. Participants were randomized to viewing original or simplified texts in a subject area, and answered multiple-choice questions (MCQs) that tested their comprehension of the text. The participants were also asked to provide qualitative feedback such as task difficulty. Our results indicate that participants who read the simplified text answered more MCQs correctly than their counterparts who read the original text (3.9% absolute increase, p<0.05). This gain was most striking with PubMed (14.6%), while more moderate gains were observed for finance (5.5%), aerospace/computer science (3.8%) domains, and legal (3.5%). Notably, the results were robust to whether participants could refer back to the text while answering MCQs. The absolute accuracy decreased by up to ~9% for both original and simplified setups where participants could not refer back to the text, but the ~4% overall improvement persisted. Finally, participants' self-reported perceived ease based on a simplified NASA Task Load Index was greater for those who read the simplified text (absolute change on a 5-point scale 0.33, p<0.05). This randomized study, involving an order of magnitude more participants than prior works, demonstrates the potential of LLMs to make complex information easier to understand. Our work aims to enable a broader audience to better learn and make use of expert knowledge available on the web, improving information accessibility.
View details
A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations
Uri Hasson
Samuel A. Nastase
Harshvardhan Gazula
Aditi Rao
Tom Sheffer
Werner Doyle
Orrin Devinsky
aditi singh
Adeen Flinker
Patricia Dugan
Bobbi Aubrey
Sasha Devore
Daniel Friedman
Leonard Niekerken
Catherine Kim
Haocheng Wang
Zaid Zada
Gina Choe
Nature Human Behaviour (2025)
Preview abstract
This study introduces a unified computational framework connecting acoustic, speech and word-level linguistic structures to study the neural basis of everyday conversations in the human brain. We used electrocorticography to record neural signals across 100 h of speech production and comprehension as participants engaged in open-ended real-life conversations. We extracted low-level acoustic, mid-level speech and contextual word embeddings from a multimodal speech-to-text model (Whisper). We developed encoding models that linearly map these embeddings onto brain activity during speech production and comprehension. Remarkably, this model accurately predicts neural activity at each level of the language processing hierarchy across hours of new conversations not used in training the model. The internal processing hierarchy in the model is aligned with the cortical hierarchy for speech and language processing, where sensory and motor regions better align with the model’s speech embeddings, and higher-level language areas better align with the model’s language embeddings. The Whisper model captures the temporal sequence of language-to-speech encoding before word articulation (speech production) and speech-to-language encoding post articulation (speech comprehension). The embeddings learned by this model outperform symbolic models in capturing neural activity supporting natural speech and language. These findings support a paradigm shift towards unified computational models that capture the entire processing hierarchy for speech comprehension and production in real-world conversations.
View details