Avinatan Hassidim

Avinatan Hassidim

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Computing efficient traffic signal plans is often based on the amount of traffic in an intersection, its distribution over the various intersection movements and hours as well as on performance metrics such as traffic delay. In their simple and typical form plans are fixed in the same hour over weekdays. This allows low operation costs without the necessity for traffic detection and monitoring tools. A critical factor on the potential efficiency of such plans is the similarity of traffic patterns over the days along each of the intersection movements. We refer to such similarity as the traffic stability of the intersection and define simple metrics to measure it based on traffic volume and traffic delay. In this paper, we propose an automatic probe data based method, for city-wide estimation of traffic stability. We discuss how such measures can be used for signal planning such as in selecting plan resolution or as an indication as which intersections can benefit from dynamic but expensive traffic detection tools. We also identify events of major changes in traffic characteristics of an intersection. We demonstrate the framework by using real traffic statistics to study the traffic stability in the city of Haifa along its 162 intersections. We study the impact of the time of day on the stability, detect major changes in traffic and find intersections with high and low stability. View details
    Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns
    Ariel Goldstein
    Avigail Grinstein-Dabush
    Haocheng Wang
    Zhuoqiao Hong
    Bobbi Aubrey
    Samuel A. Nastase
    Zaid Zada
    Eric Ham
    Harshvardhan Gazula
    Eliav Buchnik
    Werner Doyle
    Sasha Devore
    Patricia Dugan
    Roi Reichart
    Daniel Friedman
    Orrin Devinsky
    Adeen Flinker
    Uri Hasson
    Nature Communications (2024)
    Preview abstract Contextual embeddings, derived from deep language models (DLMs), provide a continuous vectorial representation of language. This embedding space differs fundamentally from the symbolic representations posited by traditional psycholinguistics. We hypothesize that language areas in the human brain, similar to DLMs, rely on a continuous embedding space to represent language. To test this hypothesis, we densely record the neural activity patterns in the inferior frontal gyrus (IFG) of three participants using dense intracranial arrays while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation for each word (i.e., a brain embedding) in each patient. We demonstrate that brain embeddings in the IFG and the DLM contextual embedding space have common geometric patterns using stringent zero-shot mapping. The common geometric patterns allow us to predict the brain embedding of a given left-out word in IFG based solely on its geometrical relationship to other nonoverlapping words in the podcast. Furthermore, we show that contextual embeddings better capture the geometry of IFG embeddings than static word embeddings. The continuous brain embedding space exposes a vector-based neural code for natural language processing in the human brain. View details
    A Neural Encoder for Earthquake Rate Forecasting
    Oleg Zlydenko
    Brendan Meade
    Alexandra Sharon Molchanov
    Sella Nevo
    Yohai bar Sinai
    Scientific Reports (2023)
    Preview abstract Forecasting the timing of earthquakes is a long-standing challenge. Moreover, it is still debated how to formulate this problem in a useful manner, or to compare the predictive power of different models. Here, we develop a versatile neural encoder of earthquake catalogs, and apply it to the fundamental problem of earthquake rate prediction, in the spatio-temporal point process framework. The epidemic type aftershock sequence model (ETAS) effectively learns a small number of parameters to constrain assumed functional forms for the space and time relationships of earthquake sequences (e.g., Omori-Utsu law). Here we introduce learned spatial and temporal embeddings for point process earthquake forecast models that capture complex correlation structures. We demonstrate the generality of this neural representation as compared with ETAS model using train-test data splits and how it enables the incorporation of additional geophysical information. In rate prediction tasks, the generalized model shows > 4% improvement in information gain per earthquake and the simultaneous learning of anisotropic spatial structures analogous to fault traces. The trained network can be also used to perform short-term prediction tasks, showing similar improvement while providing a 1,000-fold reduction in run-time. View details
    Caravan - A global community dataset for large-sample hydrology
    Frederik Kratzert
    Nans Addor
    Tyler Erickson
    Martin Gauch
    Lukas Gudmundsson
    Daniel Klotz
    Sella Nevo
    Guy Shalev
    Scientific Data, 10 (2023), pp. 61
    Preview abstract High-quality datasets are essential to support hydrological science and modeling. Several CAMELS (Catchment Attributes and Meteorology for Large-sample Studies) datasets exist for specific countries or regions, however these datasets lack standardization, which makes global studies difficult. This paper introduces a dataset called Caravan (a series of CAMELS) that standardizes and aggregates seven existing large-sample hydrology datasets. Caravan includes meteorological forcing data, streamflow data, and static catchment attributes (e.g., geophysical, sociological, climatological) for 6830 catchments. Most importantly, Caravan is both a dataset and open-source software that allows members of the hydrology community to extend the dataset to new locations by extracting forcing data and catchment attributes in the cloud. Our vision is for Caravan to democratize the creation and use of globally-standardized large-sample hydrology datasets. Caravan is a truly global open-source community resource. View details
    AI Increases Global Access to Reliable Flood Forecasts
    Asher Metzger
    Dana Weitzner
    Frederik Kratzert
    Guy Shalev
    Martin Gauch
    Sella Nevo
    Shlomo Shenzis
    Tadele Yednkachw Tekalign
    Vusumuzi Dube
    arXiv (2023)
    Preview abstract Floods are one of the most common natural disasters, with a disproportionate impact in developing countries that often lack dense streamflow gauge networks. Accurate and timely warnings are critical for mitigating flood risks, but hydrological simulation models typically must be calibrated to long data records in each watershed. Here we show that AI-based forecasting achieves reliability in predicting extreme riverine events in ungauged watersheds at up to a 5-day lead time that is similar to or better than the reliability of nowcasts (0-day lead time) from a current state of the art global modeling system (the Copernicus Emergency Management Service Global Flood Awareness System). Additionally, we achieve accuracies over 5-year return period events that are similar to or better than current accuracies over 1-year return period events. This means that AI can provide flood warnings earlier and over larger and more impactful events in ungauged basins. The model developed in this paper was incorporated into an operational early warning system that produces publicly available (free and open) forecasts in real time in over 80 countries. This work highlights a need for increasing the availability of hydrological data to continue to improve global access to reliable flood warnings. View details
    Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
    Paul Roit
    Johan Ferret
    Geoffrey Cideron
    Matthieu Geist
    Sertan Girgin
    Léonard Hussenot
    Nikola Momchev
    Piotr Stanczyk
    Nino Vieillard
    Olivier Pietquin
    Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (2023), 6252–6272
    Preview abstract Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input. This phenomenon is emphasized in tasks like summarization, in which the generated summaries should be corroborated by their source article. In this work we leverage recent progress on textual entailment models to directly address this problem for abstractive summarization systems. We use reinforcement learning with reference-free, textual-entailment rewards to optimize for factual consistency and explore the ensuing trade-offs, as improved consistency may come at the cost of less informative or more extractive summaries. Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience and conciseness of the generated summaries. View details
    Eddystone-EID: Secure and Private Infrastructural Protocol for BLE Beacons
    Liron David
    Alon Ziv
    IEEE Transactions on Information Forensics and Security (2022)
    Preview abstract Beacons are small devices which are playing an important role in the Internet of Things (IoT), connecting “things” without IP connection to the Internet via Bluetooth Low Energy (BLE) communication. In this paper we present the first private end-to-end encryption protocol called the Eddystone-Ephemeral-ID (Eddystone-EID) protocol. This protocol enables connectivity from any beacon to its remote owner, while supporting beacon’s privacy and security, and essentially preserving the beacon’s low power consumption. We describe the Eddystone-EID development goals, discuss the design decisions, show the cryptographic solution, and analyse its privacy, security, and performance. Finally, we present three secure IoT applications built on Eddystone-EID, demonstrating its utility as a security and privacy infrastructure in the IoT domain. Further, Eddystone-EID is a prototypical example of security design for an asymmetric system in which on one side there are small power-deficient elements (the beacons) and on the other side there is a powerful computing engine (a cloud). The crux of the design strategy is based on: (1) transferring work from the beacon to the cloud, and then (2) building a trade-off between cloud online work against cloud offline work, in order to enable fast real-time reaction of the cloud. These two principles seem to be generic and can be used for other problems in the IoT domain. View details
    Preview abstract Clinical notes often contain vital information not observed in other structured data, but their unstructured nature can lead to critical patient-related information being lost. To make sure this valuable information is utilized for patient care, algorithms that summarize notes into a problem list are often proposed. Focusing on identifying medically-relevant entities in the free-form text, these solutions are often detached from a canonical ontology and do not allow downstream use of the detected text-spans. As a solution, we present here a system for generating a canonical problem list from medical notes, consisting of two major stages. At the first stage, annotation, we use a transformer model to detect all clinical conditions which are mentioned in a single note. These clinical conditions are then grounded to a predefined ontology, and are linked to spans in the text. At the second stage, summarization, we aggregate over the set of clinical conditions detected on all of the patient's note, and produce a concise patient summary that organizes their important conditions. View details
    Preview abstract A streaming algorithm is said to be adversarially robust if its accuracy guarantees are maintained even when the data stream is chosen maliciously, by an adaptive adversary. We establish a connection between adversarial robustness of streaming algorithms and the notion of differential privacy. This connection allows us to design new adversarially robust streaming algorithms that outperform the current state-of-the-art constructions for many interesting regimes of parameters. View details
    Flood forecasting with machine learning models in an operational framework
    Asher Metzger
    Chen Barshai
    Dana Weitzner
    Frederik Kratzert
    Gregory Begelman
    Guy Shalev
    Hila Noga
    Moriah Royz
    Niv Giladi
    Ronnie Maor
    Sella Nevo
    Yotam Gigi
    HESS (2022)
    Preview abstract Google’s operational flood forecasting system was developed to provide accurate real-time flood warnings to agencies and the public, with a focus on riverine floods in large, gauged rivers. It became operational in 2018 and has since expanded geographically. This forecasting system consists of four subsystems: data validation, stage forecasting, inundation modeling, and alert distribution. Machine learning is used for two of the subsystems. Stage forecasting is modeled with the Long Short-Term Memory (LSTM) networks and the Linear models. Flood inundation is computed with the Thresholding and the Manifold models, where the former computes inundation extent and the latter computes both inundation extent and depth. The Manifold model, presented here for the first time, provides a machine-learning alternative to hydraulic modeling of flood inundation. When evaluated on historical data, all models achieve sufficiently high-performance metrics for operational use. The LSTM showed higher skills than the Linear model, while the Thresholding and Manifold models achieved similar performance metrics for modeling inundation extent. During the 2021 monsoon season, the flood warning system was operational in India and Bangladesh, covering flood-prone regions around rivers with a total area of 287,000 km2, home to more than 350M people. More than 100M flood alerts were sent to affected populations, to relevant authorities, and to emergency organizations. Current and future work on the system includes extending coverage to additional flood-prone locations, as well as improving modeling capabilities and accuracy. View details