Viren Jain

Viren Jain

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Biological neurons come in many shapes. High-fidelity generative modeling of their varied morphologies is challenging yet underexplored in neuroscience, and crucial for the subfield of connectomics. We introduce MoGen (Neuronal Morphology Generation), a flow matching model to generate high-resolution 3D point clouds of mouse cortex axon and dendrite fragments. This is enabled by an adaptation that injects local geometric context into a scalable latent transformer backbone, allowing for the generation of high-fidelity, realistic samples. To assess MoGen's generation quality, we propose a dedicated evaluation suite with interpretable geometric and topological features tailored to neuronal structures that we validate in a user study. MoGen's practical utility is showcased through controllable generation for visualization via smooth interpolation and a direct downstream application: we augment the training set of a shape plausibility classifier from a production connectomics neuron reconstruction pipeline with millions of generated samples, thereby improving classifier accuracy and reducing the number of remaining split and merge errors by 4.4%. We estimate this can reduce manual proofreading labor by over 157 person-years for reconstruction of a full mouse brain. View details
    Sexual dimorphism in the complete connectome of the Drosophila male central nervous system
    Stuart Berg
    Isabella R Beckett
    Marta Costa
    Philipp Schlegel
    Elizabeth C Marin
    Aljoscha Nern
    Stephan Preibisch
    Wei Qiu
    Shin-ya Takemura
    Andrew Champion
    Reed A. George
    Gary Huang
    William Katz
    Christopher Ordish
    Ken Hayworth
    Eric Trautman
    Vivek Jayaraman
    Wyatt Korff
    Geoffrey W Meissner
    Sandro Romani
    Jan Funke
    Christopher Knecht
    Stephan Saalfeld
    Louis Scheffer
    Scott Waddell
    Gwyneth Card
    Carlos Ribeiro
    Michael B. Reiser
    Harald Hess
    Gerry Rubin
    Gregory S.X.E. Jefferis
    bioRxiv (2026)
    Preview abstract Sex differences in behaviour exist across all animals, typically under strong genetic regulation. In Drosophila, fruitless/doublesex transcription factors can identify dimorphic neurons but their organisation into functional circuits remains unclear. We present the connectome of the entire Drosophila male central nervous system. This contains 166,691 neurons spanning the brain and nerve cord, fully proofread and annotated including fruitless/doublesex expression and 11,691 types. We provide the first comprehensive comparison between male and female brain connectomes to synaptic resolution, finding 7,205 isomorphic, 114 dimorphic, 262 male-specific and 69 female-specific types. This resource enables analysis of full sensory-to-motor circuits underlying complex behaviours and the impact of dimorphic elements. Sex-specific/dimorphic neurons are concentrated in higher brain centres while the sensory and motor periphery are largely isomorphic. Within higher centres, male-specific connections are organised into hotspots defined by male-specific neurons or arbours. Numerous circuit switches reroute sensory information to form antagonistic circuits controlling opposing behaviours. (Full author list included with the paper.) View details
    Light-microscopy-based dense connectomic reconstruction of mammalian brain tissue
    Mojtaba R. Tavakoli
    Julia Lyudchik
    Vitali Vistunou
    Nathalie Agudelo Duenas
    Jakob Vorlaufer
    Christoph Sommer
    Caroline Kreuzinger
    Barbara de Souza Oliveira
    Alban Cenameri
    Gaia Novarino
    Johann Danzl
    Nature (2025)
    Preview abstract The information-processing capability of the brain’s cellular network depends on the physical wiring pattern between neurons and their molecular and functional characteristics. Charting neurons and resolving the individual synaptic connections requires volumetric imaging at nanoscale resolution and comprehensive cellular contrast. Light microscopy is uniquely positioned to visualize specific molecules but dense, synapse-level circuit reconstruction by light microscopy has been out of reach due to limitations in resolution, contrast, and volumetric imaging capability. Here we developed light-microscopy based connectomics (LICONN). We integrated hydrogel embedding and expansion with comprehensive deep-learning based segmentation and analysis of connectivity, thus directly incorporating molecular information in synapse-level brain tissue reconstructions. LICONN will allow synapse-level brain tissue phenotyping in biological experiments in a readily adoptable manner. View details
    ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish
    Alexander Immer
    Alex Bo-Yuan Chen
    Mariela D. Petkova
    Nirmala A. Iyer
    Luuk Willem Hesselink
    Aparna Dev
    Gudrun Ihrke
    Woohyun Park
    Alyson Petruncio
    Aubrey Weigel
    Wyatt Korff
    Florian Engert
    Jeff W. Lichtman
    Misha B. Ahrens
    International Conference on Learning Representations (ICLR) (2025)
    Preview abstract Data-driven benchmarks have led to significant progress in key scientific modeling domains including weather and structural biology. Here, we present the Zebrafish Activity Prediction Benchmark (ZAPBench), which quantitatively measures progress on the problem of predicting cellular-resolution neural activity throughout an entire vertebrate brain. The benchmark is based on a novel dataset containing 4d light-sheet microscopy recordings of more than 70,000 neurons in a larval zebrafish brain, along with motion stabilized and voxel-level cell segmentations of these data that facilitate development of a variety of forecasting methods. Initial results from a selection of time series and volumetric video modeling approaches achieve better performance than naive baseline methods, but also show room for further improvement. The specific brain used in the activity recording is also undergoing synaptic-level anatomical mapping, which will enable future integration of detailed structural information into ZAP forecasting methods. View details
    ZAPBench: a benchmark for whole-brain activity prediction in zebrafish
    Alex Immer
    Alex Bo-Yuan Chen
    Mariela Petkova
    Nirmala Iyer
    Luuk Hesselink
    Aparna Dev
    Gudrun Ihrke
    Woohyun Park
    Alyson Petruncio
    Aubrey Weigel
    Wyatt Korff
    Florian Engert
    Jeff W. Lichtman
    Misha Ahrens
    International Conference on Learning Representations (ICLR) (2025)
    Preview abstract Data-driven benchmarks have led to significant progress in key scientific modeling domains including weather and structural biology. Here, we present the Zebrafish Activity Prediction Benchmark (ZAPBench), which quantitatively measures progress on the problem of predicting cellular-resolution neural activity throughout an entire vertebrate brain. The benchmark is based on a novel dataset containing 4d light-sheet microscopy recordings of more than 70,000 neurons in a larval zebrafish brain, along with motion stabilized and voxel-level cell segmentations of these data that facilitate development of a variety of forecasting methods. Initial results from a selection of time series and volumetric video modeling approaches achieve better performance than naive baseline methods, but also show room for further improvement. The specific brain used in the activity recording is also undergoing synaptic-level anatomical mapping, which will enable future integration of detailed structural information into ZAP forecasting methods. View details
    CURIE: Evaluating LLMs on multitask long context scientific understanding and reasoning
    Hao Cui
    Zahra Shamsi
    Gowoon Cheon
    Xuejian Ma
    Shutong Li
    Maria Tikhanovskaya
    Nayantara Mudur
    Paul Raccuglia
    Victor V. Albert
    Pranesh Srinivasan
    Haining Pan
    Philippe Faist
    Brian Rohr
    Ekin Dogus Cubuk
    Muratahan Aykol
    Amil Merchant
    Michael Statt
    Drew Purves
    Elise Kleeman
    Ruth Alcantara
    Matthew Abraham
    Muqthar Mohammad
    Ean Phing VanLee
    Chenfei Jiang
    Lizzie Dorfman
    Eun-Ah Kim
    International Conference on Learning Representations (ICLR) (2025)
    Preview abstract The core of the scientific problem-solving process involves synthesizing information while applying expert knowledge. Large Language Models (LLMs) have the potential to accelerate this process due to their extensive knowledge across a variety of domains. Recent advancements have also made it possible for LLMs to handle very long "in-context" content. However, existing evaluations of long-context LLMs have focused on assessing their ability to summarize or retrieve information within the given context, primarily in generalist tasks that do not require deep scientific expertise. To facilitate analogous assessments of domain-specific tasks, we introduce the scientific long-Context Understanding and Reasoning Inference Evaluations (CURIE) benchmark. This benchmark provides a set of 8 challenging tasks, derived from around 250 scientific research papers, requiring domain expertise, comprehension of long in-context information, and multi-step reasoning that tests the ability of LLMs to assist scientists in realistic workflows. Tasks in CURIE have been collected from experts in six disciplines - materials science, theoretical condensed matter physics, quantum computing, geospatial analysis, biodiversity, and protein sequencing - covering both experimental and theoretical workflows in science. We evaluate a range of closed and open LLMs on these tasks. Additionally, we propose strategies for task decomposition, which allow for a more nuanced evaluation of the models and facilitate staged multi-step assessments. We hope that insights gained from CURIE can guide the future development of LLMs. View details
    A petavoxel fragment of human cerebral cortex reconstructed at nanoscale resolution
    Alex Shapson-Coe
    Daniel R. Berger
    Yuelong Wu
    Richard L. Schalek
    Shuohong Wang
    Neha Karlupia
    Sven Dorkenwald
    Evelina Sjostedt
    Dongil Lee
    Luke Bailey
    Angerica Fitzmaurice
    Rohin Kar
    Benjamin Field
    Hank Wu
    Julian Wagner-Carena
    David Aley
    Joanna Lau
    Zudi Lin
    Donglai Wei
    Hanspeter Pfister
    Adi Peleg
    Jeff W. Lichtman
    Science (2024)
    Preview abstract To fully understand how the human brain works, knowledge of its structure at high resolution is needed. Presented here is a computationally intensive reconstruction of the ultrastructure of a cubic millimeter of human temporal cortex that was surgically removed to gain access to an underlying epileptic focus. It contains about 57,000 cells, about 230 millimeters of blood vessels, and about 150 million synapses and comprises 1.4 petabytes. Our analysis showed that glia outnumber neurons 2:1, oligodendrocytes were the most common cell, deep layer excitatory neurons could be classified on the basis of dendritic orientation, and among thousands of weak connections to each neuron, there exist rare powerful axonal inputs of up to 50 synapses. Further studies using this resource may bring valuable insights into the mysteries of the human brain. View details
    Preview abstract Early machine-learning systems were inspired by neural networks — now AI might allow neuroscientists to get to grips with the brain’s unique complexities. View details
    Multi-Layered Maps of Neuropil with Segmentation Guided Contrastive Learning
    Sven Dorkenwald
    Daniel R. Berger
    Agnes L. Bodor
    Forrest Collman
    Casey M. Schneider-Mizell
    Nuno Maçarico da Costa
    Jeff W. Lichtman
    Nature Methods (2023)
    Preview abstract Maps of the nervous system that identify individual cells along with their type, subcellular components and connectivity have the potential to elucidate fundamental organizational principles of neural circuits. Nanometer-resolution imaging of brain tissue provides the necessary raw data, but inferring cellular and subcellular annotation layers is challenging. We present segmentation-guided contrastive learning of representations (SegCLR), a self-supervised machine learning technique that produces representations of cells directly from 3D imagery and segmentations. When applied to volumes of human and mouse cortex, SegCLR enables accurate classification of cellular subcompartments and achieves performance equivalent to a supervised approach while requiring 400-fold fewer labeled examples. SegCLR also enables inference of cell types from fragments as small as 10 μm, which enhances the utility of volumes in which many neurites are truncated at boundaries. Finally, SegCLR enables exploration of layer 5 pyramidal cell subtypes and automated large-scale analysis of synaptic partners in mouse visual cortex. View details
    Structured sampling of olfactory input by the fly mushroom body
    Zhihao Zheng
    Feng Li
    Corey Fisher
    Iqbal J. Ali
    Nadiya Sharifi
    Steven Calle-Schuler
    Joseph Hsu
    Najla Masoodpanah
    Lucia Kmecova
    Tom Kazimiers
    Eric Perlman
    Matthew Nichols
    Davi Bock
    Current Biology, 32 (2022), pp. 3334-3349
    Preview abstract Associative memory formation and recall in the fruit fly Drosophila melanogaster is subserved by the mushroom body (MB). Upon arrival in the MB, sensory information undergoes a profound transformation from broadly tuned and stereotyped odorant responses in the olfactory projection neuron (PN) layer to narrowly tuned and nonstereotyped responses in the Kenyon cells (KCs). Theory and experiment suggest that this transformation is implemented by random connectivity between KCs and PNs. However, this hypothesis has been challenging to test, given the difficulty of mapping synaptic connections between large numbers of brain-spanning neurons. Here, we used a recent whole-brain electron microscopy volume of the adult fruit fly to map PN-to-KC connectivity at synaptic resolution. The PN-KC connectome revealed unexpected structure, with preponderantly food-responsive PN types converging at above-chance levels on downstream KCs. Axons of the overconvergent PN types tended to arborize near one another in the MB main calyx, making local KC dendrites more likely to receive input from those types. Overconvergent PN types preferentially co-arborize and connect with dendrites of αβ and α′β′ KC subtypes. Computational simulation of the observed network showed degraded discrimination performance compared with a random network, except when all signal flowed through the overconvergent, primarily food-responsive PN types. Additional theory and experiment will be needed to fully characterize the impact of the observed non-random network structure on associative memory formation and recall. View details
    ×