
Cesar Ilharco Magalhaes
Authored Publications
Sort By
Preview abstract
Many AI applications of interest require specialized multi-modal models. Yet, relevant data for training these models is inherently scarce. Human annotation is prohibitively expensive, error-prone, and time-consuming. Meanwhile, existing synthetic data generation methods often rely on manual prompts, evolutionary algorithms, or extensive seed data from the target distribution - limiting scalability and control. In this paper, we introduce Simula, a novel, seedless framework that balances global and local reasoning to generate synthetic datasets. We utilize taxonomies to capture a global coverage space and use a series of agentic refinements to promote local diversity and complexity. Our approach allows users to define desired dataset characteristics through an explainable and controllable process, without relying on seed data. This unlocks new opportunities for developing and deploying AI in domains where data scarcity or privacy concerns are paramount.
View details
Recognizing Multimodal Entailment (tutorial at ACL 2021)
Afsaneh Hajiamin Shirazi
Blaž Bratanič
Christina Liu
Gabriel Fedrigo Barcik
Georg Fritz Osang
Jared Frank
Lucas Smaira
Ricardo Abasolo Marino
Roma Patel
Vaiva Imbrasaite
(2021) (to appear)
Neural Structured Learning in TensorFlow: Hands-On Tutorial at KDD
Chun-Sung Ferng
George Yu
(2020), pp. 3501-3502
Preview abstract
We present Neural Structured Learning (NSL) in TensorFlow, a new learning paradigm to train neural networks by leveraging structured signals in addition to feature inputs. Structure can be explicit as represented by a graph, or implicit, either induced by adversarial perturbation or inferred using techniques like embedding learning. NSL is open-sourced as part of the TensorFlow ecosystem and is widely used in Google across many products and services. In this tutorial, we provide an overview of the NSL framework including various libraries, tools, and APIs as well as demonstrate the practical use of NSL in different applications. The NSL website is hosted at www.tensorflow.org/neural_structured_learning, which includes details about the theoretical foundations of the technology, extensive API documentation, and hands-on tutorials.
View details