Robotics
Having a machine learning agent interact with its environment requires true unsupervised learning, skill acquisition, active learning, exploration and reinforcement, all ingredients of human learning that are still not well understood or exploited through the supervised approaches that dominate deep learning today. Our goal is to improve robotics via machine learning, and improve machine learning via robotics. We foster close collaborations between machine learning researchers and roboticists to enable learning at scale on real and simulated robotic systems.
Recent Publications
        
          
            
              Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Jesse Zhang
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Jiahui Zhang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Karl Pertsch
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ziyi Liu
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Xiang Ren
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Shao-Hua Sun
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Joseph Lim
                      
                    
                  
              
            
          
          
          
          
            Conference on Robot Learning 2023 (2023)
          
          
        
        
        
          
              Preview abstract
          
          
              We propose BOSS, an approach that automatically learns to solve new long-horizon, complex, and meaningful tasks by autonomously growing a learned skill library. Prior work in reinforcement learning require expert supervision, in the form of demonstrations or rich reward functions, to learn long-horizon tasks. Instead, our approach BOSS (BOotStrapping your own Skills) learns to accomplish new tasks by performing “skill bootstrapping,” where an agent with a set of primitive skills interacts with the environment to practice new skills without receiving reward feedback for tasks outside of the initial skill set. This bootstrapping phase is guided by large language models (LLMs) that inform the agent of meaningful skills to chain together. Through this process, BOSS builds a wide range of complex and useful behaviors from a basic set of primitive skills. We demonstrate through experiments in realistic household environments that agents trained with our LLM-guided bootstrapping procedure outperform those trained with naive bootstrapping as well as prior unsupervised skill acquisition methods on zero-shot execution of unseen, long-horizon tasks in new environments
              
  
View details
          
        
      
    
        
          
            
              Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
            
          
        
        
          
            
              
                
                  
                    
                
              
            
              
                
                  
                    
                    
    
    
    
    
    
                      
                        Harris Chan
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Pierre Sermanet
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ayzaan Wahid
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Anthony Brohan
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Karol Hausman
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sergey Levine
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Jonathan Tompson
                      
                    
                  
              
            
          
          
          
          
            RSS 2023 (2023)
          
          
        
        
        
          
              Preview abstract
          
          
              In recent years, much progress has been made in learning robotic manipulation policies that can follow natural language instructions.
Common approaches involve learning methods that operate on offline datasets, such as task-specific teleoperated demonstrations or on hindsight labeled robotic experience.
Such methods work reasonably but rely strongly on the assumption of clean data: teleoperated demonstrations are collected with specific tasks in mind, while hindsight language descriptions rely on expensive human labeling.
Recently, large-scale pretrained language and vision-language models like CLIP have been applied to robotics in the form of learning representations and planners.
However, can these pretrained models also be used to cheaply impart internet-scale knowledge onto offline datasets, providing access to skills contained in the offline dataset that weren't necessarily reflected in ground truth labels?
We investigate fine-tuning a reward model on a small dataset of robot interactions with crowd-sourced natural language labels and using the model to relabel instructions of a large offline robot dataset.
The resulting dataset with diverse language skills is used to train imitation learning policies, which outperform prior methods by up to 30% when evaluated on a diverse set of novel language instructions that were not contained in the original dataset.
              
  
View details
          
        
      
    
        
          
            
              Scalable Multi-Sensor Robot Imitation Learning via Task-Level Domain Consistency
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Armando Fuentes
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Daniel Ho
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Eric Victor Jang
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Matt Bennice
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Mohi Khansari
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Nicolas Sievers
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sean Kirmani
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Yuqing Du
                      
                    
                  
              
            
          
          
          
          
            ICRA (2023) (to appear)
          
          
        
        
        
          
              Preview abstract
          
          
              Recent work in visual end-to-end learning for robotics has shown the promise of imitation learning across a variety of tasks. However, such approaches are often expensive and require vast amounts of real world training demonstrations. Additionally, they rely on a time-consuming evaluation process for identifying the best model to deploy in the real world. These challenges can be mitigated by simulation - by supplementing real world data with simulated demonstrations and using simulated evaluations to identify strong policies. However, this introduces the well-known ``reality gap'' problem, where simulator inaccuracies decorrelates performance in simulation from reality. In this paper, we build on top of prior work in GAN-based domain adaptation and introduce the notion of a Task Consistency Loss (TCL), a self-supervised contrastive loss that encourages sim and real alignment both at the feature and action-prediction level. We demonstrate the effectiveness of our approach on the challenging task of latched-door opening with a 9 Degree-of-Freedom (DoF) mobile manipulator from raw RGB and depth images. While most prior work in vision-based manipulation operate from a fixed, third person view, mobile manipulation couples the challenges of locomotion and manipulation with greater visual diversity and action space complexity. We find that we are able to achieve 77% success on seen and unseen scenes, a +30% increase from the baseline, using only ~16 hours of teleoperation demonstrations in sim and real.
              
  
View details
          
        
      
    
        
          
            
              Single-Level Differentiable Contact Simulation
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Simon Le Cleac'h
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Mac Schwager
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Zachary Manchester
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Pete Florence
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
          
          
          
          
            IEEE RAL (2023)
          
          
        
        
        
          
              Preview abstract
          
          
              We present a differentiable formulation of rigid-body contact dynamics for objects and robots represented as compositions of convex primitives. Existing optimization-based approaches simulating contact between convex primitives rely on a bilevel formulation that separates collision detection and contact simulation. These approaches are unreliable in realistic contact simulation scenarios because isolating the collision detection problem introduces contact location non-uniqueness. Our approach combines contact simulation and collision detection into a unified single-level optimization problem. This disambiguates the collision detection problem in a physics-informed manner. Compared to previous differentiable simulation approaches, our formulation features improved simulation robustness and computational complexity improved by more than an order of magnitude. We provide a numerically efficient implementation of our formulation in the Julia language called \href{https://github.com/simon-lc/DojoLight.jl}{DojoLight.jl}.
              
  
View details
          
        
      
    
        
          
            
              A Connection between Actor Regularization and Critic Regularization in Reinforcement Learning
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        Benjamin Eysenbach
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Matthieu Geist
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ruslan Salakhutdinov
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Sergey Levine
                      
                    
                  
              
            
          
          
          
          
            International Conference on Machine Learning (ICML) (2023)
          
          
        
        
        
          
              Preview abstract
          
          
              As with any machine learning problem with limited data, effective offline RL
algorithms require careful regularization to avoid overfitting, with most methods
regularizing either the actor or the critic. These methods appear distinct. Actor
regularization (e.g., behavioral cloning penalties) is simpler and has appealing
convergence properties, while critic regularization typically requires significantly
more compute because it involves solving a game, but it has appealing lower-bound
guarantees. Empirically, prior work alternates between claiming better results with
actor regularization and critic regularization. In this paper, we show that these two
regularization techniques can be equivalent under some assumptions: regularizing
the critic using a CQL-like objective is equivalent to updating the actor with a BC-
like regularizer and with a SARSA Q-value (i.e., “1-step RL”). Our experiments
show that this theoretical model makes accurate, testable predictions about the
performance of CQL and one-step RL. While our results do not definitively say
whether users should prefer actor regularization or critic regularization, our results
hint that actor regularization methods may be a simpler way to achieve the desirable
properties of critic regularization. The results also suggest that the empirically-
demonstrated benefits of both types of regularization might be more a function of
implementation details rather than objective superiority.
              
  
View details
          
        
      
    
        
          
            
              Robotic Table Tennis: A Case Study into a High Speed Learning System
            
          
        
        
          
            
              
                
                  
                    
    
    
    
    
    
                      
                        David B. D'Ambrosio
                      
                    
                
              
            
              
                
                  
                    
                    
                      
                        Jon Abelian
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Saminda Abeyruwan
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Michael Ahn
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Justin Boyd
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Erwin Johan Coumans
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Omar Escareno
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Wenbo Gao
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Navdeep Jaitly
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Juhana Kangaspunta
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Satoshi Kataoka
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Gus Kouretas
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Yuheng Kuang
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Corey Lynch
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Thinh Nguyen
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Ken Oslund
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Barney J. Reed
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Anish Shankar
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Pierre Sermanet
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Avi Singh
                      
                    
                  
              
            
              
                
                  
                    
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Grace Vesom
                      
                    
                  
              
            
              
                
                  
                    
                    
                      
                        Peng Xu
                      
                    
                  
              
            
          
          
          
          
            Robotics: Science and Systems (2023)
          
          
        
        
        
          
              Preview abstract
          
          
              We present a deep-dive into a learning robotic system that, in previous work, was shown to be capable of hundreds of table tennis rallies with a human and has the ability to precisely return the ball to desired targets. This system puts together a highly optimized and novel perception subsystem, a high-speed low-latency robot controller, a simulation paradigm that can prevent damage in the real world and also train policies for zero-shot transfer, and automated real world environment resets that enable autonomous training and evaluation on physical robots. We complement a complete system description including numerous design decisions that are typically not widely disseminated, with a collection of ablation studies that clarify the importance of mitigating various sources of latency, accounting for training and deployment distribution shifts, robustness of the perception system, and sensitivity to policy hyper-parameters and choice of action space. A video demonstrating the components of our system and  details of experimental results is included in the supplementary material.
              
  
View details
          
        
      
    