🌊 Self-Improving Autonomous Underwater Manipulation 🤖

1Columbia University, 2Stanford University, 3University of Notre Dame

AquaBot is an autonomous manipulation system that combines behavior cloning from human demonstrations with self-learning to improve beyond human teleoperation performance.

Abstract

Underwater robotic manipulation faces significant challenges due to complex fluid dynamics and unstructured environments, causing most manipulation systems to rely heavily on human teleoperation. In this paper, we introduce AquaBot, a fully autonomous manipulation system that combines be- havior cloning from human demonstrations with self-learning optimization to improve beyond human teleoperation performance. With extensive real-world experiments, we demonstrate AquaBot’s versatility across diverse manipulation tasks, including object grasping, trash sorting, and rescue retrieval. Our real-world experiments show that AquaBot’s self-optimized policy outperforms a human operator by 41% in speed. AquaBot represents a promising step towards autonomous and self-improving underwater manipulation systems. We open-source both hardware and software implementation details.

Video

Method

Method
Our method is based on behavior cloning and self-learning. In the first part, we collect human demonstrations and train a behavior cloning model. In the second part, we use the behavior cloning model to collect data for self-learning in order to improve the performance beyond human teleoperation.

Task 1: Rock Grasping

We show real-world deployment of our grasping policy under various lighting conditions, initialial states, rock geometries, environmental/human pertubation.

Task 2: Garbage Sorting

We can chain multiple skills into a long-horizon behavior. In this task, AquaBot is able to sort 3 categories of objects into their corresponding bins. We show an uncut full 20-minutes video of policy rollout.

Task 3: Robot Safeguard

Our policy can autonomously manipulate large objects with complicated dynamics 2 times its own weight and size.

Self Learning for Policy Acceleration

We use the behavior cloning model to collect data for self-learning in order to improve the efficiency beyond human level.

System Overview

hardware
Our system is made of low-cost hardware and we fully open-source our software implementation, making laboratory research more accessible to the community. Please refer to our GitHub repo for more details.

Have some fun with your friends and AquaBots :)

Watch with audio on!!!

Acknowledgement

We would like to thank Cheng Chi, Aurora Qian, Yunzhu Li, Zeyi Liu, Matei Ciocarlie, and Xia Zhou for their helpful feedback. We would also like to acknowledge the technical support from QYSEA. This work is supported in part by NSF Award #2143601, #2037101, and #2132519, #1925157, Sloan Fellowship. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the sponsors

BibTeX

@misc{liu2024selfimprovingautonomousunderwatermanipulation,
      title={Self-Improving Autonomous Underwater Manipulation}, 
      author={Ruoshi Liu and Huy Ha and Mengxue Hou and Shuran Song and Carl Vondrick},
      year={2024},
      eprint={2410.18969},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2410.18969}, 
    }