Robot automation is typically welcomed for its ability to handle dirty, dull, or dangerous tasks. However, as robot capabilities continue to expand, robots are entering safe and enjoyable domains, such as the creative industries. Although there is widespread resistance to automation in creative fields, from hobbyists to professionals, many are willing to embrace creative tools with supportive or collaborative features.
Supporting creative tasks in real-world robot scenarios presents significant challenges: relevant datasets are extremely limited, creative tasks are inherently abstract and high-level, and real-world tools and materials are difficult to model and predict. Learning-based robot intelligence offers a promising path for creative support tools, but due to the extreme complexity of the tasks, common approaches like imitation learning require vast amounts of data, while reinforcement learning may never converge.
In this thesis, we propose multiple self-supervised learning techniques that enable robots to learn autonomously, thereby supporting humans in creative activities.
We formalize robots in the real world that support human creation based on high-level goals as a new research field: Generative Robotics. We propose methods for supporting 2D visual art creation (including painting and sketching) as well as 3D clay sculpting from a fixed viewpoint. Lacking robot datasets for collaborative painting and sculpting, our designed methods learn real-world constraints from small-scale, self-generated robot datasets and enable collaborative interactions.
The contributions of this thesis include: (1) a Real2Sim2Real technique that allows robots to construct complex dynamics models from small-scale, self-generated motion data; (2) a method for planning robot actions in long-horizon tasks within a semantically aligned representation space; (3) a self-supervised learning framework for adapting pre-trained models to be robot-compatible and generate collaborative goals. We demonstrate how self-supervised learning enables model-based robot planning methods to collaborate with humans in painting across various media.
Finally, we extend the method from painting to sculpting, proving its ability to generalize to new materials, tools, action representations, and state representations.