Imitation Learning with Generative Methods

Imitation learning (behavior cloning) leverages human teleoperated demonstrations to autonomously solve tasks without designing a reward function. Recent advancements in imitation learning rely on generative methods, such as Diffusion and Flow Matching, which excel at modeling complex multi-modal probability distributions. Originally developed for images, these methods can effectively generate high-dimensional distributions, enabling the creation of policies that produce not just single actions but entire future command trajectories. This capability enhances time coherence and improves the feedback-feedforward trade-off of the reactive policy.

We proposed an imitation learning pipeline for the Talos humanoid robot using Flow Matching, designed to perform multi-support tasks. This approach enables the robot to extend its manipulation capabilities by learning "common sense" from human demonstrations about when and where to establish additional contact. Additionally, we demonstrate that a shared autonomy-assisted teleoperation method, utilizing the policy learned from demonstrations, effectively assists in tasks outside the training distribution.

Multi-Contact Whole-Body Retargeting and Control

Using additional contact for balance can enhance the locomotion and manipulation capabilities of complex multi-limb robots. However, multi-contact introduces challenges, particularly in the redundancy of contact force distribution. We investigated methods designed for teleoperation applications. Unlike trajectory optimization, the operator's future intentions are unknown, requiring the system to remain reactive to the stream of commands. Additionally, since operator commands may be dangerous or infeasible, the method must enforce limits to ensure the safety of the robot and its environment.

We developed SEIKO (Sequential Equilibrium Inverse Kinematics Optimization), a whole-body retargeting formulation based on Sequential QP optimization. This approach enables real-time tracking of operator commands at the Cartesian effector level while enforcing physical limits such as balance, joint, and contact constraints in multi-contact scenarios. We also achieved smooth contact transitions for loco-manipulation applications. SEIKO has been implemented on the Valkyrie and Talos humanoid robots, as well as a bimanual Franka arms system capable of manipulating heavy objects.

To enable real robot experiments, we proposed a whole-body admittance controller (SEIKO Controller) based on a similar SQP formulation that regulates contact forces on a position-controlled robot. This indirect force control accounts for and models the internal flexibility of joints and actuators. We demonstrated its effectiveness through teleoperated locomotion tasks in multi-contact environments on the Talos robot.

Teleoperation and Human-Robot Interaction

As robots become more complex and versatile, teleoperating them intuitively becomes increasingly challenging. This research investigated strategies for remotely commanding dual-arm and mobile-base humanoid robots, with the aim of achieving dexterous bimanual manipulation and multi-contact loco-manipulation. On Franka arms, applications include tasks like block assembly and medical ultrasound scanning. Additionally, developing a teleoperable system enables the collection of demonstration data for downstream imitation learning.

Numerous factors influence the operator's experience, performance, and cognitive load. The choice of input device is critical, with different devices offering unique features. We tested haptic feedback from a Sigma 7 device, compactness and velocity command from a 6 DoFs SpaceMouse, or absolute pose command from wireless VP controllers. The frame in which commands are expressed, as well as the command mode - velocity for large workspace areas or position for precise, dexterous tasks - are also significant. Being able to automatically coordinate the two arms for bimanual manipulation and object transport reduces operator's mental load. The latency of video streaming that provides visual feedback is also essential.

In collaboration with researchers from Imperial College, we investigated these factors by demonstrating long-distance bimanual teleoperation between London and Edinburgh. We conducted a user study to evaluate the cognitive load associated with using a multi-view camera and a head-mounted augmented reality interface (HoloLens).

By combining both an impedance and admittance control schemes on torque-controlled robots, we explored human-robot physical collaboration. In this setup, a remote human operator commands the robot while collaborating with a local human assistant who interacts physically with the robot.

RoboCup Football Competition

RoboCup is an international annual robotics competition, first organized in 1996, and divided into several leagues - many of which involve autonomous robotic soccer tournaments. Its ultimate goal is to develop a team of robots capable of winning against the best human team in an official FIFA match by 2050. Together with the Rhoban Football Club at the University of Bordeaux, I participated in the soccer Humanoid Kid-Size League, where teams build and program small humanoid robots up to 90 cm tall to play fully autonomous 4-vs-4 football games. The robots are also restricted to using only human-like sensors, which adds an extra layer of complexity to the competition.

RoboCup games offer a rigorous environment to evaluate and challenge the robustness and practical applicability of robotics methods, pushing ambitions beyond the simplified, controlled conditions of the lab. The competition places a strong emphasis on resilience and reliability, which are essential factors for success.

  • RoboCup 2018, Montréal (Canada): 1st place
  • German Open 2017, Magdebourg (Germany): 1st place
  • RoboCup 2016, Leipzig (Germany): 1st place
  • RoboCup 2015, Hefei (China): 3rd place
  • RoboCup 2014, João Pessoa (Brazil): quarter
  • RoboCup 2013, Eindhoven (Netherlands): round robin

Correcting Small Humanoid Robots

Small humanoid robots used in RoboCup competition are typically built from commercially available position-controlled servomotors (e.g. Dynamixel). These robots offer affordability, experimental convenience, and durability, able to withstand frequent falls without breaking. However, unlike larger robots, small humanoid robots have limited accuracy, and their dynamics are heavily affected by actuator imperfections (e.g., servomotor control inaccuracies, backlash, friction, and electrical effects). These imperfections often cause significant deviations from the classical rigid body model, complicating the use of traditional techniques such as ZMP walking, trajectory optimization, and similar model-based methods. Furthermore, the frequent falls and collisions in a typical RoboCup match lead to kinematic model inaccuracies, as aluminum structural elements tend to bend and deform over time.

This research focused on leveraging machine learning to address these limitations and improve robot performance in RoboCup matches. Specifically, we applied incremental learning algorithms, such as LWPR and gradient-free optimization methods (e.g., CMA-ES), to improve various models. These included odometry and navigation model estimation, adjustments to camera models for accurate object position estimation and localization, and learning actuator model to optimize kick motion trajectories.