Learning From Imperfect Demonstrations In A Surgical Training Task

Robotic surgery offers several advantages over traditional techniques, including improved precision, greater consistency, and enhanced dexterity. Learning from demonstrations (LfD) is a promising approach for transferring expert skills to robots, thereby alleviating clinicians’ physical workload. However, a major challenge in surgical robotics is that demonstration data often includes suboptimal or failed behaviors due to human error, fatigue, or the inherent complexity of surgical tasks. Discarding such imperfect data results in the loss of valuable information and hinders the scalability of data-driven surgical skill acquisition. In this work, we propose a novel LfD optimization framework capable of learning from a broad spectrum of demonstrations—including successful, suboptimal, and failed attempts. Our method employs a dual probabilistic modeling strategy to encode demonstrations and formulates a multi-objective optimization problem under novel problem conditions to find an optimal reproduction. We validate our approach on the standard ring-and-rail task, a representative surgical training task requiring high-precision and dexterous manipulation. Real-world experiments using the da Vinci Research Kit (dVRK) show that, even in the presence of failure cases within the demonstration set, our method produces optimized trajectories that enable the patient-side manipulator to successfully guide the ring along the curved wire without contact. These results demonstrate the robustness and effectiveness of our approach in learning from imperfect data, underscoring its potential for real-world deployment in robot-assisted surgery.