M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation Computer Vision News Computer Vision News 2 UR24 Best Student Paper Award In this paper, Fotios and his fellow researchers tackle an important robotics problem – how to effectively fuse different sensing modalities, such as vision and touch, so that a reinforcement learning algorithm can make sense of them and robots can make better decisions during manipulation tasks. These algorithms are notoriously data-hungry, requiring vast amounts of examples to learn from and make informed decisions. This situation intensifies when dealing with high-dimensional inputs like vision and touch. “Imagine you have big images that have a lot of pixels, and each pixel can take a lot of values,” Fotios says. “The same goes for tactile sensing. You have very discrete information. It’s also high dimensional, and if you give all this high-dimensional raw data to the algorithm, it will never figure out how to map observation to an actual action.” Fotios Lygerakis is a third-year PhD student and University Assistant at the University of Leoben in Austria. He speaks to us fresh from winning the Best Student Paper Award at the International Conference on Ubiquitous Robots (UR2024).
RkJQdWJsaXNoZXIy NTc3NzU=