Computer Vision News 2 Exclusive Interview Yann LeCun was keynote speaker at MICCAI 2023. He was so kind as to give a second interview to Ralph, during his visit at ICCV 2023 in Paris. Yann, thank you very much for being with us again. When we talked five years ago, you told me you had a clear plan for the next few years. Did you stick to it? The plan hasn’t changed very much – the details have changed, and we’ve made progress, but the original plan is still the same. The original plan was the limitation of current AI systems is that they’re not capable of understanding the world. You need a system that can understand the world if you want it to be able to plan. You need to imagine in your head what the consequences of your actions might be, and for this, you need a world model. I’ve been advocating for this for a long time. This is not a new idea. The concept is very old, from optimal control, but using machine learning to learn the world models is the big problem. Back when we talked, I can’t remember if I’d made the transition between what I called latent variable generative models and what I’m advocating now, which I call JEPA, so joint embedding predictive architectures. I used to think that the proper way to do this would be to train a system on videos to predict what will happen in the video, perhaps as a consequence of some action being taken. If you have
RkJQdWJsaXNoZXIy NTc3NzU=