CVPR Daily - Friday

10 DAILY CVPR Friday Unified-IO 2 builds on the foundations laid by its predecessor, Unified-IO, aiming to create a model that can truly input and output anything. Training such a comprehensive model, especially with limited resources, has been incredibly tough. The team’s first major challenge was collecting the pretraining and instruction tuning data. The second was training a multimodal model from scratch rather than adapting existing unimodal models. “We tried a few months of tricks to stabilize the model and make it train better,” Jiasen recalls. “We figured out a few key recipes that were used by later papers and shown to be very effective, even in other things like image generation. We’re training on a relatively large scale with 7B models and over 1 trillion data. More than 230 tasks were involved in training these giant models.” Highlight Presentation

RkJQdWJsaXNoZXIy NTc3NzU=