VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis
Google’s desperate attempt to showcase “something”
Published in
6 min readMar 21, 2024
The research initiative led by Enric Corona, Andrei Zanfir, Eduard Gabriel Bazavan, Nikos Kolotouros, Thiemo Alldieck, and Cristian Sminchisescu at Google Research introduces an innovative framework named VLOGGER. This novel system showcases the capacity to generate photorealistic and temporally coherent videos of humans talking and moving vividly, all from a single input image and audio sample. Sure, it is innovative, but I am really struggling…