ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer

Huadai Liu, Rongjie Huang, Xuan Lin, Wenqiang Xu, Maozong Zheng, Hong Chen, Jinzheng He, Zhou Zhao

Main: Speech and Multimodality Main-poster Paper

Poster_Demo_Industry Hybrid 4: Speech and Multimodality (Poster)
Conference Room: East Foyer(Virtual)
Conference Time: December 09, 09:00-10:30 (+08) (Asia/Singapore)
Global Time: December 09, Poster_Demo_Industry Hybrid 4 (01:00-02:30 UTC)
TLDR:
You can open the #paper-ARR-59 channel in a separate window.
Abstract: