ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Huadai Liu, Rongjie Huang, Xuan Lin, Wenqiang Xu, Maozong Zheng, Hong Chen, Jinzheng He, Zhou Zhao
Main: Speech and Multimodality Main-poster Paper
Poster_Demo_Industry Hybrid 4: Speech and Multimodality (Poster)
Conference Room: East Foyer(Virtual)
Conference Time: December 09, 09:00-10:30 (+08) (Asia/Singapore)
Global Time: December 09, Poster_Demo_Industry Hybrid 4 (01:00-02:30 UTC)
TLDR:
You can open the
#paper-ARR-59
channel in a separate window.
Abstract: