A Video Is Worth 4096 Tokens: Verbalize Story Videos To Understand Them In Zero Shot
Aanisha Bhattacharyya, Yaman K Singla, Balaji Krishnamurthy, Rajiv Ratn Shah, Changyou Chen
Main: Speech & Multimodality 1 Main-oral Paper
Session 9: Speech & Multimodality 1 (Oral)
Conference Room: Central 3
Conference Time: December 10, 09:00-10:30 (+08) (Asia/Singapore)
Global Time: December 10, Session 9 (01:00-02:30 UTC)
TLDR:
You can open the
#paper-232
channel in a separate window.
Abstract: