GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebron, Sumit Sanghai

Main: Efficient Methods for NLP Main-poster Paper

Poster_Demo_Industry Hybrid 5: Efficient Methods for NLP (Poster)
Conference Room: East Foyer(Virtual)
Conference Time: December 09, 11:00-12:30 (+08) (Asia/Singapore)
Global Time: December 09, Poster_Demo_Industry Hybrid 5 (03:00-04:30 UTC)
TLDR:
You can open the #paper-294 channel in a separate window.
Abstract: