Large pretrained language models such as BERT suffer from slow inference and high memory usage, due to their huge size. Recent approaches to compressing BERT rely on iterative pruning and knowledge distillation, which, however, are often too complicated and computationally intensive. This paper proposes a novel semi-structured one-shot pruning method for BERT, called Permutat...
We bring you the latest updates from Journal of Artificial Intelligence Research through a simple and fast subscription.
We can deliver your news in your inbox, on your phone or you can read them here on this website on your personal news page.
Unsubscribe at any time without hassle.
Journal of Artificial Intelligence Research's title: Journal of Artificial Intelligence Research