Part of the Computer Sciences Commons
On the effect of dropping layers of pre-trained transformer models, Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov Natural Language Processing Faculty Publications
Link
On the effect of dropping layers of pre-trained transformer models, Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov Preslav Nakov
Advanced Search