On the effect of dropping layers of pre-trained transformer models

Overview of attention for article published in Computer Speech & Language, January 2023

Altmetric Badge

About this Attention Score

In the top 5% of all research outputs scored by Altmetric
One of the highest-scoring outputs from this source (#1 of 431)
High Attention Score compared to outputs of the same age (95th percentile)

Mentioned by

news: 1 news outlet
blogs: 1 blog

twitter: 59 X users
reddit: 1 Redditor

Citations

dimensions_citation: 24 Dimensions

Readers on

mendeley: 59 Mendeley

Summary News Blogs X Reddit Dimensions citations

So far, Dimensions has found 24 publications that cite this research output.

24

The most recent citing publications are shown below. View all 24 publications that cite this research output on Dimensions.

A vulnerability severity prediction method based on bimodal data and multi-task learning

Article in Journal of Systems and Software (July 2024)

Less is more: Pruning BERTweet architecture in Twitter sentiment analysis

Article in Information Processing & Management (July 2024)

Towards Data- and Compute-Efficient Fake-News Detection: An Approach Combining Active Learning and Pre-Trained Language Models

Article in SN Computer Science (April 2024)

This page shows the most recent citations of this research output.

Click here to find out how to access more activity.