fBERT: A Neural Transformer for Identifying Offensive Content

Diptanu Sarkar, Marcos Zampieri, Tharindu Ranasinghe, Alexander Ororbia

    Research output: Chapter in Book/Published conference outputConference publication

    Abstract

    Transformer-based models such as BERT, XLNET, and XLM-R have achieved state-of-the-art performance across various NLP tasks including the identification of offensive language and hate speech, an important problem in social media. In this paper, we present fBERT, a BERT model retrained on SOLID, the largest English offensive language identification corpus available with over 1.4 million offensive instances. We evaluate fBERT’s performance on identifying offensive content on multiple English datasets and we test several thresholds for selecting instances from SOLID. The fBERT model will be made freely available to the community.
    Original languageEnglish
    Title of host publicationFindings of the Association for Computational Linguistics: EMNLP 2021
    PublisherAssociation for Computational Linguistics (ACL)
    Pages1792-1798
    DOIs
    Publication statusPublished - Nov 2021
    EventThe 2021 Conference on Empirical Methods in Natural Language Processing - Punta Cana, Dominican Republic
    Duration: 7 Nov 202111 Nov 2021
    https://2021.emnlp.org/

    Conference

    ConferenceThe 2021 Conference on Empirical Methods in Natural Language Processing
    Abbreviated titleEMNLP 2021
    Country/TerritoryDominican Republic
    CityPunta Cana
    Period7/11/2111/11/21
    Internet address

    Fingerprint

    Dive into the research topics of 'fBERT: A Neural Transformer for Identifying Offensive Content'. Together they form a unique fingerprint.

    Cite this