🌮🔥 BERTweet's TACO Fiesta: Contrasting Flavors On The Path Of Inference And Information-Driven Argument Mining On Twitter 🔥🌮
This repository contains the code used for the paper "BERTweet's TACO Fiesta: Contrasting Flavors On The Path Of Inference And Information-Driven Argument Mining On Twitter".
Table of Contents:
- notebooks
- classifier_cv.ipynb: For the validation of closed and cross-topic classifications on TACO.
- data_augmentation.ipynb: For creating A-TACO, which is an augmented copy of TACO.
- fine_tuning_bertweet.ipynb: For the contrastive pre-classification fine-tuning of BERTweet.
- target_space.ipynb: For visually analyzing the optimized embeddings of BERTweet.
- train_classifier.ipynb: For training Augmented BERTweet and retraining WRAP for publication.
- tweet_embeddings.ipynb: For generating embeddings of TACO and A-TACO for all BERTweet models.
- outputs
- cv-6-shuffled.csv: The outputs of all models for closed-topic evaluation with dynamic embeddings.
- cv-6-topic.csv: The outputs of all models for cross-topic evaluation with dynamic embeddings.
- cv-6-shuffled-frozen.csv: The outputs of all models for closed-topic evaluation with frozen embeddings.
- cv-6-topic-frozen.csv: The outputs of all models for cross-topic evaluation with frozen embeddings.
All models can be used via Huggingface:
Note
Notice: For accessing the data please contact the authors for additional information.
Model Precision Recall F1
Evaluation on golden holdout tweets of #Abortion
Vanilla BERTweet-CLS 50.00 100.00 66.67
Augmented BERTweet-CLS 65.69 86.66 74.73
WRAPresentations-CLS 66.00 84.32 74.04
WRAPresentations-MEAN 63.05 88.91 73.78
(1) Inference (2) Information (3) Multi-Class
Model Frozen Dynamic Frozen Dynamic Frozen Dynamic
Closed-Topic (6-fold) Validation
Length 62.34 71.47 38.26
SVM + TF-IDF 76.00 75.44 55.39
LR + TF-IDF 76.87 74.73 54.76
RF + TF-IDF 76.12 80.56 55.65
Vanilla BERTweet 73.12 84.54 66.49 83.55 42.87 71.05
Augmented BERTweet 84.49 86.68 79.22 84.57 67.07 73.80
WRAPresentations 86.88 86.62 81.54 86.30 71.07 75.29
Cross-Topic (6-fold) Validation
Length 61.99 71.55 38.17
SVM + TF-IDF 72.24 74.79 50.55
LR + TF-IDF 72.20 75.90 50.41
RF + TF-IDF 73.93 80.16 53.29
Vanilla BERTweet 70.28 83.15 66.15 82.22 39.00 68.12
Augmented BERTweet 84.20 84.25 79.38 83.31 66.41 69.99
WRAPresentations 86.83 86.27 81.54 84.90 70.93 73.54
Reason Statement Notification None
Model Frozen Dynamic Frozen Dynamic Frozen Dynamic Frozen Dynamic
Closed-Topic (6-fold) Validation
Length 61.68 20.19 14.47 56.72
SVM + TF-IDF 64.79 24.57 62.36 69.85
LR + TF-IDF 65.75 17.66 62.62 73.02
RF + TF-IDF 69.35 17.30 63.35 72.62
Vanilla BERTweet 66.05 74.98 00.00 53.99 43.80 77.62 61.63 77.62
Augmented BERTweet 74.50 76.82 49.53 58.37 70.95 80.28 73.29 79.71
WRAPresentations 77.34 78.14 58.66 60.96 72.61 79.36 75.67 82.72
Cross-Topic (6-fold) Validation
Length 61.78 19.32 14.49 57.09
SVM + TF-IDF 62.35 18.68 56.11 65.05
LR + TF-IDF 65.19 16.09 55.30 65.08
RF + TF-IDF 68.61 13.33 62.75 68.46
Vanilla BERTweet 63.57 73.15 00.00 47.40 35.79 74.92 56.64 77.01
Augmented BERTweet 75.18 75.10 46.34 51.74 71.61 75.71 72.50 77.42
WRAPresentations 77.13 77.05 57.62 58.33 73.05 78.45 75.91 80.33
Topic | Original | Augmented |
---|---|---|
Abortion | If you eat eggs, you shouldn't say anything against abortion #AbortionIsHealthcare #AbortionIsAWomansRight #AbortionBan #abortion HTTPURL | If you eat meat, you should not say anything against it..... |
Brexit | #OTD 1920 science fiction author Isaac Asimov was born. When stupidity is considered patriotism, it is unsafe to be intelligent. As advocated by #NotMyPM serial #LiarJohnson with the disaster called #Brexit. HTTPURL | A science fiction author @USER was born. When he is considered intelligent, it became unsafe to be intelligent. As advocated by a serial killer with the disaster called @USER. HTTPURL |
Twitter-Takeover | It's amazing that so much stupid could come out of someone so small... #TwitterTakeover HTTPURL | It is amazing that so much good could come out of someone so small... HTTPURL HTTPURL |
Topic | Original | Augmented |
---|---|---|
Abortion | Do the people against requiring the #vaccine- stating the argument 'it's against our #medicalfreedom'- realize that outlawing #abortion is against the same #rights they are leaning on? #VaccineMandate #AbortionBan #prochoice #ProLife #YCHYCAEIT | Do the people against requiring a #vaccine- stating the argument 'it is against our rights'- realize that outlawing it is against the same #rights they are leaning against? #VaccineMandate HTTPURL #CYCHYCAEIT |
Brexit | The #brexit countdown clock is like the years rolling back | The Christmas countdown advert is like the clocks rolling back |
Twitter-Takeover | @USER @USER Not really. They see that Twitter continues to make poor decisions that devalue the stock and product. | @USER Not really. They see that Apple continues to make poor decisions that devalue the stock and product. |
Topic | Original | Augmented |
---|---|---|
Abortion | BREAKING: Federal judge tells #Texas to shove its 6-week #abortion ban... HTTPURL | BREAKING: Federal judge tells Trump to shove his Muslim travel ban... HTTPURL |
Brexit | British lawmakers finally approve historic #Brexit deal HTTPURL | US lawmakers to approve historic trade deal HTTPURL |
Twitter-Takeover | The CEO of Twitter @USER hasn't posted in 4 days. Meanwhile, @USER has posted more than 15 times in that period. #TwitterTakeover HTTPURL | The CEO of the company has not posted in months. However, he has posted several times in that period. HTTPURL |
Topic | Original | Augmented |
---|---|---|
Abortion | @USER @USER Bastards! | Yes! |
Brexit | @USER 😆 | Lol... |
Twitter-Takeover | @USER @USER Hi Jules! 👋 | Hello! 👋 |
BERTweet’s TACO Fiesta: Contrasting Flavors On The Path Of Inference And Information-Driven Argument Mining On Twitter by Marc Feger is licensed under CC BY-NC-SA 4.0
Please contact [email protected] or [email protected].
@inproceedings{feger-dietze-2024-bertweets,
title = "{BERT}weet{'}s {TACO} Fiesta: Contrasting Flavors On The Path Of Inference And Information-Driven Argument Mining On {T}witter",
author = "Feger, Marc and
Dietze, Stefan",
editor = "Duh, Kevin and
Gomez, Helena and
Bethard, Steven",
booktitle = "Findings of the Association for Computational Linguistics: NAACL 2024",
month = jun,
year = "2024",
address = "Mexico City, Mexico",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.findings-naacl.146",
doi = "10.18653/v1/2024.findings-naacl.146",
pages = "2256--2266"
}