Knowledge Graph informed Fake News Classification via Heterogeneous Representation Ensembles

Repository containing the code of the paper: Knowledge Graph informed Fake News Classification via Heterogeneous Representation Ensembles

Knowledge graph embeddings

For the purpose of the research we used the GraphVite Wikidata5

Download and save them to kg_dump folder.

KG-based representations:

Concept extraction

The extraction is represented in Fuzzy_extractor.ipynb notebook.

For each dataset KG concepts are are extracted and saved in dataset kg_emb_dump folder.

dataset_kgmethod.pkl

Metadata concept extraction

The extraction is represented in Fuzzy_extractor.ipynb notebook.

Only for the LIAR and the FakeNewsnet, KG concepts are extracted and saved in dataset kg_emb_dump folder in a format:.

dataset_method_entity.pkl

Language representations

The code for obtaining the language representation of each dataset is to be found in the Extractor.ipynb notebook. To extract:

To extract KG_REPRESENTATIONS run the export_kgs block at the end of the notebook
To extract LANGUAGE_REPRESENTATIONS run the export_LM block at the end of the notebook

Classifiers

The neural learners are to be found in the bind_learn.py script.
The LR baselines are in the cartesian_regression.py script.

Citation

@article{KOLOSKI2022,
title = {Knowledge graph informed fake news classification via heterogeneous representation ensembles},
journal = {Neurocomputing},
year = {2022},
issn = {0925-2312},
doi = {https://doi.org/10.1016/j.neucom.2022.01.096},
url = {https://www.sciencedirect.com/science/article/pii/S0925231222001199},
author = {Boshko Koloski and Timen {Stepišnik Perdih} and Marko Robnik-Šikonja and Senja Pollak and Blaž Škrlj},
keywords = {Fake news detection, Knowledge graphs, Text representation, Representation learning, Neuro-symbolic learning},
abstract = {Increasing amounts of freely available data both in textual and relational form offers exploration of richer document representations, potentially improving the model performance and robustness. An emerging problem in the modern era is fake news detection—many easily available pieces of information are not necessarily factually correct, and can lead to wrong conclusions or are used for manipulation. In this work we explore how different document representations, ranging from simple symbolic bag-of-words, to contextual, neural language model-based ones can be used for efficient fake news identification. One of the key contributions is a set of novel document representation learning methods based solely on knowledge graphs, i.e., extensive collections of (grounded) subject-predicate-object triplets. We demonstrate that knowledge graph-based representations already achieve competitive performance to conventionally accepted representation learners. Furthermore, when combined with existing, contextual representations, knowledge graph-based document representations can achieve state-of-the-art performance. To our knowledge this is the first larger-scale evaluation of how knowledge graph-based representations can be systematically incorporated into the process of fake news classification.}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data/final		data/final
src_end2end		src_end2end
Analyzer.ipynb		Analyzer.ipynb
Exporter.ipynb		Exporter.ipynb
Extractor.ipynb		Extractor.ipynb
Fuzzy_extractor.ipynb		Fuzzy_extractor.ipynb
LICENSE		LICENSE
README.md		README.md
Test_SANs.ipynb		Test_SANs.ipynb
Untitled.ipynb		Untitled.ipynb
Untitled1.ipynb		Untitled1.ipynb
bind_learn.py		bind_learn.py
cartesian_regression.py		cartesian_regression.py
config.py		config.py
dataz.pkl		dataz.pkl
evalz.py		evalz.py
feature_construction.py		feature_construction.py
lsa_features.py		lsa_features.py
lsa_model.py		lsa_model.py
main.py		main.py
model.py		model.py
model_helper.py		model_helper.py
requirements.txt		requirements.txt
scheme.png		scheme.png
try_pan_pls.py		try_pan_pls.py
umap_features.py		umap_features.py
utils.py		utils.py
wikidata_concept_finder.py		wikidata_concept_finder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge Graph informed Fake News Classification via Heterogeneous Representation Ensembles

Knowledge graph embeddings

KG-based representations:

Concept extraction

The extraction is represented in Fuzzy_extractor.ipynb notebook.

For each dataset KG concepts are are extracted and saved in dataset kg_emb_dump folder.

Metadata concept extraction

The extraction is represented in Fuzzy_extractor.ipynb notebook.

Only for the LIAR and the FakeNewsnet, KG concepts are extracted and saved in dataset kg_emb_dump folder in a format:.

Language representations

The code for obtaining the language representation of each dataset is to be found in the Extractor.ipynb notebook. To extract:

Classifiers

Citation

About

Releases

Packages

Contributors 3

Languages

License

bkolosk1/KBNR

Folders and files

Latest commit

History

Repository files navigation

Knowledge Graph informed Fake News Classification via Heterogeneous Representation Ensembles

Knowledge graph embeddings

KG-based representations:

Concept extraction

The extraction is represented in Fuzzy_extractor.ipynb notebook.

For each dataset KG concepts are are extracted and saved in dataset kg_emb_dump folder.

Metadata concept extraction

The extraction is represented in Fuzzy_extractor.ipynb notebook.

Only for the LIAR and the FakeNewsnet, KG concepts are extracted and saved in dataset kg_emb_dump folder in a format:.

Language representations

The code for obtaining the language representation of each dataset is to be found in the Extractor.ipynb notebook. To extract:

Classifiers

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages