Multi-domain Urdu fake news detection using pre-trained ensemble model.

Advanced Search

Sheetal Harris, Hassan Jalil Hadi, Naveed Ahmad, Mohammed Ali Alshara

Author Information

Sheetal Harris: School of Cyber Science and Engineering, Wuhan University, Wuhan, China.
Hassan Jalil Hadi: School of Cyber Science and Engineering, Wuhan University, Wuhan, China. hjhwhu@whu.edu.cn.
Naveed Ahmad: Prince Sultan University, Riyadh, Saudi Arabia.
Mohammed Ali Alshara: Prince Sultan University, Riyadh, Saudi Arabia.

PMID: 40082485 DOI: 10.1038/s41598-025-91054-4

Fake News (FN) dissemination on websites and online platforms influences human behaviours, socio-political domains, and the sovereignty of a country. The outpour of biased news and propaganda on online portals can be addressed by restricting online propaganda using an automated mechanism. Proving the authenticity of news and information on online platforms in regional languages, such as Urdu, with limited resources and datasets, is challenging. Furthermore, limited research in resource-constrained languages has created language bias in Artificial Intelligence (AI) research, which is concentrated in this study. Natural Language Processing (NLP) techniques have been used for Fake News Detection (FND) for English news and various language-related tasks. Previous studies used Machine Learning (ML), Deep Learning (DL), and individual Pre-trained Language Models (PLMs) for Urdu FND. ML-based ensemble model showed better performance than pre-trained models for Urdu FND. We propose a methodology for Urdu FND by applying stacked ensemble learning of PLMs, ELECTRA, mBERT and XLM-RoBERTa after apposite fine-tuning and hyperparameter optimization. To overcome the limitations of each pre-trained transformer model, these are fine-tuned individually using a publicly available Urdu dataset. The prediction performance results of the proposed stacking approach surpass the performance of each pre-trained model. An Accuracy of 0.914, a Matthews Correlation Co-efficient (MCC) value of 0.898, and an F1-score of 0.904 validate the efficacy of the proposed ensemble model.

PeerJ Comput Sci. 2021 Mar 9;7:e425 [PMID: 33817059]
BMC Genomics. 2020 Jan 2;21(1):6 [PMID: 31898477]
PLoS One. 2017 Jun 2;12(6):e0177678 [PMID: 28574989]
Sensors (Basel). 2024 Sep 19;24(18): [PMID: 39338806]
Sci Rep. 2021 Dec 8;11(1):23705 [PMID: 34880354]
Inf Process Manag. 2021 Sep;58(5):102610 [PMID: 36567974]

Humans

Natural Language Processing

Machine Learning

Deception

Artificial Intelligence

Language

Internet

Information Dissemination

Journal Article

Fake news detection in Urdu language using machine learning.Deep Ensemble Fake News Detection Model Using Sequential Deep Learning Technique.Supervised ensemble learning methods towards automatically filtering Urdu fake news within social media.KG-MFEND: an efficient knowledge graph-based model for multi-domain fake news detection.Multi-modal transformer for fake news detection.Multimodal Fake-News Recognition Using Ensemble of Deep Learners.Enhancing the Predictive Performance of Credibility-Based Fake News Detection Using Ensemble Learning.Normalized effect size (NES): a novel feature selection model for Urdu fake news classification.Text-image multimodal fusion model for enhanced fake news detection.MCred: multi-modal message credibility for fake news detection using BERT and CNN.

No available data.

OpenLB
Open Library of Bioscience