oai:arXiv.org:2410.06311
Computer Science
2024
16/10/2024
This study evaluates the effectiveness of machine learning (ML) and deep learning (DL) models in detecting COVID-19-related misinformation on online social networks (OSNs), aiming to develop more effective tools for countering the spread of health misinformation during the pan-demic.
The study trained and tested various ML classifiers (Naive Bayes, SVM, Random Forest, etc.), DL models (CNN, LSTM, hybrid CNN+LSTM), and pretrained language models (DistilBERT, RoBERTa) on the "COVID19-FNIR DATASET".
These models were evaluated for accuracy, F1 score, recall, precision, and ROC, and used preprocessing techniques like stemming and lemmatization.
The results showed SVM performed well, achieving a 94.41% F1-score.
DL models with Word2Vec embeddings exceeded 98% in all performance metrics (accuracy, F1 score, recall, precision & ROC).
The CNN+LSTM hybrid models also exceeded 98% across performance metrics, outperforming pretrained models like DistilBERT and RoBERTa.
Our study concludes that DL and hybrid DL models are more effective than conventional ML algorithms for detecting COVID-19 misinformation on OSNs.
The findings highlight the importance of advanced neural network approaches and large-scale pretraining in misinformation detection.
Future research should optimize these models for various misinformation types and adapt to changing OSNs, aiding in combating health misinformation.
;Comment: 8 pages, 4 tables presented at the OASIS workshop of the ACM Hypertext and Social Media Conference 2024
Sikosana, Mkululi,Ajao, Oluwaseun,Maudsley-Barton, Sean, 2024, A Comparative Study of Hybrid Models in Health Misinformation Text Classification