An Algorithm Creating for Predicting the Inaccurate Information Presence in Social Networks in Russian Language

Authors

  • Alexander A. Chernyaev Tyumen State University
  • Alexander G. Ivashko Tyumen State University

DOI:

https://doi.org/10.17072/1993-0550-2024-1-60-71

Keywords:

Machine learning, neural network, data analysis, linguistic analysis, semantic analysis, social networks

Abstract

The development of user-to-user communication methods, such as social media, has resulted in the amount of inaccurate information reaching record levels. This problem affects not only regular users of social media, but also the media, which may refer to such messages as a source of information. The spread of false information leads to both financial and life-threatening problems. It is almost impossible to trace these messages manually and therefore it is required to create an algorithm that can perform this process automatically. The purpose of this paper is to try to create such an algorithm for the Russian language using machine learning methods. The data on which the models are based is a sample of data that has undergone the process of manual annotation. The sample has undergone the process of preparation and balancing. From this sample, 29 attributes were obtained which can be divided into 3 categories: user, text and distribution. These attributes and were applied to obtain classification models that are able to predict with sufficiently high probability. The result of this work is an algorithm for predicting the presence of inaccurate information in a social network post.

References

Pennycook G. The Psychology of Fake News. Trends in Cognitive Sciences. 2021. Vol. 25. P. 321–357. DOI: 10.1016/j.tics.2021.02.007.

Banda Juan M., Tekumalla Ramya, Wang Guan-yu, Yu, Jingyuan Liu, Tuo Ding, Yuning, Arte-mova, Katya Tutubalina, Elena & Chowell Gerardo. A large-scale COVID-19 Twitter chatter dataset for open scientific research - an inter-national collaboration (Version 67) [Data set]. Zenodo. DOI10.5281/zenodo.5000423.

Черняев А.А. 2019. Математическое моделирование оценки достоверности слухов в средствах массовой информации / А.А. Черняев, А.Г. Ивашко // Вестник Тюменского государственного университета. Физико-математическое моделирование. Нефть, газ, энергетика. 2019. Т. 5, № 4(20). С. 181–199. DOI 10.21684/2411-7978-2019-5-4-181-199. EDN SQYEWN.

Chernyaev A. Spryiskov A. Ivashko A., Bidulya Y. A Rumor Detection in Russian Tweets. 2020. P. 108–118. DOI: 10.1007/978-3-030-60276-5_11.

Eismann K. Diffusion and persistence of false rumors in social media networks: implications of searchability on rumor self-correction on Twitter. Journal of Business Economics. 2021. Vol. 91. P. 1299–1329. DOI: 91. 10.1007/ s11573-020-01022-9.

Vosoughi S. Automatic detection and verification of rumors on Twitter. 2015. P. 1–147.

Иванова Г.Ф. О мнениях и оценках / Г.Ф. Иванова // Известия Российского государственного педагогического университета им. А.И. Герцена. 2007. Т. 8, № 41. С. 25–31. EDN JXKQIX.

Емельянова О.Н. Бранная и вульгарная лек-сика в толковых словарях русского языка // Вестник Красноярского государственного педагогического университета им. В.П. Астафьева. 2015. № 4(34). С. 126–130. EDN VDKKMN.

Рамазанова Р.З. Вводно-модальные слова как средство выражения уверенности в современном русском языке // Филология и куль-тура. 2020. № 2(60). С. 77–82. DOI 10.26907/2074-0239-2020-60-2-77-82. EDN PWAYJW.

Селезнёва Е.В. Сложноподчиненное предложение с придаточным условия: содержание и объем понятия // Филология на стыке научных эпох: сб. статей памяти доктора филол. наук, проф. Анатолия Михайловича Ломова / Автономная некоммерческая организация по оказанию издательских и полиграфических услуг. Воронеж: "Наука–Юнипресс", 2020. С. 158–164. EDN HESCYX.

Шульга М.В. 2002. Количественная оценочность в газетно-публицистическом тексте // Вестник МГУЛ – Лесной вестник. 2002. № 3. URL: https://cyberleninka.ru/ article/n/kolichestvennaya-otsenochnost-v-gazetno-publitsisticheskom-tekste (дата обращения: 22.02.2023).

Туманова А.Б. Категория времени в совре-менной науке: анализ и интерпретация / А.Б. Туманова, Т.В. Павлова, Н.Ю. Зуева // Нео-филология. 2019. Т. 5, № 18. С. 131–138. DOI 10.20310/2587-6953-2019-5-18-131-138. EDN EAONIK.

Lachowicz D. Библиотека для Python Enchant. URL: https://abiword.github.io/ enchant/ (дата обращения: 22.02.2023).

Vicenzi A. Библиотека для Python Emojis. URL: https://emojis.readthedocs.io/en/latest/ (дата об-ращения: 22.02.2023).

Jahanbakhsh-Nagadeh Z., Feizi-Derakhshi MR., Ramezani M. A model to measure the spread power of rumors. J Ambient Intell Human Com-put. 2022. DOI: 10.1007/s12652-022-04034-1.

Castillo C., Mendoza M., Poblete B. Information credibility on Twitter. Proceedings of the 20th International Conference on World Wide Web. 2011. P. 675–684. 10.1145/ 1963405.1963500.

Chawla N., Bowyer K., Hall L., Kegelmeyer P. Smote: synthetic minority over-sampling tech-nique. Journal of artificial intelligence research. 2002. Vol. 16. P. 321–357, DOI: 10.1613/jair.953.

Черняев А.А., Ивашко А.Г. Математическое моделирование оценки доверия к сообщению в социальных сетях на русском языке // При-кладная информатика. 2023. Т. 18, № 4. С. 121–132. DOI: 10.37791/2687-0649-2023-18-4-121-132.

Kumar A., Sangwan S.R., Nayyar A. Rumour veracity detection on twitter using particle swarm optimized shallow classifiers. Multimed Tools Appl 78, 2019. Vol. 78. P. 24083–24101. DOI: 10.1007/s11042-019-7398-6.

Kennedy J., Eberhart R. Particle swarm optimization. Proceedings of ICNN'95 – International Conference on Neural Networks, Perth, WA, Australia, 1995, pp. 1942–1948 Vol. 4, DOI: 10.1109/ICNN.1995.488968.

Published

2024-03-29

How to Cite

Chernyaev А. А., & Ivashko А. Г. . (2024). An Algorithm Creating for Predicting the Inaccurate Information Presence in Social Networks in Russian Language. BULLETIN OF PERM UNIVERSITY. MATHEMATICS. MECHANICS. COMPUTER SCIENCE, (1 (64), 60–71. https://doi.org/10.17072/1993-0550-2024-1-60-71