DIGITAL DEVELOPMENT OF THE SIBERIAN FEDERAL DISTRICT: CLUSTERING OF REGIONS IN A TAG CLOUD
DOI:
https://doi.org/10.17072/2079-7877-2021-3-62-73Keywords:
human geography, big data, news flow, digital development, tag cloud, cluster analysis, Siberian Federal DistrictAbstract
Free access to the flow of regional news and online generators of tag clouds opens up new opportunities for human geography with regard to processing big data and detecting geographical patterns. The aim of the study was to identify the current digital development priorities of ten Siberian regions by clustering them in a tag cloud from the stream of official regional news. There were identified twelve tags (labels, keywords) that reflect the priorities of digital development. Based on the flow of news (texts) from the regional ministries of digital development for the first five months of 2021, clouds of the most common tags were created. Five bands of tag frequency were set. A measure of the distance between the regions in the tag cloud and an algorithm for grouping regions into clusters were proposed. It has been found that there are no regions in Siberia with the same priorities for digital development; the existing differences allow us to group all the regions into two clusters. On this basis, the initial hypothesis about the uniformity of digital development priorities in all the regions has been rejected. Ten features of the digital development of Siberia are listed. The directions of further research on this issue are presented.References
Блануца В.И. Социально-экономическое районирование как система смыслов: контент-анализ постсоветских публикаций // Географический вестник. 2017. № 4. С. 39–50. doi: 10.17072/2079-7877-2017-4-39-50.
Блануца В.И. Социально-экономическое районирование в эпоху больших данных. М.: ИНФРА-М, 2018. 194 с.
Регионы России. Социально-экономические показатели 2020 // Федеральная служба государственной статистики. URL: https://rosstat.gov.ru/folder/210/document/13204 (дата обращения: 24.05.2021).
Arabadzhyan A., Figini P., Vici L. Measuring destination image: A novel approach based on visual data mining. A methodological proposal and an application to European islands // Journal of Destination, Marketing & Management. 2021. Vol. 20. e100611. doi: 10.1016/j.dmm.2021.100611.
Bokányi E., Kondor D., Dobos L., Sebõk T., Stéger J., Csabai I., Vattay G. Race, religion and the city: Twitter word frequency patterns reveal dominant demographic dimensions in the United States // Palgrave Communications. 2016. Vol. 2. e16010. doi: 10.1057/palcomms.2016.10.
De Oliveira Capela F., Ramirez-Marquez J.E. Detecting urban identity perception via newspaper topic modeling // Cities. 2019. Vol. 93. P. 72–83. doi: 10.1016/j.cities.2019.04.009.
Drisko J.M., Maschi T. Content Analysis. Oxford: Oxford University Press, 2016. 191 pp.
Feldman R., Sanger J. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. New York: Cambridge University Press, 2006. 422 pp.
Ferreira D., Vale M. Geography in the big data age: An overview of the historical resonance of current debates // Geographical Review. 2020. Vol. 110. e1832424. doi: 10.1080/00167428.2020.1832424.
Goodchild M.F. Citizens as sensors: The world of volunteered geography // GeoJournal. 2007. Vol. 69. P. 211–221. doi: 10.1007/s10708-007-9111-y.
Graham M., Shelton T. Geography and the future of big data, big data and the future of geography // Dialogues in Human Geography. 2013. Vol. 3. No. 3. P. 255–261. doi: 10.1177/2043820613513121.
Kitchin R. Big data and human geography: Opportunities, challenges and risks // Dialogues in Human Geography. 2013. Vol. 3. No. 3. P. 262–267. doi: 10.1177/2043820613513388.
Kwan M.-P. Algorithmic geographies: Big Data, algorithmic uncertainty, and the production of geographic knowledge // Annals of the American Association of Geographers. 2016. Vol. 106. No. 2. P. 274–282. doi: 10.1080/00045608.2015.1117937.
Li C., Dong X., Yuan X. Metro-Wordle: An interactive visualization for urban text distributions based on Wordle // Visual Informatics. 2018. Vol. 2. No. 1. P. 50–59. doi: 10.1016/j.visinf.2018.04.006.
Li D., Zhou X., Wang M. Analyzing and visualizing the spatial interactions between tourists and locals: A Flickr study in ten US cities // Cities. 2018. Vol. 74. P. 249–258. doi: 10.1016/j.cities.2017.12.012.
Longley P.A., Cheshire J.A., Mateos P. Creating a regional geography of Britain through the spatial analysis of surnames // Geoforum. 2011. Vol. 42. No. 4. P. 506–516. doi: 0.1016/j.geoforum.2011.02.001.
Mayring P. Qualitative Content Analysis: Theoretical Foundation, Basic Procedures and Software Solution. Klagenfurt: SSOAR, 2014. 143 pp.
Spyrou E., Korakakis M., Charalampidis V., Psallas A., Mylonas P. A geo-clustering approach for the detection of areas-of-interest and their underlying semantics // Algorithms. 2017. Vol. 10. No. 1. e35. doi: 10.3390/a10010035.
Zhou X., Xu C., Kimmons B. Detecting tourism destinations using scalable geospatial analysis based on cloud computing platform // Computers, Environment and Urban Systems. 2015. Vol. 54. P. 144–153. doi: 10.1016/j.compenvurbsys.2015.07.006.
Zhou Z., Zhang X., Guo X., Liu Y. Visual abstraction and exploration of large-scale geographical social media data // Neurocomputing. 2020. Vol. 376. P. 244–255. doi: 10.1016/j.neucom.2019.10.072.