Invader/s
Post by Jaume Palmer Real. Supervised by Sune Lehmann. Formatted by Peter Edsberg Møllgaard. Published: January 11, 2022.
By the end of 2019 the SARS-CoV-2 virus strain emerged and developed into a global pandemic. Consequently, (virtually) all aspects of our lives were affected. However, despite the feeling that the virus paralyzed the world, the pandemic itself was far from being still. The spread of the virus has taken many forms and raised many questions concerning our safety and well-being: the virus mortality rate, the imposition of nation-wide lockdowns or the effectivity of masks and vaccines… just to name a few. The urgency of such topics pushed them into the front page of all our discussions, as we showed in the first issue of this project. This continuous presence raised a very simple question: do political beliefs affect COVID discussion?
Our intuition (and prejudices) might want to answer these questions right away. For this reason, this work intends to quantify and visualize if such political bias exists. To do it, we have used data from Reddit, and analyzed post submissions and comments of key communities. Specifically, we identified key words that represent the topics and questions that the pandemic has raised. Afterwards, these topics have been analyzed in mainstream political communities to evaluate if there are differences in their usage.
The year 2020 will be forever linked to the pandemic. The virus started to globally spread during January, and by December the first people started to get vaccinated. Thus, 2020 encapsulates most of the major breakthroughs and controversies that defined COVID-19 and this work has focused only on studying this strange year. To represent the major COVID-related controversies that flooded 2020, 19 words have been selected from the corpus of all submission titles in the r/Coronavirus community. This community is dedicated to sharing global news about the evolution of the pandemic so it should work as a proxy of public discussion.
The following group of infamous words(1) are the most used during 2020:
China, confirm, death, die, health, hospit, infect, lockdown, mask, outbreak, pandem, patient, posit, record, report, spread, studi, test, vaccin.Looking at all of them together, they appear as a foggy cloud of concepts that range from little surprising to not surprising at all. Checking them individually helps to untangle this cloud and sets a timeline that all of us remember well.
Throughout the year, the relative importance of each topic changed as we were collectively adapting and overcoming new challenges. This can be easily seen by comparing the usage of each word to the other 18 words, the higher up a word appears, the more important it was during a given period.
It is interesting to see the correlation between some of the words; as a starter, these are clashes worth checking:
- The take over of pandemic over outbreak
- The inverse correlation between hospital and lockdown
- The ominous importance of the word death , despite the hopeful race upwards of vaccine
The presented words describe a landscape of topics that took over public discussion. However, the reactions that they caused might have been different depending on the communities they were discussed on. To sort this out, this section compares the usage of the 19 words in the comment sections of two major political subreddits. Given that Reddit is a US-based website, this study has focused on US politics. So, in order to check differences in political discussion we have focused on communities dedicated to the two major US political parties: r/republican and r/democrats.
First of all, an easy check: the total use of the 19 corona words.
It already looks like the words were used more in the republican category; still, the size is comparable in both. We can go a bit deeper by comparing the total volume(2) of each word in each political community. This gives us a better picture of what is causing the difference that we see in the previous figure.
Notice that, neutral words, such as test and confirm have comparable sizes. On the other hand, other words with more controversial undertones show significant differences in their volumes. It is fair to assume that controversy generates arguments. Typically, in such arguments, people will post longer comments to make sure that their point is carried accross. Then, with the next figure, we wanted to check whether comments containing corona-related words are longer than others.
It looks like corona words tend to appear in longer comments, and the longest comments contain words that relate to health ( hospit and patient ). Still, focusing on each particular word, there is no clear differences between republican and democrat communities.
As we have seen above studying the r/Coronavirus community, the volume of these 19 words changed with time, as the relevance of the topics they described shifted. Then, it might be interesting to study the frequency use of these words in the two political communities. However, r/republican and r/democrats are as broad as communities as politics can be, so covid-related discussion will share the space with many other different topics. To clear the search, we focused on days when the use of COVID-words was abnormally high and we marked them with a ⬛; the larger the ⬛, the higher the surge.
Notice that the surges propagate over time, given that they seldomly disappear after one day. Also, in general, they do not appear alone as one can see that when the volume of a word peaks so do many of the others, creating vertical patterns in the previous figure. A priori, the triggers behind the surges are not known, but we included different 2020 milestones to give temporal context, and we invite the reader to explore and come up with a reasonable explanation.
The previous analysis suggests that there is a difference in how the words are treated in each political community and we would like to visualize it in a compact and clear way. Comparing the volumes of each word might not be enough, since we have seen that the time when a word is used can be as important. To combine the volume and temporal use we have computed the monthly volume of each word, and correlated the trend in both political communities.
# Metric
If the usage of a word in r/republican is similar to its use in r/democrats, then the cloud of dots will fall over the reference line. If the dots are “tilted” towards one community’s axis, then that word was systematically more used in that community. Then, the slope synthetizes the likeness of the use of a word during 2020 in both political communities.
But these communities already existed before COVID hit. So, was the usage of these words “normal” or was indeed triggered by the pandemic? We measured this by measuring the volumes of each word during 2019 and computing the ratio between the volumes in 2020 and the previous year. Finally, the slope of the previous figure and the volume ratio represent two metrics that are easily comparable, which allowed us to map map the political bias in one single final plot.
The map shows a clear bias towards the r/republican community. The severity of this bias is striking, which raises the question of whether this community is an appropiate representative of the republican mindset, or it caters to a more polarized audience than r/democrats. However, this question is out of the scope of this work. Notice, that the lockdown was a far more important topic in the r/republican community, while the discussion of masks and vaccines were closer in both political communities. As already hinted in previous figures, there is a big difference in the usage of the word China, even though, since it appears closer to the horizontal axis, it looks like it was a recurrent republican topic in 2019 already. In general, the lower a word appears, the the more broad and politically neutral, which again highlights that the reaction to COVID-discussion was different in both communities. Yet, this map, and the rest of the work for that matter, is open for discussion (it deals with politics after all).
Now, to conclude, we believe there is one thing left to highlight about the last figure. Despite the noticeable differences, after all this effort and discussion, the pandemic is still up there, standing out in the middle of everything else.