This blog provides access to data on tweets using the #Covid19UK hashtag during the UK lockdown, which began on 24 March 2020. The data were extracted using TAGS, and then mapped using NodeXL. There were of course other UK-focused hashtags used during this period and some UK-based Covid-19 tweeting that did not use any hashtag. However it was unfeasible to capture all the data. I therefore stuck with one hashtag all the way through. You can see further information, including other search terms and ways of presenting data, in a tweet thread. These data are updated on a weekly basis (except where there is too much data to plot at once, in which case they are charted by day). There’s more about this work on the BMJ Opinion blog.
The URLs in the table point to NodeXL reports that describe the data more fully. Note that some of the early days of the UK lockdown had so much tweeting that it was not possible to obtain a complete data set (either because of the 18,000 tweet & retweet limit or because of difficulties accessing the Twitter API; you can only extract Twitter data 7-10 days into the past). Fortunately NodeXL also allows us to look at the tweet data for posts retweeted during the periods where there are data. That means that we can access some of the more popular tweets from the periods that were “missed”. For example, for 20 March, pre-lockdown, there were only 8 hours 21 minutes of data extracted (15:38 to 23:59). However, retweets during this period allowed the identification of tweets going back over 2 weeks. The following graph shows daily data from 3 to 17 March and hourly data thereafter, plotting tweets but not retweets.
Click into the links in the table below to see the NodeXL reports. These are typically weekly reports, starting each Tuesday (as the lockdown began on a Tuesday). Scroll down to the bottom of the NodeXL reports to download the raw data (unless the table below has a link for “data”, in which case the file was too large to upload to the NodeXL Graph Gallery, and needed to be uploaded to Dropbox instead). In the NodeXL reports select the link – “Download the Graph Data as a NodeXL Workbook”. Sometimes there was just too much data to analyse in one go, so I have processed them in days instead. Most of the NodeXL reports were produced in September or October. That means that some of the tweets had been deleted, retweets retracted, sometimes accounts deleted. For example, for 24 March – re-extracted on 1 October – 14,735 tweets and retweets were identified out of 16,192 originally collected. There may be some interesting findings from looking at the posts that have disappeared in the intervening period – perhaps some were from “troll farm” and bot accounts that have been removed. I have a lot of the original TAGS data, which will allow further study of this question.
NodeXL graphs (e.g. the graph below for 24 March) show Twitter accounts that tweeted, retweeted, and/or were mentioned in tweets. These are called “vertices”. You can read more about how to interpret these maps in an article by NodeXL and Pew Research. You can also read a lot more about how to analyse and interpret the raw data on the ScotPublicHealth blog and in my papers published in peer-reviewed medical journals.
The number of Twitter accounts included in the weekly extracts has varied over time, peaking at the start of lockdown, a quiet period over the summer, and an increase during September as we contemplated a return of stricter lockdown rules.
Hopefully there will be time over coming months to analyse these data in collaboration with academic teams. Please contact me if you are interested in contributing. However, for the moment my focus is on GP exams and clinical work. Watch this space for further data over coming months.
|Week of lockdown||Dates||Vertices (number of tweeters, retweeters & mentioned accounts||URL to access data|
|Pre lockdown |
|20-23 March||50737||20 March |
|1||24 – 30 March||86348||24 March |
|2||31 March – 6 April||61084||31 March |
|3||7 – 13 April||42330||Report |
|4||14 – 20 April||40443||Report |
|5||21 – 27 April||36964||Report |
|6||28 April – 4 May||35491||Report |
|7||5– 11 May||55688||Report |
|8||12- 18 May||55377||12 May |
|9||19 – 25 May||44803||Full week|
|10||26 May – 1 June||44714||Full week|
|11||2– 8 June||25091||Full week|
|12||9 – 15 June||21481||Full week|
|13||16-22 June||18857||Full week|
|14||23-29 June||30340||Full week|
|15||30 June – 6 July||23065||Full week|
|16||7 – 13 July||24174||Full week|
|17||14 – 20 July||14243||Full week|
|18||21- 27 July||9191||Full week|
|19||28 July – 3 August||14549||Full week|
|20||4 – 10 August||8357||Full week|
|21||11 – 17 August||7618||Full week|
|22||18 – 24 August||6125||Full week|
|23||25– 31 August||8143||Full week|
|24||1 – 7 September||9774||Full week|
|25||8 – 14 September||35498||Full week|
|26||15 – 21 September||24634||Full week|
|27||22 – 28 September||24250||Full week|
|28||29 Sep – 5 October||13756||Full week|
|29||6 – 12 October||17766||Full week|
|30||13 – 19 October||19491||Full week|
Dr Graham Mackenzie, GPST2, Edinburgh, Scotland
12 October 2020.