Adventures in social media analysis for understanding healthcare and public health topics.

Between 2016 and 2020 inclusive I have been exploring how to use social media and social network analysis to understand healthcare and public health topics. The list of peer reviewed papers and blogs below (click “Continue reading” below) – and the brief description beneath each reference – provides a summary of this work. I have tried to keep each publication original, making new discoveries and advances along the way. Hopefully these publications, and the blogs and pages on the ScotPublicHealth site, will help others make further advances over time, and also understand the pitfalls in social media analysis.

Thank you to all co-authors who have helped make this such an enlightening and enjoyable body of work. It has been a truly international collaboration, across multiple clinical fields, with new connections from Hong Kong, across Europe, and North America, plus Australasia and South America along the way in conference abstract and social media conference summaries.

This concludes planned work on this topic (there’s one further paper in press that explains how you can use a “user search” to gather data over a longer period of time, providing insights into a surgical topic).

Dr Graham Mackenzie, GPST2, Edinburgh, Scotland

January 2021

graham mackenzie on twitter
Continue reading “Adventures in social media analysis for understanding healthcare and public health topics.”

Full data for #Covid19uk during UK lockdown

This blog provides access to data on tweets using the #Covid19UK hashtag during the UK lockdown, which began on 24 March 2020. The data were extracted using TAGS, and then mapped using NodeXL. There were of course other UK-focused hashtags used during this period and some UK-based Covid-19 tweeting that did not use any hashtag. However it was unfeasible to capture all the data. I therefore stuck with one hashtag all the way through. You can see further information, including other search terms and ways of presenting data, in a tweet thread. These data are updated on a weekly basis (except where there is too much data to plot at once, in which case they are charted by day). There’s more about this work on the BMJ Opinion blog.

Details are provided below, but you may want to start with the summary outputs by month between March/ April and December. These summarise the top tweets (by number of retweets received) until end of July 2020, before moving to a different approach from August 2020 onwards where I attempt to capture more diversity in tweeters by making sure that no tweeter has more than 3 tweets in the summaries. Click to go to the Wakelet summary. Note that some tweets may have subsequently been deleted by Twitter or the tweeter, and some users will have left Twitter or been suspended. Accordingly, I have included PDF summaries that capture a permanent record. Links to the PDF summaries are included in the Wakelet summaries for each month:

Continue reading “Full data for #Covid19uk during UK lockdown”

A review of the BMJ’s social media content during the covid-19 pandemic

Another post in an occasional series of articles and papers that were not published in peer reviewed papers or journal blogs.

This blog explores a sample of healthcare-focused tweets across the period of the UK lockdown. With the huge number of tweets posted and retweeted during the pandemic, and the wide range of different hashtags, it was necessary to narrow my search down to look at a single account. I therefore explored the exchange of information between the BMJ (@bmj_latest) and its readers via social media. While most social media searches are limited to a period of 7-10 days into the past, Twitter allows longer-range searches for a single account (up to 3,200 tweets and retweets).

On 20 June I used NodeXL to map several international medical journals (BMJ, BJSM, Lancet, NEJM, JAMA and Nature Medicine) to gain an overview of their tweeting patterns. There was considerable variation in the number of tweets posted and the level of engagement between the different journals and readers. More detail is provided in a Wakelet summary. The BMJ was the most engaged, posting and retweeting much more regularly than any of the other journals studied. The covid-19 lockdown started on 23 March 2020 in the UK. Fortuitously, the NodeXL extract for @bmj_latest extended back to 6 March 2020. The single user search gives a map focused on that user without showing the interactions between other users (Figure 1). We need an additional step to transform the data.

Figure 1. The user list for @BMJ_Latest (16 March to 20 June 2020). Source NodeXL.
Figure 1. The user list for @BMJ_Latest (16 March to 20 June 2020). Source NodeXL.

Another quirk of Twitter data extraction allows us to extract original tweets back as far as we like, as long as we have the unique tweet ID. This is the number at the end of a tweet URL – e.g. for this BMJ tweet it is 1236199998192312320. I extracted the tweet and retweet IDs from the map in Figure 1 and imported the original tweets into a conventional social network map. This step identifies the original tweets that the BMJ retweeted, showing us connections between tweeters (Figure 2). NodeXL also groups tweeters into communities according to tweeting patterns.

Figure 2. The @BMJ_Latest network reconstructed from tweet and retweet IDs (16 March to 20 June 2020). Source NodeXL.
Figure 2. The @BMJ_Latest network reconstructed from tweet and retweet IDs (16 March to 20 June 2020). Source NodeXL.

Figure 2 shows different types of connections – for example if mentioned in a tweet, if replied to or replying to a tweet, or if retweeting a post. This map demonstrates that the BMJ interacts broadly with its readers. Of the 3,196 tweets successfully extracted, 1,996 tweets were posted by the BMJ, mentioning 685 tweeters. The remaining 1,200 tweets (38%), each retweeted by the BMJ, were posted by 665 tweeters.

Covid-19 and associated terms were the most used hashtags (Figure 3). The Wakelet summary lists the top tweets posted or retweeted by the BMJ in chronological order. Each tweet included in this summary received at least 100 retweets. These can be considered “viral tweets”: despite over 400,000 followers, on average the BMJ receives a median of 16 retweets per tweet. Examples of covid-19 coverage included face-to-face and remote assessment in primary care, health inequalities, moral injury and PPE shortages in healthcare workers, face covering, prediction models, risks for black and minority ethnic populations,  lessons in loosening lockdown, mental health, schools reopening, concerns about poor quality research and lack of trust in government. The top tweets were not just about covid-19. Tackling racism in healthcare was discussed before lockdown started. Gabapentinoids in managing pain were the subject of a popular infographic. The impact of trade agreements on the NHS was also covered. The breadth of coverage in BMJ tweets and retweets – both for covid-19 and more generally – is impressive and reflects the healthcare and public health focus of the journal and readership.

Figure 3. Hashtags used in the tweets posted and retweeted by @bmj_latest 6 March to 20 June 2020.
Figure 3. Hashtags used in the tweets posted and retweeted by @bmj_latest 6 March to 20 June 2020.

This blog demonstrates the use of the “popularity” of a tweet to help sift through a lot of social media data. Applying this approach to the BMJ’s tweets during the pandemic has illustrated the breadth of the journal’s content and has helped identify the “viral content” posted and shared by the journal. Social media is a two-way process. The 120 tweets listed in the Wakelet summary were shared widely – 4% of tweets posted by 1.6% of tweeters in Figure 2 received 30% of all retweets. We can start to understand the most engaging content through the eyes of the journal’s readers and social media followers. This is useful information, but would have been missed had this analysis been attempted just a few days later. Just as the paper journal gets recycled, and tweets move as if by gravity towards the bottom of our screens, social media data is only available transiently. Blink and you miss it. Retrospective data collection becomes very expensive. Healthcare organisations, individual healthcare workers and journals need to adapt to understand a world where social media is integral to learning. Covid-19 accelerates this process.

Dr Graham Mackenzie, GPST2, Edinburgh, Scotland

7 August 2020.

graham mackenzie on twitter

How COVID-19 has changed infectious diseases communication on social media: Through the lens of @TheLancetInfDis

This is another post in an occasional series of social media analysis that didn’t reach the pages of peer reviewed journals. It was completed on 28 June 2020, rejected by Lancet Infectious Diseases on 16 July 2020.

Article, very slightly modified, follows…

So much has changed in such a short time. It is difficult to stand back and take a long view. In this article I explore the way that infectious diseases messages have been shared via social media over a period of over two years (11 April 2018 to 18 June 2020), including the pandemic. The detail is provided in two Wakelet summaries – one for tweets by the Journal, the other for retweets by the Journal. Briefly, I used a quirk of Twitter analysis that allows data extracts over a longer period than normal by looking at a single account (@TheLancetInfDis). The original NodeXL extract provided data centred on the Journal’s Twitter account. I remapped this to chart wider connections and updated the data a week later. This approach provided up-to-date information on 1,356 tweets by @TheLancetInfDis account and 1,840 tweets by 371 other accounts which the journal retweeted.

thelancetinfdis - user map via NodeXL
thelancetinfdis – user map via NodeXL

thelancetinfdis - network map after extracting details of other tweeters
thelancetinfdis – network map after extracting details of other tweeters

Prior to the pandemic the Journal covered a wide range of topics: an “angiostrongyliasis to zoster” of infectious disease, with TB, malaria, HIV, antimicrobial resistance and vaccination popular topics. Since January 2020 there has been a stark narrowing of topics, with COVID-19 related terms dominating: though the Journal continued to tweet about topics from anthrax to zika, two thirds of the hashtags used were COVID-19 related. Interactive word clouds provide detail and allow exploration of the individual tweets (you can access these by clicking on the images below, or fuller information in the Wakelet summary of tweets).

Hashtags in @TheLancetInfDis tweets prior to the pandemic - by number of tweets posted Hashtags in @TheLancetInfDis tweets prior to the pandemic - by number of tweets posted
Hashtags in @TheLancetInfDis tweets prior to the pandemic – by number of tweets posted Hashtags in @TheLancetInfDis tweets prior to the pandemic – by number of tweets posted

Hashtags in @TheLancetInfDis tweets during the pandemic - by number of tweets posted Hashtags in @TheLancetInfDis tweets during the pandemic - by number of tweets posted
Hashtags in @TheLancetInfDis tweets during the pandemic – by number of tweets posted Hashtags in @TheLancetInfDis tweets during the pandemic – by number of tweets posted

There has simultaneously been a step change in the reception of the Journal’s tweets, peaking in March. During the pandemic, tweets by the Journal received twice as many retweets as before and the content retweeted by the Journal reached a much wider audience (though that will be influenced by a range of factors as some of these retweeted posts were already “going viral”).

Number of tweets made and retweets received by @thelancetinfdis
Number of tweets made and retweets received by @thelancetinfdis

For the full 26-month period 42/100 of the Journal’s top tweets and 83/100 of the top retweeted posts were about COVID-19. It is not possible from these data to separate out whether these tweets – by the Journal or other tweeters – were shared by the general public or individuals and organisations with a specialist interest in infectious disease. Nonetheless, the individual tweets posted and retweeted provide an international view of the pandemic including research, ethics, public health, clinical, epidemiological, pharmacological, political and social dimensions.

Many of the lessons from COVID-19 are transferable to other infectious diseases. Hopefully the awareness, interest and connections generated during the pandemic can be converted into wider knowledge of infectious diseases among professionals and public after the pandemic.

Dr Graham Mackenzie, GPST2, Edinburgh, Scotland

16 July 2020.

graham mackenzie on twitter

Taking a long-view of tweeting – an example looking at @HelenBevan’s account

Introduction: A few days ago I contacted Helen Bevan to share a social network map of her tweets over a period of almost 2 years. Helen has been a great source of support over the past 5+ years after we met in social media discussions about quality improvement and then in person with the Q Community.

I had run this most recent map of Helen’s tweets as an experiment to look at long range tweeting. Usually it is only possible to look at a few days of tweets. However Twitter allows you to extract tweets over a longer period if looking at a single account, providing access to up to 3,200 tweets and retweets. Helen is well known as one of the UK’s top healthcare tweeters. She is also very supportive of colleagues from across the world, reading and commenting on others’ tweets and blogs and is quick to share useful content with her 86,000+ followers. Mapping her social network connections would help her understand her audience and the content that had most impact. Helen tweeted my map. The map intrigued and confused some of Helen’s followers, so I have posted a blog on this analysis. (The blog is also available as a PDF file; there is also a PDF version of the associated Wakelet summary).

Continue reading “Taking a long-view of tweeting – an example looking at @HelenBevan’s account”

Finding the sweet spot in healthcare social media communication: A call for greater clarity in medical and science hashtags

Scientific communication relies on clarity, specificity and universality. In this blog I explain how communication between medical tweeters is held back by a lack of clarity in hashtag choice, and by the absence of a “fuzzy search” feature in Twitter. I explore lessons from the way that medical research papers are categorised (MeSH headings) and propose options for improving medical tweeting, helping people to look beyond their usual social media bubble.  I also demonstrate ways to visualise intentions vs reception for hashtags in two topical issues using word clouds.

I have written this as a blog, because I wanted to include a more reflective exploration of this topic than I could in a traditional medical paper. Hopefully, with the contribution of other medical tweeters, the ideas presented here can be developed into a peer reviewed paper in a medical journal.

Continue reading “Finding the sweet spot in healthcare social media communication: A call for greater clarity in medical and science hashtags”

Health innovation and COVID-19 pandemic: Defining the need and understanding the response.

Health innovation and COVID-19 pandemic: Defining the need and response.

A question heard on the wards recently – how can we capture all the innovations that have emerged from the COVID-19 pandemic? I’m sure that there are similar questions in hospitals, GP surgeries and other organisations across the world.
In order to answer this question we need to start by defining innovation. The World Health Organization (which might want to drop the American spelling in light of recent political decisions) defines health innovation as follows:

“Health innovation is to develop new or improved health policies, systems, products and technologies, and services and delivery methods that improve people’s health, with a special focus on the needs of vulnerable populations.

  • WHO engages in health innovation in the context of universal health coverage
  • Health innovation adds value in the form of improved efficiency, effectiveness, quality, safety and/or affordability
  • Health innovation can be in preventive, promotive, therapeutic, rehabilitative and/or assistive care”

In classic Public Health style WHO identifies 3 overlapping domains necessary to capture health innovation fully – science innovation (R&D), social innovation, and business innovation – each of which we can see in evidence in the wider pandemic response.

This is a useful definition for a number of reasons:

Continue reading “Health innovation and COVID-19 pandemic: Defining the need and understanding the response.”

The Role of Patient Information Leaflets in the Treatment of Patients

A first year medical school report by Kirsty Mackenzie, University of Dundee, written as part of a Student Selected Component (SSC) on Human Factors (March 2018). This is also available as a PDF.


Over the past century and particularly over the last few decades, there has been a huge shift in the way in which patients interact with doctors. In the past, patients were given very little information about their conditions or their treatments. Medicine was very paternalistic and there was little room for patients to question the doctor’s decisions or to make choices for themselves. The public had very little scientific knowledge and blindly agreed to treatments that may not have needed or wanted(1). This was not in the best interests of patients because they had no control over their own health and this must have left them feeling less content and more anxious about what they were going through. The old model of ‘doctor knows best’ has in recent times been put aside in favour of ‘person-centred care’. The Royal College of Nursing states ‘[Person centred care] means that the person is an equal partner in the planning of care and that his or her opinions are important and are respected(2).’ This term was first coined by the psychotherapist Carl Rogers building on earlier ideas proposed by healthcare workers. Further building on Rogers’ ideas, the psychiatrist George Engel promoted ‘the move from a medical to a biopsychosocial the move from a medical to a biopsychosocial model of health(3).’ His ideas have been widely credited with being responsible for the shift in the model of care.

Continue reading “The Role of Patient Information Leaflets in the Treatment of Patients”

One step beyond: Mapping older tweets and retweets

Over the past 3 years I have been studying social networks for health (e.g. public health campaigns, clinical conferences). I have been collaborating with clinicians and analysts across the world in this work, publishing some of the outputs in peer-reviewed journals as listed below, studying the content, influencers, components of tweets that could influence retweeting, commercial influences in conference tweeting, responses beyond the hashtag, and looking at hierarchy of tweeting. Some of these have been published already – e.g. most recently a paper with Muge Cevik and David Ong, available for a few more weeks in free full text. Look out for details of the remaining papers over coming months. Summary of social media work - papersOne area that I have been keen to explore, but have not been able to until now, is mapping older tweets and retweets. Twitter provides access to tweets and retweets over the past 10 days. Sometimes a network will take longer than 10 days to establish, at which point the data become difficult – or expensive – to extract. This blog explains how to extract older tweets and retweets manually so that they can be mapped well beyond the 10 day limit, using NodeXL. I have used the example of #CwPAMS (Commonwealth Partnerships for Antimicrobial Stewardship) following a request by Diane Ashiru. I have illustrated this using 30 tweets first, and then the retweets that followed. It would be possible with patience and time to map all the #CwPAMS tweets using this method.

Continue reading “One step beyond: Mapping older tweets and retweets”

Reflections on live tweeting social media analysis from #RCGPAC.

The Royal College of General Practitioners Annual Conference (RCGPAC) 2019 has embraced the use of social media to disseminate information to conference delegates and a wider audience beyond the conference hall. In an experiment at RCGPAC 2019, social media analysis was shared live from the conference hall throughout the conference, summarising the top content, identifying the main contributors and encouraging delegates to use the conference hashtag (#RCGPAC) to aid identification and dissemination of tweets. This idea emerged after an analysis of tweeting from the 2018 conference showed that a substantial proportion of tweets had omitted the official conference hashtag and were therefore less likely to reach their intended audience. Throughout the conference I shared social media analysis in a tweet thread and an ongoing Wakelet summary capturing the most popular tweets. This analysis has, in turn, fed into GP Online articles.

Continue reading “Reflections on live tweeting social media analysis from #RCGPAC.”