Introduction: ScotPublicHealth work has always had a double purpose – exploring both public health and social media techniques. This post uses social media – and social network analysis (using NodeXL) in particular – to document tweeting about health/ healthcare. It looks at top influencers at UK and global level.
I have developed an interest in social network analysis over the past year. The method allows us to look at interactions between Twitter users. It also provides a complete extract of tweets between stated time limits that can be used for further analysis in Excel.
Back in July 2017 Helen Bevan‘s team at NHS Horizons contacted me to see whether these methods could be used to identify the top 3% influencers around health/ healthcare tweets. I set out to refine the social network analysis methods that I had been applying to simple Twitter searches (eg around a public health awareness campaign or conference, with tweeting around a single hashtag) to answer this rather more complex question.
There were 3 main purposes to this work:
|To attempt to answer the 85%:3% question and identify the key influencers||Help these tweeters understand their reach and influence
Identify other influential health/ healthcare community who may be creating useful content
|To refine and describe the methodology||Allow others to repeat and refine the approach and apply to other questions
To allow peer review
|To summarise the type of content identified||To understand the type of content that is achieving success, and the tweeters generating that content|
Methods: I ran a series of exploratory NodeXL searches and then identified a time period to run three searches, at a UK and global level:
- Health topics (general topics and words/ phrases emerging from exploratory searches)
- NHS (using geocode for UK focused search and #NHS (hashtag) for global search). I found that a search of “NHS” at a global scale was not specific enough for the purposes of this work – eg see this Storify. While the NHS is specifically a UK health service, there is tweeting about the NHS globally, in a health and healthcare context.
- Influential health/ healthcare tweeters at a UK and global level. I included the IHI in the UK search because a considerable volume of tweeting/retweeting about the IHI is from the UK, as illustrated in “Follow the Hashtag” searches.
I identified top tweets for each search using methods described previously, removed duplicates and very large tweets that were not relevant to health/healthcare (eg “viral” tweets with humorous content that had been retweeted by one or other of the prominent tweeters). I excluded a small number of tweets that did not use the Roman alphabet (tweets with emojis were included in the analysis).
Tweets were only included once in each analysis (UK and world searches were analysed separately so there may have been tweets included in both).
The way that NodeXL works has been well described previously. For the “influential tweeters” search, for example, the tweets identified included tweets by, or mentioning, the influential tweeter, or retweeted by the influential tweeter. By studying these interactions the NodeXL extract pulls in tweets from a much wider group than just the tweeters included in the original search term.
For each tweet included, NodeXL extracts information about the tweeter, people interacting with that tweeter, plus date and URL of the tweet and number of times it has been replied to, liked and/or retweeted. I used number of “retweets” as the main metric as it is a measure of the number of times the tweet has been shared with others, and therefore a measure of the popularity or sometimes controversy of that tweet. Looking at retweets rather than tweets is a measure of influence – if somebody tweets a lot but is rarely retweeted then they will not rank prominently in this analysis; if somebody tweets infrequently but their post is shared a lot, then they will appear as influential tweeters in this analysis.
I collated the top tweets (by number of retweets) in two Storify summaries – one for UK, one for global search (20 tweets for each of the three main searches listed above).
I focused searches on two periods:
- at a UK level from 19:29(UTC) on 19 Oct to 17:16(UTC) on 23 Oct 2017
- at a global level from 18:12(UTC) on 24 Oct to 05:38 on 26 Oct 2017
I had to decide on suitable terms and run these time consuming analyses in a relatively narrow time frame (9-10 days). I included common health and health service topics, aiming for as broad a search as possible, and also topical issues that appeared in exploratory searches (eg polio). There are similarities to conducting a literature review – casting net wide then focusing in to answer question.
Some of the searches looked at topics that were so large that they only included a few hours of tweets. I repeated these searches for the periods/ geographies listed above. NodeXL will capture tweets that were tweeted or retweeted during the window of study. As the periods extracted for larger searches were predominantly at the end of the day these extracts will have picked up some of the larger tweets from earlier in the day; some less popular tweets will have been missed. I am in the process of uploading the files to the NodeXL graph gallery website. Searches already uploaded are listed below:
- UK health/healthcare topic search: 20 October (19 hours, 13 minutes): go to NodeXL report (still to upload rest of these reports)
- “NHS” search focused on UK using geocode term: to upload
- Influential UK tweeters: complete period: go to NodeXL report
- Global health/healthcare topic search: 24 October (1 hour, 26 minutes): go to NodeXL report
- Global health/healthcare topic search: 25 October (1 hour, 10 minutes): go to NodeXL report
- Global health/ healthcare topic search: 26 October (49 minutes): go to NodeXL report
- Global #NHS search: complete period: go to NodeXL report
- Influential global tweeters: ran four searches as listed here (all complete period): search 1, search 2, search 3, search 4 (the latter consisting of US influential tweeters listed in a tweet by @cmichaelgibson that I had not already included in previous searches).
The list of tweeters and terms included in the searches were initially based on size of Twitter following and a record of high quality tweeting on health/ healthcare. It does not represent a definitive list, and there may be prominent health/ healthcare tweeters who have I have missed out, or who did not tweet during the period of the study.
Some additional influential tweeters were identified from the results of the searches listed above, but these accounts could have been substantially under-represented, so I ran a further series of searches to compare these individuals with the top 20 tweeters from the global analysis: these tweeters were @doctor_oxford, @drtedros, @keepnhspublic, @healingmb, @shaunlintern, @marcuschown and @cpeedell).
Results: Attempting to answer the 85%:3% question, the following tables show the results by search strategy and overall, for UK (Table 1) and global (Table 2) searches.
Table 1: UK searches (click table to zoom in):
Table 2: Global searches (click table to zoom in):
For the UK “big name” tweeters the results are very close to the 85%:3% “rule” (Figure 1). In total, 105 tweeters of the 3,260 tweeters identified overall accounted for 85% of the retweets = 3.2%. The result is less close for the global search of “big name” tweeters: 4.9% of tweeters accounted for 85% of the tweets.
Figure 1 shows the results for influential tweeters at a UK level (85% retweets were for tweets from 3.2% of tweeters).
Figure 1: Retweets and cumulative % for the “influential tweeters” included in the UK extract.
The top tweeters are listed at the end of this blog, separately for the UK (Appendix 1) and global searches (Appendix 2).
The content of the top 20 tweets for each of the 3 search strategies is shown in Storify summaries – one for the UK-based search, the other for global tweets. There is considerable diversity of tweeting for each of the searches, including some content that is of limited use, some tweets which are likely to be “one offs” as they are specific to a particular set of circumstances, and other tweets that are questionable scientifically or politically. However, many of the tweets are informative and some are from the influential tweeters included in the searches, suggesting influence at a UK and/or global scale.
Finally, a fuller search to include additional influencers identified during the course of this work was run on 30 October 2017 as summarised in this Storify. The results are shown in Appendix 3. This analysis of 4,156 tweets identified tweets by 2,593 tweeters. One third of the retweets were for @healingmb, a Canadian account posting health “memes”, aimed at the general public. For this whole body of tweeters 85% of tweets were for 45 tweeters (1.74%). Excluding @healingmb from the analysis, 84 tweeters made up 85% of retweets (3.24%). @NHSmillion tweets accounted for 23% of the tweets; excluding @NHSMillion from the analysis 117 tweeters made up 85% of retweets (4.52%). Excluding the next biggest influencer (@WHO), 145 tweeters made up 85% of retweets (5.60%). This range (1.74-5.60%) illustrates the sensitivity of the analysis to inclusion criteria. Of note, the number of retweets does not follow a linear relationship to number of followers; some tweeters with a medium-sized following have very considerable impact.
Discussion: This is, as far as I know, a first attempt at identifying top health and healthcare tweeters at a UK and global scale, using a method that looks at both connectedness and the popularity of tweets. The methods have been tried, tested and refined over a period of months, and applied to a recent time point to provide an up to date view of health and healthcare tweeting. All methods are written up and shared on ScotPublicHealth.com pages.
Knowing the “top 3%” of tweeters is helpful for these tweeters in understanding their influence and impact. The results also provide a benchmark for others to judge the impact of their tweets when counting the number of retweets over an equivalent period (as can be extracted easily using Twitter Analytics). For example @thelancet falls a little short of the top 35 list based on retweets of tweets posted on 30 October 2017 (remember that retweets will continue to accumulate so need to be careful to compare periods matched as closely as possible).
While the healthcare and related influential tweeters are largely well known people/ organisations in the world of healthcare social media, there are some “break through” posts, whether due to humour, controversy, topicality or serendipity. The positive reasons for such impact – creativity/ connectedness/ knowledge of audience/ communication skills – should be fostered for the smaller tweeters who had an unexpectedly large impact in this analysis. The most retweeted individual tweeters and tweets can be identified by inspecting the Storify summaries. The rest can be identified from the raw data uploaded onto the NodeXL graph gallery (some files still to be charted and uploaded to the site).
This is a “best shot” analysis, with massive extracts obtained opportunistically when I had access to a suitably powerful computer, particularly the weekends of 20-22 October and 27-29 October. I chose these huge searches so that they could be left running with as little human input as possible to avoid disrupting work and family life. Some of the analyses took 5-6 hours by the time the data had been extracted, the interactions calculated, the resulting map drawn and the file uploaded.
There will inevitably be influential health/healthcare tweeters who have not been included. The “health topic” searches that I ran would have been better run as a series of single topic searches, obtaining longer periods per search, but that would have resulted in more post hoc processing that would have stretched the period of the study (Twitter only makes data available 9-10 days into the past). The “NHS” searches were surprising non-specific, but I only realised this after I had run three massive searches that took the best part of a day to complete. They identified lots of tweeting completely unrelated to the UK health service from across the world. The use of a geocode-based search focused on the UK (centred on Haltwhistle) helped with the UK focused “NHS” search, but by the time I realised that this was required there was not time to run this search across the whole period of the UK analysis. The use of the #NHS hashtag helped increase the specificity for the global search, but I did not have time to run searches looking for tweets about other healthcare systems internationally.
Despite these limitations, NodeXL will still pick up older tweets, if they have been retweeted during the period of the extract. This means that more popular tweets will be over-represented in the analyses that only obtained a few hours of activity. There will be a considerable body of tweeting that is not picked up for shorter extracts – tweets that were not tweeted, or only achieved a small number of tweets in the period immediately after posting. Together these two influences – the over-representation of popular tweets and under-counting of lower profile tweeters – will have an important impact on the analysis, and will affect the accuracy with which we can answer the “85%:3%” question. The “influential tweeters” analysis was very close to the 85%:3% rule for the UK tweeters, but less so for the global search. It is plausible that tweeting within a single country (albeit one with different healthcare systems in the 4 nations that make up the UK) will be more connected than tweeting at a global level. It should be noted that the nature of the searches means that the analysis is largely based on English language tweets.
Conclusions: The analyses of influential tweeters in the UK and globally have shown that an approximation of the 85%:3% rule applies, and is remarkably close for the UK-focused tweeting, and for the global influencers once the most publicly facing account had been removed from the analysis. The more disparate tweeting that goes on around health topics or the NHS follows the “rule” less closely, but nonetheless the figure is roughly centred around the 85%:3% level.
The analysis potentially allows us to identify tweeters with a smaller number of followers but unexpectedly large reach, though repeat analysis would be required to identify whether these popular tweets were “one offs”.
While there are limitations to the analysis, as highlighted above, most of the individuals and organisations listed in the appendices can be considered among the top influencers of health and wellbeing tweeting in the UK and world. (Note – influence on Twitter does not necessarily translate to influence in the real world!)
Dr Graham Mackenzie, Consultant in Public Health, NHS Lothian, 9 November 2017
Appendix 1) UK analysis (combined search, showing organisations or individuals with a professional link to health/healthcare, or a public role. Accounts marked with an asterisk were not specifically included in the “influential tweeters” search, but were identified as prominent tweeters in one or more of the NodeXL extracts). This list represents the top 1.1% of tweeters in the UK for this analysis.
+ NevilleSouthall is best known for his football career, but has become a prominent tweeter in support of the NHS.
Appendix 2) Global analysis (combined search – list below shows just those tweeters specifically named in the “influential tweeters” search). Tweeters down to mentalhealth are in the top 1% of influencers globally in this analysis. Tweeters down to hhsgov are in the top 3% of influencers globally in this analysis.
Appendix 3) Top influencers on 30 October 2017, adding in Twitter users who had not been included specifically in the UK or global searches, but who appeared in Appendix 1 or 2. I have excluded “nuisance tweeters” from this top 35 list: spam and other potentially offensive material (n=8 tweeters). On the basis of the analysis in Appendix 2, which includes the 3 search strategies (topic, NHS, influencers), the accounts listed below are likely to have been among the top 1% of health/healthcare tweeters for 30 October 2017. Many of these tweeters have a similar pattern of tweeting on other days.