Importing social follower numbers into Google Sheets – 101
This blog was due last week. Also, it wasn’t this blog – in actual fact, trying to piece together this section of the thing I was working on took much longer than I’d expected. So, while I’m hopefully going to return to this at some point as part of something larger, here’s how to import follower data into Google Sheets, so you don’t have to spend the time I did
To begin with, it’s probably a good idea to explain what I’m talking about – this blog will detail how to set up a spreadsheet in Google Sheets that will automatically pull follower data for specified URLs or handles.
While it may, at first, seem to have limited utility (you can just go on and check, after all, this is all public data), the possibilities for competitor tracking and growth monitoring (if combined with another section of code, which I’ll hopefully get to near the end), for ranking the social presence of publishers for link building etcetera, are pretty good – and can possibly save hours of unproductive time gathering the data following its initial set up.
So, how do you do it and which platforms are available?
The sites I’ve managed to get formulae working for are: Facebook, Google+, Twitter, Pinterest and YouTube. Other than LinkedIn, which I’m not speaking to, this represents all of the social channels that Click monitors for reporting purposes – there may be other social channels this will work for (it will work for most things not done in JS) – but I haven’t tried. At least: not yet.
There is one special provision – which is for Twitter, for which I had to use the parred back /intent/ URL to bypass the more complicated standard desktop code.
How to do it
There’s a link to the sheet below that you can make a copy of if you want to skip to the end, but if you want to know how I fumbled my way through the process, read on.
Most of the important work for this is done using two main formulae:
- IMPORTXML – a function which imports data from structured data types (XML, HTML, CSV etcetera), this allows you to parse the XPATH (there’s a nice explanation of XPATHs on w3schools) of the particular node in which the information you’re looking for is located. This, combined with the VALUE function, accounts for much of the data scraping in the sheet.
- REGEXEXTRACT – pulls out matching substrings by regular expression – this is only needed in the Facebook example – where the data is being pulled from the meta description on the profile.
- VALUE – converts a string (that Google Sheets is able to understand) in to a number.
- CONCAT – this is used a couple of times to simply append something on to a URL to save multiple entries for a platform if the data needs to be pulled from more than one location or URLs are only differentiated by the user name (YouTube and Twitter respectively, for example).
There are few tools necessary, but one you’ll find useful is a Chrome extension called SelectorGadget – which will allow you do easily create XPATHs for use in your formulae. Other than that, however, we’re relying on the power of Google Sheets alone.
As you’ll see when you open the sheet, the process is deceptively simple:
- The first column for each platform provides the sheet with a URL (or, for Twitter, the /intent/ URL is added using CONCAT in the hidden column F).
- The subsequent columns then follow an XPATH using IMPORTXML to the location of a follower count/like count/view count etcetera, and returns the count, generally thanks to a VALUE function – or, in the case of Facebook, looks up a set string in the meta and returns the associated number.
The actual formulae take time (unless you want to steal them from the sheet below), but by tinkering with the syntax and using the SelectorGadget to easily generate your XPATHs, you can edit them to return public data from most non JS sites.
VIEW AND COPY THE SHEET HERE
Another thing that has taken longer than I expected has been the archiving macro – which would have been impossible to get even this far on without our very own Keith Docherty.
As things stand, the macro is messy – as will the sheets it will store data in, but they work and, if I can’t tidy them up, will be perfectly serviceable to use as a data source for Google Data Studio.
The way I’ve worked it is to clumsily adapt some of the aforementioned Keith’s macro wizardry to archive (by adding a column to a separate sheet) the value of the range on the day the macro runs (which is scheduled for the 1st of each month). At present, there is a macro for each platform which adds a column to a sheet for that specific platform – it’s inelegant, however, so I invite any readers with a better grasp to let me know how they would approach the process in the comments below, but the following image should give you an idea of how it’s been done.
Finally, a big thank you to this blog by Sharon Machlis at Computerworld which I found after scrabbling to find a way around Twitter’s desktop site (I briefly managed to get it to work with the mobile version, but it broke almost immediately), and while I couldn’t get the recording script there to work, there is a walkthrough for auto logging results there as well.
For more actionable advice on all aspects of search marketing, subscribe to our blog – or check out our comprehensive resource section. Alternatively, contact us today to see what we can do for your brand.