A screenshot of videos posted to the #voicetoparliament hashtag

How do you measure the popularity of videos on TikTok?

TikTok video stats, and the question of whether a video is “viral” on the platform or not, were in the spotlight recently due to TikTok users posting videos about Osama bin Laden’s letter to America.

This reminded me that I’d been meaning to write up some methods and background info about my own experience with analysing the use of TikTok for political purposes.

During the Indigenous Voice to Parliament referendum campaign in Australia there were a significant amount of videos shared to TikTok about the proposed Voice, from both the official Yes and No campaigns, and from others – activists, politicians, and regular people.

Since we had already done quite a bit of work scrutinising how Facebook and Google ads were being used for political campaigning, we wanted to figure out what was happening on TikTok. However, while Facebook and Google both have ad transparency databases, APIs and other tools like Crowdtangle1, TikTok has yet to open up transparency tools outside of Europe.

So, we had to figure out how to answer the following questions: what were the most viewed videos about the Voice to Parliament? Were either side of the campaign doing better on TikTok than the other? Was the content of the videos factual, or was the platform being used to mislead voters?

The last question is important, but a more qualitative exercise. The first two questions feel like they should be possible to answer, but tricky – without an official stats API, I thought we might have to scrape a sample of videos on the web version of TikTok to get the public-facing stats the site provides: approximate counts of plays, likes, comments etc.

The other issue is TikTok’s famous content algorithm2. If we were to scrape content, like the videos on a search page or hashtag page, would it that content be a comprehensive ranked list of all the videos matching the search criteria, or would we only be shown the content TikTok’s algo thought we should see?

Given that this could render the analysis almost useless, we conducted some tests with my TikTok account and that of my two colleagues. We compared the videos shown to each user in the mobile app and website for a given hashtag.

The result? The feed for hashtags on the app appears to be algorithmically sorted per user, however the feed for hashtags on the TikTok website appears to be the same across users.

This means we were able to use the website hashtag feed for our analysis.

If you liked this blog post, you can subscribe to get new posts here:

Then to actually get the data I used this Python library which is a wrapper for the unofficial TikTok API. An “unofficial API” is basically an API which is publicly accessible, used to provide content for a website or app, but doesn’t have any documentation associated with it as it’s not built for use outside of these purposes.

Unofficial APIs are usually the best way to scrape website data these days. This is because so many websites dynamically generate content, or use frameworks that make reliably targeting specific elements tricky, so scraping the HTML itself isn’t possible without running a headless browser. Using the API directly is also much faster and less resource-intensive than running a headless browser.

Putting this all together, the resulting code was quite simple. Get the data, stick it in a sqlite database3, then run some analysis on the results for a few different hashtags, as well as compare the figures for the main official campaign accounts:

from TikTokApi import TikTokApi
import asyncio
import os
import json
import scraperwiki
import asyncstdlib as a

ms_token = ['YOUR TOKEN HERE']

async def get_hashtag(hashtag):
	async with TikTokApi() as api:
		await api.create_sessions(ms_tokens=[ms_token], num_sessions=1, sleep_after=3)
		tag = api.hashtag(name=hashtag)
		# user_data = await user.info()
		# print(user_data)
		count = 0
		async for i, video in a.enumerate(tag.videos(count=200)):
			# print(video)
			# jsonDict = json.dumps(video.as_dict, indent=4)
			print(video.as_dict['desc'])
			data = {}
			data['username'] = video.as_dict['author']['uniqueId']
			data['title'] = video.as_dict['desc']
			data['video_id'] = video.as_dict['id']
			data['saved'] = video.as_dict['stats']['collectCount']
			data['comments'] = video.as_dict['stats']['commentCount']
			data['liked'] = video.as_dict['stats']['diggCount']
			data['plays'] = video.as_dict['stats']['playCount']
			data['shares'] = video.as_dict['stats']['shareCount']
			data['url'] = f"https://www.tiktok.com/@{data['username']}/video/{data['video_id']}"
			data['order'] = i 
			print(data)
			scraperwiki.sqlite.save(unique_keys=["video_id"], data=data, table_name=hashtag)
			# with open("example.json", "w") as outfile:
			#     outfile.write(jsonDict)

asyncio.run(get_hashtag("yes23"))

The results showed that videos from the No campaign’s official account, Fair Australia, dominated the most commonly used voice to parliament hashtags, including hashtags specific to the Yes campaign.

For the most widely used hashtag, #voicetoparliament, seven of the top 10 videos by total plays were from Fair Australia.

You can read the full report on The Guardian here.

  1. While these tools are better than nothing, Google’s efforts at transparency in particular leave a lot to be desired. Why the world’s biggest search engine company cannot make its ad transparency content searchable is baffling, and I’ve spent a significant amount of time and energy making their ad library content searchable ↩︎
  2. It might be my limited use, age, or the content available but personally I haven’t found TikTok’s recommended videos all that compelling. Especially compared with YouTube, which really seems to just get me – mixing up an hour-long video of a man silently building a hut in the Alaskan wilderness with 17th century cittern bangers ↩︎
  3. Yes, I’m still using the decade-old python library scraperwiki to save data into a sqlite database. Yes, it requires a fix to work with modern python versions, but the convenience of a one-liner that creates a sqlite db and saves your data is still very compelling, and I need to either find another library that does the same thing or write my own. If you know of something, then please email me!!! ↩︎