Capturing YouTube Video Views with Azure Data Factory

As a result of the pandemic like many other community speakers I’ve taken to YouTube as the current method for sharing knowledge and content. Recording video’s isn’t as much fun as speaking at a physical event, but it partially ticks the box. Plus, my daughter thinks its “cool” that I have a YouTube channel, so totally worth the Dad points! Lol.

That said, my Community Talk Log and Dashboard, seen below has become a little stale. I therefore decided to extend the data recorded to include this new avenue of online content.

Below are the high level steps on how I did this in case your interested in doing something similar.


  1. Getting started with the YouTube Data API (https://developers.google.com/youtube/v3/getting-started). The API has many capabilities far exceeding my simple requirements so try not to GET lost 😉
  2. Get a key to authenticate achieved via the GCP portal here: https://console.developers.google.com/projectselector2/apis/dashboard and enable the key for use with YouTube things. An OAuth token can also be used if you prefer.
  3. Get a YouTube video ID, which is simply the end part of a sharable video link.

  1. Perform a simple GET operation via Postman (seen below) to try out the request and view the results. The “part” parameter in the request can be an array containing lots attributes to return all sorts of things. However, I just wanted the video stats. For all options check out the developer docs page here: https://developers.google.com/youtube/v3/docs/videos

 

  1. Create a Data Factory pipeline to hit the YouTube API with the video ID passed in as a pipeline parameter and insert the views value into a database table for todays date.

  1. Create a second parent Data Factory pipeline to iterate over all know YouTube video ID values returned from a database table of metadata. I also added a trigger to this pipeline so it runs daily. For new video’s a stored procedure creates the initial snapshot log entry with a view count of zero before inserting a snapshot record.

  1. Extend the parent pipeline to get the API key from Azure Key Vault using another web activity. For details on how to do this check out the Microsoft docs page here: https://docs.microsoft.com/en-us/azure/data-factory/how-to-use-azure-key-vault-secrets-pipeline-activities
  2. Finally, view the results in a Power BI dashboard currently showing the total number of views for all videos, by channel.


Of course, currently this is fairly boring as I’ve only just created the process so there isn’t much data to work with. I’ll revisit this in maybe a months time to view the growth and break things down to work out which my most popular video’s are by day etc. Then build out the Power BI dashboard with some more interesting visuals.


Code in Community Speaking Log GitHub repository.
https://github.com/mrpaulandrew/CommunitySpeakingLog


Many thanks for reading.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.