Welcome to my new series, where I will explore common data tasks for marketers, analysts, and data scientists.
Project Overview
For our first project, we’ll be embarking on a four-step project using SQL, Python, and Tableau to primarily pull status code data for landing pages. The steps will include:
- querying BigQuery for Google Analytics 4 (GA-4) landing page data
- using the Requests library in Python to return status codes said pages
- marrying the status code and other marketing data in BigQuery
- building a dashboard with this data in Tableau
Video
Breakdown of Video
Below is a list of the minute marks of the main topics I cover in the video:
- 0:00:05 – Overview of the project
- 0:01:38 – BigQuery vocab review (e.g., project name, id, dataset, table, view, partitioning)
- 0:05:00 – Introduction to the GA-4 dataset schema
- 0:11:09 – Differences between GA-3 and GA-4 data (specific to querying it)
- 0:19:27 – Why querying GA-4 data is so much more complicated
- 0:22:23 – The CTE part of the query explained in plain English (using a kitchen prep analogy)
- 0:26:02 – The main query explained
- 0:26:45 – Explaining the event_params field + unnesting in the most Gen X way possible
- 0:36:24 – Demo of unnesting using GA-3 (because it’s easier for neophytes)
- 0:43:19 – Pulling it all together in the final query
Tip: I also include these timestamps in the video description. If you click on a minute mark, it will advance you to that part of the video.
Query
The query I used to pull landing pages:
-- This CTE flattens the event_params field so that I can select the page_location, entrances, and ga_session_id keys
WITH events AS (
SELECT
event_date,
(SELECT ep.value.string_value
FROM UNNEST(event_params) AS ep
WHERE ep.key = 'page_location') AS page_location,
(SELECT ep.value.int_value
FROM UNNEST(event_params) AS ep
WHERE ep.key = 'entrances') AS entrances,
(SELECT ep.value.int_value
FROM UNNEST(event_params) AS ep
WHERE ep.key = 'ga_session_id') AS ga_session_id
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_2021*`
)
SELECT
e.page_location AS landing_page,
COUNT(DISTINCT e.ga_session_id) AS sessions
FROM
events AS e
WHERE
e.entrances = 1
GROUP BY
e.page_location
ORDER BY
sessions DESC
Links
You can view the query here.
You can access the helper files here.
You can view Google’s paltry few recipes here.
Leave a Reply