[GA4] BigQuery Export: How to Extract GA4 User Properties and Metrics

GA4 Google BigQuery
Big Query Extension

In today's fast-paced digital landscape, where understanding user behavior can make or break a campaign, leveraging the right tools is not just advantageous – it's essential. Google Analytics 4 (GA4) combined with BigQuery offers a powerhouse solution for marketers, enabling deep dives into user data, uncovering trends, and driving smarter strategies.

Whether you're looking to refine your audience targeting, optimize conversions, or simply stay ahead of the competition, this guide will walk you through the process of exporting GA4 user properties and metrics to BigQuery, understanding their significance, and applying advanced querying techniques to maximize the value of your data.

This article is Part 5 in our series, "[GA4] BigQuery Export."

If you're following along, check out the previous articles in the series:

Benefits of Exporting GA4 Data to BigQuery

Here are the key benefits of exporting GA4 data to BigQuery:

  • Merge GA4 data with information from Google Ads, CRM systems, and external datasets to comprehensively view your business performance.
  • BigQuery’s BI Engine caches frequently used data, speeding up SQL queries and improving performance in BI tools like Google Data Studio.
  • Set up automated queries and reports in BigQuery to eliminate manual data extraction, saving time and resources.
  • Use BigQuery for complex analyses, predictive modeling, and identifying trends that might be hidden in standard GA4 reports.
  • Access raw, unsampled GA4 data in BigQuery, bypassing limitations like sampling, cardinality issues, and thresholding.

Pro tip: Set up the GA4 to BigQuery export link as soon as possible to begin accumulating valuable historical data immediately, as the export is not retroactive.

Uncover in-depth insights

How to Set up [GA4] to BigQuery Export

Download now

Bonus for readers

How to Set up [GA4] to BigQuery Export

Understanding GA4 User Properties and Metrics

GA4 user properties and metrics are essential components that help understand user behavior and interactions on your site. These elements provide granular insights that can be used to tailor marketing strategies effectively.

User Properties

User properties in GA4 are attributes associated with individual users, such as age, gender, location, or custom properties that you define based on specific needs. These properties help you understand who your users are and how they interact with your site or app, providing valuable insights for personalizing user experiences and tailoring marketing strategies.

In BigQuery, user-scope parameters are typically located in the events_YYYYMMDD tables, nested within the event data. These parameters provide a consistent reference for analyzing user behavior across multiple sessions and events, making them vital for longitudinal studies and targeted marketing efforts.

However, it's important to note that not all user-scope parameters can be exported directly from GA4 to BigQuery. Certain sensitive or personally identifiable information (PII), such as user email addresses or phone numbers, cannot be collected or exported in GA4 due to strict privacy regulations and data protection policies.

GA4 is designed to prevent the collection of such data, ensuring it remains absent both in the platform and any exports, including those to BigQuery.

Instead, GA4 focuses on exporting anonymized or pseudonymous data, which still allows for effective analysis without compromising user privacy. While you have access to a wealth of user-level data, some specific details will need to be calculated or inferred through other means.

You can often calculate those user-scope parameters that aren't directly exported using SQL queries within BigQuery.

For example:

  • While GA4 might not export a direct count of sessions per user, you can calculate this by aggregating session IDs within the events_YYYYMMDD table.
  • Similarly, you can create custom metrics and dimensions by combining and transforming existing data fields.

This flexibility allows you to tailor your analysis to your specific needs, ensuring you can extract maximum value from your GA4 data.

Dimensions and Metrics

In GA4, dimensions and metrics work together to provide a complete picture of user interactions on your site or app. Dimensions, such as 'Page Title' or 'User Location,' describe the attributes of your data, while metrics like 'Page Views' or 'Event Count' measure the quantity of these interactions.

Alternatively, metrics like 'Page Views' or 'Session Duration' measure both the quantity and duration of user interactions.

By analyzing these together, you can gain deep insights into how users engage with your content, helping you to make informed decisions and optimize your marketing strategies.

Please check out this video to learn more about effectively using dimensions and metrics in your analysis.

How to Set Up GA4 User Properties for Data Export

Before extracting user properties and metrics from GA4, it’s important to ensure that they are set up correctly. Without properly configuring GA4 to capture the right user properties, your export data may be incomplete or inaccurate.

If you need a step-by-step guide on how to configure and export GA4 user properties to BigQuery, we’ve got you covered. Check out our comprehensive guide: How to Set Up and Export GA4 User Properties to BigQuery for detailed instructions.

This article explains why user properties are critical, how to configure them in GA4, and how to export them to BigQuery for advanced analysis. Once you’ve set everything up, you’ll be ready to extract and analyze these properties with advanced queries in BigQuery.

Advanced Querying of [GA4] BigQuery Export Data

Mastering advanced querying techniques is essential to unlocking the full potential of your GA4 data in BigQuery. GA4 data is often nested and complex, requiring a solid understanding of how to work with nested fields and user-scope parameters.

Description of the Nested Field Schema for User-Scope Parameters

GA4 exports data to BigQuery with a nested field schema, particularly for user-scope parameters. These parameters, which provide valuable insights at the user level, are typically nested within event data, making it necessary to use functions like UNNEST to access them. Understanding this schema is crucial for accurate data extraction and analysis, enabling you to maximize your user-level data.

User-Scope Parameters from the Table event_YYYYMMDD

In the event_YYYYMMDD tables, user-scope parameters are nested within the event data. These parameters track user-specific attributes across various sessions and events, providing a consistent view of user behavior over time. Querying these parameters allows you to analyze user interactions in detail and draw meaningful conclusions about user engagement.

User-Scope Parameters from the Table pseudonymous_users_YYYYMMDD

The pseudonymous_users_YYYYMMDD table contains user-scope parameters associated with anonymized or pseudonymous user identifiers. These parameters help you track user behavior without exposing personally identifiable information (PII), ensuring data privacy while allowing for comprehensive user activity analysis across sessions.

Understanding How to Work with Nested Structures in GA4 BigQuery Export

When working with GA4 data in BigQuery, you’ll often encounter nested structures like arrays and structs, which organize data into complex, multi-level formats.

Arrays hold multiple values within a single field, such as a list of event parameters for an event, while structs group related fields into a single, composite data type. To analyze these nested structures effectively, you must know how to use BigQuery functions like UNNEST and STRUCT.

For example, let's say you want to extract the value of a specific event parameter, such as page_location, from the event_params array within the event_YYYYMMDD table.

(And by the way, you can run this query yourself - it takes the public data available for everybody to play with)

Here’s how you might write the query:

SELECT
user_pseudo_id,
event_name,
param.value.string_value AS page_location
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131`, UNNEST(event_params) AS param
WHERE
event_name = 'page_view'
AND param.key = 'page_location';

Here:

  • user_pseudo_id: This field represents the unique identifier for each user, allowing you to track user behavior across different events and sessions.
  • event_name: This field captures the name of the event (e.g., 'page_view'). It is used in the query to filter and focus on specific events of interest.
  • param.value.string_value AS page_location: This part of the query extracts the value associated with the page_location key from the event_params array.
  • FROM Clause: Specifies the GA4 event table being queried. The format event_YYYYMMDD indicates a daily partitioned table, where YYYYMMDD represents the specific date.
  • UNNEST(event_params) AS param: The UNNEST function is used to flatten the event_params array, turning each element of the array into a separate row.
  • WHERE Clause: The WHERE clause filters the data to include only rows where event_name is 'page_view'. Additionally, it filters the unnested parameters to include only those with the key 'page_location'.

This query demonstrates how to navigate and extract data from nested arrays and structs in GA4 using BigQuery. By mastering these techniques, you can perform detailed analyses and gain deeper insights into your data.

💡To dive deeper into how to work with arrays in BigQuery, including using functions like UNNEST, be sure to read our detailed article on BigQuery Array Functions. It’s a must-read for mastering advanced data analysis techniques!

Dive deeper with this read

Understanding BigQuery Array Functions

Image for article: Understanding BigQuery Array Functions

Extracting User Dimensions & Metrics

Extracting user dimensions and metrics from GA4 data in BigQuery is essential for in-depth analysis and reporting. User dimensions describe various attributes of your users, such as location or device type, while metrics quantify user interactions, like session counts or conversions.

In GA4, these data points are stored in the events_YYYYMMDD tables, which hold daily event data with nested user properties. By querying these tables effectively, you can unlock detailed insights into user behavior, segment audiences more precisely, and enhance your marketing strategies.

Let's explore how to extract both default and calculated user dimensions and metrics using specific SQL queries in BigQuery.

Default User Dimensions

GA4 provides several default user dimensions that help describe your users' attributes, such as their ID, activity status, and properties associated with them.

These dimensions are crucial for understanding user behavior and can be extracted from the events_YYYYMMDD tables in BigQuery. Below, each dimension is presented with a sample query to help you understand how to retrieve it.

user_id

This dimension represents the user ID set via the setUserId API. It's a unique identifier for users who have logged in or been assigned a specific ID.

SELECT
user_id
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
GROUP BY user_id

user_pseudo_id

This dimension is the pseudonymous ID (e.g., app instance ID) assigned to users. It's used to track users anonymously across sessions.

SELECT
user_pseudo_id
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
GROUP BY
user_pseudo_id

is_active_user

This dimension indicates whether the user was active (true) or inactive (false) at any point during the calendar day. This is available from 2023-07-17 onward.

SELECT
is_active_user
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce..events_*`
GROUP BY is_active_user

user_first_touch_timestamp

This dimension records the time (in microseconds) when the user first opened the app or website. It helps to understand when the user first interacted with your site or app.

SELECT
timestamp_micros(user_first_touch_timestamp) AS user_first_touch_timestamp
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
GROUP BY user_first_touch_timestamp

user_properties.key

This dimension represents the name of the user property. Replace <insert key> with the specific property key you are interested in, or delete the WHERE clause to select all properties.

SELECT
(SELECT key FROM UNNEST(user_properties) WHERE key = '') AS user_properties_key
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`

user_properties.value.string_value

This dimension extracts the string value of a specific user property. Replace <insert key> with the specific property key, or delete the WHERE clause to select all.

SELECT
(SELECT value.string_value FROM UNNEST(user_properties) WHERE key = '') AS user_string_value
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`

user_properties.value.set_timestamp_micros

This dimension indicates the time (in microseconds) when the user property was last set. Replace <insert key> with the specific property key, or delete the WHERE clause to select all.

SELECT


timestamp_micros((SELECT value.set_timestamp_micros FROM UNNEST(user_properties) WHERE key = '')) AS user_set_timestamp_micros
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`

user_ltv.currency

This dimension represents the currency of the user's lifetime value, helping to understand the financial contribution of users in different currencies.

SELECT
user_ltv.currency AS user_ltv_currency
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
GROUP BY user_ltv_currency

You can create more targeted analyses and improve your overall data-driven strategies by leveraging these dimensions, such as user IDs, activity status, and custom user properties.

Default User Metrics

Default user metrics in GA4 are vital for quantifying user interactions and behavior on your website or app. These metrics help you measure and analyze aspects such as user engagement, lifetime value, and custom properties that provide valuable insights into the effectiveness of your marketing strategies.

Extracting these metrics from the events_YYYYMMDD tables in BigQuery allows you to perform detailed analysis and reporting.

Below are some key default user metrics and example queries to help you retrieve and analyze them in BigQuery.

user_properties.value.int_value

This metric captures the integer value of a specific user property. Replace <insert key> with the specific property key you wish to analyze, or remove the WHERE clause to select all.

SELECT
  
    (SELECT value.int_value FROM UNNEST(user_properties) WHERE key = '') AS user_int_value
FROM
    `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`

user_properties.value.float_value

This metric represents the float value of a user property. Replace <insert key> with the desired property key, or delete the WHERE clause to access all float values.

SELECT

    (SELECT value.float_value FROM UNNEST(user_properties) WHERE key = '') AS user_float_value
FROM
    `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`

user_properties.value.double_value

This metric captures the double value of a user property, which is useful for more precise measurements. Replace <insert key> with the specific property key, or remove the WHERE clause to retrieve all double values.

SELECT
    (SELECT value.double_value FROM UNNEST(user_properties) WHERE key = '') AS user_double_value
FROM
    `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`

user_ltv.revenue

This metric measures the lifetime value (revenue) generated by the user. It's essential for understanding users' financial contributions over time.

SELECT
  
    user_ltv.revenue AS user_ltv_revenue
FROM
    `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`

These queries provide a foundation for extracting and analyzing key user metrics from GA4 data in BigQuery. By tailoring these queries to your specific needs, you can gain deeper insights into user behavior, helping you make data-driven decisions that improve your marketing efforts and overall business performance.

Calculated User Dimensions

Calculated user dimensions in GA4 allow you to create custom dimensions by combining or transforming existing data. These dimensions help you gain deeper insights into user behavior by categorizing users in ways that standard dimensions may not cover.

For instance, you might want to distinguish between new and returning visitors or calculate the total number of sessions for each user. These calculated dimensions are essential for creating more tailored reports and understanding user patterns at a granular level.

Here are some examples of calculated user dimensions, along with the queries used to generate them:

User Type

The user type dimension categorizes users as either new or returning visitors. This is based on the ga_session_number parameter, which tracks the session count for each user.

SELECT
 CASE
WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE event_name = 'session_start' AND key = 'ga_session_number') = 1 THEN 'new visitor'
WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE event_name = 'session_start' AND key = 'ga_session_number') > 1 THEN 'returning visitor'
ELSE NULL END AS user_type,
COUNT(DISTINCT user_pseudo_id) AS users
FROM
-- change this to your Google Analytics 4 export location in BigQuery
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
-- define static and/or dynamic start and end date
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
GROUP BY
user_type
HAVING
user_type IS NOT NULL

Count of Sessions

The count of sessions dimension calculates the total number of sessions each user has had. This helps in understanding how often users return to your site or app.

WITH prep AS (
SELECT
user_pseudo_id,
MAX((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number')) OVER (PARTITION BY user_pseudo_id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS max_session_number
FROM
-- change this to your Google Analytics 4 export location in BigQuery
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
-- define static and/or dynamic start and end date
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
)

SELECT
-- count of sessions (dimension | the session index for a user, each session from a unique user will get its own incremental index starting from 1 for the first session)
max_session_number AS count_of_sessions,
COUNT(DISTINCT user_pseudo_id) AS users
FROM
prep
GROUP BY
max_session_number
ORDER BY
max_session_number

Calculated user dimensions help tailor your analysis to understand your audience better and optimize your marketing strategies, ensuring you make data-driven decisions that align with your business goals.

Calculated User Metrics

Calculated user metrics in GA4 allow you to measure user behavior and engagement more precisely by applying custom logic to your data. These metrics are essential for understanding how different user segments interact with your site or app, and they provide insights that go beyond standard metrics.

Below are some key-calculated user metrics, along with queries and explanations.

Total Users

This metric counts the total number of unique users based on the user_pseudo_id.

SELECT
   COUNT(DISTINCT user_pseudo_id) AS users
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
-- define static and/or dynamic start and end date
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Active Users

This metric identifies active users based on engagement criteria, such as time spent on the site or specific interactions.

SELECT
COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') > 0
OR (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1'
THEN user_pseudo_id ELSE NULL END) AS active_users
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

New Users (Based on Total Users)

This metric counts the number of users who interacted with your site or app for the first time.

SELECT


COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1
THEN user_pseudo_id ELSE NULL END) AS new_users
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

% New Users (Based on Total Users)

This metric calculates the percentage of users who are new, compared to the total user base.

SELECT

COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1
THEN user_pseudo_id ELSE NULL END) / COUNT(DISTINCT user_pseudo_id) AS percentage_new_users
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

New Sessions

This metric counts the number of sessions started by new users, providing insights into how effectively your site or app attracts first-time visitors.

SELECT
    COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1
THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) ELSE NULL END) AS new_sessions
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

% New Sessions

This metric calculates the percentage of sessions that were initiated by new users.

SELECT

COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1
THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) ELSE NULL END)
/ COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'))) AS percentage_new_sessions
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Number of Sessions per User (Based on Total Users)

This metric calculates the average number of sessions per user, offering a clear view of user engagement over time.

SELECT
   
    COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))
/ COUNT(DISTINCT user_pseudo_id) AS number_of_sessions_per_user
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Event Count per User (Based on Total Users)

This metric counts how many times a specific event, like page_view, was triggered per user.

SELECT

COUNTIF(event_name = 'page_view') / COUNT(DISTINCT user_pseudo_id) AS event_count_per_user
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Engaged Sessions per User (Based on Total Users)

This metric calculates the average number of engaged sessions per user.

SELECT
    COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1'
THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END)
/ COUNT(DISTINCT user_pseudo_id) AS engaged_sessions_per_user
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Calculated user metrics in GA4 allow you to analyze user behavior and engagement more meaningfully. Whether you're measuring total users, active users, new users, or the number of sessions per user, these metrics provide deeper insights into how users interact with your site or app.

Extracting Session Metrics

Session metrics in GA4 provide crucial insights into how users interact with your site or app during a specific visit or session. These metrics help you understand user engagement, session frequency, and overall behavior patterns.

By extracting and analyzing session metrics from GA4 data in BigQuery, you can gain a deeper understanding of the effectiveness of your content, the user experience, and the impact of your marketing efforts.

Calculated Session Metrics

Session metrics in GA4 provide a comprehensive view of how users interact with your website or app during individual sessions. These metrics, including session counts, engagement rates, bounce rates, and session duration, are crucial for understanding user behavior and evaluating the effectiveness of your content and marketing strategies.

Below, you'll find examples of both straightforward and calculated session metrics and the corresponding queries to extract them.

Sessions

This metric counts the total number of unique sessions users initiate, providing a baseline understanding of overall user engagement. Tracking sessions help assess how frequently users interact with your site or app.

SELECT
    COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'))) AS sessions
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Engaged Sessions

This metric counts the number of "engaged" sessions based on user interaction or engagement time. Engaged sessions indicate the quality of user interactions during visits.

SELECT
   COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1'
THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END) AS engaged_sessions
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Engagement Rate

This metric calculates the percentage of sessions that were engaged compared to all sessions, offering insight into how effective your site or app is at keeping users engaged.

SELECT
    SAFE_DIVIDE(
COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1'
THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),
COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))
) AS engagement_rate
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Engagement Time

This metric calculates the average length of time in seconds that the app was in the foreground or the website had focus during a session.

WITH prep AS (
SELECT
    user_pseudo_id,
    (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS session_id,
    MAX((SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged')) AS session_engaged,
    SUM((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec'))/1000 AS engagement_time_seconds
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

GROUP BY
    user_pseudo_id, session_id
)

SELECT
    SAFE_DIVIDE(SUM(engagement_time_seconds), COUNT(DISTINCT CASE WHEN session_engaged = '1' THEN CONCAT(user_pseudo_id, session_id) END)) AS engagement_time
FROM
    prep

Bounces

This metric counts the number of sessions that resulted in a bounce, meaning the session ended without significant engagement. Monitoring bounces helps identify areas where user experience might need improvement.

SELECT
    COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'))) 
        - COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' 
        THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END) AS bounces
FROM
    `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Bounce Rate

This metric calculates the percentage of sessions that resulted in a bounce, providing insight into how well your site or app retains user interest. A high bounce rate may indicate issues with content relevance or user experience.

SELECT
    SAFE_DIVIDE(
        COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'))) 
            - COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' 
            THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),
        COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))
    ) AS bounce_rate
FROM
   `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Event Count per Session

This metric calculates the number of times an individual event (e.g., 'page_view') was triggered per session, offering a measure of user activity during each visit.

SELECT
    SAFE_DIVIDE(
        COUNT(DISTINCT CASE WHEN event_name = 'page_view' 
            THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),
        COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))
    ) AS event_count_per_session
FROM
   `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))

Average Session Duration (in seconds)

This metric calculates the average duration of users' sessions in seconds, reflecting how long users stay engaged during a visit. Longer session durations often suggest better content or user experience.

WITH prep AS (
SELECT
    CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) AS session_id,
    (MAX(event_timestamp) - MIN(event_timestamp))/1000000 AS session_length_in_seconds
FROM  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
GROUP BY
    session_id
)

SELECT
    SUM(session_length_in_seconds) / COUNT(DISTINCT session_id) AS average_session_duration_seconds
FROM
    prep

Views per Session

This metric calculates the number of web pages viewed per session, indicating user engagement and interest. Higher views per session generally indicate that users find the content engaging and relevant.

WITH prep AS (
SELECT
    CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) AS session_id,
    COUNTIF(event_name = 'page_view') AS views
FROM
    `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_table_suffix BETWEEN '20210131'
AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
GROUP BY
    session_id
)

SELECT
    SUM(views) / COUNT(DISTINCT session_id) AS views_per_session
FROM
    prep

These session metrics provide a detailed view of how users interact with your site or app during their sessions. By leveraging these queries, you can gain valuable insights into user engagement, session quality, and overall user behavior.

Extracting Data from the User Properties in BigQuery

When working with GA4 data in BigQuery, extracting user properties is essential for gaining insights into user behavior and characteristics. In GA4, user properties are often stored in the pseudonymous_users_YYYYMMDD and users_YYYYMMDD tables.

  • pseudonymous_users_YYYYMMDD Table

The pseudonymous_users_YYYYMMDD table holds data associated with anonymized or pseudonymous user identifiers. By querying this table, you can analyze user engagement, segment users based on properties like device type or location, and ensure data privacy while gaining meaningful insights.

  • users_YYYYMMDD Table

The users_YYYYMMDD table contains user properties linked to identified users, such as those who have logged in or provided specific identifiers. Queries to this table enable you to track user journeys, analyze lifetime value, and create personalized experiences based on user-specific information.

💡For a deeper dive into understanding the GA4 BigQuery export, including event table schemas and handling dates, check out this article on our blog. It’s essential for mastering your GA4 data analysis! Read the article here.

Dive deeper with this read

[GA4] BigQuery Export: Events Table Schema and Managing Dates

Image for article: [GA4] BigQuery Export: Events Table Schema and Managing Dates

Count of Occurrences by Property with UNNEST Function

Counting the occurrences of each user property in GA4 data is essential for understanding the distribution and importance of various user attributes. Using the UNNEST function in BigQuery, you can break down nested fields, such as user properties, and count how often specific attributes—like device type or subscription status – appear across your user base.

Let's walk through an example of counting the occurrences of different user properties to gain insights into user preferences.

Example:

Suppose you manage an e-commerce platform and want to understand the distribution of user properties such as "preferred_device" and "subscription_status" among your customers.

SELECT
  up.key AS user_property,
  COUNT(*) AS occurrences
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131`, UNNEST(user_properties) AS up
WHERE up.key IN ('preferred_device', 'subscription_status')
GROUP BY 1
ORDER BY 2 DESC

Here:

  • UNNEST Function: The UNNEST function is used to flatten the user_properties array, turning each element within the array into a separate row.
  • SELECT Clause: The query selects the key of each user property (e.g., "preferred_device", "subscription_status") and renames it as user_property.
  • FROM Clause: The query is run against your specific BigQuery table, denoted as ... The UNNEST function is applied directly within the FROM clause to expand the user_properties array.
  • GROUP BY and ORDER BY: The results are grouped by the user_property key, and the counts are ordered in descending order to show the most common properties first.

Using the UNNEST function in this query, you can efficiently count the occurrences of various user properties in your GA4 data.

Extract Specific Property Value

Extracting specific user property values in GA4 data allows you to focus on particular attributes most relevant to your analysis. By filtering the key field in your query, you can target and extract only the specific property you’re interested in, such as a user’s device type or the time of their first interaction with your app.

Example:

Imagine you manage a mobile app and want to analyze the "user_type" property to understand whether users logging into your app are new or returning customers.

To extract a specific property value, use a subquery within the main query to filter for the desired property. Here’s how you can extract the "user_type" property:

SELECT
  (SELECT value.string_value FROM UNNEST(user_properties) WHERE key = 'user_type') AS user_type
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131`
WHERE event_name = 'login'

If the property you want to extract is stored as an integer, you would modify the query to pull from the value.int_value field. For example, to extract the "first_open_time" property:

SELECT
  (SELECT value.int_value FROM UNNEST(user_properties) WHERE key = 'first_open_time') AS first_open_time
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131`

Here:

  • Subquery within SELECT Clause: The subquery is used to extract the string_value associated with the user_type property. This targets only the specific property of interest.
  • UNNEST Function: The UNNEST function is used to break down the user_properties array so that each property can be accessed and filtered individually.
  • WHERE Clause: The WHERE clause filters the events to focus on a specific event, such as 'login', which is relevant to the analysis.
  • String vs. Integer Values: Depending on the data type of the property, you might use value.string_value or value.int_value to extract the information.

Using targeted queries to extract specific user property values in GA4 data can provide precise insights into user behavior and characteristics.

Execute Queries

Executing queries in BigQuery allows you to extract, analyze, and visualize data from your GA4 to BigQuery data export. You can dive into specific user behaviors by crafting precise SQL queries like the above, segment data by various properties, and uncover valuable insights that drive business decisions.

Whether you're counting occurrences, extracting specific values, or analyzing session metrics, running queries effectively in BigQuery is essential for leveraging the full potential of your GA4 data.

pipeline

Explore BigQuery Data in Google Sheets

Bridge the gap between corporate BigQuery data and business decisions. Simplify reporting in Google Sheets without manual data blending and relying on digital analyst resources availability

Simplify Analytics Now

Use OWOX: Reports, Charts & Pivots Extension to Query GA4 Data from Google Sheets

The OWOX: Reports, Charts & Pivots Extension is a powerful tool that allows you to query GA4 data directly from Google Sheets seamlessly.

This integration simplifies analyzing your GA4 data, enabling you to perform complex queries without leaving the familiar Google Sheets environment.

💡For those looking to enhance their querying skills, don't miss our GA4 BigQuery Export SQL Queries Library, which includes ready-made SQL queries for analyzing your GA4 data more efficiently.

With just a few clicks, you can pull data from BigQuery into your sheets, making it easier to work with and share insights across your team.

Beyond querying data, the OWOX BI Reports Extension also helps you automatically build pivot tables and charts, transforming raw GA4 data into meaningful visualizations. This feature saves time and enhances your ability to interpret and act on the data.

Whether tracking key metrics or exploring user behavior trends, this tool provides a streamlined approach to making your GA4 data more accessible and actionable.

report-v2

Unlock BigQuery Insights in Google Sheets

Report on what matters to you. Integrate corporate BigQuery data into a familiar spreadsheet interface. Get insightful, up-to-date reports with just a few clicks

Unlock Data Insights Now

FAQ

Expand all Close all
  • Why should I export GA4 data to BigQuery?

    Exporting GA4 data to BigQuery allows you to unlock advanced analytical capabilities that go beyond what GA4's default interface offers. In BigQuery, you can perform complex queries, integrate GA4 data with other business data sources, and avoid limitations like data sampling and thresholding.

  • What are GA4 user properties, and why are they important?

    GA4 user properties are attributes associated with individual users, such as demographics, device information, or custom parameters that you define. These properties are crucial for understanding who your users are and how they interact with your website or app.

  • How do I set up GA4 to transmit user properties to Google Analytics?

    To set up GA4 to transmit user properties to Google Analytics, you typically use Google Tag Manager (GTM). First, create custom user properties in GTM and configure tags to capture these properties from your website or app. Then, these tags will be linked to GA4 by instructing Google Analytics to accept the transmitted user properties. Finally, test the setup to ensure the properties are being collected correctly in GA4, allowing you to analyze them effectively.

  • What steps are involved in setting up a GA4 export to BigQuery?

    To set up a GA4 export to BigQuery, start by linking your Google Analytics 4 property to BigQuery. This is done by navigating to the "Admin" section in GA4, selecting "BigQuery Linking" under the "Property" column, and clicking "Link". Next, choose the data streams you want to export and decide whether you want daily or streaming exports. You’ll then need to select or create a Google Cloud project where the data will be stored. After reviewing and confirming your settings, the export will be set up, and you can verify that the data is successfully appearing in your BigQuery project as configured.

  • What is the importance of understanding nested structures in BigQuery exports from GA4?

    To set up custom metrics in GA4, you first define the metric you want to track, either through your website's code or via Google Tag Manager (GTM). In the GA4 interface, navigate to the "Custom Definitions" section and create a new custom metric, specifying the parameters and calculation.

  • How do I set up custom metrics in GA4, and why are they necessary?

    To set up custom metrics in GA4, define the desired interactions using GTM or tracking code, then create a custom metric in GA4’s "Custom Definitions." Custom metrics are essential for tracking specific, business-relevant actions, enabling tailored insights.