[GA4] BigQuery Export SQL Querries Library

icon Get in-depth insights

How to Set up [GA4] to BigQuery Export

icon Get in-depth insights

How to Set up [GA4] to BigQuery Export

If you haven't set up your GA4 BigQuery Export yet, on the right side, you can find a link to our detailed guide on how you can set it all up in minutes. 

However, you don’t need to have your own GA4 export data available in BigQuery to practice querying GA4 data. Google has created the BigQuery Public Dataset for GA4 export data so you can play with it. 

This ga4_obfuscated_sample_ecommerce dataset contains a sample GA4 event export data from the Google Merchandise Store. The data covers three months from November 2020 to January 2021 and includes information on user behavior, product views, and transactions.

To access this dataset, you will need to have a Google Cloud project with the BigQuery API enabled.

How to query multiple GA4 'events_' tables

Before we start querying GA4 data in BigQuery let's figure out how to work with dates. Every day of Google Analytics 4 data in BigQuery is stored in a separate table.

If you only need data from one day (let's say December 31st, 2023) the FROM clause in your query will look something like this:

FROM
`project_id.dataset_id.events_20231231`

However, in most cases you will need to query a slightly longer period of time. You have one of 2 options here: WILDCARD or _TABLE_SUFFIX.

Define Dates using WILDCARD

When specifying the tables you want to retrieve the data FROM, you can use WILDCARD to select the date range.

FROM
`project_id.dataset_id.events_*` -- All tables in the dataset
FROM
`project_id.dataset_id.events_2021*` -- 2021 full year
FROM
`project_id.dataset_id.events_202101*` -- January 2021
FROM
`project_id.dataset_id.events_2021010*` -- First 9 days of January 2021

Define Dates using WHERE and Table Suffix

Right after specifying the tables you want to retrieve the data FROM, you can use the WHERE clause to specify the date range.

Static Date Range - Last 30 days

SELECT
*
FROM
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
_TABLE_SUFFIX BETWEEN `20210101` AND `20210131`

Dynamic Date Range - Last 30 days

WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 30 day))
AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))

Dynamic Date Range - Last 7 days

WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 7 day))
AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))

Static & Dynamic Date Range

You can also use the WHERE statement to mix both dynamic and static dates, for example you can query the data from a specific date till yesterday (In our example - From January 1st, 2021 till Yesterday)

WHERE
_TABLE_SUFFIX BETWEEN '20210101' AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))

How to Query Events Details

Events Count

SELECT
  event_name,
  COUNT(*) AS event_count
FROM
  --select your project, dataset and the table, as well as the date range
  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131`
GROUP BY
  event_name
ORDER BY
  event_count DESC
LIMIT
  100

Traffic Report

SELECT
  Traffic_source.source,
  Traffic_source.medium,
  Traffic_source.name,
  COUNT(DISTINCT user_pseudo_id) AS Users
FROM
  --select your project, dataset and the table, as well as the date range
  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
GROUP BY
  Traffic_source.source,
  Traffic_source.medium,
  Traffic_source.name
ORDER BY
  Users DESC
LIMIT
  100

Traffic Report by date

SELECT
  Event_date AS date,
  Traffic_source.source,
  Traffic_source.medium,
  Traffic_source.name,
  COUNT(DISTINCT user_pseudo_id) AS Users
FROM
  --select your project, dataset and the table, as well as the date range
  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
GROUP BY
  date,
  Traffic_source.source,
  Traffic_source.medium,
  Traffic_source.name
ORDER BY
  date DESC,
  Users DESC
LIMIT
  1000

How to Query Specific Events

If you want to query just the specific events, like purchase or page_view, you can use the WHERE clause to filter just those event names you need.

SELECT
  *
FROM
  --select your project, dataset and the table, as well as the date range
  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
  event_name = "page_view"
LIMIT
  100

Query BigQuery Data right from Google Sheets

While mastering SQL queries is a powerful skill, visualizing these results can often be just as challenging.

Look, BigQuery is a data warehouse, and it's not designed to visualize data: build pivots, charts, graphs or share data with stakeholders.

That's why we've created a solution to simplify this step.

Report

Simplify BigQuery Reporting in Sheets

Easily analyze corporate data directly into Google Sheets. Query, run, and automatically update reports aligned with your business needs

Simplify Reporting Now

It's as simple as that:

1. Install this extension;

2. Copy-paste any query from this guide;

3. Run it directly from Google Sheets;

4. Set automated updates to get a fresh report.

For most of the advanced queries below, you won't even have to edit the query to enter your project_id, dataset_id, or write any query to select dates. Simply add a new query in extension, run and you'll see a little window on the right side to enter your variables. 

Without ever touching the SQL code, making it exceptionally user-friendly for anyone, especially those not familiar with SQL or just getting started.

Install Extension

👉 By using this BigQuery Reports Extension for Google Sheets, you can instantly create dynamic and interactive reports, visualize BigQuery data, set automated updates, and share actionable data insights with stakeholders—all within the familiar interface of Google Sheets.

Note: for some of the queries, you'll need to manually change the project_id and the dataset_id to the one with your data (however, you can still run the query on public data to see how it goes).

How to Query User Metrics & Dimensions

Now it's time to move forward and talk about how you can query default user metrics and dimensions from the [GA4] BigQuery export, like users, active users, session id, user pseudo id, and user properties key and values.

Some of Google Analytics 4 dimensions and metrics can be retrieved straight from the tables without performing any calculations or unnests for specific event parameters.  Other metrics and dimension that you might need are a bit harder to retrieve.

Please, keep in mind that GA4 has multiple concepts of users:

  • total users - based on unique user_pseudo_id’s (ex. clientId)
  • active users - based on unique user_pseudo_id’s and engagement 
  • users based on user_id

For every query that involves users, you have to first figure out which concept of user calculations is applicable to your specific calculations. 

You will have to calculate them, or extract, which can require some more complex SQL statements that we specified below:

How to Query User details

SELECT
  -- user_id (use if you set this up in GA)
  user_id,
  -- user_pseudo_id ( formerly client_id - the pseudonymous id for the user)
  user_pseudo_id,
  -- user_first_touch_timestamp (the time in microseconds at which the user first opened your website)
  TIMESTAMP_MICROS(user_first_touch_timestamp) AS user_first_touch_timestamp,
  -- user_ltv.revenue (the lifetime value of the user)
  user_ltv.revenue AS user_ltv_revenue,
  -- user_ltv.currency (the currency of the user lifetime value)
  user_ltv.currency AS user_ltv_currency
FROM
  --select your project, dataset and the table
  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
  -- define the wildcard or static and/or dynamic start and end date
  _TABLE_SUFFIX BETWEEN '20210101'
  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day)) AND
  user_ltv.revenue > 0
LIMIT
  100

How to Calculate New / Returning Visitors

SELECT
  -- user type (new visitor or returning visitor)
  CASE
    WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE event_name = 'session_start' AND KEY = 'ga_session_number') = 1 THEN 'new visitor'
    WHEN (
  SELECT
    value.int_value
  FROM
    UNNEST(event_params)
  WHERE
    event_name = 'session_start'
    AND KEY = 'ga_session_number') > 1 THEN 'returning visitor'
  ELSE
  NULL
END
  AS user_type,
  COUNT(DISTINCT user_pseudo_id) AS users
FROM
  --select your project, dataset and the table
  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
  -- define the wildcard or static and/or dynamic start and end date
  _TABLE_SUFFIX BETWEEN '20210101'
  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
GROUP BY
  user_type
HAVING
  user_type IS NOT NULL
LIMIT
  100

How to Query User Types Grouped by Date

SELECT
--query date
 event_date,
--count new visitors
 SUM(CASE WHEN user_type = 'new visitor' THEN 1 ELSE 0 END) AS new_visitors,
--count returning visitors
 SUM(CASE WHEN user_type = 'returning visitor' THEN 1 ELSE 0 END) AS returning_visitors
FROM (
 SELECT
   event_date,
   user_pseudo_id,
   CASE
     WHEN MAX(CASE WHEN event_name = 'session_start' AND params.key = 'ga_session_number' THEN params.value.int_value END) = 1 THEN 'new visitor'
     WHEN MAX(CASE WHEN event_name = 'session_start' AND params.key = 'ga_session_number' THEN params.value.int_value END) > 1 THEN 'returning visitor'
   END AS user_type
 FROM
--replace with your project_id.dataset_id.events_* or run as it is in OWOX BI BigQuery Reports Extension and change later
   `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`,
   UNNEST(event_params) AS params
 GROUP BY
   event_date,
   user_pseudo_id
)
WHERE
 user_type IS NOT NULL
GROUP BY
 event_date
ORDER BY
 event_date

How to Count # of Users by # of Sessions

WITH
  prep AS (
  SELECT
    user_pseudo_id,
    MAX((
      SELECT
        value.int_value
      FROM
        UNNEST(event_params)
      WHERE
        KEY = 'ga_session_number')) OVER (PARTITION BY user_pseudo_id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS max_session_number
  FROM
    --select your project, dataset and the table, as well as the date range
    `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
  WHERE
    -- define static and/or dynamic start and end date
    _table_suffix BETWEEN '20210101'
    AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day)))
SELECT
  -- count of sessions (Each sessions gets it's number, where the first session is specified as 1)
  max_session_number AS count_of_sessions,
  COUNT(DISTINCT user_pseudo_id) AS users
FROM
  prep
GROUP BY
  max_session_number
ORDER BY
  Max_session_number DESC
LIMIT
  100

Need Helping Hand Writing Complex SQL Queries?

Diving into advanced SQL can be daunting. But if you're looking to write complex queries without spending hours troubleshooting or refining your scripts... we've created a ChatGPT App - SQL Copilot for BigQuery.

Just check out the video below and use ChatGPT to write the queries that you need:

This gpt app integrates directly with your BigQuery, allowing you to use natural language to generate and optimize SQL queries.

You can share your table schema, add more context, and get the query that you need in seconds. 
You can even perform complex JOINs to query data from different datasets or tables into the same report.

👉 The SQL Copilot for BigQuery by OWOX not only assists in creating accurate SQL queries but also teaches you how to handle complex data effectively. 

It's like having an SQL expert sitting right beside you, ready to guide you through the intricacies of data querying.

How to Query Total users / New users / Active Users

SELECT
  -- users (the total number of users)
  COUNT(DISTINCT user_pseudo_id) AS users,
  -- active users (the number of engaged users)
  COUNT(DISTINCT
    CASE
      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'engagement_time_msec') > 0 OR ( SELECT value.string_value FROM UNNEST(event_params) WHERE KEY = 'session_engaged') = '1' THEN user_pseudo_id
    ELSE
    NULL
  END
    ) AS active_users,
  -- % active users (the percentage of engaged users from the total users)
  COUNT(DISTINCT
    CASE
      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'engagement_time_msec') > 0 OR ( SELECT value.string_value FROM UNNEST(event_params) WHERE KEY = 'session_engaged') = '1' THEN user_pseudo_id
    ELSE
    NULL
  END
    ) / COUNT(DISTINCT user_pseudo_id) AS percent_active_users,
  -- new users (the number of users who interacted with your site for the first time)
  COUNT(DISTINCT
    CASE
      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_number') = 1 THEN user_pseudo_id
    ELSE
    NULL
  END
    ) AS new_users,
  -- % new users (the percentage of new users from the total users)
  COUNT(DISTINCT
    CASE
      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_number') = 1 THEN user_pseudo_id
    ELSE
    NULL
  END
    ) / COUNT(DISTINCT user_pseudo_id) AS percent_new_users
FROM
  --select your project, dataset and the table, as well as the date range
  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
  -- define static and/or dynamic start and end date
  _table_suffix BETWEEN '20210101'
  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
LIMIT
  100

How to Query New Sessions / Engaged Sessions / Events per Users

SELECT
  -- sessions (the total number of sessions)
  COUNT(DISTINCT CONCAT(user_pseudo_id,(
      SELECT
        value.int_value
      FROM
        UNNEST(event_params)
      WHERE
        KEY = 'ga_session_id'))) AS sessions,
  -- new sessions (the number of sessions by users who had never visited your site before)
  COUNT(DISTINCT
    CASE
      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_number') = 1 THEN CONCAT(user_pseudo_id,( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'))
    ELSE
    NULL
  END
    ) AS new_sessions,
  -- % new sessions (the percentage of sessions by users who had never visited your site before)
  COUNT(DISTINCT
    CASE
      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_number') = 1 THEN CONCAT(user_pseudo_id,( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'))
    ELSE
    NULL
  END
    ) / COUNT(DISTINCT CONCAT(user_pseudo_id,(
      SELECT
        value.int_value
      FROM
        UNNEST(event_params)
      WHERE
        KEY = 'ga_session_id'))) AS percent_new_sessions,
  -- number of sessions per user (the total number of sessions divided by the total number of users)
  COUNT(DISTINCT CONCAT(user_pseudo_id,(
      SELECT
        value.int_value
      FROM
        UNNEST(event_params)
      WHERE
        KEY = 'ga_session_id'))) / COUNT(DISTINCT user_pseudo_id) AS number_of_sessions_per_user,
  -- event count per user (the number of times all of the events were triggered divided by amount of users)
  COUNT(*) / COUNT(DISTINCT user_pseudo_id) AS event_count_per_user,
  -- engaged sessions per user
  COUNT(DISTINCT
    CASE
      WHEN ( SELECT value.string_value FROM UNNEST(event_params) WHERE KEY = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id,( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'))
  END
    ) / COUNT(DISTINCT user_pseudo_id) AS engaged_sessions_per_user
FROM
  --select your project, dataset and the table, as well as the date range
  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE
  -- define static and/or dynamic start and end date
  _table_suffix BETWEEN '20210101'
  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
LIMIT
  100

How to Query Engagement Details & Bounce Rate

select
 --query date
 event_date as date,
 --query sessions count
 count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id'))) as sessions,
 --query engaged sessions count
 count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end) as engaged_sessions,
 --query engagement rate
 safe_divide(count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end),count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')))) as engagement_rate,
 --query bounced sessions count
 count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id'))) - count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end) as bounces,
 --query bounce rate
 safe_divide(count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id'))) - count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end),count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')))) as bounce_rate,
 -- query event count per session
 safe_divide(count(distinct case when event_name = 'page_view' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end),count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')))) as event_count_per_session
 from
   --replace with your `project_id.dataset_id.events_*` or run as it is in OWOX BI BigQuery Reports Extension and change later
  `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
where
 -- define start and end date
 _TABLE_SUFFIX BETWEEN
 --replace with start date eg '20240301'
 '{start_date default="20210101" type="input"}' AND
 --replace with end date eg '202403131'
 '{end_date default="20210110" type="input"}'
   GROUP BY date
   ORDER BY date DESC
LIMIT 100

How to Query User Count by Source / Medium and Date

SELECT
 -- date (dimension)
 event_date,
 -- source (dimension | extracted from event_params)
 (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source') AS source,
 -- medium (dimension | extracted from event_params)
 (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium') AS medium,
 -- users (metric | the total number of users)
 COUNT(DISTINCT user_pseudo_id) AS users,
 -- active users (metric | users with engagement time or session engaged)
 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') > 0 OR (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN user_pseudo_id ELSE NULL END) AS active_users,
 -- new users (metric | users with ga_session_number = 1)
 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN user_pseudo_id ELSE NULL END) AS new_users,
 -- % new users (metric | percentage of new users out of total users)
 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN user_pseudo_id ELSE NULL END) * 1.0 / COUNT(DISTINCT user_pseudo_id) AS percentage_new_users,
 -- new sessions (metric | count of new sessions by new users)
 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) ELSE NULL END) AS new_sessions,
 -- % new sessions (metric | percentage of new sessions out of total sessions)
 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) ELSE NULL END) * 1.0 / COUNT(DISTINCT CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING))) AS percentage_new_sessions,
 -- sessions per user (metric | average number of sessions per user)
 COUNT(DISTINCT CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING))) * 1.0 / COUNT(DISTINCT user_pseudo_id) AS number_of_sessions_per_user,
 -- event count per user (metric | average number of page_view events per user)
 COUNTIF(event_name = 'page_view') * 1.0 / COUNT(DISTINCT user_pseudo_id) AS event_count_per_user,
 -- engaged sessions per user (metric | average number of engaged sessions per user)
 COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) END) * 1.0 / COUNT(DISTINCT user_pseudo_id) AS engaged_sessions_per_user
FROM
 --replace with your project_id.dataset_id.events_* or run as it is in OWOX BI BigQuery Reports Extension and change later
 `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
WHERE
 -- define start and end date
  _TABLE_SUFFIX BETWEEN
--replace with start date eg '20240301'
'{start_date default="20210101" type="input"}' AND
--replace with end date eg '202403131'
'{end_date default="20210110" type="input"}'
GROUP BY
 event_date, source, medium
ORDER BY
event_date DESC,
users DESC

How to Query User Count by Source / Medium and Landing Page

SELECT
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source') AS source,
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium') AS medium,
  CONCAT((SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source'), ' / ', (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium')) AS source_medium,
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS landing_page,
  COUNT(DISTINCT user_pseudo_id) AS new_users
FROM
  `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
WHERE
  event_name = 'session_start' AND
  (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 AND
  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
GROUP BY
  source,
  medium,
  source_medium,
  landing_page
ORDER BY
  new_users DESC

How to Run Page Performance Analysis

SELECT
  event_date,
  page_location,
  page_title,
  COUNTIF(event_name = 'page_view') AS page_views,
  COUNT(DISTINCT user_pseudo_id) AS users,
  SAFE_DIVIDE(COUNTIF(event_name = 'page_view'), COUNT(DISTINCT user_pseudo_id)) AS pageviews_per_user,
  SUM(IF(event_name = 'user_engagement', engagement_time, 0)) AS total_engagement_time,
  AVG(IF(event_name = 'user_engagement', engagement_time, NULL)) AS avg_engagement_time
-- For Unique Pageviews and Engagements per User, consider calculating these based on distinct user interactions within subqueries if necessary
FROM (
  SELECT
    event_date,
    user_pseudo_id,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS page_location,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_title') AS page_title,
    event_name,
    IF(event_name = 'user_engagement', (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') / 1000, 0) AS engagement_time
  FROM
--replace next line with your project_id.dataset_id if running in the BigQuery interface
    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
  WHERE
--replace next line with your dates if running in the BigQuery interface
    _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
    AND (event_name = 'page_view' OR event_name = 'user_engagement')
)
GROUP BY
  event_date, page_location, page_title
ORDER BY
  event_date, page_views DESC
LIMIT 100

How to Perform E-commerce Purchase Events Analysis

SELECT
  event_timestamp,
  event_date,
  event_name,
  user_pseudo_id,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'page_location'
  ) AS landing_page,
  (
    SELECT COALESCE(value.int_value, value.float_value, value.double_value)
    FROM UNNEST(event_params)
    WHERE key = 'value'
  ) AS event_value,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'transaction_id'
  ) AS transaction_id,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'currency'
  ) AS currency,
  -- Assuming item details are directly under event_params for simplicity
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'item_name'
  ) AS item_name,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'item_id'
  ) AS item_id,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'item_brand'
  ) AS item_brand,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'item_category'
  ) AS item_category,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'item_variant'
  ) AS item_variant,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'coupon'
  ) AS coupon,
  (
    SELECT COALESCE(value.float_value, value.double_value)
    FROM UNNEST(event_params)
    WHERE key = 'shipping'
  ) AS shipping,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'payment_type'
  ) AS payment_type,
  (
    SELECT value.string_value
    FROM UNNEST(event_params)
    WHERE key = 'page_title'
  ) AS page_title,
  COUNT(*) AS number_of_events
FROM
  --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
WHERE
  --replace with conversion even name
  event_name = '{event_name default="purchase" type="input"}'
  AND 
  -- replace w/ your dates if running outside of BigQuery Reports Extension
  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
GROUP BY
  event_timestamp, event_date, event_name, user_pseudo_id, landing_page, event_value, transaction_id, currency, item_name, item_id, item_brand, item_category, item_variant, coupon, shipping, payment_type, page_title
ORDER BY
  event_date, event_timestamp

GA4 Standard Reports

User Acquisition Report

-- Query to retrieve a comprehensive user acquisition report from GA4 data in BigQuery
WITH base_data AS (
 SELECT
   -- The dimensions
   event_date,
   event_name,
   user_pseudo_id,
   traffic_source.source AS source,
   traffic_source.medium AS medium,
   -- Combining source and medium into a single dimension
   CONCAT(traffic_source.source, '/', traffic_source.medium) AS source_medium,
   traffic_source.name AS campaign,
   -- Extracting various user parameters for the metrics calculation
   (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') AS session_engaged,
   (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') AS engagement_time_msec,
   (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') AS ga_session_number,
   (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS ga_session_id,
FROM
  --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
WHERE
  -- replace w/ your dates if running outside of BigQuery Reports Extension
  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'),
-- Calculation of metrics for each user
metrics AS (
 SELECT
   event_date,
   event_name,
   user_pseudo_id,
   source,
   medium,
   source_medium,
   campaign,
   session_engaged,
   engagement_time_msec,
   ga_session_number,
   ga_session_id,
  
   -- Calculating engagement time in seconds for clarity
   SAFE_DIVIDE(engagement_time_msec, 1000) AS engagement_time_sec,
  
   -- Determining if the session is engaged based on session_engaged value or engagement_time_msec
   (session_engaged = '1' OR engagement_time_msec > 0) AS is_engaged,
  
   -- Determining if the session is the first session
   ga_session_number = 1 AS is_first_session,
  
   -- Creating a unique session identifier
   CONCAT(user_pseudo_id, CAST(ga_session_id AS STRING)) AS session_id
 FROM
   base_data
)
-- Aggregating & Selecting metrics across all users
 SELECT
   event_date,
   source,
   medium,
   source_medium,
   campaign,
  
   -- Counting total and active users
   COUNT(DISTINCT user_pseudo_id) AS total_users,
   COUNT(DISTINCT CASE WHEN is_engaged THEN user_pseudo_id END) AS active_users,
  
   -- Calculating the percentage of active users
   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_engaged THEN user_pseudo_id END), COUNT(DISTINCT user_pseudo_id)) AS percent_active_users,
  
   -- Counting new users and calculating the percentage of new users
   COUNT(DISTINCT CASE WHEN is_first_session THEN user_pseudo_id END) AS new_users,
   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_first_session THEN user_pseudo_id END), COUNT(DISTINCT user_pseudo_id)) AS percent_new_users,
  
   -- Counting total sessions, new sessions and calculating percentages
   COUNT(DISTINCT session_id) AS sessions,
   COUNT(DISTINCT CASE WHEN is_first_session THEN session_id END) AS new_sessions,
   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_first_session THEN session_id END), COUNT(DISTINCT session_id)) AS percent_new_sessions,
  
   -- Calculating sessions per user and engaged sessions per user
   SAFE_DIVIDE(COUNT(DISTINCT session_id), COUNT(DISTINCT user_pseudo_id)) AS sessions_per_user,
   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_engaged THEN session_id END), COUNT(DISTINCT user_pseudo_id)) AS engaged_sessions_per_user,
  
   -- Counting engaged sessions and calculating the engagement rate
   COUNT(DISTINCT CASE WHEN is_engaged THEN session_id END) AS engaged_sessions,
   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_engaged THEN session_id END), COUNT(DISTINCT session_id)) AS engagement_rate,
  
   -- Calculating average engagement time and events per user
   SAFE_DIVIDE(SUM(CASE WHEN is_engaged THEN engagement_time_sec ELSE 0 END), COUNT(DISTINCT user_pseudo_id)) AS average_engagement_time_sec,
   SAFE_DIVIDE(COUNT(*), COUNT(DISTINCT user_pseudo_id)) AS events_per_user
 -- Additional metrics for conversions and user conversion rate would be added here
 -- For this example, we assume 'purchase' events signify conversions and we sum up purchase revenue
 -- Placeholder for revenue calculation (requires exact field specification)
 FROM
   metrics
 GROUP BY
   event_date,
   source,
   medium,
   source_medium,
   Campaign
ORDER BY 
event_date DESC, 
total_users DESC

Customer Reports

Product Report

SELECT
  event_date AS Date,
  item_id AS ID,
  item_name AS Product_Name,
  item_variant AS Product_Variant,
  item_brand AS Brand,
  item_category AS Category,
  price AS Price,
  SUM(quantity) AS Quantity,
  COUNT(DISTINCT user_pseudo_id) AS User_count
FROM
  --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`,
  UNNEST(items)
WHERE
  -- Replace date range if you’re running outside if BigQuery Reports Extension
  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
  AND event_name IN ('purchase')
GROUP BY
  1, 2, 3, 4, 5, 6, 7
ORDER BY
  Quantity DESC

Audience Report

SELECT
  user_pseudo_id AS Client_id,
  user_id AS User_id,
  traffic_source.`source` AS Source,
  traffic_source.medium AS Medium,
  geo.continent AS Continent,
  geo.sub_continent AS Sub_Continent,
  geo.country AS Country,
  geo.region AS Region,
  geo.city AS City,
  device.category AS Device_Category,
  device.mobile_brand_name AS Mobile_Brand,
  device.mobile_model_name AS Mobile_Model,
  device.operating_system AS OS,
  device.operating_system_version AS OS_version,
  device.`language` as Language
FROM
   --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`,
  UNNEST(items)
WHERE
  -- Replace date range.
  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
GROUP BY
  1, 2, 3, 4,  5,  6, 7, 8, 9, 10, 11, 12, 13, 14, 15
ORDER BY
  Client_id ASC

Ecommerce Report

SELECT
  *,
  revenue/transactions AS average_order_value
FROM (
  SELECT
    event_date,
    --use line 8-10 instead of 11-12 if you need data and the source/channel/company of the session, not the user
    --collected_traffic_source.manual_source as source,
    --collected_traffic_source.manual_medium as medium,
    --collected_traffic_source.manual_campaign_name as campaign,
    traffic_source.source AS source,
    traffic_source.medium AS medium,
    COUNT(DISTINCT user_pseudo_id) AS users,
    COUNT(DISTINCT
    IF
    (event_name='purchase',user_pseudo_id,NULL)) AS converted_users,
    COUNT(DISTINCT ecommerce.transaction_id) AS transactions,
    SUM(ecommerce.total_item_quantity) AS products,
    SUM(ecommerce.purchase_revenue) AS revenue
  FROM
--replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
  WHERE
-- Replace date range if running outside of BigQuery Reports Extension
  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
  GROUP BY
    event_date,
    --if you use lines 8-10, then use lines 29-31 instead of 32-33
    --collected_traffic_source.manual_source,
    --collected_traffic_source.manual_medium,
    --collected_traffic_source.manual_campaign_name
    traffic_source.source,
    traffic_source.medium 
    ORDER BY
    transactions DESC)

TrafficSource Report (w/ Channel Grouping Template)

Metrics

  • # Sessions,
  • # First_sessions,
  • # Users,
  • # Pageviews,
  • # Purchases,
  • $ Revenue,
  • % Bounce_rate,
  • # Page_per_session,
  • # Page_per_user

    Dimensions

    • Event_date,
    • source,    
    • Medium,
    • channelGrouping.

      SELECT
        CASE
          WHEN REGEXP_CONTAINS(medium, r'(?mi)(organic)') THEN 'Organic Search'
          WHEN REGEXP_CONTAINS(medium, r'(?mi)(none)') THEN 'Direct'
          WHEN REGEXP_CONTAINS(medium, r'(?mi)(display)') THEN 'Paid'
          WHEN REGEXP_CONTAINS(medium, r'(?mi)(not set)') THEN 'Not set'
          WHEN REGEXP_CONTAINS(medium, r'(?mi)(^offline$)') THEN 'Offline'
          WHEN REGEXP_CONTAINS(medium, r'(?mi)(cpc)|(ppc)|(cpm)|(adv)|(_ad)|(^ad)|(paid)|(remark)|(target)') THEN 'Paid'
          WHEN REGEXP_CONTAINS(medium, r'(?mi)(insta)|(twitter)|(telegram)|(social)|(smm)|(facebook)|(youtube)') THEN 'Social'
        ELSE
        'Referral'
      END
        AS channelGrouping,
        *,
        pageviews/sessions AS page_per_session,
        pageviews/users AS page_per_user
      FROM (
        SELECT
          event_date,
          traffic_source.source AS source,
          traffic_source.medium AS medium,
          --use lines 23-25 for session source/medium/campaign instead of user details
          --collected_traffic_source.manual_source as source,
          --collected_traffic_source.manual_medium as medium,
          --collected_traffic_source.manual_campaign_name as campaign,
          SUM(
          IF
            (event_name='session_start',1,NULL)) AS sessions,
          SUM(
          IF
            (event_name='first_visit',1,NULL)) AS first_sessions,
          count (DISTINCT user_pseudo_id) AS users,
          SUM(
          IF
            (event_name='page_view',1,NULL)) AS pageviews,
          SUM(
          IF
            (event_name='purchase',1,NULL)) AS purchases,
          SUM(ecommerce.purchase_revenue) AS revenue,
          --query bounce rate
          SAFE_DIVIDE(COUNT(DISTINCT CONCAT(user_pseudo_id,(
                SELECT
                  value.int_value
                FROM
                  UNNEST(event_params)
                WHERE
                  KEY = 'ga_session_id'))) - COUNT(DISTINCT
              CASE
                WHEN ( SELECT value.string_value FROM UNNEST(event_params) WHERE KEY = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id,( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'))
            END
              ),COUNT(DISTINCT CONCAT(user_pseudo_id,(
                SELECT
                  value.int_value
                FROM
                  UNNEST(event_params)
                WHERE
                  KEY = 'ga_session_id')))) AS bounce_rate,
        FROM
      --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
          `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
        WHERE
      -- Replace date range if running outside of BigQuery Reports Extension
        _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
        GROUP BY
          event_date,
          traffic_source.source,
          traffic_source.medium
          --if you use line 23-35, then use lines 69-71 instead of 66-67
          --collected_traffic_source.manual_source,
          --collected_traffic_source.manual_medium,
          --collected_traffic_source.manual_campaign_name
        ORDER BY
          sessions DESC )

      Events Report

      SELECT
        event_date,
        event_name,
        count (DISTINCT event_timestamp ) AS events,
        COUNT(DISTINCT user_pseudo_id) AS users
      FROM
      --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
          `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
        WHERE
      -- Replace date range if running outside of BigQuery Reports Extension
        _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
      GROUP BY
        1,
        2
      ORDER BY
        events desc

      GeoReport

      SELECT
        event_date,
        device.category,
        device.mobile_brand_name,
        device.mobile_model_name,
        device.operating_system,
        device.operating_system_version,
        device.web_info.browser,
        device.web_info.browser_version,
        count (DISTINCT user_pseudo_id) AS users,
        SUM(
        IF
          (event_name='session_start',1,NULL)) AS sessions,
        SUM(
        IF
          (event_name='page_view',1,NULL)) AS pageviews,
      FROM
      --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
          `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
        WHERE
      -- Replace date range if running outside of BigQuery Reports Extension
        _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
      GROUP BY
        event_date,
        device.category,
        device.mobile_brand_name,
        device.mobile_model_name,
        device.operating_system,
        device.operating_system_version,
        device.web_info.browser,
        device.web_info.browser_version
       ORDER BY
        users DESC

      Device and Browser Report

      SELECT
        event_date,
        device.category,
        device.mobile_brand_name,
        device.mobile_model_name,
        device.operating_system,
        device.operating_system_version,
        device.web_info.browser,
        device.web_info.browser_version,
        count (DISTINCT user_pseudo_id) AS users,
        SUM(
        IF
          (event_name='session_start',1,NULL)) AS sessions,
        SUM(
        IF
          (event_name='page_view',1,NULL)) AS pageviews,
      FROM
      --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
          `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
        WHERE
      -- Replace date range if running outside of BigQuery Reports Extension
        _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
      GROUP BY
        event_date,
        device.category,
        device.mobile_brand_name,
        device.mobile_model_name,
        device.operating_system,
        device.operating_system_version,
        device.web_info.browser,
        device.web_info.browser_version
       ORDER BY
        users DESC
      Book a demo

      Spend Smarter, See Results Sooner

      Learn how to optimize every marketing dollar with personalized insights from OWOX BI

      Book a demo

      icon Get in-depth insights

      How to Set up [GA4] to BigQuery Export

      icon Get in-depth insights

      How to Set up [GA4] to BigQuery Export