[GA4] BigQuery Export SQL Querries Library

i-radius

If you haven't set up your GA4 BigQuery Export yet, on the right side, you can find a link to our detailed guide on how you can set it all up in minutes.

However, you don’t need to have your own GA4 export data available in BigQuery to practice querying GA4 data. Google has created the BigQuery Public Dataset for GA4 export data so you can play with it.

This ga4_obfuscated_sample_ecommerce dataset contains a sample GA4 event export data from the Google Merchandise Store. The data covers three months from November 2020 to January 2021 and includes information on user behavior, product views, and transactions.

To access this dataset, you will need to have a Google Cloud project with the BigQuery API enabled.

How to query multiple GA4 'events_' tables

Before we start querying GA4 data in BigQuery let's figure out how to work with dates. Every day of Google Analytics 4 data in BigQuery is stored in a separate table.

If you only need data from one day (let's say December 31st, 2023) the FROM clause in your query will look something like this:

1FROM
2`project_id.dataset_id.events_20231231`

However, in most cases you will need to query a slightly longer period of time. You have one of 2 options here: WILDCARD or _TABLE_SUFFIX.

Define Dates using WILDCARD

When specifying the tables you want to retrieve the data FROM, you can use WILDCARD to select the date range.

1FROM
2`project_id.dataset_id.events_*` -- All tables in the dataset

1FROM
2`project_id.dataset_id.events_2021*` -- 2021 full year

1FROM
2`project_id.dataset_id.events_202101*` -- January 2021

1FROM
2`project_id.dataset_id.events_2021010*` -- First 9 days of January 2021

Define Dates using WHERE and Table Suffix

Right after specifying the tables you want to retrieve the data FROM, you can use the WHERE clause to specify the date range.

Static Date Range - Last 30 days

1SELECT
2*
3FROM
4`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
5WHERE
6_TABLE_SUFFIX BETWEEN `20210101` AND `20210131`

Dynamic Date Range - Last 30 days

1WHERE
2_TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 30 day))
3AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))

Dynamic Date Range - Last 7 days

1WHERE
2_TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 7 day))
3AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))

Static & Dynamic Date Range

You can also use the WHERE statement to mix both dynamic and static dates, for example you can query the data from a specific date till yesterday (In our example - From January 1st, 2021 till Yesterday)

1WHERE
2_TABLE_SUFFIX BETWEEN '20210101' AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))

How to Query Events Details

Events Count

1SELECT
2  event_name,
3  COUNT(*) AS event_count
4FROM
5  --select your project, dataset and the table, as well as the date range
6  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131`
7GROUP BY
8  event_name
9ORDER BY
10  event_count DESC
11LIMIT
12  100

i-shadow

Traffic Report

1SELECT
2  Traffic_source.source,
3  Traffic_source.medium,
4  Traffic_source.name,
5  COUNT(DISTINCT user_pseudo_id) AS Users
6FROM
7  --select your project, dataset and the table, as well as the date range
8  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
9GROUP BY
10  Traffic_source.source,
11  Traffic_source.medium,
12  Traffic_source.name
13ORDER BY
14  Users DESC
15LIMIT
16  100

i-shadow

Traffic Report by date

1SELECT
2  Event_date AS date,
3  Traffic_source.source,
4  Traffic_source.medium,
5  Traffic_source.name,
6  COUNT(DISTINCT user_pseudo_id) AS Users
7FROM
8  --select your project, dataset and the table, as well as the date range
9  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
10GROUP BY
11  date,
12  Traffic_source.source,
13  Traffic_source.medium,
14  Traffic_source.name
15ORDER BY
16  date DESC,
17  Users DESC
18LIMIT
19  1000

i-shadow

How to Query Specific Events

If you want to query just the specific events, like purchase or page_view, you can use the WHERE clause to filter just those event names you need.

1SELECT
2  *
3FROM
4  --select your project, dataset and the table, as well as the date range
5  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
6WHERE
7  event_name = "page_view"
8LIMIT
9  100
10

i-shadow

Query BigQuery Data right from Google Sheets

While mastering SQL queries is a powerful skill, visualizing these results can often be just as challenging.

Look, BigQuery is a data warehouse, and it's not designed to visualize data: build pivots, charts, graphs or share data with stakeholders.

That's why we've created a solution to simplify this step.

It's as simple as that:

1. Install this extension;

2. Copy-paste any query from this guide;

3. Run it directly from Google Sheets;

4. Set automated updates to get a fresh report.

For most of the advanced queries below, you won't even have to edit the query to enter your project_id, dataset_id, or write any query to select dates. Simply add a new query in extension, run and you'll see a little window on the right side to enter your variables.

Without ever touching the SQL code, making it exceptionally user-friendly for anyone, especially those not familiar with SQL or just getting started.

Install Extension

👉 By using this BigQuery Reports Extension for Google Sheets, you can instantly create dynamic and interactive reports, visualize BigQuery data, set automated updates, and share actionable data insights with stakeholders—all within the familiar interface of Google Sheets.

Note: for some of the queries, you'll need to manually change the project_id and the dataset_id to the one with your data (however, you can still run the query on public data to see how it goes).

How to Query User Metrics & Dimensions

Now it's time to move forward and talk about how you can query default user metrics and dimensions from the [GA4] BigQuery export, like users, active users, session id, user pseudo id, and user properties key and values.

Some of Google Analytics 4 dimensions and metrics can be retrieved straight from the tables without performing any calculations or unnests for specific event parameters.  Other metrics and dimension that you might need are a bit harder to retrieve.

Please, keep in mind that GA4 has multiple concepts of users:

  • total users - based on unique user_pseudo_id’s (ex. clientId)
  • active users - based on unique user_pseudo_id’s and engagement
  • users based on user_id

For every query that involves users, you have to first figure out which concept of user calculations is applicable to your specific calculations. You will have to calculate them, or extract, which can require some more complex SQL statements that we specified below:

How to Query User detailsSELECT

1SELECT
2  -- user_id (use if you set this up in GA)
3  user_id,
4  -- user_pseudo_id ( formerly client_id - the pseudonymous id for the user)
5  user_pseudo_id,
6  -- user_first_touch_timestamp (the time in microseconds at which the user first opened your website)
7  TIMESTAMP_MICROS(user_first_touch_timestamp) AS user_first_touch_timestamp,
8  -- user_ltv.revenue (the lifetime value of the user)
9  user_ltv.revenue AS user_ltv_revenue,
10  -- user_ltv.currency (the currency of the user lifetime value)
11  user_ltv.currency AS user_ltv_currency
12FROM
13  --select your project, dataset and the table
14  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
15WHERE
16  -- define the wildcard or static and/or dynamic start and end date
17  _TABLE_SUFFIX BETWEEN '20210101'
18  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day)) AND
19  user_ltv.revenue > 0
20LIMIT
21  100

i-shadow

How to Calculate New / Returning Visitors

1SELECT
2  -- user type (new visitor or returning visitor)
3  CASE
4    WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE event_name = 'session_start' AND KEY = 'ga_session_number') = 1 THEN 'new visitor'
5    WHEN (
6  SELECT
7    value.int_value
8  FROM
9    UNNEST(event_params)
10  WHERE
11    event_name = 'session_start'
12    AND KEY = 'ga_session_number') > 1 THEN 'returning visitor'
13  ELSE
14  NULL
15END
16  AS user_type,
17  COUNT(DISTINCT user_pseudo_id) AS users
18FROM
19  --select your project, dataset and the table
20  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
21WHERE
22  -- define the wildcard or static and/or dynamic start and end date
23  _TABLE_SUFFIX BETWEEN '20210101'
24  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
25GROUP BY
26  user_type
27HAVING
28  user_type IS NOT NULL
29LIMIT
30  100

i-shadow

How to Query User Types Grouped by Date

i-shadow
1SELECT
2--query date
3 event_date,
4--count new visitors
5 SUM(CASE WHEN user_type = 'new visitor' THEN 1 ELSE 0 END) AS new_visitors,
6--count returning visitors
7 SUM(CASE WHEN user_type = 'returning visitor' THEN 1 ELSE 0 END) AS returning_visitors
8FROM (
9 SELECT
10   event_date,
11   user_pseudo_id,
12   CASE
13     WHEN MAX(CASE WHEN event_name = 'session_start' AND params.key = 'ga_session_number' THEN params.value.int_value END) = 1 THEN 'new visitor'
14     WHEN MAX(CASE WHEN event_name = 'session_start' AND params.key = 'ga_session_number' THEN params.value.int_value END) > 1 THEN 'returning visitor'
15   END AS user_type
16 FROM
17--replace with your project_id.dataset_id.events_* or run as it is in OWOX BI BigQuery Reports Extension and change later
18   `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`,
19   UNNEST(event_params) AS params
20 GROUP BY
21   event_date,
22   user_pseudo_id
23)
24WHERE
25 user_type IS NOT NULL
26GROUP BY
27 event_date
28ORDER BY
29 event_date

i-shadow

How to Count # of Users by # of Sessions

1WITH
2  prep AS (
3  SELECT
4    user_pseudo_id,
5    MAX((
6      SELECT
7        value.int_value
8      FROM
9        UNNEST(event_params)
10      WHERE
11        KEY = 'ga_session_number')) OVER (PARTITION BY user_pseudo_id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS max_session_number
12  FROM
13    --select your project, dataset and the table, as well as the date range
14    `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
15  WHERE
16    -- define static and/or dynamic start and end date
17    _table_suffix BETWEEN '20210101'
18    AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day)))
19SELECT
20  -- count of sessions (Each sessions gets it's number, where the first session is specified as 1)
21  max_session_number AS count_of_sessions,
22  COUNT(DISTINCT user_pseudo_id) AS users
23FROM
24  prep
25GROUP BY
26  max_session_number
27ORDER BY
28  Max_session_number DESC
29LIMIT
30  100

i-border

Need Helping Hand Writing Complex SQL Queries?

Diving into advanced SQL can be daunting. But if you're looking to write complex queries without spending hours troubleshooting or refining your scripts... we've created a ChatGPT App - SQL Copilot for BigQuery.

Just check out the video below and use ChatGPT to write the queries that you need:


This gpt app integrates directly with your BigQuery, allowing you to use natural language to generate and optimize SQL queries.

You can share your table schema, add more context, and get the query that you need in seconds.
You can even perform complex JOINs to query data from different datasets or tables into the same report.

👉 The SQL Copilot for BigQuery by OWOX not only assists in creating accurate SQL queries but also teaches you how to handle complex data effectively.

It's like having an SQL expert sitting right beside you, ready to guide you through the intricacies of data querying.

How to Query Total users / New users / Active Users

1SELECT
2  -- users (the total number of users)
3  COUNT(DISTINCT user_pseudo_id) AS users,
4  -- active users (the number of engaged users)
5  COUNT(DISTINCT
6    CASE
7      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'engagement_time_msec') > 0 OR ( SELECT value.string_value FROM UNNEST(event_params) WHERE KEY = 'session_engaged') = '1' THEN user_pseudo_id
8    ELSE
9    NULL
10  END
11    ) AS active_users,
12  -- % active users (the percentage of engaged users from the total users)
13  COUNT(DISTINCT
14    CASE
15      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'engagement_time_msec') > 0 OR ( SELECT value.string_value FROM UNNEST(event_params) WHERE KEY = 'session_engaged') = '1' THEN user_pseudo_id
16    ELSE
17    NULL
18  END
19    ) / COUNT(DISTINCT user_pseudo_id) AS percent_active_users,
20  -- new users (the number of users who interacted with your site for the first time)
21  COUNT(DISTINCT
22    CASE
23      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_number') = 1 THEN user_pseudo_id
24    ELSE
25    NULL
26  END
27    ) AS new_users,
28  -- % new users (the percentage of new users from the total users)
29  COUNT(DISTINCT
30    CASE
31      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_number') = 1 THEN user_pseudo_id
32    ELSE
33    NULL
34  END
35    ) / COUNT(DISTINCT user_pseudo_id) AS percent_new_users
36FROM
37  --select your project, dataset and the table, as well as the date range
38  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
39WHERE
40  -- define static and/or dynamic start and end date
41  _table_suffix BETWEEN '20210101'
42  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
43LIMIT
44  100

i-shadow

How to Query New Sessions / Engaged Sessions / Events per Users

1SELECT
2  -- sessions (the total number of sessions)
3  COUNT(DISTINCT CONCAT(user_pseudo_id,(
4      SELECT
5        value.int_value
6      FROM
7        UNNEST(event_params)
8      WHERE
9        KEY = 'ga_session_id'))) AS sessions,
10  -- new sessions (the number of sessions by users who had never visited your site before)
11  COUNT(DISTINCT
12    CASE
13      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_number') = 1 THEN CONCAT(user_pseudo_id,( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'))
14    ELSE
15    NULL
16  END
17    ) AS new_sessions,
18  -- % new sessions (the percentage of sessions by users who had never visited your site before)
19  COUNT(DISTINCT
20    CASE
21      WHEN ( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_number') = 1 THEN CONCAT(user_pseudo_id,( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'))
22    ELSE
23    NULL
24  END
25    ) / COUNT(DISTINCT CONCAT(user_pseudo_id,(
26      SELECT
27        value.int_value
28      FROM
29        UNNEST(event_params)
30      WHERE
31        KEY = 'ga_session_id'))) AS percent_new_sessions,
32  -- number of sessions per user (the total number of sessions divided by the total number of users)
33  COUNT(DISTINCT CONCAT(user_pseudo_id,(
34      SELECT
35        value.int_value
36      FROM
37        UNNEST(event_params)
38      WHERE
39        KEY = 'ga_session_id'))) / COUNT(DISTINCT user_pseudo_id) AS number_of_sessions_per_user,
40  -- event count per user (the number of times all of the events were triggered divided by amount of users)
41  COUNT(*) / COUNT(DISTINCT user_pseudo_id) AS event_count_per_user,
42  -- engaged sessions per user
43  COUNT(DISTINCT
44    CASE
45      WHEN ( SELECT value.string_value FROM UNNEST(event_params) WHERE KEY = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id,( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'))
46  END
47    ) / COUNT(DISTINCT user_pseudo_id) AS engaged_sessions_per_user
48FROM
49  --select your project, dataset and the table, as well as the date range
50  `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
51WHERE
52  -- define static and/or dynamic start and end date
53  _table_suffix BETWEEN '20210101'
54  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
55LIMIT
56  100

i-shadow

How to Query Engagement Details & Bounce Rate

i-shadow
1select
2 --query date
3 event_date as date,
4 --query sessions count
5 count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id'))) as sessions,
6 --query engaged sessions count
7 count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end) as engaged_sessions,
8 --query engagement rate
9 safe_divide(count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end),count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')))) as engagement_rate,
10 --query bounced sessions count
11 count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id'))) - count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end) as bounces,
12 --query bounce rate
13 safe_divide(count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id'))) - count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end),count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')))) as bounce_rate,
14 -- query event count per session
15 safe_divide(count(distinct case when event_name = 'page_view' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end),count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')))) as event_count_per_session
16 from
17   --replace with your `project_id.dataset_id.events_*` or run as it is in OWOX BI BigQuery Reports Extension and change later
18  `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
19where
20 -- define start and end date
21 _TABLE_SUFFIX BETWEEN
22 --replace with start date eg '20240301'
23 '{start_date default="20210101" type="input"}' AND
24 --replace with end date eg '202403131'
25 '{end_date default="20210110" type="input"}'
26   GROUP BY date
27   ORDER BY date DESC
28LIMIT 100

i-shadow

How to Query User Count by Source / Medium and Date

i-shadow

1SELECT
2 -- date (dimension)
3 event_date,
4 -- source (dimension | extracted from event_params)
5 (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source') AS source,
6 -- medium (dimension | extracted from event_params)
7 (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium') AS medium,
8 -- users (metric | the total number of users)
9 COUNT(DISTINCT user_pseudo_id) AS users,
10 -- active users (metric | users with engagement time or session engaged)
11 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') > 0 OR (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN user_pseudo_id ELSE NULL END) AS active_users,
12 -- new users (metric | users with ga_session_number = 1)
13 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN user_pseudo_id ELSE NULL END) AS new_users,
14 -- % new users (metric | percentage of new users out of total users)
15 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN user_pseudo_id ELSE NULL END) * 1.0 / COUNT(DISTINCT user_pseudo_id) AS percentage_new_users,
16 -- new sessions (metric | count of new sessions by new users)
17 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) ELSE NULL END) AS new_sessions,
18 -- % new sessions (metric | percentage of new sessions out of total sessions)
19 COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) ELSE NULL END) * 1.0 / COUNT(DISTINCT CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING))) AS percentage_new_sessions,
20 -- sessions per user (metric | average number of sessions per user)
21 COUNT(DISTINCT CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING))) * 1.0 / COUNT(DISTINCT user_pseudo_id) AS number_of_sessions_per_user,
22 -- event count per user (metric | average number of page_view events per user)
23 COUNTIF(event_name = 'page_view') * 1.0 / COUNT(DISTINCT user_pseudo_id) AS event_count_per_user,
24 -- engaged sessions per user (metric | average number of engaged sessions per user)
25 COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) END) * 1.0 / COUNT(DISTINCT user_pseudo_id) AS engaged_sessions_per_user
26FROM
27 --replace with your project_id.dataset_id.events_* or run as it is in OWOX BI BigQuery Reports Extension and change later
28 `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
29WHERE
30 -- define start and end date
31  _TABLE_SUFFIX BETWEEN
32--replace with start date eg '20240301'
33'{start_date default="20210101" type="input"}' AND
34--replace with end date eg '202403131'
35'{end_date default="20210110" type="input"}'
36GROUP BY
37 event_date, source, medium
38ORDER BY
39event_date DESC,
40users DESC

i-shadow

How to Query User Count by Source / Medium and Landing Page

1SELECT
2  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source') AS source,
3  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium') AS medium,
4  CONCAT((SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source'), ' / ', (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium')) AS source_medium,
5  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS landing_page,
6  COUNT(DISTINCT user_pseudo_id) AS new_users
7FROM
8  `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
9WHERE
10  event_name = 'session_start' AND
11  (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 AND
12  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
13GROUP BY
14  source,
15  medium,
16  source_medium,
17  landing_page
18ORDER BY
19  new_users DESC

i-shadow

1SELECT
2  event_date,
3  page_location,
4  page_title,
5  COUNTIF(event_name = 'page_view') AS page_views,
6  COUNT(DISTINCT user_pseudo_id) AS users,
7  SAFE_DIVIDE(COUNTIF(event_name = 'page_view'), COUNT(DISTINCT user_pseudo_id)) AS pageviews_per_user,
8  SUM(IF(event_name = 'user_engagement', engagement_time, 0)) AS total_engagement_time,
9  AVG(IF(event_name = 'user_engagement', engagement_time, NULL)) AS avg_engagement_time
10-- For Unique Pageviews and Engagements per User, consider calculating these based on distinct user interactions within subqueries if necessary
11FROM (
12  SELECT
13    event_date,
14    user_pseudo_id,
15    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS page_location,
16    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_title') AS page_title,
17    event_name,
18    IF(event_name = 'user_engagement', (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') / 1000, 0) AS engagement_time
19  FROM
20--replace next line with your project_id.dataset_id if running in the BigQuery interface
21    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
22  WHERE
23--replace next line with your dates if running in the BigQuery interface
24    _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
25    AND (event_name = 'page_view' OR event_name = 'user_engagement')
26)
27GROUP BY
28  event_date, page_location, page_title
29ORDER BY
30  event_date, page_views DESC
31LIMIT 100

i-shadow

How to Perform E-commerce Purchase Events Analysis

i-shadow

1SELECT
2  event_timestamp,
3  event_date,
4  event_name,
5  user_pseudo_id,
6  (
7    SELECT value.string_value
8    FROM UNNEST(event_params)
9    WHERE key = 'page_location'
10  ) AS landing_page,
11  (
12    SELECT COALESCE(value.int_value, value.float_value, value.double_value)
13    FROM UNNEST(event_params)
14    WHERE key = 'value'
15  ) AS event_value,
16  (
17    SELECT value.string_value
18    FROM UNNEST(event_params)
19    WHERE key = 'transaction_id'
20  ) AS transaction_id,
21  (
22    SELECT value.string_value
23    FROM UNNEST(event_params)
24    WHERE key = 'currency'
25  ) AS currency,
26  -- Assuming item details are directly under event_params for simplicity
27  (
28    SELECT value.string_value
29    FROM UNNEST(event_params)
30    WHERE key = 'item_name'
31  ) AS item_name,
32  (
33    SELECT value.string_value
34    FROM UNNEST(event_params)
35    WHERE key = 'item_id'
36  ) AS item_id,
37  (
38    SELECT value.string_value
39    FROM UNNEST(event_params)
40    WHERE key = 'item_brand'
41  ) AS item_brand,
42  (
43    SELECT value.string_value
44    FROM UNNEST(event_params)
45    WHERE key = 'item_category'
46  ) AS item_category,
47  (
48    SELECT value.string_value
49    FROM UNNEST(event_params)
50    WHERE key = 'item_variant'
51  ) AS item_variant,
52  (
53    SELECT value.string_value
54    FROM UNNEST(event_params)
55    WHERE key = 'coupon'
56  ) AS coupon,
57  (
58    SELECT COALESCE(value.float_value, value.double_value)
59    FROM UNNEST(event_params)
60    WHERE key = 'shipping'
61  ) AS shipping,
62  (
63    SELECT value.string_value
64    FROM UNNEST(event_params)
65    WHERE key = 'payment_type'
66  ) AS payment_type,
67  (
68    SELECT value.string_value
69    FROM UNNEST(event_params)
70    WHERE key = 'page_title'
71  ) AS page_title,
72  COUNT(*) AS number_of_events
73FROM
74  --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
75    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
76WHERE
77  --replace with conversion even name
78  event_name = '{event_name default="purchase" type="input"}'
79  AND 
80  -- replace w/ your dates if running outside of BigQuery Reports Extension
81  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
82GROUP BY
83  event_timestamp, event_date, event_name, user_pseudo_id, landing_page, event_value, transaction_id, currency, item_name, item_id, item_brand, item_category, item_variant, coupon, shipping, payment_type, page_title
84ORDER BY
85  event_date, event_timestamp

i-shadow

GA4 Standard Reports

User Acquisition Report

i-shadow

1-- Query to retrieve a comprehensive user acquisition report from GA4 data in BigQuery
2WITH base_data AS (
3 SELECT
4   -- The dimensions
5   event_date,
6   event_name,
7   user_pseudo_id,
8   traffic_source.source AS source,
9   traffic_source.medium AS medium,
10   -- Combining source and medium into a single dimension
11   CONCAT(traffic_source.source, '/', traffic_source.medium) AS source_medium,
12   traffic_source.name AS campaign,
13   -- Extracting various user parameters for the metrics calculation
14   (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') AS session_engaged,
15   (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') AS engagement_time_msec,
16   (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') AS ga_session_number,
17   (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS ga_session_id,
18FROM
19  --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
20    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
21WHERE
22  -- replace w/ your dates if running outside of BigQuery Reports Extension
23  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'),
24-- Calculation of metrics for each user
25metrics AS (
26 SELECT
27   event_date,
28   event_name,
29   user_pseudo_id,
30   source,
31   medium,
32   source_medium,
33   campaign,
34   session_engaged,
35   engagement_time_msec,
36   ga_session_number,
37   ga_session_id,
38  
39   -- Calculating engagement time in seconds for clarity
40   SAFE_DIVIDE(engagement_time_msec, 1000) AS engagement_time_sec,
41  
42   -- Determining if the session is engaged based on session_engaged value or engagement_time_msec
43   (session_engaged = '1' OR engagement_time_msec > 0) AS is_engaged,
44  
45   -- Determining if the session is the first session
46   ga_session_number = 1 AS is_first_session,
47  
48   -- Creating a unique session identifier
49   CONCAT(user_pseudo_id, CAST(ga_session_id AS STRING)) AS session_id
50 FROM
51   base_data
52)
53-- Aggregating & Selecting metrics across all users
54 SELECT
55   event_date,
56   source,
57   medium,
58   source_medium,
59   campaign,
60  
61   -- Counting total and active users
62   COUNT(DISTINCT user_pseudo_id) AS total_users,
63   COUNT(DISTINCT CASE WHEN is_engaged THEN user_pseudo_id END) AS active_users,
64  
65   -- Calculating the percentage of active users
66   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_engaged THEN user_pseudo_id END), COUNT(DISTINCT user_pseudo_id)) AS percent_active_users,
67  
68   -- Counting new users and calculating the percentage of new users
69   COUNT(DISTINCT CASE WHEN is_first_session THEN user_pseudo_id END) AS new_users,
70   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_first_session THEN user_pseudo_id END), COUNT(DISTINCT user_pseudo_id)) AS percent_new_users,
71  
72   -- Counting total sessions, new sessions and calculating percentages
73   COUNT(DISTINCT session_id) AS sessions,
74   COUNT(DISTINCT CASE WHEN is_first_session THEN session_id END) AS new_sessions,
75   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_first_session THEN session_id END), COUNT(DISTINCT session_id)) AS percent_new_sessions,
76  
77   -- Calculating sessions per user and engaged sessions per user
78   SAFE_DIVIDE(COUNT(DISTINCT session_id), COUNT(DISTINCT user_pseudo_id)) AS sessions_per_user,
79   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_engaged THEN session_id END), COUNT(DISTINCT user_pseudo_id)) AS engaged_sessions_per_user,
80  
81   -- Counting engaged sessions and calculating the engagement rate
82   COUNT(DISTINCT CASE WHEN is_engaged THEN session_id END) AS engaged_sessions,
83   SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN is_engaged THEN session_id END), COUNT(DISTINCT session_id)) AS engagement_rate,
84  
85   -- Calculating average engagement time and events per user
86   SAFE_DIVIDE(SUM(CASE WHEN is_engaged THEN engagement_time_sec ELSE 0 END), COUNT(DISTINCT user_pseudo_id)) AS average_engagement_time_sec,
87   SAFE_DIVIDE(COUNT(*), COUNT(DISTINCT user_pseudo_id)) AS events_per_user
88 -- Additional metrics for conversions and user conversion rate would be added here
89 -- For this example, we assume 'purchase' events signify conversions and we sum up purchase revenue
90 -- Placeholder for revenue calculation (requires exact field specification)
91 FROM
92   metrics
93 GROUP BY
94   event_date,
95   source,
96   medium,
97   source_medium,
98   Campaign
99ORDER BY 
100event_date DESC, 
101total_users DESC

i-shadow

Customer Reports

Product Report

i-shadow

1SELECT
2  event_date AS Date,
3  item_id AS ID,
4  item_name AS Product_Name,
5  item_variant AS Product_Variant,
6  item_brand AS Brand,
7  item_category AS Category,
8  price AS Price,
9  SUM(quantity) AS Quantity,
10  COUNT(DISTINCT user_pseudo_id) AS User_count
11FROM
12  --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
13    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`,
14  UNNEST(items)
15WHERE
16  -- Replace date range if you’re running outside if BigQuery Reports Extension
17  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
18  AND event_name IN ('purchase')
19GROUP BY
20  1, 2, 3, 4, 5, 6, 7
21ORDER BY
22  Quantity DESC

i-shadow

Audience Report

i-shadow

1SELECT
2  user_pseudo_id AS Client_id,
3  user_id AS User_id,
4  traffic_source.`source` AS Source,
5  traffic_source.medium AS Medium,
6  geo.continent AS Continent,
7  geo.sub_continent AS Sub_Continent,
8  geo.country AS Country,
9  geo.region AS Region,
10  geo.city AS City,
11  device.category AS Device_Category,
12  device.mobile_brand_name AS Mobile_Brand,
13  device.mobile_model_name AS Mobile_Model,
14  device.operating_system AS OS,
15  device.operating_system_version AS OS_version,
16  device.`language` as Language
17FROM
18   --replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
19    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`,
20  UNNEST(items)
21WHERE
22  -- Replace date range.
23  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
24GROUP BY
25  1, 2, 3, 4,  5,  6, 7, 8, 9, 10, 11, 12, 13, 14, 15
26ORDER BY
27  Client_id ASC

i-shadow

Ecommerce Report

i-shadow

1SELECT
2  *,
3  revenue/transactions AS average_order_value
4FROM (
5  SELECT
6    event_date,
7    --use line 8-10 instead of 11-12 if you need data and the source/channel/company of the session, not the user
8    --collected_traffic_source.manual_source as source,
9    --collected_traffic_source.manual_medium as medium,
10    --collected_traffic_source.manual_campaign_name as campaign,
11    traffic_source.source AS source,
12    traffic_source.medium AS medium,
13    COUNT(DISTINCT user_pseudo_id) AS users,
14    COUNT(DISTINCT
15    IF
16    (event_name='purchase',user_pseudo_id,NULL)) AS converted_users,
17    COUNT(DISTINCT ecommerce.transaction_id) AS transactions,
18    SUM(ecommerce.total_item_quantity) AS products,
19    SUM(ecommerce.purchase_revenue) AS revenue
20  FROM
21--replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
22    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
23  WHERE
24-- Replace date range if running outside of BigQuery Reports Extension
25  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
26  GROUP BY
27    event_date,
28    --if you use lines 8-10, then use lines 29-31 instead of 32-33
29    --collected_traffic_source.manual_source,
30    --collected_traffic_source.manual_medium,
31    --collected_traffic_source.manual_campaign_name
32    traffic_source.source,
33    traffic_source.medium 
34    ORDER BY
35    transactions DESC)

i-shadow

TrafficSource Report (w/ Channel Grouping Template)Metrics:

  • # Sessions,
  • # First_sessions,
  • # Users,
  • # Pageviews,
  • # Purchases,
  • $ Revenue,
  • % Bounce_rate,
  • # Page_per_session,
  • # Page_per_user

Dimensions:

  • Event_date,
  • source,    
  • Medium,
  • channelGrouping.
i-shadow

1SELECT
2  CASE
3    WHEN REGEXP_CONTAINS(medium, r'(?mi)(organic)') THEN 'Organic Search'
4    WHEN REGEXP_CONTAINS(medium, r'(?mi)(none)') THEN 'Direct'
5    WHEN REGEXP_CONTAINS(medium, r'(?mi)(display)') THEN 'Paid'
6    WHEN REGEXP_CONTAINS(medium, r'(?mi)(not set)') THEN 'Not set'
7    WHEN REGEXP_CONTAINS(medium, r'(?mi)(^offline$)') THEN 'Offline'
8    WHEN REGEXP_CONTAINS(medium, r'(?mi)(cpc)|(ppc)|(cpm)|(adv)|(_ad)|(^ad)|(paid)|(remark)|(target)') THEN 'Paid'
9    WHEN REGEXP_CONTAINS(medium, r'(?mi)(insta)|(twitter)|(telegram)|(social)|(smm)|(facebook)|(youtube)') THEN 'Social'
10  ELSE
11  'Referral'
12END
13  AS channelGrouping,
14  *,
15  pageviews/sessions AS page_per_session,
16  pageviews/users AS page_per_user
17FROM (
18  SELECT
19    event_date,
20    traffic_source.source AS source,
21    traffic_source.medium AS medium,
22    --use lines 23-25 for session source/medium/campaign instead of user details
23    --collected_traffic_source.manual_source as source,
24    --collected_traffic_source.manual_medium as medium,
25    --collected_traffic_source.manual_campaign_name as campaign,
26    SUM(
27    IF
28      (event_name='session_start',1,NULL)) AS sessions,
29    SUM(
30    IF
31      (event_name='first_visit',1,NULL)) AS first_sessions,
32    count (DISTINCT user_pseudo_id) AS users,
33    SUM(
34    IF
35      (event_name='page_view',1,NULL)) AS pageviews,
36    SUM(
37    IF
38      (event_name='purchase',1,NULL)) AS purchases,
39    SUM(ecommerce.purchase_revenue) AS revenue,
40    --query bounce rate
41    SAFE_DIVIDE(COUNT(DISTINCT CONCAT(user_pseudo_id,(
42          SELECT
43            value.int_value
44          FROM
45            UNNEST(event_params)
46          WHERE
47            KEY = 'ga_session_id'))) - COUNT(DISTINCT
48        CASE
49          WHEN ( SELECT value.string_value FROM UNNEST(event_params) WHERE KEY = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id,( SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'))
50      END
51        ),COUNT(DISTINCT CONCAT(user_pseudo_id,(
52          SELECT
53            value.int_value
54          FROM
55            UNNEST(event_params)
56          WHERE
57            KEY = 'ga_session_id')))) AS bounce_rate,
58  FROM
59--replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
60    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
61  WHERE
62-- Replace date range if running outside of BigQuery Reports Extension
63  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
64  GROUP BY
65    event_date,
66    traffic_source.source,
67    traffic_source.medium
68    --if you use line 23-35, then use lines 69-71 instead of 66-67
69    --collected_traffic_source.manual_source,
70    --collected_traffic_source.manual_medium,
71    --collected_traffic_source.manual_campaign_name
72  ORDER BY
73    sessions DESC )

i-shadow

Events Report

i-shadow

1SELECT
2  event_date,
3  event_name,
4  count (DISTINCT event_timestamp ) AS events,
5  COUNT(DISTINCT user_pseudo_id) AS users
6FROM
7--replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
8    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
9  WHERE
10-- Replace date range if running outside of BigQuery Reports Extension
11  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
12GROUP BY
13  1,
14  2
15ORDER BY
16  events desc

i-shadow

GeoReport

i-shadow

1SELECT
2  event_date,
3  device.category,
4  device.mobile_brand_name,
5  device.mobile_model_name,
6  device.operating_system,
7  device.operating_system_version,
8  device.web_info.browser,
9  device.web_info.browser_version,
10  count (DISTINCT user_pseudo_id) AS users,
11  SUM(
12  IF
13    (event_name='session_start',1,NULL)) AS sessions,
14  SUM(
15  IF
16    (event_name='page_view',1,NULL)) AS pageviews,
17FROM
18--replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
19    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
20  WHERE
21-- Replace date range if running outside of BigQuery Reports Extension
22  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
23GROUP BY
24  event_date,
25  device.category,
26  device.mobile_brand_name,
27  device.mobile_model_name,
28  device.operating_system,
29  device.operating_system_version,
30  device.web_info.browser,
31  device.web_info.browser_version
32 ORDER BY
33  users DESC

i-shadow

Device and Browser Report

i-shadow

1SELECT
2  event_date,
3  device.category,
4  device.mobile_brand_name,
5  device.mobile_model_name,
6  device.operating_system,
7  device.operating_system_version,
8  device.web_info.browser,
9  device.web_info.browser_version,
10  count (DISTINCT user_pseudo_id) AS users,
11  SUM(
12  IF
13    (event_name='session_start',1,NULL)) AS sessions,
14  SUM(
15  IF
16    (event_name='page_view',1,NULL)) AS pageviews,
17FROM
18--replace with your project_id.dataset_id if running outside of BigQuery Reports Extension
19    `{gcp_project_id default="bigquery-public-data"}.{dataset_id default="ga4_obfuscated_sample_ecommerce"}.events_*`
20  WHERE
21-- Replace date range if running outside of BigQuery Reports Extension
22  _TABLE_SUFFIX BETWEEN '{start_date default="20210101" type="input"}' AND '{end_date default="20210110" type="input"}'
23GROUP BY
24  event_date,
25  device.category,
26  device.mobile_brand_name,
27  device.mobile_model_name,
28  device.operating_system,
29  device.operating_system_version,
30  device.web_info.browser,
31  device.web_info.browser_version
32 ORDER BY
33  users DESC

i-shadow

FAQ

No items found.
Oops! Something went wrong while submitting the form...