How to Calculate GA4 Total Users and Active Users in BigQuery

Recreating GA4 Basic Metrics in BigQuery series

In the previous post, I showed the SQL to calculate the “Sessions” metrics in BigQuery.

In this post, we will recreate the “Users” and “Active Users” metrics in BigQuery.

GA4 tracks users in three different ways.

  1. By default, it uses an anonymous user ID for each user and stores it in the user_pseudo_id column.

  2. If you have enabled known user_id tracking in GA4 for all the users then by default GA4 uses that ID to track users.

  3. Finally, if you enabled user_id tracking in GA4 but not all the users have a User ID then GA4 uses user_id where it can find it else it uses anonymous_user_id

The first scenario is the most common one and the last one is a bit complicated. So I will cover the first two scenarios in this post and will cover the third one in the future.

Please note that GA4 also provides two different metrics for users: “Total Users:” and “Active Users”. The “Active Users” is also called “Users” in GA4. (This is confusing but there are a lot of things in GA4 that are confusing). Let’s take a look at both these metrics.

Total User Metrics

“Total Users” is the total number of people who visited your site or app in the specified date range. As mentioned above we will use two different methods to calculate the “Total Users” metrics.

  1. Using user_pseudo_id column in GA4, the SQL will be:

SELECT COUNT(DISTINCT user_pseudo_id)
FROM your_project_id.your_dataset_id.event_######

Since each user’s ID is stored in the user_pseudo_id column, we just need to a distinct count (unique count) of those IDs.

  1. Using user_id column in GA4, the SQL will be

SELECT COUNT(DISTINCT user_id)
FROM your_project_id.your_dataset_id.event_######

In the above SQL, instead of using user_pseudo_id, I used the user_id column, which stores the known user IDs.

That’s it. It is that simple.

Active User Metrics

"Active users" (or just "Users") is the number of people who engaged with your site or app in the specified date range. This is the User count that’s shown in the default reports.

Now to make your life easy, GA4 data export in BigQuery also contains a column called “is_active_user”. This column contains “true” if the user is an active user and “false” if the user is not active. So you don’t have to do any extensive calculations to figure out this metric.

We can simply use that column to only count those users who have the value of “true” in “is_active_user” column. So here are the two modified SQL scripts:

SELECT COUNT(DISTINCT user_pseudo_id)
FROM 'your_project_id.your_dataset_id.event_######'
WHERE is_active_user = true
SELECT COUNT(DISTINCT user_id)
FROM 'your_project_id.your_dataset_id.event_######'
WHERE is_active_user = true

That’s it. This one wasn’t difficult either.

We will look at various user types in future lessons. For now, practice writing these SQL queries.

If you want to support this newsletter then Upgrade to the Premium version.

You May Also Like

Happy Learning!

Anil Batra, Founder Optizent

If you need help with GA4 and BigQuery then contact me.

Subscribe to keep reading

This content is free, but you must be subscribed to GA4 BigQuery For Marketing Analytics to continue reading.

Already a subscriber?Sign In.Not now

Join the conversation

or to participate.