How Does GA4 Identify Users? An In-depth Analysis of Blended, Observed, and Device-Based Identity Models

Cover image illustrating analytics concept with text "How Does GA4 Identify Users? An In-depth Analysis of Blended, Observed, and Device-Based Identity Models"

Do your GA4 user counts seem unreasonable? Are you struggling with the challenge of tracking users across devices? The core of all these problems points to the GA4 Reporting Identity setting. This seemingly inconspicuous option is, in fact, the cornerstone that determines the quality of your data. This article will guide you through an in-depth understanding of the three core models GA4 uses to piece together a user’s profile, providing a clear decision-making framework to help you choose the setting that best reflects real user behavior based on your business needs, thereby making more precise, data-driven decisions.

Why is an Accurate "GA4 Reporting Identity" So Crucial?

Before we delve into the various models, we must first understand why this setting is so critical. It’s not just a technical option; it’s a strategic decision with far-reaching consequences.

Man browsing three mobile app interfaces

| From Universal Analytics (UA) to GA4: A Paradigm Shift in User Identification

Remember the old Universal Analytics (UA)? It primarily used browser cookies to track users, a method that proved inadequate in the multi-device era. GA4 has brought a fundamental change with its user-centric, event-driven model. This means GA4 is inherently designed to understand the complete user journey, and accurate user identification is the key to achieving this goal.

| Cross-Device, Cross-Platform: The Challenge of the Modern User Journey

Imagine a typical modern user journey: a customer browses your product on their mobile app during their morning commute, adds the product to their cart on their work computer at noon, and finally completes the purchase on their tablet at home in the evening. Without effective cross-device tracking, GA4 might misinterpret these three interactions as three different “new users.” This is why your GA4 reports may be double-counting users. In our analysis experience, there was a case where a client, due to improper identity settings, severely overestimated their new user count while underestimating the value of returning users, nearly leading them to cut the marketing budget for loyal members.

| How Does This Affect Your Business Decisions?

Incorrect user counts can create a chain reaction, directly misleading your business decisions. It affects:

  • Marketing Budget Allocation: You might mistakenly believe that certain channels for acquiring new customers are performing exceptionally well, while overlooking the value of retaining existing ones.
  • Customer Lifetime Value (LTV): The calculated LTV will be lower because a user’s multiple purchases are scattered across different “identities.”
  • Conversion Attribution: You will be unable to accurately determine which touchpoint truly led to the final conversion, causing your conversion attribution model to be inaccurate.

Understanding the importance of this setting, let’s now break down the three core technologies GA4 uses to identify users. They are the fundamental components that make up the different reporting models.

The Core of GA4 User Identification: A Detailed Explanation of the Three Identity Spaces

GA4 uses three different “Identity Spaces” in a specific order of priority to piece together a complete picture of a single user. Understanding how these three work, along with their pros and cons, is the first step in making the right choice.

| User-ID: The "Gold Standard" of Highest Accuracy

The GA4 User-ID is a unique identifier that you (the website owner) proactively provide to GA after a user completes a member login. It’s like issuing a unique digital ID card to each member. Because this is based on your own system’s first-party data, its accuracy is the highest.

  • Pros: It can perfectly connect a user’s behavior across all logged-in devices, whether it’s a phone, computer, or tablet, identifying them as the same person.
  • Prerequisites: Your website or app must have a member login system, and you’ll need developer assistance to deploy the code to successfully send the User-ID to GA4.

| Google Signals: A Powerful Tool for Cross-Device Remarketing

When a User-ID is not available, GA4 will try to use Google Signals. This leverages data from users who are logged into their Google accounts and have agreed to ad personalization settings to perform cross-device identification. In simple terms, as long as a user is logged into the same Google account on different devices, GA4 has a chance to connect these behaviors.

  • Pros: It can be enabled without requiring users to log into your website and also unlocks demographic and interest reports in GA4, which is very helpful for remarketing campaigns.
  • Key Drawback: This is the main cause of “data thresholding.” To protect user privacy, when the number of users in a report is too small, GA4 will automatically hide some data, leading to reports showing “(not set)” or insufficient data.

| Device ID: The Most Basic Anonymous Identification Method

If neither of the first two is available, GA4 will fall back to the most basic Device ID. On a website, this refers to the Client ID stored in the browser’s cookies (in the _ga cookie); on an app, it’s the App-Instance ID.

  • Drawback: This is its fundamental limitation—it cannot track across devices or browsers. As soon as a user changes devices, clears their cookies, or uses incognito mode, they are treated as a new user by GA4, leading to severe fragmentation of user data.

Understanding these three identity spaces will help you better understand the three reporting models they combine to form. Next, we will enter the core of this article and discuss how to make the best strategic choice among these three models.

How to Choose the Best Model for You: A Strategic Showdown of Blended, Observed, and Device-Based Models

GA4 combines the three identity spaces mentioned above into three reporting models, allowing you to choose based on your business needs and privacy considerations. This is a strategic decision, and your choice will directly determine the shape of your data.

| Model 1: Blended - For the Most Comprehensive User View

>This is GA4’s default and most powerful model. It works by sequentially using User-ID > Google Signals > Device ID to identify users, making the best effort to piece together the most complete user profile.>

  • Best for: Ideal for e-commerce sites or SaaS platforms with a member login system that want to maximize the completeness of their cross-device data. The Blended model allows you to see user behavior that is closest to reality.
  • Note: You must accept the potential data thresholding issues that may arise from enabling Google Signals.

| Model 2: Observed - A Balance Between Data and Privacy

>This model deliberately skips Google Signals, working by sequentially using User-ID > Device ID. Its goal is to maintain high-accuracy User-ID tracking while avoiding the interference of data thresholding in reports.

  • Best for: Suitable for businesses with a member system that prioritize data integrity and stability. For example, B2B websites, financial institutions, or research organizations with extremely high data accuracy requirements that cannot tolerate hidden data would find the Observed model very suitable.
  • Note: You will not be able to use the demographic and interest reports that are based on Google Signals.

| Model 3: Device-Based - Focusing on Privacy and Raw Data

This is the most basic model, using only the Device ID to identify users. This means GA4 will completely give up its cross-device tracking capabilities, reverting to a tracking method similar to traditional analytics tools.

  • Best for: Primarily suitable for content websites and blogs without a member login system, or for enterprises with extremely strict internal privacy policies that prevent the use of any login information or Google Signals. The Device-Based model is the most basic but has the lowest cross-device accuracy.
  • Note: User counts will be severely overestimated, and user journeys will be fragmented.

A Quick Comparison of the Three Reporting Identity Models

Feature

Blended Model

Observed Model

Device-Based Model

Accuracy

Highest

High

Low

Cross-Device Capability

Best (User-ID + Google Signals)

Medium (User-ID only)

None

Data Thresholding Risk

High

None

None

Implementation Prerequisite

User-ID Recommended

User-ID Required

None

Recommended Business Type

E-commerce, SaaS Platforms

B2B, Finance, Research

Content Websites, Blogs

Now that you clearly understand the differences between the three models, it’s time to check and configure your GA4 settings. In the next step, we’ll walk you through the GA4 backend for a hands-on exercise.

A Practical Guide: How to Set and Change Your Reporting Identity in the GA4 Backend

Theoretical knowledge must ultimately be put into practice. This step is very simple, but you need to pay special attention to the impact before and after the operation.

| Step 1: Check Your Prerequisites

Before switching models, be sure to confirm:

  • Have you deployed User-ID? If you want to use the “Blended” or “Observed” model, ensure your engineering team has correctly set up User-ID tracking.
  • Have you enabled Google Signals? If you want to use the “Blended” model, you need to enable Google Signals first.

| Step-by-Step Illustrated Guide: Finding and Switching Your Reporting Identity

You can easily find the settings page via the following path:

  1. Go to your GA4 backend.
  2. Click the “Admin” (gear icon) in the bottom-left corner.
  3. In the “Property” column, find and click on “Reporting Identity.”
  4. Here, you can see the currently selected model and can click to switch. The system will usually show “Show all” to let you choose.

| Important Warning: What to Pay Attention to After Switching Models

`What are the effects of switching the reporting identity`? This is a very important question. Please remember the following three points:

  1. Data is Not Retroactive: This is the most crucial point. The new setting will only apply to data collected *after* the switch. Your past report data will not change.
  2. User Counts May Change Drastically: When you switch from “Device-Based” to “Blended,” because GA4 starts merging previously fragmented users, you will observe a change in user numbers. Typically, the total user count will decrease significantly, while “events per user” will increase. This is a more realistic picture.
  3. Be Sure to Create an Annotation: It is highly recommended to use the GA4 Annotations feature to mark this major change on the day you switch. This will help you or other team members understand the reason for data changes at a specific point in time when analyzing data in the future.

As one senior data analyst advises, “After switching models, give the data at least a week to ‘stabilize’ before making a before-and-after comparison. Focus on ‘per-user’ metrics like ‘events per user’ and ‘user conversion rate,’ rather than just the total user count.”

After learning the basic operations, let’s explore some deeper topics that will help you fully grasp GA4’s data strategy.

Advanced Topics: Data Thresholding and the Cookieless Future

Choosing a reporting identity is not just a one-time setting; it’s about your future data strategy. Understanding the causes and responses to data thresholding and thinking about your strategy in the Cookieless era will put you one step ahead.

| An In-depth Look at "Data Thresholding": Why Does Your Report Show Insufficient Data?

Data Thresholding is a nightmare for many GA4 users. Its existence is fundamentally to protect the privacy of users after Google Signals is enabled. When the number of users in the date range or specific dimension you are viewing (e.g., city, interest) is too small and falls below Google’s internal threshold, GA4 will hide this data to prevent you from reverse-engineering the identity of a specific individual.

So, `how to avoid GA4 data thresholding`?

  • A temporary fix: Extend the date range of your report to increase the total number of users included in the calculation.
  • A fundamental fix: In the Reporting Identity settings, switch to the “Observed” or “Device-Based” model, which does not use Google Signals.
  • A professional solution: Export your GA4 data to BigQuery for analysis. The raw data in BigQuery is not subject to data thresholding, allowing you to perform deep analysis freely.

| Responding to the Cookieless Era: The Importance of GA4's Identity Recognition

As Google is about to phase out third-party cookies, the Cookieless era is upon us. This means that past methods of tracking and ad targeting that relied on cookies will face huge challenges. Tracking based on “Device ID” (which is essentially a first-party cookie) will also become increasingly unreliable as browser privacy policies tighten.

In this trend, proactively building your brand’s own first-party data has become the core of future data strategy. Encouraging users to register and log in, and using User-ID for precise identification, is the fundamental way to break free from cookie dependence. GA4’s powerful identity recognition model is a key tool designed to help you achieve this strategic transformation.

You now have a comprehensive understanding of GA4’s reporting identity, from basic to advanced. It’s time to summarize and plan your data strategy.

Conclusion: Empower Your Data and Make Smarter Decisions

We started with the importance of GA4’s identity recognition, broke down the three core identity spaces (User-ID, Google Signals, Device ID), and delved into a strategic comparison of the “Blended,” “Observed,” and “Device-Based” models.

Ultimately, the core point we need to reiterate is: choosing a GA4 Reporting Identity model is not a one-time technical setting but a strategic business decision that needs to be continuously evaluated based on your business model, data needs, and privacy strategy.

The foundation of your data determines the ceiling of your analysis. The wrong setting will lead you to make misjudgments based on distorted data, while the right setting will provide you with a clear perspective to gain insight into real user behavior. We encourage you to immediately check your current settings and, based on the decision-making framework provided in this article, re-evaluate if it is the optimal choice.

Frequently Asked Questions (FAQ)

A: No. Will my old data change after switching the GA4 reporting identity? The answer is no. The new setting will only apply to new data collected after the switch. Old data will remain unchanged, which is why it’s recommended to create an annotation when you make the switch.

A: Which GA4 reporting identity model is the best? There is no standard answer, only the “most suitable.” For e-commerce with a member system that values cross-device analysis, the “Blended” model is recommended. For those who value data stability and want to avoid data thresholding, the “Observed” model can be considered. For content websites without a member system, the “Device-Based” model is the only option.

A: Why can’t I see demographic data after enabling Google Signals? This is most likely due to “data thresholding.” When the number of users matching the criteria within the time frame you are viewing is too small, GA4 will hide this data to protect privacy. Try extending the date range or checking if other filters are limiting the amount of data.

A: Does the GA4 reporting identity affect BigQuery data? Yes, but in a different way. It affects how GA4 populates the `user_pseudo_id` field when exporting data, but it does not remove any raw event data. In BigQuery, you still have the most complete raw logs, unrestricted by the data thresholding of the GA4 reporting interface, which is one of its great strengths.

A: Should I update the GA4 reporting identity after adding a member feature? Absolutely. Once you have successfully deployed User-ID tracking, you should switch your reporting identity from “Device-Based” to “Blended” or “Observed” as soon as possible to take full advantage of the higher-accuracy first-party data for identifying and analyzing your users.

Share the Post:

Related Posts

A Complete Guide to Measuring Loyalty Program ROI Effectively

Maximize Revenue with Customer Lifetime Value Tracking: The Definitive Guide

Unlocking Growth with Member Behavioral Data Analysis