Migrating your Reporting from Google Universal Analytics to Google Analytics 4
Google announced that it will focus on its newer web tracking service, Google Analytics 4, and that the older version, Universal Analytics, will no longer receive hits after June 30th, 2023. This change will affect many users who rely on the free version of Google Analytics. The paid version, Google Analytics 360, will continue for another year before also being discontinued. Google has been encouraging users to switch to the new version, and we have helped our clients with this process. This article will focus on the impact of the switch to Google Analytics 4 on business intelligence dashboards and reports, which is an important consideration for many users.
Since the launch of GA4, we have covered many migration projects for our clients. In these we’re regularly faced with questions around what migrating means. Where other articles focus more on that, this article is focussed on the stages downstream of Google Analytics. Most importantly on business intelligence dashboards and reports and the impact of a switch to Google Analytics 4 on those.
Comparison of UI of Universal Analytics (left) vs Google Analytics 4 (right)
Reporting on the same but different
For many companies, be they B2B or B2C, Google Analytics is not only an essential tool in its UI but a key component of their (marketing) analytics and attribution modeling. In this way data from Google Analytics is used in reportings and dashboards whether they are based on a data warehouse or by directly importing data from Google Analytics to the visualization tool of choice. If you are using Universal analytics data in this way, the time to take stock has come now: on July 1st 2023 your dashboards will not receive new data. And as we learned, switching to Google Analytics 4 might not be as straightforward given the change in the underlying data model.
The most prominent example is the “old” Universal Analytics event. This basically comes in a standard form of having four qualities being:
These will no longer exist this way in GA4 anymore. If you base your funnel reporting on the structure of specific event labels, you will have to rework your dashboard.
Another prominent example is eCommerce tracking, where we also have differing requirements towards how GA4 wants to receive numbers compared to Universal Analytics: https://developers.google.com/analytics/devguides/migration/ecommerce/gtagjs-before-you-begin
Given that an Analytics migration is already a challenge to handle in house for some companies, adding the complexity of understanding the impact on a grown reporting and analytics landscape might seem overwhelming.
GA4 – Not just a new Analytics, a complete new way of tracking
In order to understand why the new version is such a big move, we have to take a quick look at the basic data model. This is the underlying foundation of web tracking of both, the old and new Google Analytics. (https://support.google.com/analytics/answer/9964640?hl=en#zippy=%2Cin-this-article)
Google Analytics 4 is taking a modern approach to tracking web events that is event-based. Universal Analytics in summary had a very rigid data model that was mainly focused on sessions, and individual visits to a web page. It would receive data in the form of hits with a specific scope and type (page, event, e-commerce and social). This led to a very specific way that data needed to be provided, and only the predefined values by Google were working in terms of tracking.
In order to achieve this Google put a lot of focus on integrating information from multiple platforms and trying to merge users when they browse mobile with what they do on desktop as well as at least trying to be more privacy-centric. Legislation will decide on that in the end. However in order to achieve this, they went with a very modern approach of tracking data which is as an event stream. This stream initially captures all information as long as it conforms to a base structure of providing an event name, like pageview and then individual parameters, that characterize the specific event.
In the example of the pageview, that can be something as straightforward as the URL of the pageview but also something more complex (added from an internal or external source) such as if the user is a registered user and what the CLTV thus far is. The power of flexible events really comes to light when comparing the limited amount of information transferred with the old Universal Analytics event that had three text qualities of category, action and label as well as only one numeric value. With parameters you can go extremely deep into specifying an event. This gives you a lot of possibilities in structuring out your tracking and allows for way more use cases than before.
As you might imagine, this flexibility in how the two versions of Google Analytics capture data with a lot more freedom and customization in Google Analytics 4, also leads to a significantly different underlying data structure between the two.
A modern data stack lifts most of the load
Gathering the requirements from current visualizations when it comes to the underlying data from Universal Analytics already goes some way towards tackling the switch in reporting. However a modern data stack can make the approach way more comfortable. In an approach we took with one of our clients after migrating their Universal Analytics setup to Google Analytics 4 we created an overview of where Universal Analytics data was used. The clients data stack looks as below:
Fivetran is used for loading data to a RAW database within the client’s snowflake account. The RAW database is then accessed by dbt, which creates a first level of data models called staging models. These models normalize data across different sources, say timezones, currencies and make data readable where it is convoluted from the source. Based on these staging models we built their initial infrastructure with two additional layers: first creating interim models that pre-aggregate information that is regularly needed in evaluations like for example the daily revenue per product or the connection of acquired users to the marketing cost needed to acquire them. The final layer is then the so called mart models, a set of highly aggregated tables that are used for performant reportings. These combine the pre-aggregated numbers to meaningful datasets.
Visualization tool of choice for our client was metabase, an approachable cloud visualization tool that allows entry level users to easily create dashboards where data-savvy users can use staging tables to create explorative analysis using SQL.
Data governance as the base for change
How did this stack help us in migrating reports from Universal Analytics to Google Analytics 4? We spend quite some time on our projects to “feed” documentation to dbt. Dbt allows for the dependency of models to be visualized in so called dbt docs (more information can be found here: https://docs.getdbt.com/docs/collaborate/documentation).
These dependencies allowed us to understand not only which visualizations depended on Universal Analytics but also which specific columns we got in our raw data from this source. Fivetran readily provides a Google Analytics 4 connector, making it a five minute task to connect to your Google Analytics 4 account and get your data to your data warehouse of choice. With our prepared “shopping list” from dbt documents, we set up the GA4 connector in fivetran to fetch the data we knew mapped to what we had in Universal Analytics given that we structured the Analytics concept during the migration from UA to GA4. Custom reports in fivetran allow to pre-select a combination of dimensions and metrics for your specific use case. Given that you pay for data transfer with fivetran, we highly value the option to test any connector for 14 days for free there.
After data arrived in the raw layer we aggregated staging tables from Google Analytics 4 data that mirrored the way Universal Analytics data was structured. In this way we made sure that the initial data we want to base our visualizations on has the exact same structure between old and new tracking.
Dbt models are regularly created by referencing other underlying models via SQL. For flipping the switch the last step we took after evaluating data consistency between old and new tracking locally was as easy as pointing reportings from the old Universal Analytics staging models to the new GA4 models. Or to say it more simple we just used a different source table that was structured identically as before.
All existing dashboards continued to run smoothly with underlying data now coming in from Google Analytics 4. It is still required to check how your data looks like after the migration. Checking data consistency between what you see in Google Analytics 4 UI and your dashboard can be tedious but taking this time when setting up your data pipeline is a recommended way to not run days and weeks on reports that are only 98% correct. Also make sure to document and actively communicate where a mapping between the two Analytics versions was not possible 1:1 to all stakeholders and in your documentation of choice.
Documentation regularly has low priority but having a common understanding over what you look at is key when working with data.
Don’t break your reporting, and start your migration now
Using a modern cloud data stack that includes fivetran, dbt, snowflake, and metabase as the foundation can make a migration project more manageable. With solid documentation and flexible tools, the reengineering of your data for reporting becomes a to-do list rather than a daunting task. If you need help with modernizing your data stack or migrating your Universal Analytics to Google Analytics 4, we’re here to assist you. We can also improve your tracking capabilities and focus on user privacy while mapping out your business in web tracking. Contact us today for support!