Introduction To Event-Based Analytics - Issue 142
Or: how to set up your events-based tracking for product analytics
Welcome to the Data Analysis Journal, a weekly newsletter about data science and analytics.
Addressing event analytics is way overdue in my newsletter. It should be taught as the opening series of Olga’s Masterclass Data Analytics 101 if that will ever happen.
A warning, as below is an analyst’s pure frustration pouring out:
Many data professionals and influencers refer to bad data as “Garbage in, garbage out”. After which they ironically spend a lot of time drilling into “garbage out” - poring through every little aspect of downstream data management, depicting every possible tooling and process breakdown to improve data processing, storage, and visualization. Every day there is a new article or LinkedIn post on the importance of data cleaning, data governance, data contracts, data observation, etc, all while never taking the time to dive into the source of analytics.
I am surprised and confused about how such a foundational aspect of analytics is underrated in today’s modern data landscape and is not properly addressed by data evangelists and content creators out there. Why?
Well, I’m excited to assertively discuss this today.
In this publication you will learn:
How to manage and leverage event-based data for analytics.
The best practice (my best practice, because I couldn’t find anything helpful out there) of setting up events, properties, and attributes for user activity tracking.
How to successfully maintain event-based data governance and documentation.
A guide on how not to get lost in the event data noise.
The difference between session-based and event-based analytics
Most analytics used for reporting can be broken down into session-based and event-based. Many product and data leaders don’t differentiate between these two. But there is a big difference - they measure activity differently and require different data handling.
Session-based data fuels marketing analytics
Session-based (also known as page-view) data aims to report sessions on the website including page views, exits, bounce rates, session length, and everything that is used to measure traffic on your website. It captures how the user session started (the source a user came in from) and how the session ended (bounced, converted, etc), along with session length, number of pages per visit, and more. You use session-based analytics to report on traffic, sources, campaigns, conversions, referrals, keywords, etc.
A common downside of session-based analytics is that user funnels are recorded within one session or visit. They are not accurate in capturing the actual user flows or paths, especially when your funnels are complex and consist of many steps that can be completed during separate visits.
Session-based analytics is usually reported via Universal Analytics or Google Analytics. Last year I covered why Google changed UA to GA 4. In a nutshell, they made a transition from session-based to event-based data tracking, because the world has evolved, and everyone wants better user engagement measurement now.
Event-based data is the foundation of product analytics
If you want to go more granular into user activity for what type of actions users do, you need to implement and configure event-based analytics. Events are user interactions, e.g. “upsell_view”, “button_click”, and “payment_submitted”, including scrolls, hoovers, toggles, and more.
Event-based analytics includes user actions and attributes. It’s more precise in capturing user flows and serves as a foundation for user behavioral reports.
Event tracking is passed from the client in a specific structure, similar to:
Event group (e.g. revenue)
Event action (e.g. purchase_completed)
Event label (e.g. item_name)
Event property (e.g. promotional)
For event-based data, you can configure as many custom dimensions and values as your system allows. It also is used for real-time analytics reporting.
Event-based data is commonly accessed via product analytics tools that are designed for tracking events.
Working with event-based analytics is not easy
It all starts with the source - a client.
A client is a mobile app, a tablet, a desktop computer, or a browser that captures all events through pixels or tags and passes them to your analytics tool (e.g. Mixpanel, Amplitude, Heap, Braze, Google Analytics, Pendo, etc) or a server, or else via API or SDK. Your ability to report data will heavily depend on the nature of this integration. You might get to the point that for one client/platform, you will have a rich set of clean user activity events which is intuitive and precise, and for another client, you will deal with over 60% disproportionally missing events that won’t make any sense. Your analyst might also spend many months cleaning and putting puzzles together that don’t fit. Read more on this - Why Most Analytics Efforts Fail.
Why this is the case?
Disconnect between product, analytics, and development
Developers can design event streams and analytics in different ways. They can configure the system to capture every piece of data movement and send this tracking payload as a massive unstructured bulk (if this is the case, it doesn’t matter what analytics tool you use, as you won’t be able to make sense of your events anyway). Or they can set the logic for it and aggregate it in a particular way, making it easy for product digital analytics tools (and the backend) to ingest and read the data.
The nature of such design will mostly depend on 2 things: