Analysis for Optimal Cadence and Frequency - Issue 276
Bringing data science into marketing analytics: How to locate the right frequency for upsells, ad impressions, or notifications
You’ve probably noticed that I write mostly about analytics for product-related initiatives and rarely touch on marketing in my newsletter. There are many reasons for that, starting with the fact that accurate attribution tracking doesn’t exist, and ending with the fact that no marketer has ever grown a product to 100M MAU.
In my experience, no matter how much effort you put into improving attribution measurement, you still end up with Meta over-reporting trials by at least 3x, and somehow marketing channels reporting you more signups in November than you actually received in the entire Q4.
It’s ironic - product analysts work with low-trust data, while marketing analysts have to work with blatantly over-reported data from every source. Literally, you have to divide every metric by 7 just to get it somewhere close to reality. No, thank you.
Anyway. Despite me not enjoying marketing analytics and advocating for growth and pricing to be owned by product teams, we should still be skilled to run analyses and support poor marketing folks as best we can. That means knowing the frameworks and types of analyses around ad ROI, email cadence, push notifications, payment upsells, and more.
So today, I want to resurface my long-overlooked framework for using ML to determine the right frequency of upsells, ad impressions, and different types of app notifications - how to find the frequency that drives the highest DAU, conversion, or sales without harming (or at least acceptably impacting) user engagement, unsubscribers, or churn.
Regression vs. Classification: choosing the right ML
Some time ago, I shared a guide to help you decide which ML model to pick to solve a problem. I pointed out that regression is the most common ML we use to answer many product and BI questions, including LTV predictions, revenue forecasts, estimating the number of new page views needed to improve signups, how many notifications increase DAU, and so on.
However, linear regression isn’t the right method for determining the threshold between multiple outputs.
While regression is the right ML to apply to understand the relationship between a metric and user actions (and how much change in one variable affects another), this is not the right approach for finding the right threshold or cadence. For example, measuring a variable (e.g. emails, notifications, ads) against 2 or more output metrics (like DAU, activations, CVRs, unsubscribes, churn, etc) is not a regression task, and the best method here would be to leverage (a) classification models and (b) testing.
Before we dive deep into that, a quick reminder on regression:
Use a simple regression model with a single independent variable:
screen view → activation
campaign impressions → trials
Use a multiple regression model with 2+ independent variables:
screen view + age → activation
upsell click + blog opt-in + referred → churn
Remember, multiple regression is designed for multiple input values, not output metrics. So, you can’t apply it to estimate clicks and unsubscribes at the same time. And that’s the reason why classification models would fit better for this task than regression.
Methods and steps to estimate the right threshold
Below, I share 2 approaches to locating the right frequency or cadence:
A quick “back of the napkin” analysis when you have a few hours to offer an estimate.
A multi-label classification model, when you hopefully have at least a week to classify users and develop a model to predict the likelihood of users clicking on the upsell or ad and not opting out or churning.



