Welcome to Issue 18 of my Data Analysis Journal newsletter, where I write about data analysis, data science, and business intelligence.
✨ In today’s newsletter:
Expert Spotlight on SaaS and Product Analysis.
How to calculate churn rate and why it can be so difficult to understand.
2020 database management survey result - what databases and cloud applications companies use in 2020.
Using SQL for Machine Learning in PostgreSQL and MariaDB.
📚 Weekend Longread
Open Source Data Management just published results from the 2020 Database Survey. Here are some interesting takes:
41% of buying decisions are now made by architects, giving them significant power over software adoption within a company.
To keep pace with their competition, many companies needed to upgrade or migrate their databases and software in 2020. Only 12% of respondents made no changes to their environment in the last year, compared to 28% who made changes 2-3 times over the last year.
Performance issues were the most significant problem that companies experienced, impacting 74% of respondents.
Overall, companies that ran a combination of MongoDB and PostgreSQL showed the most significant overall jump, with a rise from 24% in 2019, to 30% in 2020 comparing to companies which use MongoDB with other RDBS.
AWS continues to dominate the public cloud provider market, with a bump from 46% of respondents using its cloud platform in 2019 to 50% in 2020. Microsoft Azure usage jumped to 23% from 16% in 2019, but Google Cloud failed to make an impact this year with an adoption rate that remained static year-over-year at 18%.
🔥 What’s new this week
SQL for Machine Learning:
A few months ago I wrote about how ML becomes more accessible to analysts without any ML frameworks via aito.ai - a predictive database that allows you to run ML using simple queries. This week I ran into a deep-dive explaining how you can use simple SQL queries to run predictive modeling in MariaDB with integration with MindsDB! Looking forward to trying it out!
If you are using PostgreSQL, check this guide explaining why foreign keys are important (as they ensure your data is consistently stored), but they also introduce challenges with the data order which might result in limitations to data modification.
Here is a step-by-step walkthrough of how, via SQL, you can perform feature engineering to prepare your dataset for ML in PostgreSQL. For example, you can use UPDATE and ALTER to rename the columns, merge or break down DateTime values, SELECT to subset it, WITH to create a common table expression, etc. The possibilities are only limited by your imagination! (that’s not entirely true, but we try to be optimistic in this journal!).
🏆 Nailed It
Be prepared for your next interview
Every SaaS company struggles with churn. Really. I haven’t met any business that has historically had a negative or zero churn rate. Even though some may say that it’s an unrealistic goal, a flawless churn rate is what we all should be aiming for. As a data analyst, it’s your job to break down, analyze, estimate the impact, and propose an action plan of what your team can do to reduce churn. To do that, you must have a thorough understanding of the churn rate and how to tackle it through your business or product lenses.
Make sure to read this recent walkthrough on how to approach and understand churn, as well as some examples of how churn is calculated at different companies.
Here is a quick summary:
The concept is simple: churn rate is the % of your customers that leave your service over a given time period. However, breaking it down can become complex. Why?
Counting customers is complicated. You might be getting a flow of new sign-ups and cancellations every hour/day/week. So, good luck picking a robust calculation with a time frame! For any given month, you have 3 customer segments:
Renewed users in the current month (users that signed up prior)
New users during the month.
Churned users during the month.
So, your denominator for the churn rate formula (which is the total number of customers) will be different within the same month.
Keep in mind that new users usually churn higher than long-term customers. Therefore, if you run a big marketing campaign or are growing too fast, your churn rate will be skewed and will show a much higher rate than there really is. Don’t freak out your CEO too early!
Other challenges/questions with churn you need to solve: are you defining churn as downgrades or cancellations? Or simply lack of renewals? Is your customer an individual or an enterprise? Churn impact for each will be different. Also, do seasons affect your product? If yes, have fun normalizing your annual reports!
Fixing churn is very important. Churn directly affects most of your SaaS metrics: MRR, LTC, CAC (read the SaaS Growth Metrics One-Pager to learn more about SaaS metrics and how they are connected and dependent on Churn).
🧭 Expert Spotlight on SaaS and Product Analysis
I’m so excited to launch a new section dedicated exclusively to interviews. This is where data analysts experts share their expectations for candidates, day-to-day projects, challenges, and values.
My first interview is with Wes Bush, the bestselling author of "Product-Led Growth: How To Build a Product That Sells Itself". He is the founder of the Product-Led Institute and is best known for challenging the way SaaS leaders approach growth.
Read the interview with Wes about the SaaS trend, challenging metrics like churn, PGLs, benefits of product-led growth, and product analysis.
You're going to find problems wherever you look. The trick is to find the problems worth solving right now. A brilliant product or data analyst will not just identify problems but develop a system to prioritize what needs to be solved first.
🍸 Drink and Mingle
Upcoming free events, meetups, talk, and webinars.
Nov 12, DSS: Become a Data Science Superhero with Python
Nov 12-19, Google: Online #IamRemarkable Workshop
Nov 14, WWC: Data Visualization through Python
Nov 17: Bay Area Chromatin: ML In Epigenomics
Nov 18, Anaconda: Data Exploration, Visuals, and Dashboards Using Familiar APIs
Nov 19, WiMLDS: ML For Climate Change
Nov 20, SWD: Accelerating ETL for Recommender Systems
Nov 21, WWC: Data Visualization through Tableau
🙏 Do Some Good.
It pays back.
I have a favor to ask. As my audience grows, I’d like to better understand how this journal can stay interesting and help you to become a better data analyst, data scientist, and progress in your career. I’d appreciate it if you could take a moment to fill out this brief survey. (And thank you to those who have already done it!)
⚙️ Try It Out
It just might solve all of your problems.
If you are looking for a SQL client for PostgreSQL or MySQL, try Arctype. It allows you to create tables, save and store your SQL, and build your visualizations from your query results.
Thanks for reading, everyone. Until next Wednesday!