Goodbye 2022 🎄 - Issue 125
A recap of 2022 and its best publications, a little gratitude, a lot of confetti, and what’s coming next
Welcome to the last newsletter of the year 🎊 from the Data Analysis Journal, a weekly advice column about data science and product analytics.
If you’re not a paid subscriber, here’s what you missed this month:
How To Get Randomly Distributed Users in SQL - learn how to apply different sampling techniques in SQL for your analysis and get randomly, uniformly, or normally distributed users. A RANDOM() function does not always do the trick. Read about the difference between normal and uniform distribution and how to replicate it in SQL.
A/B Test Checklist - a short and sweet one-pager to remind you of all the things to know to launch and analyze an A/B test. Updated and improved.
Python For Data Science: The Difference Between Merge, Join, And Concat - if you want to combine 2 or more datasets for analysis, in SQL you would run LEFT, RIGHT, INNER, or OUTER (or CROSS if you are creative) type of JOIN. In Pandas, you have 3 options: MERGE(), JOIN(), and CONCAT(). Learn when and why to use which method and how to differentiate them.
A horrible, dark, devastating year
2022 was very dark. My hometown in Ukraine was completely destroyed. My family and friends are scattered around the world. Available data on casualties is petrifying. Since February 24th, 2022, deaths in Ukraine are estimated at around 8,300 civilians, over 13,000 Ukrainian forces, and over 21,000 Russian forces. At least 440 children had been killed, 847 wounded, 239 missing, and over 7,894 were deported to Russia. These are estimated numbers, as the true death toll could be significantly higher.
Many of us share the same feeling of helplessness. While not many can commit to monetary support, I want to encourage everyone to please not to buy or use Russian products or services. Refuse any donations, funding, or sponsorship from Russian organizations or affiliates. Cancel and decline any cooperation with Russian artists, writers, and scientists if they openly support Putin’s war. At this moment, please refrain from promoting Russian art, poetry, culture, and cuisine, as it is a trigger for many.
Russia is losing the war and facing multiple counterattacks at the moment. As a tactic, they target electric stations, leaving Ukraine without light, heating, and water in below-freezing winter temperatures. Despite the lack of electricity and between daily air raid sirens, Ukrainians are as strong as ever, hopeful, and united.
⏪ A rewind of 2022
Earlier this year I joined the Board of Directors of UCARE (Ukrainian Children's Aid and Relief Effort) whose mission is to support Ukrainian orphans and children deprived of parental care. Many years ago, when I lost both my parents in a car crash, I was supported and guided by UCARE as well. Today I am joining their efforts to help Ukrainian children impacted by the war. We are raising funds to purchase food, clothing, medicine, and bare necessities. Please consider supporting us - Help Ukrainian Children During Russian Attack.
As for this newsletter, 2022 was a pivotal year:
Published over 50 articles across SQL, Python, experimentation, product analytics, statistics, KPIs and metrics monitoring, visualizations, and tooling.
Reached the 100th newsletters milestone 🎉.
Partnered with DCLA, ODSC, and Women In Analytics.
Published my first guest post How to measure cohort retention in
, which is the #1 business newsletter on Substack with over 300,000 subscribers 🤯.Introduced my first guest writer -
, founder and CEO of Eppo, a next-gen AB experimentation platform for data science and product teams, with his practical guide How To Develop a Highly Trusted Experiment Analysis Workflow. ⭐Made it to the Top 25 Data Science and Analytics influencers and content creators by IBM and Databand.ai 🤯
Knowing how small and specialized data analytics function is, it feels unreal to reach tens of thousands of readers, thousands of subscribers, and hundreds of paid subscribers within one year. Such growth was achieved without any sponsorship or ads and with a very limited (or absent) social media presence. It would not be possible without your sharing and the word of mouth (which I am about to quantify better than Yousuf Bhaijee did himself). Thank you, thank you for reading my publications, subscribing to my newsletter, supporting me, and sharing ❤️🙏.
🚀 What’s next in 2023
This year I met many great established bloggers, analysts, founders, and VCs. When creative minds come together, great things happen. Next year expect more practical guides, solutions, and learnings shared by the best experts out there on setting analytics foundation, ML adoption, reporting facilitation, SaaS benchmarking, growing data teams, and scaling data ecosystems.
I received many requests to break down technicality and adoption for more metrics in a similar way that I did for the guest post - how to measure cohort retention. In 2023, I aim to do similar for Churn, LTV, and maybe ARPU, as it's a dark and cruel beast itself.
I have a passion for user engagement analytics and pay a lot of attention to user behavior by identifying product personas, developing quantitative and qualitative user profile analyses, building feature usage matrices, and proving relationships between user engagement and revenue. I have never seen revenue growth with low or poor product usage. So expect many publications and guides around analyzing, monitoring, and scaling user engagement.
In 2023 I plan to share consolidated templates and maps on calculating SaaS waterfalls, analyzing onboarding funnels (and trees), measuring the test impact, and some napkin math to confirm proportion change, distribution type, and more.
I do not do sponsorships or paid promotions in my newsletter.
You might see occasional “Try It Out” chapters where I share an application or a tool, but they’re not sponsored in any way except that I independently fell in love with it, and I’d like to recommend it personally. For example, you may notice that my newsletters often mention Reforge, Kaggle, and Amplitude. I also like Anaconda, and Periscope Data (well, Sisence now, I guess…). But also please be forewarned that I am not a fan of Segment (or any CDP, for that matter, as I find them unnecessarily complex, expensive, and redundant) or Tableau (which I believe is causing more harm to your analytics today). I’ll almost certainly be talking about these topics more as we move onward in this next year…
💰 Invest in your growth and shape your own path
Career growth is your responsibility. Define your path and your strengths, and double down on your knowledge and the value you bring to your team. The best investment you can make is investing in your own growth. Most companies offer professional development, training, and education budget for their employees. Subscription to the Data Analysis Journal is reimbursable, and you can expense it through your company.
Paid subscribers receive:
SQL and Python solutions, interview questions, and guidance.
Deep dives and analytical case studies.
Data analysis project examples and walkthroughs.
Examples of successful (and not really) A/B tests, rollout procedures, and their analysis.
My frameworks and playbooks for analysis, reporting, metric calculations, test evaluations, and forecasts.
📚 What you read matters
Over the years, I’ve created a lot of checklists, playbooks, and code snippets. My favorite articles across different domains, best frameworks, good examples, gotchas, and caveats are based on my own experience, learnings, or never-ending timing research of best practices and working solutions out there. I keep it available here for my subscribers to reuse, upskill and grow.
Data analysis is a challenging field, still emerging and constantly transforming. Navigating through it is both tough and exciting. I hope this newsletter can help you find a path for professional growth, point you to the right sources and documentation, inspire you to learn and love data analysis, and simply become better at your job.
⭐ The most popular publications in 2022
SQL
Python
Product Analytics
A/B Testing
Career growth, advancement, and self-learning
Creating an analytics legacy, making your company truly data-driven, and setting the right frameworks based on the proper foundation are all challenging and exciting. It doesn’t happen overnight, and sometimes takes years to bring data into decision-making and business strategy.
In my journal, I want to create a ladder showing you how to get there. I want to encourage and inspire my fellow data analysts to enjoy genuine analytics and build a data heritage at your company that you would be proud of.
Now, before I go skiing or contemplate the meaning of life (or data analytics, unless they’re one and the same), here’s one last thing from my favorite source - 2022 In 5 Charts. Here’s a similar one from McKinsey - 2022: The year in charts. Enjoy!
Hope you have a great end of the year ☃️. See you in January!
I was late to the game in 2022, so I’m pumped to have this on my radar for all of 2023. Keep up the awesome content