Building BI: Behind The Scenes | Benn Stancil
Interviewing Benn Stancil: BI landscape and emerging trends in analytics
Welcome to my Data Analytics Journal, where I write about data science and analytics.
This month, paid subscribers learned about:
Ranking The Top Used Product Features - How to locate the most frequently used product features to learn what drives the highest user engagement - steps and SQL.
How To Make A Sandwich in 587 Steps - because it’s my newsletter, so I can disagree about data platforms.
WWDC 2024 Recap: Top Announcements Impacting Analytics - What you need to know about recent key updates from Apple and how they impact mobile analytics.
When I launched my newsletter a few years ago, there weren't many bloggers writing about analytics, unlike today. Substack now lists more than 360 newsletters about analytics. However, only a few writers truly stand out, one of whom is
.For many years,
has embraced the classic, foundational themes of data analytics and BI. For this reason, I love his blog because, truly, there is nothing more impactful or exciting than analytics.Today marks a historical milestone for my humble newsletter - I am excited to share my interview with my favorite blogger and analyst, the co-founder of Mode, Benn Stancil! ⭐
In our conversation, we discuss the tricky nature of BI, the pitfalls of building data analytics tools, new trends in analytics, and more.
What is the most challenging aspect of building BI tools?
You have to be really disciplined about identifying who your customer is and focus on selling to them.
This is why BI is tricky: it’s easy to end up selling to 3-4 different people. Especially early on when you are looking for customers, it's very tempting to drift away from your primary persona.
For Mode, we built a tool primarily for analysts. However, analysts are serving different stakeholders, and then there is also admin and management, so it is actually a tool that was meant to serve one group of people but consumed by another, and then, there are different categories of users. Though we mostly were always more focused on data teams than these other groups, you inevitability find yourself questioning how much you should build features for them, too.
If I do it again, I’d basically try to recognize from the beginning whom we are serving and be very disciplined about it: “This is the product for just this group of people.” Stick with this persona, make something that's as good for them as possible, and don't worry too much about other types of buyers
Other successful data tools do this pretty well. R Studio and Jupyter Notebooks, for example, mostly don’t deviate from being technical tools. And tools like Mixpanel know where to draw their boundaries - e.g., primary for product managers.
It’s hard to do this with BI, because they can get used by so many people, but it’s necessary.
You wrote some time ago something that stuck with me for years: "No matter how much a visualization technology can do, people will want more...Give people a line chart, and they’ll want to add another line on a second axis. Then they’ll want to turn one of the lines into a bar chart; the bar chart will need multiple series; first stacked, then grouped. If the chart is a time series, they’ll then want to group data by different intervals: by day, by week, by month. But do weeks start on Sunday or Monday? They’ll want to choose. Which time zone are dates grouped in? Can you exclude weekends? Can you treat incomplete periods differently?"
Agree, but how do you navigate all these requests and directions when building a BI tool?
This is one of the reasons BI is hard. There's no easy way out of these requests—it's just about effort.
Let’s take Twitter, they released one killer feature that worked really well, and people loved it. However, for BI, I think there is no one killer feature, people need to have the ability to do many things. People need more. So, the mode for every BI company is to put effort into building a thousand of features.
Looker had a clever thing with semantic layers and ML, or Tableau, for example, focused on visualizations, and there are many companies who try to be clever at one or another, but for the most part, to be a really big and last as a BI tool visualization tool, you need to support many different things - integrations, alerts, different dashboards formatting, different ways to consume data, permissions, management, different ways to distribute things, … and it’s a very very long list.
The trap for many companies who want to get into the BI space is the idea that we can build 20% of Power BI and cover 80% of usage, and it will be enough. Some use the 80/20 rule (you can get 80% of the experience by building 20% of the features), but the thing is, you can’t do something 20% and think it’s enough. You need to build 100% for that 80%. Otherwise, you will end up with a very frustrating product. With BI, it’s really hard to build this 80%. That’s the trap.
What’s your favorite feature that was never shipped?
Tons.
We always wanted to build internal Quora - a place where people can ask questions, like a forum. A resource to find answers for questions, a place where conversation would live. At Mode, every report was like a snapshot in a way, and someone could ask a question, but the answer is fixed in time to the question they saw. I don’t know if this would really work, but I would love to see it.
When I was at Yammer, which was a product similar to Workplace (aka, Facebook for work), This was how the data team handled answering questions, and it worked really well.
I was thinking of a Version Control for analysts. It’s a must-have and is currently absent in today’s data landscape.
Yeah, I am surprised that there is no Asana or Linear for analysts yet, given how much exposure data teams have gotten over the last few years.
It’d probably have to be attached to another tool you are using, though. It’s hard to get right.
Which BI features are difficult to build?
Visualizations are hard. People always want to do new stuff with it. It’s complicated, takes effort.
Also, Version Control, but not code, but rather who-solved-what-when. For example, if I share a link of a report - what am I sharing? Is it a snapshot, when is it updated, parameters, what we do about it, etc..
Another thing, which is related, there is no sense of production. Unlike in engineering, where it’s clear the scope you are maintaining, there is a sandbox, staging, etc.. Everything in data is half-of-production. If someone asks me a question, and I send a link in Slack to a chart - is that production? Kind of. It was in a moment when I shared it. It’s production in the sense that it was sent by someone on the data team, which is a kind of implicit confirmation that it works… People want to trust BI, and trust it’s verified. But it’s impossible to know who maintains it, or if it’s maintained at all. This would be nice, but very hard to do.
The best versions of this that I've seen are when companies have relatively few dashboards. They say "We'll maintain these exact things, and that's it." If you need more, come talk to us.
We were trying to expand the self-serve realm - if you need more - talk to us or bring to the mechanic.
Over 10 years ago, we had bad tools, and most of the work was done manually. Now we have this amazing tech, yet the time to value hasn’t changed. The productivity output didn’t change. Why do you think this is the case?
We've made it much easier to build dashboards, which is great, but having more data doesn't mean that we know what to do with it.
For example, take a car dashboard - no information there will make you a better driver. The same is true with BI. A few basic numbers is often more useful than data on everything. People don’t know what to do with that data.
This is roughly what we have done - move away from thinking about how to be a better driver to thinking about how to put more metrics on the dashboard.
What new directions in BI tooling do you see today? Do you agree with or like new trends?
Different things for sure will catch. 10 years ago when notebooks became popular, it was nice. Same with Tableau. Gen AI could make tools easier to work with too. For example, the Tableau interface is hard. There is a learning curve to using any BI tool. GenAI could be a way to put a conversational interface on top of BI. Even if it’s not AI agents solving all our problems, it could be a more accessible interface on top of BI.
If I have to pick what would really change things - and I don’t think I’m optimistic about it ever existing - it’d be someone making Excel work at scale. There are some attempts, like Sigma, though these are more spreadsheet interfaces on top of databases. But that’s not really Excel because charts and reports are a representation of data that lives in a database. The genius of Excel is that it’s both - the representation and data itself. It’s a file, which is a very clean concept to understand and not just a view.
Like Equals?
Equals is interesting. They started as a Google sheet competitor. I don’t know if that quite replaces it, but for sure it gets closer. But the direction they have gone is more BI. Now they offer dashboards and other things. I am afraid of it becoming a BI tool with a spreadsheet backend instead of a spreadsheet tool.
It’s similar to Hex. It’s a very good notebook. And initially, they were focused on being a notebook tool. But now, they position themselves more as BI, and the notebook is the backend for the BI tool. It’s notebook-based BI, rather than a notebook. This is the drift that makes BI so hard - it’s easy to evolve from building a great, specialized tool into building a much more general BI tool with a twist.
Is there anything else you want to share to encourage or inspire people to learn data?
Don’t do it for the data alone. Do it for the problem you want to solve.
It’s easy to say here is a list of technical things I want to learn, but my question would be - do you want to learn technical things or do the work those technical things are meant to help you do?
The time I had the most fun - was when you work on a problem you want to solve. Get to the bottom of it. By solving these problems.
Thank you, Benn!
Find Benn:
Although I am currently an AI engineer, I have a background in data engineering. I am very familiar with metric tracing, metadata tracing, and data lineage. These technologies are crucial because version control fundamentally relies on these tracing techniques. Data engineering often involves the tedious task of continuously cleaning data. Due to project timelines, data quality issues are often delayed. Subsequently, we continuously improve data quality based on data processing workflows, creating a cycle that repeats throughout the year.