Introduction To Analytics Engineering
How analytics engineering is changing the data landscape and what you need to know to keep up with the industry - a guest post by Madison Schott.
Good morning and welcome to another edition of the Data Analysis Journal newsletter, where I write about data science and product analytics. If you’re not a paid subscriber, here’s what you missed this month:
Playbook For Launching, Monitoring, and Analyzing A/B Tests - my step-by-step process of analytical support for full A/B tests lifecycle.
How To Locate The Right Frequency Of Push Notifications - methods and analysis on how to find the right threshold of notification frequency (or email cadence, ads impressions, payment upsells, etc) to make sure it converts into the highest DAU, while at the same time also doesn’t harm user engagement.
Getting Date Functions in SQL - a SQL guide on date and time formatting, providing pointers on which Date Time function to use for which question or case.
Today’s newsletter is a special one. I am excited to invite my second guest writer, a blogger, an analyst, an engineer, and a health and wellness enthusiast - Madison Schott 🥰.
I am way overdue with my coverage of analytics engineering. I keep receiving questions on what dbt is, why analytics engineering has become so popular so quickly, and how we can expect it to continue to change the industry. Should analysts re-qualify now and become data engineers to keep up with the industry demands?
There’s a lot to unpack here. There is no one better to introduce analytics engineering than Madison Schott, the author of the
newsletter:I am very grateful Madison found the time to write a guest piece for us before she continues to stay busy with getting married and celebrating life!Â
is an analytics engineer at ConvertKit and the technical writer behind the Learn Analytics Engineering newsletter, wherein she writes about transitioning from a non-traditional tech background to a data role, working with dbt and SQL, and best practices. She is also the author of the ebook "The ABCs of Analytics Engineering" that I highly recommend for anyone looking to dive deeper into the field and learn the skills necessary to transition to this role. When she's not writing about data, you can find her cooking up a meal with ingredients from a farmer's market or on a hike in the sunshine.What is analytics engineering, how it is transforming data analysis and data engineering, and why it is important
Data analysts work to serve marketing, growth, product, and financial teams by providing them with reports and dashboards highlighting KPIs and other various metrics that help drive business decisions. Data engineers work to capture data from different applications and external sources and deliver that to the data analysts to use. But there is a major gap in these two roles that nobody seems to be talking about.Â
I’ve experienced this gap firsthand as an analytics engineer. Data engineers capture the data they think should be captured, never consulting with business teams on what data is most important to them. Data analysts then have to piece together the data they do have, producing a report that may not paint the entire picture. Oftentimes, these two roles don’t know how to properly communicate with one another about what they need due to a lack of technical or business knowledge. This is where the analytics engineer steps in.
What is analytics engineering?
Analytics engineers sit in between the data engineer and the data analyst. They have both technical skills and an understanding of how the business functions. Their primary skills include:
Data modelingÂ
SQLÂ
Technical documentationÂ
Data pipeliningÂ
TestingÂ
A large emphasis on what analytics engineers do revolves around producing high-quality data.Â
Many analytics engineers use a data transformation tool called dbt. While dbt isn’t a defining characteristic of an analytics engineer, the company that created the tool is actually the one that created this new role. Most analytics engineers use dbt because of the best practices it instills when data modeling. It allows them to write modular, fully tested, fully documented, and advanced data models.Â
dbt compiles SQL code so that the same piece of code doesn’t have to be written over and over again and can instead be referenced in multiple data models. It also uses a templating language called Jinja that makes it easy for dbt to connect to your data warehouse, reading from raw data and writing to your development and production databases. Dbt is also powerful because of its macros, which are essentially Jinja functions that can be used throughout your SQL code.Â
How analytics engineers are different from data engineers
The difference between analytics engineers and data engineers is still a nuanced topic. Data engineering has become such a broad term nowadays with all of the different languages, tools, and skills grouped under it. It’s becoming harder and harder to narrow down job descriptions and find what you really need as both an employer and a job candidate.Â
We have slowly seen data engineering branch into different types- devops or site reliability engineering, machine learning engineering, and now analytics engineering. This is because the responsibilities and skills are different! They deserve to have their own title to make it easier for everyone in the job market.Â
Analytics engineers still have a lot of responsibilities similar to that of data engineers. They are in charge of moving data from external sources to a single source of truth, or the data warehouse. They own ingestion, transformation, and orchestration. The big difference between analytics engineers and data engineers is really that transformation component. They aren’t just capturing and moving around data, but understanding it on a deeper level.Â
The transformation component done by analytics engineers requires an understanding of business processes and what that data should look like. Data engineers don’t typically need to understand the data itself, more so just the metadata. Analytics engineers can really bridge the gap between the more technical processes and the characteristics of the specific data being collected.Â
How analytics engineers help data analystsÂ
I like to say that the data analyst is the analytics engineer’s stakeholder. Everything the analytics engineer produces is essentially for the data analyst to then use in their reports and dashboards. Data analysts serve the business and analytics engineers serve them.
Data analysts without an analytics engineer on their team may be writing data transformations, all of which live directly in their BI tool. This not only slows dashboards down but also slows down the whole analytics process. Analysts are forced to standardize data every time they want to write something new and repeat code that they’ve already written.
When an analytics engineer joins the team, their job is to transform the raw data into data models that can then be directly used by the data analyst. The standardization of columns, joins, and whatever other transformations needed are done within these data models instead of the BI layer. This ensures key datasets are always available to be used and contain the highest quality data that has already been validated.
Now, data analysts can focus on using these datasets to simplify the BI layer and produce the KPIs needed by the business. Dashboards and reports can be done quicker and trusted for accurate results. Â
Why it’s importantÂ
Data quality is a bigger issue now than ever. We have so much data available at our fingertips, but how do we know it's accurate? What’s the point of using data to drive important decisions like where to spend money on advertising or how to increase customer retention if it’s wrong?Â
Analytics engineers make producing high-quality data their top priority. By bridging the gap between data engineers and data analysts, discrepancies between how the data is being captured and what is needed can be better understood. Analytics engineers draw attention to the issues they see in how data is being moved, and how it can be better improved.Â
By standardizing data close to the source using a tool like dbt, analytics engineers are ensuring everyone within the company is using clean and accurate data. When transformations occur at the source, rather than within the BI tool, mistakes are minimized and quality issues are caught before that data has a chance to make it to the BI tool.
The same goes for the complex data transformations that analytics engineers write directly within dbt. These can be validated while they are being written, then properly tested, before being deemed reliable and ready to use to make business decisions. Then, they are orchestrated to run on a cadence so that these datasets are always highly available when they are needed, speeding up the entire analytics process!Â
If you want to become an analytics engineer…
Hopefully, you’ve gained a better understanding of what analytics engineering is, how it relates to the other data roles, and why it is more important now than ever. I’m a firm believer that analytics engineers are here to stay because of the value they provide for data teams adopting a modern data stack. They help to bridge the gap between data analysts and data engineers, focusing on producing high-quality data for business teams.Â
If you think analytics engineering might be the right career for you, I’m here to support you in your journey! Polish up on your SQL skills, follow some dbt tutorials, and continue reading newsletters like this one. The first step is to learn as much as possible, the second is to put that knowledge to work.Â
Thanks, Madison!Â
Find and connect with Madison on LinkedIn and Twitter.
Thanks for reading, everyone. Until next Wednesday!
I had so much fun writing this! Thank you for having me :) Hopefully this sparks someone's interest in analytics engineering.
You know what they say in data quality, "garbage in garbage out"