Data Analysis Journal

Share this post

Documentation: How We Do It All Wrong - Issue 114

dataanalysis.substack.com

Documentation: How We Do It All Wrong - Issue 114

The never-ending battle of creating, adopting, and maintaining documentation - and why we all fail at it.

Olga Berezovsky
Oct 12, 2022
5
Share this post

Documentation: How We Do It All Wrong - Issue 114

dataanalysis.substack.com

Hi everyone! I am traveling to Canada for a week, so if you are in Toronto and empowered by analytics and data, let me know! I’d love to meet my Toronto readers or get advice on the best coffee shops in town.

In other news, if you are in Austin and are into marketing and growth, you still have time to register for the CXL Live 2022 event on October 25-26 organized by the one and only Peep Laja. If you work with experimentation and don’t know the “conversion optimization champion”, you really should. He wrote a lot on A/B tests and was voted as the #1 most influential conversion rate optimization expert in the world in 2015. 

I’m still in disbelief that I decided to write on documentation. It’s somehow both the most boring topic ever while also being the most vital. The way you handle documentation speaks to your leadership, experience, strategy, execution, and much more. That’s one of the secret ladders you can use to climb toward career growth and showcase your expertise, management, and influence. So bare with me today as I go through the clinically dry and flavorless foundational guides regarding how to create, adopt, and maintain proper documentation in analytics.  

Why we can’t have nice things

To start with, I haven’t met a tool or an application in analytics yet that handles documentation well (go on, send me your SaaS!). 

At every company I was part of, in order to make it work, analytic teams ended up using multiple applications for managing documentation that to some extent overlapped, was insufficient, or required duplicating the same work twice or more. 

For example: 

  • We relied on Mixpanel, Amplitude, or Secoda to create and store a dictionary of analytics events. Such tools offer the advantage of having a consolidated list of analytics events related to user activity and attributes. The downside was always the need to manually maintain it and keep it up to date.

  • We used GitHub or GitLab to store SQL and Python and document any changes or variations done to the code logic. It helps to keep track of changes and versions, providing the best framework for a review process. The downside was that such technical documentation wasn’t leveraged by business or product analytics teams who do most of their analysis out of data lakes and dashboards. There was always a gap between “this is the source metric definition we have” vs “this is how we need the metric to report”. As result, we often had to duplicate our SQL in spreadsheets, wiki, etc. 

  • We used Jupiter Notebooks and Collabs to maintain analysis, forecasting, and modeling. I love both of them for how easy and intuitive it is to keep, retrieve, and present your analysis. It “marries” code with results, which is priceless for analytics. The downside was maintaining versions and drafts and collaborating on projects with other analysts. 

  • We stored product specs and history documentation at Confluence, Asana, Notion, and Clickup. The biggest advantage was that everyone could easily access and contribute to these. The constant downside was that it’s not meant to serve as technical documentation.

I am convinced that the challenge of maintaining and adopting documentation in analytics is the hardest one to solve because analytics is a very cross-functional discipline. Thus, it requires fitting into technical, business, and product domains at the same time.

To be fair, owning and maintaining proper documentation is a full-time job that requires skills and experience. After working on consulting, I now understand why we invested in technical writers who worked with product and customer success teams to develop the right content strategy for our clients. For any project to be successful, you have to start with 2 basic things:

  1. Hire a designated curator (“librarian”) to structure documentation, provide templates and guidelines, and help enforce them.

  2. Set up a process that generates, updates, and archives documentation.

If you can’t afford a writer, you have to double down on the process. How do you get there?  

Documentation 101

I am adopting the best documentation practices from the Divio documentation system that was first presented at PyCon Australia in 2017. To recap, there are 4 different types of documentation: tutorials, how-to guides, technical references, and explanations. Each requires its own unique mode, structure, and purpose: 

Tutorials:

Tutorials are learning-oriented towards learning how rather than learning that. They are meant to turn your learners into users by allowing the user to learn by doing. The point of a tutorial is to get your learner started on their journey, not to propel them to a final destination. They need to be concrete and built around specific and particular actions and outcomes.

How-to guides:

How-to guides take the reader through the steps required to solve a real-world problem. They are different from tutorials. A tutorial is what you decide a beginner needs to know. A how-to guide is an answer to a question that only a user with some experience could formulate. How-to guides must contain a list of steps and must focus on achieving a practical goal. 

They should not explain or discuss. If explanations are important, link to them. With guides, practical usability is more valuable than completeness.

Reference guides:

Reference guides are technical descriptions of the application or a tool with the steps on how to use it or get started with it. It’s information-oriented. Reference guides are code-determined and meant to describe key classes, functions, and APIs, cover things like functions, fields, attributes, and methods, and set out how to use them. It may contain examples to illustrate usage, but it should not attempt to explain basic concepts, or how to achieve common tasks. In reference guides, structure, tone, and format must all be consistent. Explanation, discussion, instruction, speculation, or opinion should be left out.  

Explanation:

Explanations or discussions are meant to clarify and illuminate a particular topic. They expand the topic and are understanding-oriented. This is the right documentation type to provide the context, discuss alternatives and opinions, and include background and context. ​​They can also explain why things are set in a given way, whether this is by design, decisions, or technical constraints.

Bringing this into analytics

In analytics there are mainly 4 types of content we work with: 

  1. Technical documentation: architecture overview, database diagrams, SQL, and Python code.

  2. Onboarding and general documentation: infrastructure overview, contact information, stakeholders overview, etc. 

  3. Project documentation: analyses, requirements, recaps, case studies. 

  4. General documentation: meeting notes, process descriptions, retrospective reports, recaps, roadmap, etc.  

The documentation should match unique team needs and meet these requirements:

  • Be up to date: queries, diagrams, and write-ups should reflect the latest changes and be refreshed with every code iteration.

  • Credible: any data point should be reviewed for correctness and precision before being publicly shared.

  • Comprehensive: analysis should be understandable to readers and stakeholders.

  • Consistent: visuals, terminology, and formatting should be consistent across research and analysis.

  • Discoverable: anyone should be able to assess and navigate through the documentation.

Structuring documentation will depend on the company's organization, size, and tech stack. 

Over the years, I developed the following hierarchy that I find the most helpful and efficient in maintaining documentation and the process. Even though I wasn’t able to adopt it at every company (again, organization, size, and tech stack affect it) I believe this is one of the most common good practices in structuring documentation for analytics: 

🔑 General: general information, onboarding guides, access details, an overview of infrastructure and data sources, one-pagers, and introductions to main dashboards and data landscape documented in Confluence, Asana, or Notion. 

🗓️ Planning: all things related to the process, planning, and specifications. This contains things like backlogs, roadmaps, meeting notes, wishlists, and actual active/open projects. Often it’s a combination of JIRA, Google Drive, Sharepoint, and Confluence.

⚒️ Development: reference guides, technical overviews, tutorials, how-to guides, manuals, explanations, and some end-user documentation stored on GitHub or GitLab.  

📊 Finished projects: old completed projects, case studies, deep dives, A/B test recaps, and previous analyses I find are best documented via Jupiter Notebooks, Amplitude Notebooks, and Collabs. 

Everyone on your team should understand and follow your documentation structure. If you are introducing this just now, start with a culture change and give incentives to analysts for supporting and maintaining documentation.  

Additionally, you have to incorporate this into your development process. For example, if using JIRA, centralize unfinished or temporary documentation, assign a ticket to a person in the Active sprint, and additionally have a periodic review set up of the current documentation and its status.

As the company and your team grow, knowledge transfer across teams naturally becomes slow and more challenging. Inefficient or missing documentation slows down analysis and the speed of decision-making. That’s why it’s important to invest in optimized and more streamlined documentation. 

Thanks for reading, everyone. Until next Wednesday!

Share this post

Documentation: How We Do It All Wrong - Issue 114

dataanalysis.substack.com
Previous
Next
Comments
TopNewCommunity

No posts

Ready for more?

© 2023 Olga Berezovsky
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing