Why Heap Isn't Built on a Manual Tracking Foundation

November 18, 20217 min read

Manual tracking can’t keep up with the modern digital experience

Today every product, design, and engineering team is acutely aware of the importance of capturing behavioral data to inform product decisions. Product analytics was invented to help these teams understand their users. In the golden age of the digital economy, all businesses acknowledge that if you don’t understand your users, you won’t be able to build a market-leading product.

The traditional data-capture method used by a majority of the current product analytics providers involves inserting code for every event that you want to track, and then waiting hours, days, or weeks for data to roll in. Once seen as best practice, this tracking method can no longer scale with the increasing complexity of modern digital experiences.

At Heap our focus has always been on building the most robust digital insights platform for agile teams. We know that traditional manual tracking might deliver baseline KPIs. But given the explosion in big data and increasing complexity of digital experiences, if you’re still tracking events by hand, you’ll never be able to move fast enough to capture a complete picture of user behavior. Incomplete data leads to inaccurate insights, which ultimately result in ineffective actions. At Heap that ruled out any strategy that would force teams to rely on old manual tracking methods.

Instead, in 2013 we built Heap from the ground up with a different approach: automatically capture all the data. This saves our customers the headache of defining events upfront, maintaining brittle tracking code, and waiting for data to accumulate. This approach is the only way to obtain a complete set of user data, and offers the best of both worlds: you can still manually track anything you want, but you can also trust that we’ll automatically capture every single event in your product or site.

Why do we believe this is a better approach? A few reasons.

The first is that insight-finding is an iterative experience requiring multiple questions to uncover the truth. If each of those steps requires weeks, you’ll never unlock the insights you need in time to keep up with the competition or market trends. In a manual tracking environment, if you don’t have the data you need, you’ll need weeks (at best) to get it.

The second is that insights are all about uncovering surprises, and these surprises are always in the places you aren’t currently looking. The problem with manual tracking is that it encourages you to track and analyze only the “happy path” you hope that users will take. What happens outside that happy path? You’ll never know.

To enable these possibilities, we not only automatically capture data; we also keep that data organized in a way that makes it accessible and usable for even non-technical users. The Heap Data Foundation enables teams to surface these surprises, and Heap Illuminate delivers proactive data-driven insights from this complete data set. It is the only digital insights platform that allows product teams to discover the insights you didn’t know you needed.

Unfortunately, when new methods and technologies are brought to market they incur resistance from the existing providers and users of the old practices. I want to touch on a few of the “Myths of Autocapture” we often are asked by new customers as they switch from manual tracking and legacy analytics providers.

Isn’t manual tracking better than implicit tracking via Autocapture?

"Autocapture vs. precision tracking" is a false choice.

Heap supports both. Legacy platforms (Amplitude) only have one — manual tracking. A typical implementation requires instrumentation on average of 2500 events and 6 months onboarding before a team starts generating product insights.

It's a question of the right tool for the job, and you want both. Autocapture helps our customers automate a very large portion of their analytics implementation, resulting in better data, smarter decisions, and more agile, data-driven product development.

That said, we know that each business is unique, and Autocapture may not cover every distinct data point our customers want to analyze. We typically advise that teams manually capture 5-10 core KPIs with our Track API (actions like “Payment Processed” or “Sign Up Submitted”, for example).

The events that should be manually tracked in Heap typically have some common qualities. They tend to be mission-critical, not automatically captured by a client-side user interaction, and/or unlikely to change much over time. Because your core KPIs don’t change very much, limiting manual tracking to just these means it will be less brittle and require far less maintenance than a typical 100% manual implementation.

Because Heap captures all our event data, won’t that be difficult to govern and manage?

Autocapture doesn't "flood" you with data. Heap’s automatically-captured dataset is stored in the background until you decide you want to define an event and analyze it.

As the leader in autotracking, Heap architected the platform to solve this exact problem. This process starts with Virtual Events. All the raw data that Autocapture collects sits quietly under the surface, out of view, waiting to be called into action. Once you need a data point, you create a Virtual Event, which is just a label that points to the underlying data stream.

Since we’ve been collecting data in the background all along, the Virtual Event that you create is retroactive to the moment you installed Heap. In this environment, changes to the front end of your site don’t produce data gaps or broken events, and data stays clean and consistent, no matter how big or extensive your dataset.

In other tools, events are typically managed via a “tracking plan” — a spreadsheet managed outside of the platform. If you’ve ever tried to manage a spreadsheet at scale, you know how easily things can break. By building governance tools into the platform, we keep all of your automatically captured data clean and consistent from the moment you define your events. As your events change, the data structure automatically adapts. While other tools offer you precision at the expense of your freedom. Heap’s complete and self-governing data foundation gives you both.

Won’t Autocapture break and create more work for our team?

Actually manual tracking breaks, a lot! When a manually tracked event breaks, and they do break, it's impossible to repair. There is a permanent gap in the dataset. Every breakdown means more gaps in your data, and more insecurity in your insights.

When a virtual event based on Autocapture breaks, it's obvious and easy to repair. (We even alert you.) With Heap, if an event goes stale, we still capture data from the new version, and we proactively prompt you to repair the event, then guide you through the process.

The result: no missing data in your customer journey. This is another benefit of an automatically captured dataset: you can always update it retroactively. With a manual tracking approach from legacy tools, a broken event means missing data, so you'll need to push new code to start picking up the new event. This means lost engineering time and a permanent gap in your dataset.

We heard that Autocapture might accidentally capture sensitive data?

Security and Compliance are the most critical part of any analytics platform. This is our #1 focus. For this reason, we want to be very clear about the intentional limits of Autocapture and the precautions we take to avoid putting our customers and their users at risk.

Out of the box, Heap only tracks behaviors that happen, and nothing about the users that perform those interactions, or anything sensitive about the behaviors themselves (remember — this is where our APIs and Snapshots come in).

For example, if a user enters their Social Security Number into a field within your app, Heap will automatically track that something was typed into the field, but will NOT capture the value. It is possible to send relevant, non-sensitive user-level data into Heap, but this is a proactive, deliberate process that is managed by the admin user on the account, not something that we attempt to do automatically.

Part of our unrelenting effort to protect the data of our customers and their users has been making sure to meet industry privacy and security standards. This includes SOC 2 Types 1 and 2 compliance, as well as GDPR compliance, among others.

Any analytics product with APIs is prone to accidentally being sent sensitive information by the user. To mitigate this, we have automatic PII detection built into the product that is designed to recognize when sensitive data does end up in Heap so we can stop and eliminate any potential issues before they blow up. Additionally, we have a user deletion API that makes it simple for our users to handle user deletion requests under GDPR.

Being selective and intentional about the data we ingest has been a priority since day one. Heap was built from the ground up with both Autocapture and privacy top-of-mind, which is the only way to make both work concurrently. This is ostensibly hard to get right if Autocapture is created as an add-on to a traditional analytics platform. When this happens, disaster can strike.

It all begins with better data capture

As you identify and assess your analytics needs for the next few years, we invite you to consider solutions that were designed for the agile nature and scale of today’s digital experiences.

To grow your product and accelerate business growth in the golden age of the digital economy — product and go-to-market teams need to continuously discover friction and growth opportunities across their user journey — effectively and quickly, even in the things they didn’t decide to track upfront! They need faster and better insights, so they can properly assess and quantify their roadmap investments and their impact on moving the needle.