Across all of the teams we’ve spoken to that have dealt with data woes, “untrustworthy data” falls into 4 categories:
What do you do next?
Surely, this event did something at some point. Was it in reference to an old version of the feature that has been deprecated? Is the tracking code broken? Am I looking in the wrong place? It’s usually hard to tell why the data doesn’t exist, and the situation can be an annoying blocker to making a product decision with truth. All too often, we hear about PMs giving up on the data altogether, and making a decision based on gut feeling.
In a paradigm that involves tracking code and manual instrumentation, it’s common that event tracking whack-a-mole takes precedent, as new features and use cases pop up, while the effort involved in cleaning data moves to the back burner.
Which one is the correct one? Are the other ones
Any inconsistency, duplication, or poor naming convention starts as a quick decision, a simple mistake, or someone saying
Imagine if you thought one call-to-action at the bottom of your app’s homepage was outperforming another similar one at the top of the page. You got this information from a report that clearly showed that the second CTA was more commonly clicked, so you decide to deprecate the top one.
You never find out, but the inverse was actually true – the top CTA was more effective than the second one. Maybe the event names got mixed up during implementation, or maybe the tracking code on the top CTA was flawed, and not every occurrence was logged.
The potential causes are many, but the scariest part is that you will probably never even know you were wrong.
At this point, you might have a new set of questions that can’t be answered, and the cycle continues.