Unlock 2025 Benchmark data → Access insights to stay ahead in the digital experience race.

Get the Report
skip to content
Loading...
    • Why Product Analytics And what can it do for you?
    • How Heap Works A video guide
    • How Heap Compares Heap vs. competitors
    • Product Analytics + Digital Experience Analytics A deeper dive
    • The Future of Insights A comic book guide
    Watch a Demo
  • Data Insights

    • Journeys Visual maps of all user flows
    • AI CoPilot Analytics for everyone
    • Web Analytics Integrate key web metrics
    • Session Replay Complete context with a single click
    • Heatmaps Visualize user behavior instantly
    • Heap Illuminate Data science that pinpoints unknown friction

    Data Analysis

    • Segments User cohorts for actionable insights
    • Dashboards Share insights on critical metrics
    • Charts Analyze everything about your users
    • Playbooks Plug-and-play templates and analyses

    Data Foundation

    • Capture Automatic event tracking and apis
    • Mobile Track and analyze your users across devices
    • Enrichment Add context to your data
    • Integrations Connect bi-directionally to other tools

    Data Management

    • Governance Keep data clean and trusted
    • Security & Privacy Security and compliance made simple
    • Infrastructure How we build for scale
    • Heap Connect Send Heap data directly to your warehouse
  • Solutions

    • Funnel Optimization Improve conversion in user flows
    • Product Adoption Maximize adoption across your site
    • User Behavior Understand what your users do
    • Product Led Growth Manage PLG with data

    Industries

    • SaaS Easily improve acquisition, retention, and expansion
    • Retail and eComm Increase purchases and order value
    • Healthcare Build better digital patient experiences
    • Financial Services Raise share of wallet and LTV

    Heap For Teams

    • Product Teams Optimize product activation, conversion and retention
    • Marketing Teams Optimize acquisition performance and costs
    • Data Teams Optimize behavioral data without code
  • Pricing
  • Support

    • Heap University Video Tutorials
    • Help Center How to use Heap
    • Heap Plays Tactical how-to guides
    • Professional Services

    Resources

    • Down the Funnel Our complete blog and content library
    • Webinars & Events Events and webinar recordings
    • Press News from and about Heap
    • Careers Join us

    Ecosystem

    • Customer Community Join the conversation
    • Partners Technology and Solutions Partners
    • Developers
    • Customers Stories from over 9,000 successful companies
  • Free TrialRequest Demo
  • Log In
  • Free Trial
  • Request Demo
  • Log In

All Blogs

Optimizing Postgres Queries at Scale

Matt Dupree
December 21, 20214 min read
  • Facebook
  • Twitter
  • LinkedIn

Heap is a product analytics tool that automatically captures web and mobile behavior like page views, clicks, and taps. We're operating at a scale of billions of events per day, which we store across a distributed Postgres cluster.

Heap's thousands of customers can build queries in the Heap UI to answer almost any question about how users are using their product. Optimizing all of these queries across all our customers presents special challenges you wouldn't typically encounter if you were optimizing the performance of a small set of queries within a typical app.

This post is about why this scale requires us to conduct performance experiments to optimize our SQL, and it details how we conduct those experiments.

Why it’s hard to optimize Heap queries

We want the querying experience within Heap to feel interactive, so we need most queries to return in just a few seconds. There are multiple reasons why this is particularly difficult to do with Heap queries, but the most important one is that Heap's query builder allows customers to issue a wide variety of queries:

blog-query-builder

As you can see, we support filters, group bys, and comparing different groups of users and time periods. While using these features may only result in a small change to the SQL, it can have a big impact on performance.

Here's a recent example of this.

One of our customers created a query that typically hits one of the millions of indexes we have across our cluster. More specifically, the query resulted in an index-only scan, a special kind of (speedy!) operation in which the database can satisfy a query with just the information that’s in the index, without having to read the table’s actual rows.

Then the user made a slight modification to their query by changing a simple value in a dropdown:

blog-change-dropdown

This change resulted in SQL that could not use an index-only scan, which made the query take much longer to execute.

Because of the wide variety of questions customers want to ask about their data, we have many examples like this. This means that when we're assessing the performance of a set of queries that support a new feature within Heap, we can't simply tweak some SQL, look at a few query plans from EXPLAIN ANALYZE, and call it a day. There may be a small, but common tweak to the query we’ve optimized that could tank the overall performance of the query. (This has happened to us multiple times.)

Here’s a histogram of some of our queries that helps show the variability of performance:

blog-query-perf-variability

Experiments to optimize SQL

One of the things we do to combat this variability issue is to construct 95% confidence intervals around our estimates of performance. These confidence intervals help us identify the range of plausible values our p90 may take once we launch a new Heap query for all of our customers. This range takes into account the variability we introduce into our query performance by supporting a wide variety of ways customers can tweak and/or slice and dice their data.

We typically visualize new Heap query performance like this:

blog-confidence-interval

The width of the confidence bounds helps us see how much query performance could vary across all our customers and across all the various tweaks they make to their queries.

To test if a SQL change makes a query faster, we first generate experimental and control SQL for a few hundred Heap queries, which are stored as JSON. Then, for each JSON query, we generate the control SQL and the experimental SQL that we hope will improve performance.

Next, we run the queries in counterbalanced order, alternating which SQL statement gets run first for a particular JSON query. For example, given JSON queries A and B, we’d run our queries in the following order:

  1. Control SQL for Query A

  2. Experimental SQL for Query A

  3. Experimental SQL for Query B

  4. Control SQL for Query B

We do this control for the effect of caching. If experimental SQL is always run after the control SQL for all queries, the experimental SQL may look faster merely because those queries benefit from caching.

Why run experimental and control SQL on the same queries instead of a true experiment where we randomly assign queries to experimental and control groups? Because of the variability of query performance, a randomized experiment would require us to have a large sample of queries, which slows down the speed with which we can conduct experiments and puts additional load on our systems.

Once we’ve executed these queries, we plot the performance estimates with confidence intervals of the experimental and control queries.

The plot looks like this:

blog-experimental-query-perf

If the confidence intervals of the queries don't overlap (as in the above picture), we can have some confidence that we've actually improved performance with our tweak.

Here’s a recap of the above process as a sequence diagram:

blog-perf-rig-sequence-diag

If you like nerding out about Postgres performance, @-me on Twitter @philosohacker, and if you like working on large data systems, we’re hiring! Check out our team and open roles.

I’d like to thank John Krauss for his help with reviewing early drafts of this post.

Matt Dupree

Was this helpful?
PreviousNext

Related Stories

See All

  • Creative visualization of AI CoPilot capability
    article

    Heap announces new generative AI CoPilot

    Heap, the leader in product analytics, unveils AI CoPilot’s open beta today.

  • Heap.io
    article

    What’s Next in Experience Analytics?

    What does the future of analytics hold, and what does it mean for you?

  • Heap.io
    article

    Building a Retention Strategy, Part 2: Connecting Activities to Revenue with a Metrics Tree

    If you read one post from this series, it should be this one.

Better insights. Faster.

Request Demo
  • Platform
  • Capture
  • Enrichment
  • Integrations
  • Governance
  • Security & Privacy
  • Infrastructure
  • Heap Illuminate
  • Segments
  • Charts
  • Dashboards
  • Playbooks
  • Use Cases
  • Funnel Optimization
  • Product Adoption
  • User Behavior
  • Product Led Growth
  • Customer 360
  • SaaS
  • Retail and eComm
  • Financial Services
  • Why Heap
  • Why Product Analytics
  • How Heap Works
  • How Heap Compares
  • ROI Calculator
  • The Future of Insights
  • Resources
  • Blog
  • Content Library
  • Events
  • Topics
  • Heap University
  • Community
  • Professional Services
  • Company
  • About
  • Partners
  • Press
  • Careers
  • Customers
  • DEI
  • Support
  • Request Demo
  • Help Center
  • Contact Us
  • Pricing
  • Social
    • Twitter
    • Facebook
    • LinkedIn
    • YouTube

© 2025 Heap Inc. All Rights Reserved.

  • Legal
  • Privacy Policy
  • Status
  • Trust