Building Target Personas with SQL, Salesforce, and Clearbit
Last week Todd, Heap’s Head of New Business, wrote a blog post describing how our new Salesforce → Heap integration is a powerful tool for defining strategy around the entire customer life cycle. I wanted to dive deeper into the difficulty of defining a target persona, and share a bit of our process before and after we built the Salesforce → Heap integration.
The Problem: Defining a Target Persona
As a company, our goal is to know exactly who to sell to, who to design the product for, and who to have distinct messaging for to maximize our value add, growth, and revenue. To define our target persona, our first step was trying to answer some seemingly simple questions:
Who signs up for Heap?
What’s their job title?
What’s their industry?
How much are they paying?
Once we know who we can answer questions like:
Which iteration of our landing page leads to the most conversions of the best buyers?
What channels bring in these buyers?
What campaigns are most effective?
Defining a target persona is an iterative process that takes a lot of fine-tuning, and changes as the company evolves. Whoever is responsible for doing the analysis on the target persona should have moderate SQL expertise or dedicated engineering resources to handle the ETL process and analysis.
Defining Our Target Persona
We started defining our target persona using three data sets – Heap for our user and attribution data, Salesforce for our business logic (such as ARR), and Clearbit for our user enrichment.
Heap does a great job ensuring that all user interaction data is collected – we have a complete, retroactive data set to see who is signing up, who is visiting our blog, our marketing page, who is running queries, what queries our users are running, etc. Heap also captures every referrer, UTM parameter, and every landing page for every user across the entirety of their usage.
Salesforce has all of the contract information, revenue, key contacts on each account, etc. We enrich each Contact, Lead, and Account using Clearbit’s Salesforce Integration, so Salesforce also holds the user/account information like industry and role.
Using Mode, a SQL based BI tool, we have to join our signup event with our users with our Contacts (or leads ) with our Accounts (or Opportunities depending on where revenue is stored within your Salesforce instance) with our Clearbit Objects. The query looks something like this:
SELECT signup.TIME :: DATE,
cbit.cbit\_\_\_employment\_role\_\_\_c,
cbit.cbit\_\_\_employment\_seniority\_\_\_c,
cbit.cbit\_\_\_company\_category\_industry\_\_\_c,
cbit.cbit\_\_\_company\_category\_sector\_\_\_c,
account.mrr\_\_\_c \* 12 AS ARR
join main\_production.sign\_up signup
join salesforce.\_contact contact
join salesforce.\_cbit\_\_\_clearbit\_\_\_c cbit
join salesforce.\_account account
WHERE account.TYPE = 'Paying Customer'
This gives us a table with all the information I need. The next step is to aggregate the data.
Before we can start to group our data meaningfully, we have to understand what the possible result values are, and how to categorize them with a query such as:
SELECT cbit.cbit\_\_\_employment\_title\_\_\_c,
FROM main\_production.users u
ON u.user\_id = signup.user\_id
ON u."identity" = contact.email
ON contact.cbit\_\_\_clearbit\_\_\_c = cbit.id
ON contact.account\_id = account.id
GROUP BY 1
If title is too specific for your situation, Clearbit includes broader categories like Industry, and Role so it’s easier to group without having to use case when statements to categorize specific titles or group similar industries. Using Clearbit’s categories, we ended up with the query below to calculate the most common roles, industries, company size and the ARR associated with each bucket of users.
SELECT ROLE,
company\_size,
Avg(arr) AS average\_rev,
FROM (SELECT signup.TIME :: DATE,
cbit.cbit\_\_\_employment\_role\_\_\_c AS ROLE,
cbit.cbit\_\_\_company\_metrics\_employees\_\_\_c AS company\_size,
FROM main\_production.users u
ON u.user\_id = signup.user\_id
ON u."identity" = contact.email
ON contact.cbit\_\_\_clearbit\_\_\_c = cbit.id
ON contact.account\_id = account.id
ORDER BY 1)
2
Analyzing Attribution
At this point we are ready to figure out what marketing campaigns are associated with each user, and how much revenue they are associated with overall. We look at two different touches, the first touch – what first brought a user to Heap, and the last touch – what ultimately was responsible for the conversion. (Although there are many ways to model attribution data, we’re just going to focus on these two in this post).
In order to find the UTM campaign associated with the conversion, we use the following SQL:
SELECT utm\_campaign,
Count(DISTINCT users.user\_id) AS total\_users
JOIN main\_production.users users
JOIN salesforce.\_contact contact
JOIN salesforce.\_cbit\_\_\_clearbit\_\_\_c cbit
JOIN salesforce.\_account account
WHERE account.type = 'Paying Customer'
AND (cbit.cbit\_\_\_company\_category\_industry\_\_\_c = 'Software' OR cbit.cbit\_\_\_company\_category\_industry\_\_\_c = 'E-commerce' )
GROUP BY utm\_campaign
Essentially, we found all the attribution information associated with our target buyers, and then looked at the total revenue associated with each of those campaigns, and users converting from those campaigns to each of those campaigns. Then we update the query to analyze the number of sessions and sign ups associated with each of those campaigns. This data allows us to allocate our spend, and refine our campaigns to align with those driving the most revenue.
Heap and Salesforce Data should Co-Exist:
Cleaning, organizing, and analyzing this data requires a lot of resources. In order to enable marketing, sales, and customer success team to answer their own questions, we decided to build a Heap + SFDC integration. This integration automatically captures touch points, account updates, contacts created, etc. so our business logic exists in Heap as well.
We no longer have to worry about holes in the data, because our integration updates Heap each time a Salesforce contact, lead, task, opportunity, or account is updated. Downstream, our integration also transforms Salesforce’s convoluted data model into a straightforward user – event model. The current state of the user’s salesforce contact (or lead), account, any opportunity data is stored as properties on the user, and every time that information changes, Heap captures an event.
This simplifies our queries immensely making it easier to iterate on them. We can perform the same analysis to define our target persona with a simplified query:
SELECT users.sfdc\_account\_cbit\_company\_industry,
Sum(users.sfdc\_account\_arr) AS Total\_Rev,
FROM main\_production.users users
ON users.user\_id = signup.user\_id
GROUP BY 1,
Similarly, the query to look at attribution for these users looks more like this:
SELECT utm\_campaign,
Count(DISTINCT users.user\_id)
JOIN main\_production.users users
WHERE sfdc\_account\_account\_type = 'Paying Customer'
sfdc\_contact\_cbit\_role = 'Product'
AND (
OR sfdc\_account\_cbit\_company\_industry = 'SaaS'
AND sfdc\_account\_cbit\_company\_size >= 20 GROUP BY utm\_campaign
Using this Data in Heap
Not only is the query simplified, but having both data sets within Heap’s UI empowers our end users (in this case marketing and sales) to do the vast majority of this analysis without a data team or a BI tool. Within Heap’s UI you can perform very similar analysis. This graph shows us the title and industry of each user who signed up for Heap and became a paying customer, as well as the ARR associated with each grouping of categories.
Now that we have data dictating where we should allocate spend and which campaigns were most effective, our marketing team can perform ad-hoc analysis to rapidly iterate on campaigns and monitor their effectiveness.
What does this mean for us?
The Salesforce → Heap Integration solved a huge pain point and a blocker in our outbound and marketing efforts. Our go-to-market team now has the data they need, in an interface everyone can use, making us smarter as a company. Heap’s flexibility allows us to constantly analyze, explore, and iterate easily. As our product develops and our customer base grows, the new data is automatically included in our reports, allowing us to adapt accordingly.