Skip to main content
Overview of Warmly's Data & Data Quality

A brief overview of Warmly's data and data quality

Maximus Greenwald avatar
Written by Maximus Greenwald
Updated over a week ago

Overview of our Data & Data Quality

Warmly’s goal is to get our customers the most accurate data possible. We are most well known for our Website Traffic Warm Signal (first-party intent data). Here is what our customers can expect:

  • Company and Contact Identification Rates:

    • Company identification rate around 60%

    • Contact identification rate around 20% (this can go up to 25% if Warmly's customers utilize their Warmly UTM parameter for individual-level identification from email sequencing, and for customers that drive significant traffic to forms on their website)

  • Understanding the Numbers: These rates are influenced by several key factors.

    • Bot Traffic: A portion of web traffic consists of bots, which can skew identification metrics. Warmly's identification system excludes bot traffic.

    • Industry and Location Variance: Effectiveness can vary significantly depending on the visitor's industry or location. For example, niche industries or non-US traffic may have lower identification rates due to less available data or differences in data collection methodologies and data privacy regulations.

    • Contact Identification Business vs Personal:

      • 60% of contacts have a business email or company name associated with them while 40% of contacts only have a personal email (Warmly doesn't recommend emailing these)

      • We are constantly working on adding more business related emails given their importance

  • Prioritization of High-Quality Sources: Our system aims to strike the best balance between quality and quantity of data. We employ a built-in mechanism designed to distinguish and prioritize high-quality sources over those with lower performance metrics. Some of our sources include 6Sense, Clearbit & Bombora. 6sense was named by Forrester reports (read more here) as the highest quality data vendor while Clearbit was acquired by Hubspot due to high data quality.

  • Data Security & Privacy: Warmly and its data providers comply with privacy & security standards such as GDPR, CCPA & SOC2. Examples include no personal-level anonymization in Europe and subscribing to California’s data collection opt-out registry.

Prospecting Data Quality: By The Numbers

Warmly has acquired a data provider called Immagnify. They are now our data provider for Prospecting, this is the data you use through Warmly Orchestrator (unless you are connected to your own provider)

Here is an approximate breakdown of the data quality of Immagnify:

Email:

  1. Quantity: 98% of prospects will have emails

  2. Quality:

    1. When using Verified emails: 92% accurate

    2. When using Qualified emails: 75% accurate

Mobiles:

  • 93% accurate mobile numbers

Our Unique Data Value Proposition

  • Data Waterfalls: For each data signal that is purchased through Warmly, we’ve optimized it by working with multiple vendors. We maximize quantity of data found through looking across a significant number of vendors, while we maximize quality of data found through proprietary data cleansing algorithms and confidence scoring by seeing whether multiple providers both independently recognize the same source.

  • Quality Optionality: Warmly believes in customer choice and transparency. We allow customers to get a full view of how and what we’re trying to identify. Furthermore, customers can, in their Warmly settings, make informed decisions to toggle their quality up (quantity down) or quantity up (quality down) depending on their GTM needs and priorities.

  • Contact-level data: Warmly is one of the only companies in the world that has developed the ability to tell our customers the name and the LinkedIn of the actual people who visited their website but have not yet converted. Some companies (like Hubspot and Qualified) claim that they do this, but it's very limited (they can only tell their customers the people who have clicked on links in outbound sequencing driving traffic back to the website). While this is great, and Warmly does this too, it typically only identifies up to ~5% of traffic at the individual-level. Warmly can find the name and LinkedIn of actual net new visitors that our customers did not know about given our first-party cookie and fingerprint graph partnerships with ad networks.

  • Signal-based selling: The future of sales is signal-based selling, where companies figure out which leads are the warmest not just because they fit your ICP, but also if they are actually in-market or could be in-market for your product or service. Warm signals are the cheat code to focusing sales efforts on what will convert the best, and each company has different signals that will convert better or worse. Warmly comes with six different common signals and we run data quality analyses on each of them. Additionally, our customers can plug in any signal that they are monitoring into Warmly's warm lead scoring system via your CRM and then outbound to those companies.

  • Acquiring Immagnify: In early 2024, Warmly acquired an Israeli data scraping & cleansing company called Immagnify. Their team specializes in big data analysis and quality assurance and they were acquired to make Warmly the leading trusted source for accurate warm signal data.

How Warmly Gets our Data

  • Website Signal:

    • When a visitor comes to our customers' website, Warmly investigates to provide the best possible response

    • Warmly looks to see whether or not we have Cookies, Fingerprinting or the IP Address of the visitor. Cookies are the most accurate signal, followed by Fingerprinting data and lastly IP Address matches. Here are the steps to investigate:

      • Step 1 - Clean the data: we filter out bots, data centers, VPNs and outlook scramblers.

      • Step 2 - Cookies: we look for cookies in the visitor's browser. This means a 100% verified match, but these are rarer to find. 93% of our dataset is 1st party cookies and 7% are 3rd party cookies. 3rd party cookies are being phased out by Google (though this will have an immaterial impact on Warmly). 1st party cookies are when the visitor has agreed to let a website drop a cookie in their browser. For example, the New York Times is one of the data sources who sells 1st party cookies to our cookie aggregators. When someone logs in to a free version of the NYT with an email, NYT is selling that email to make money and the visitor agrees to it in the terms & conditions.

      • Step 3 - Fingerprinting: If there are no cookies, Warmly looks at the visitor’s fingerprint information, specifically what browser they are using, what device size and operating system they use, the location of the visit and other minute factors that would indicate a match. Warmly believes this to be 80% accurate, and as long as our customers are filtering the signal for their ICP (eg fishing companies will visit fishing websites) then this goes to 95% accurate.

      • Step 4 - IP Address: If there is not enough fingerprinting information we take the IP address of the visitor (everyone has one in order to load a website) and we check which people and companies use that IP address. After filtering out data centers and VPNs we can see if we can make a probabilistic match. If someone is accessing a website from a company headquarters that occupies a whole floor of a building or a whole building, a match is easy. If the IP address is a co-working space or multiple worker remote household it is more complicated. This is where limited fingerprinting information and location data can be used to increase the confidence of the match. Confidence scores for a match are always going up and down depending on how many websites and hits we detect for a person or company across an IP or IP range. We believe this to be 60% accurate, and as long as our customers are filtering the signal for their ICP (eg fishing companies will visit fishing websites) then this goes to 95% accurate.

    • We also look for a match from your contribution: Warmly offers three ways that our customers can increase the match rate of visitors with 100% accuracy.

      • #1: Tagging outbound. By adding a Warmly UTM parameter globally in all hyperlinks, anyone who clicks a link in our customers' emails/messages and comes to their website will be cookied & tagged

      • #2: Form fills / partial form fills. Warmly will monitor our customers' inbound demo form and try to autocomplete partially filled forms so we can tag that person for our customers when they return in the future

      • #3: Hubspot UTK Cookies. By syncing Hubspot, we can pull in our customers' Historical cookies already assigned in past years and then alert our customers when those companies revisit their website

    • For visitors where Warmly can only provide the company name (no individual-level identification), Warmly will then suggest the most likely visitors (a person) - based on the location of the visitor, who our customer sells to and what company the visitor came from.

  • Job Changes Signal: Warmly monitors all the contacts in our customers' CRMs and contacts at accounts our customers care about for changes to their LinkedIn profiles. We do this by crawling LinkedIn’s public dataset that is indexed by Google and Bing search. LinkedIn is legally required to make this data available by default. Warmly does not crawl private profiles (~10% of profiles).

  • Research Intent Signal: Warmly partners with Bombora & one other data provider to tell our customers the company names of who is researching topics related to what they sell or their competitors but has not yet visited their website. Bombora, for example, has 15,000 blog site and news site partners. Let’s say that a review site called SoftwareReviews.com has created an article called “Top 10 best accounting softwares in 2024.” When someone visits that article, Bombora’s tracking pixel knows what company they come from and then matches that to the fact that they’re researching accounting software.

Setting Expectations - Why the Data Won't be Perfect

  • Warmly and its data vendors do our best to get our customers the most and best data possible. Our system aims to strike the best balance between quality and quantity of data. We employ a built-in mechanism designed to distinguish and prioritize high-quality sources over those with lower performance metrics. However this data is not 100% coverage or 100% accurate.

  • Big data web scraping is difficult and constantly evolving. The rise of remote work, mobile privacy restrictions, GDPR, and more render keeping match rates high a challenge. That's why Warmly encourages our customers to lean into the good data and filter out the less strong data. Think about it this way... which scenario would you prefer to be in?

    • Warmly is 10% wrong and you choose not to de-anonymize your website traffic. You miss out on all the leads.

    • Warmly is 10% wrong and your still choose to de-anonymize your website traffic. You accidentally chat/email a few wrongly ID’d people who then proceed to ignore your outreach since they didn’t actually visit your website. You still correctly email/chat a bunch of good ID’d people who then reply and book meetings.

  • Our customers will find inaccuracies and anonymous sessions. Unlike other companies which sweep this under the rug, Warmly offers customers a full view into their data and let them make informed decisions on a quantity versus quality tradeoff depending on their goals.

Frequently Asked Questions

(A) Does identification work on mobile?

  • Yes.

(B) How do you deal with remote workers / multi-worker households?

  • This has made matching more difficult for IP address matches - 6sense estimates that its non-corporate matches are 85% accurate. When remote work took off their 6signal intent graph deployed available secondary marker information, like mobile advertising IDs, to triangulate data connections. Read more by 6sense’s CTO.

(C) How do VPNs affect the data quality?

  • While companies can and do use VPN for company resources, Warmly doesn’t need someone to be on a VPN to match accounts.

  • Even though many employees use a VPN to access company resources, most companies limit the types of activities routed through the VPN. If you access your company intranet, data center or cloud resources, then that traffic is routed through a VPN. But when you are browsing the internet for memes or the news, that traffic isn’t routed through the VPN to reduce the load on the VPN tunnel.

  • If a user or company uses a Cloud VPN (hosted solution), our data vendors tend to ignore these types of IP addresses, as they are difficult to verify and susceptible to deception — for example, they could be associated with a bot crawling the internet, and their 6signal Graph focuses instead on more reliable data.

This article was written by the Customer Success team at Warmly. Please feel free to reach out to your CSM directly or [email protected]

Did this answer your question?