Marketing AttributionFebruary 25, 202620 min read

Self-Attributed Attribution: Why 'How Did You Hear About Us?' Is Your Most Important Survey Question in the Age of AI

GA4 can't track ChatGPT referrals, dark social, or Slack word-of-mouth. Self-attributed attribution — asking customers directly — is now the only reliable way to know where your best customers actually come from.

Your analytics dashboard says 38% of signups came from organic search, 22% from paid ads, and 31% from direct. You present these numbers in your marketing review. Everyone nods. Decisions get made.

Almost none of it is accurate.

The 31% labeled "direct" is a catch-all for everything GA4 cannot explain: word-of-mouth, Slack threads, private newsletters, WhatsApp forwards, and — increasingly — recommendations made by ChatGPT, Claude, and Perplexity. These are your highest-intent visitors, often your highest-converting ones, and your analytics tool has no idea they exist.

There is exactly one method that can tell you the truth about where your customers actually come from: asking them.

Note

What is self-attributed attribution? Self-attributed attribution (also called survey-based attribution or self-reported attribution) is the practice of asking customers directly how they first heard about your product. Instead of inferring the source from cookies, UTM parameters, or referrer headers — all of which fail for large categories of traffic — you ask a simple question at or after the point of conversion: "How did you first hear about us?" The answer comes from the customer themselves, making it the only method that captures word-of-mouth, private communities, and AI tool recommendations.


TL;DR

  • Google Analytics cannot track referrals from ChatGPT, Claude, or Perplexity — they show up as "direct"
  • Dark social (Slack, WhatsApp, private email, communities) has been breaking attribution for a decade and is now joined by LLM recommendations
  • Self-attributed attribution — one survey question at signup — is the only method that captures all of these
  • Include "AI tools (ChatGPT, Claude, Perplexity, etc.)" explicitly as an answer option or you'll undercount it
  • The data tells you which acquisition bets to double down on and which channels are overreported in your dashboard

The attribution lie you've been living with

Last-click attribution — the default model in GA4 — gives full credit to the last touchpoint before a conversion. If someone clicked a Google search result and signed up, that's an organic search conversion. Simple, measurable, trackable.

The problem: the last click is often not how the customer found you. It is how they navigated to you after already knowing they wanted to check you out.

The actual discovery moment — the colleague mentioning your product in a team standup, the AI assistant recommending you when asked for survey tools, the podcast host dropping your name — happened earlier, in a place no analytics tool can see.

What GA4 actually tracks

GA4 tracks the following reliably:

  • Paid search and paid social: because your ad platforms pass UTM parameters automatically
  • Organic search: when Google passes referrer data (increasingly blocked in private browsing)
  • Direct referral links: when another website links to you with a proper HTTP referrer header

That covers maybe 60-65% of traffic for a typical SaaS site with a content strategy. The rest ends up in the "direct / none" bucket — which is not a channel, it is a confession that the data is missing.

What hides in "direct" traffic

Traffic sourceWhy it appears as "direct"
Someone typing your URL from memoryTrue direct — accurate
Slack / Teams recommendationNo referrer passed in desktop apps
WhatsApp / SMS linkReferrer stripped by mobile apps
Private newsletter (opened in Gmail app)Referrer not passed
Bookmarked link from weeks agoSession expired, referrer lost
HTTPS-to-HTTPS referral blockedStrict referrer policy strips it
Link copied from PDF or documentNo referrer header
ChatGPT, Claude, Perplexity recommendationBy design: no referrer passed

For most SaaS companies, "direct" traffic converts at 2-3x the rate of organic search. This is because it is largely comprised of high-intent visitors who already know who you are — they arrived with context, not curiosity. Your analytics treats this as undifferentiated "direct." Your self-attribution survey will reveal it is five completely different channels, each requiring a different strategic response.


Dark social: the first wave that broke attribution

The term "dark social" was coined by Alexis Madrigal in The Atlantic in 2012 to describe sharing via private, untraceable channels. When someone copies a link and pastes it into a WhatsApp group, a Slack DM, or an email — the analytics are dark. No source. No medium. No referrer.

Research from RadiumOne (one of the earliest dark social studies) found that roughly 69% of online content sharing happens through dark social channels, not public social networks. Your Twitter shares are measurable. Your Slack shares are invisible.

This was already a serious problem before AI chatbots existed. For SaaS companies with strong word-of-mouth, the dark social blind spot can mean underreporting referral and community channels by 50% or more while overstating paid acquisition's contribution.

The startup that thinks its paid ads are driving 30% of signups might actually be looking at a number that's 50% word-of-mouth filtered through "direct."


The new dark social: AI referrals that GA4 will never see

Dark social — frustrating as it is — is at least a known unknown. Marketers have been aware of the Slack-and-WhatsApp problem for over a decade.

The new dark social is newer, larger, and still largely invisible to most marketing teams.

ChatGPT crossed 300 million weekly active users in early 2025. Perplexity reached 15+ million monthly active users in 2024 and is growing fast. Claude, Google AI Overviews, Microsoft Copilot, and a dozen other AI assistants collectively handle hundreds of millions of queries per week. A meaningful fraction of those queries are product discovery questions:

  • "What's the best website survey tool for a SaaS startup?"
  • "What are good alternatives to Hotjar?"
  • "How do I collect exit intent feedback from my pricing page?"

When an AI assistant recommends your product, the user closes the chat window and either types your URL directly or runs a branded search. When they arrive on your site, GA4 sees either:

  1. Direct — they typed the URL
  2. Organic branded search — they searched your brand name, which analytics counts as organic search but is actually a AI-prompted visit

Neither entry in your dashboard reveals the actual source: an AI recommendation.

Why this matters more than dark social

Dark social volume is large but has been roughly stable — WhatsApp and Slack aren't growing at the rate they did in 2015. LLM referral traffic, on the other hand, is compounding. Every week, more users turn to AI assistants for product recommendations rather than running a Google search.

If you are building content specifically to rank in AI responses (which you should be), you need to know whether it is working. The only way to know is to ask.

Tip

Test this yourself right now: Open ChatGPT or Perplexity and ask "What's a good lightweight website survey tool?" Note which products come up. Then imagine those are your potential customers — and that GA4 will mark all of them as "direct."


Self-attributed attribution: the method that closes the gap

Survey-based attribution has existed for decades. Direct mail companies were asking "How did you hear about us?" in the 1980s. The concept is not new. What is new is that it is now the only method with no ceiling — it can capture every channel, including ones that did not exist when your analytics setup was built.

The mechanics are simple:

  1. At or immediately after signup, ask one question: "How did you first hear about us?"
  2. Present a list of specific options that covers every real acquisition channel your product actually uses
  3. Include "I don't remember" as an option — it is honest data, not a failed survey
  4. Store the response and segment by it

The data you collect will diverge materially from your GA4 channel breakdown. In almost every case, survey data reveals more word-of-mouth, more community referrals, and more AI-assisted discovery than any analytics tool reports. The analytics data overweights last-click paid channels. The survey data reflects actual memory of first exposure.

Neither is perfectly accurate. The survey has recall bias; people sometimes remember the second or third touchpoint, not the first. Analytics has attribution-model bias; it counts the click, not the awareness. Used together, they triangulate something closer to truth than either provides alone.

For the channels that are completely invisible to analytics — AI recommendations, Slack word-of-mouth, private newsletters — the survey is the only data you have. There is no other way.


The question wording and answer options that matter in 2026

Getting the survey right requires getting the answer options right. Vague options produce vague data. The goal is a list specific enough that respondents can recognize their actual experience, not just pick the closest approximation.

The question

"How did you first hear about [Product]?"

"First" matters. Without it, respondents often report the last touchpoint (their branded search), not the actual discovery moment. "First hear" anchors them to the right memory.

The answer options

OptionWhy it's here
Organic search (Google or Bing)Captures SEO-driven discovery
Paid ad (Google, LinkedIn, Facebook, etc.)Isolates paid spend that's actually driving discovery
Friend or colleague recommendationWord-of-mouth — the channel that predicts retention best
Online community (Slack, Reddit, Discord, etc.)Community and dark social
Social media (LinkedIn, X/Twitter, Instagram...)Public social — separable from community
AI tool (ChatGPT, Claude, Perplexity, etc.)The critical 2026 addition — without this, AI referrals disappear
Newsletter or emailCreator/influencer and email-driven discovery
PodcastSpoken referrals — underreported everywhere
Blog post or articleContent-driven discovery
I don't rememberHonest option that prevents forcing bad answers

The "AI tool" option is not optional if you publish content, optimize for search, or have any presence in AI responses. Without it, AI-referred visitors will select "online search," "I don't remember," or "other" — and your data will systematically undercount one of the fastest-growing acquisition channels in SaaS.

When you add the AI tools option, the response rate for it typically surprises people. Teams that expected 2-3% have found 15-20% of new signups saying an AI assistant was their first touchpoint. That data changes your content strategy.

Where to place this survey

Post-signup is the gold standard. The moment after account creation is the highest-intent moment with the freshest recall. The visitor just committed — they're engaged, the experience is live in memory, and they have a reason to respond.

Second-best: on the homepage after 20-30 seconds, shown once per visitor. This captures visitors who browse but don't convert yet, which gives you a different but valuable picture of discovery.

Do not place this on every page. It is a one-time attribution question, not a page-feedback survey.

Cooldown

Set a 365-day cooldown or — better — fire it exactly once per user account. Attribution data from the same person twice is noise.


What the data tells you (and what to do with it)

The self-attribution survey pays for itself the first time it changes a strategic decision. Here is the decision framework for each channel finding:

If this is your #1 channelWhat it revealsWhat to do
Organic searchSEO is working as primary discoveryKeep publishing, deepen keyword clusters, prioritize structured content that AI can cite
AI tools (ChatGPT, Claude, etc.)Your content ranks in AI responsesDouble down: more FAQ sections, more definition boxes, more structured data, explicit FAQ schema on all posts — feed the signal that's working
Friend / colleagueStrong product, weak top-of-funnelBuild a formal referral program; this channel can be amplified
Online communityOrganic community tractionInvest in community presence; you have product-community fit
Paid adsAds driving first impressionCheck whether organic can replace paid over 12 months; high ad dependency at early stage is a fragility risk
PodcastSpoken word is workingFind more podcasts; audio-driven discovery is durable and high-trust
Newsletter / emailCreator/influencer path is realBuild relationships with the newsletters that sent you traffic; consider newsletter sponsorships

The AI signal deserves special attention

If 15% of your new signups say an AI tool is how they found you, that is an unusually clear strategic directive: create more content that AI assistants can cite.

What makes this particularly telling is the gap between the two data sources. SaaS teams that have implemented both last-touch analytics and self-attribution are seeing a consistent pattern: last-touch says paid media (Google Ads, LinkedIn) — self-attribution says ChatGPT or Perplexity. The same customer. Two completely different stories.

What actually happened: an AI assistant recommended the product, the visitor ran a branded search, your paid ad appeared (or your organic listing), they clicked, and converted. GA4 credited the ad. The customer remembers the chatbot. The ad got the credit for work the AI did.

This is not a reason to pause paid campaigns. It is a reason to ask harder questions about what is creating the conditions that make those campaigns convert. If the AI recommendation had never happened, the branded search does not happen, the ad impression does not happen, and the conversion does not happen. The paid channel is real — but it is executing on awareness that originated somewhere else entirely.

The industry is shifting. Teams that only look at last-touch will keep optimizing ads that are really just harvesting AI-driven demand they did not know they had. Teams that layer in self-attribution data will start to see the full chain — and can invest in the top of it.

Even where the attribution data is imperfect — recall is fuzzy, people sometimes report the second touchpoint, not the first — the insights it surfaces open discussions that last-touch data never would. That alone is worth asking the question.

AI assistants prefer content that is:

  • Definitionally precise (clear definitions of concepts and terms)
  • Structured with numbered lists and comparison tables
  • Factually specific (real numbers, real benchmarks)
  • Covering niche questions comprehensively rather than popular questions shallowly

An article titled "Self-Attributed Attribution: Why 'How Did You Hear About Us?' Is Your Most Important Survey Question" has a better chance of being cited by an AI assistant than a generic piece on "Marketing Attribution 101" — because it answers a specific question that users are asking AI tools, with a level of specificity that makes it citable.

If your survey says AI is sending you customers, prioritize writing more posts like this one.


What a healthy channel mix looks like

There are no universal benchmarks — channel mix varies enormously by go-to-market, stage, and product type. But here are directional ranges for B2B SaaS companies with content strategies, based on self-attribution data:

ChannelHealthy rangeWarning signs
Organic search35-55%Below 20%: underinvested in content. Above 70%: fragile if algorithm shifts.
Word of mouth / colleague20-35%Below 10%: product isn't generating organic conversation
AI tools5-20% (2026, growing)0%: either not cited by AI or not asking the right question
Paid ads10-20%Above 40%: growth stops when budget stops
Community5-15%Below 5%: not engaged where your ICP spends time
Newsletter / podcast / other5-15%

The healthiest channel mix has high word-of-mouth and organic search (both compounding channels) with paid ads as acceleration, not foundation. If your survey shows the inverse — paid ads dominant, word-of-mouth below 10% — you have a structural risk.

A company that grows primarily through paid acquisition can be accurate in its analytics. A company that grows primarily through word-of-mouth and AI recommendations will look, in GA4, like it barely markets at all.


How Selge's "How did you find us?" template is built for exactly this

Selge ships a pre-built How did you find us? template that is configured specifically for self-attributed attribution. It includes:

  • The correct question wording ("How did you first hear about us?")
  • All the answer options listed above, including the AI tools option
  • A follow-up open text field ("Anything more specific you can share?") for context
  • Default trigger: post-signup (fires once, immediately after account creation)
  • 365-day cooldown pre-set

The template is part of Selge's library of 14 expert templates built from real CRO work. You can start collecting attribution data within 5 minutes of installing the embed script — no manual question writing, no option-list agonizing, no trigger configuration.

You can also combine it with a Homepage Clarity Check to understand whether visitors who arrived through AI recommendations found your homepage as clear as those who arrived through organic search. (They might not — AI-referred visitors arrive with context the page doesn't assume they have.)

See all templates at selge.app/templates.


Frequently asked questions

What is self-attributed attribution?

Self-attributed attribution is the practice of directly asking customers how they first heard about your product, rather than inferring it from click tracking, UTM parameters, or referrer headers. It is the only attribution method that captures word-of-mouth, private channel shares, and AI assistant recommendations — all of which are invisible to web analytics tools. The data comes from a single survey question asked at or after the point of conversion.

Why doesn't Google Analytics track ChatGPT referrals?

AI chat interfaces like ChatGPT, Claude, and Perplexity do not pass HTTP referrer headers when a user navigates from the chat to an external website. This is by design — it protects user privacy within the chat context. When someone clicks a link recommended by an AI assistant, the receiving website sees either no referrer (classified as "direct") or a branded search if the user searched for the company name after the recommendation. GA4 has no mechanism to identify these as AI-assisted visits.

How do I track where my customers come from?

The most reliable method is a combination of three approaches: (1) ask with a post-signup self-attribution survey, (2) cross-reference with your GA4 channel data as a directional check, (3) look at branded search volume trends over time as a proxy for word-of-mouth and awareness growth. Self-attribution captures channels that analytics can't see. Analytics captures volume and trends that surveys can't measure. Together they give you a more complete picture than either alone.

What is dark social?

Dark social refers to web traffic that arrives via private sharing channels — WhatsApp messages, Slack threads, Signal groups, copied-and-pasted links in email, SMS links — where the HTTP referrer header is stripped or absent. The traffic shows up in analytics as "direct" but was actually referred by another person in a private context. The term was coined by Alexis Madrigal in The Atlantic in 2012. For most SaaS companies with word-of-mouth traction, dark social is one of the largest untracked acquisition channels.

Should I ask visitors how they found my website?

Yes, but with precision. Ask the question post-signup or post-purchase, not on first visit. First-visit visitors cannot yet tell you why they converted. Post-signup visitors just committed and have the clearest recall of their first exposure. One well-timed question after account creation captures more accurate attribution data than a survey shown to thousands of non-converting browsers. Selge's How did you find us? template is pre-configured for this exact placement.

How do I measure LLM and AI referral traffic?

Directly, you can't — AI chat tools do not pass referrer data. The only reliable method is to explicitly include "AI tool (ChatGPT, Claude, Perplexity, etc.)" as an answer option in a post-signup self-attribution survey. Without this option, AI-referred customers will select "online search" or "I don't remember," and your data will systematically undercount this channel. To get a rough directional signal without a survey, watch your branded search volume in Google Search Console — AI mentions of your product typically produce branded search spikes, since many users search the company name after seeing it recommended in a chat.

What questions should I ask in a marketing attribution survey?

Start with one question: "How did you first hear about [Product]?" Include specific answer options rather than generic categories — "AI tool (ChatGPT, Claude, Perplexity, etc.)" rather than just "other online source." Add an optional open text follow-up ("Anything more specific?") for context. Avoid multi-question attribution surveys — customers don't want to reconstruct their entire journey; they want to answer one thing and move on. If you ask about the first touchpoint, a second question about the decision-making touchpoint can be useful, but keep it optional and low-friction.

How accurate is self-attributed attribution?

More accurate than people assume. Research on survey-based attribution (Northstar Research, Nielsen) consistently shows 80-90% correlation between self-reported first exposure and independently verified first exposure for channels that can be verified (such as paid campaigns with unique URLs). For channels that cannot be independently verified (AI referrals, Slack word-of-mouth, private newsletters), there is no ground truth to compare against — which is exactly the point. Self-attribution is not perfectly accurate, but it is the only source of data for invisible channels. Partial accuracy beats complete blindness.


The bottom line

GA4 is not broken. It measures what it can measure: clicks, referrers, sessions, goals. The problem is that a growing share of your actual acquisition is happening in channels that don't produce clicks with referrer headers — private conversations, AI recommendations, community word-of-mouth.

"How did you first hear about us?" is a question that has existed for decades. The 2026 version requires one update: add an AI tools option to the answer list. That single change transforms a good survey into an essential one.

If you learn that 20% of your new signups found you through an AI assistant, you know exactly what to invest in: structured content, FAQ schema markup, niche question coverage, and the kind of specific, expert-backed articles that AI assistants cite when a user asks a product discovery question.

If you don't ask, you'll see that 20% buried in "direct traffic" — indistinguishable from people who typed your URL from memory.

One question. One answer option nobody else is adding. Data that changes your strategy.


Selge is a lightweight on-site survey tool built for SaaS websites. The How did you find us? template is pre-built with the right question wording, the AI tools answer option, and post-signup trigger configuration. Install in 2 minutes. Browse all 14 templates or start free.

Tags:self-attributed attributionmarketing attributiondark socialLLM referral trackinghow did you hear about us surveywebsite survey
2-minute setup

Ready to hear what your visitors think?

Pick a template, paste one script tag, start getting real answers. No developer required.

No credit card requiredFree plan available

Free to build - pay only when you go live