← Insights & Guides May 20, 2026 · 6 min read

How to Test a Tagline With Real Customers From an AI Agent

TL;DR

Tagline testing used to take 2-3 weeks: brief a research vendor, wait for recruitment, schedule moderated sessions, and synthesize a report. With a MCP-connected AI agent and User Intuition's agentic research platform, the same test takes 30 minutes of setup plus 2-3 hours of fielding. The agent calls ask_humans with mode set to preference, passes your tagline options as stimuli, and returns a structured result: a preference distribution showing which option won, driving themes explaining why, minority objections surfacing edge cases, and verbatim quotes from real participants. Studies recruit from a vetted 4M+ global panel across 50+ languages at $25 per audio interview, with results in 24 hours and 98% participant satisfaction. User Intuition's agentic research platform supports Claude, ChatGPT, Cursor, Claude Code, and VS Code via the Model Context Protocol. The Starter plan includes 3 free interviews with no credit card required, making it practical to test a single tagline decision before committing to a full launch.

Tagline testing used to take 2-3 weeks. Brief a research vendor, wait for panel recruitment, schedule moderated sessions, debrief, synthesize a report. By the time you had real customer signal, your launch date had already moved.

With a MCP-connected agent and the User Intuition agentic research platform, the same test takes 30 minutes of setup and 2-3 hours of fielding. The agent handles everything: participant recruitment from the 4M+ panel, AI-moderated conversations, preference scoring, and structured output with verbatim quotes.

This guide shows the exact workflow — end to end, with real tool calls.

The 30-Minute Path

Four steps from “I need to test these taglines” to “here are the results”:

Step 1: Get an API key. Sign up at app.userintuition.ai/sign-up — Starter plan is free, 3 interviews included, no credit card.

Step 2: Connect your agent. One config block. Here are the two most common:

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "userintuition": {
      "command": "npx",
      "args": ["-y", "@userintuition-ai/mcp"],
      "env": {
        "USERINTUITION_API_KEY": "ui_sk_your_key_here"
      }
    }
  }
}

Cursor (Settings → MCP → Add Server):

{
  "userintuition": {
    "command": "npx",
    "args": ["-y", "@userintuition-ai/mcp"],
    "env": {
      "USERINTUITION_API_KEY": "ui_sk_your_key_here"
    }
  }
}

ChatGPT, VS Code, and custom agents follow the same pattern — point at https://mcp.userintuition.ai/mcp for Streamable HTTP/OAuth.

Step 3: Write your options. Three taglines works well. More than five dilutes the preference signal — participants lose track of their reasoning across too many options.

Step 4: Run the study. Ask your agent:

“Run a preference study on these 3 taglines with 25 real people: [Option A], [Option B], [Option C]. Target audience is B2B SaaS buyers, product managers and growth leads.”

The agent calls ask_humans with mode: "preference", specifying your stimuli and context. Recruitment starts immediately from the vetted panel. Results arrive in 2-3 hours.

Real Example: 3 SaaS Taglines Tested With 25 People

Let’s walk through what this looks like end to end.

The brief: A SaaS research platform is choosing between three positioning taglines before a website redesign.

The stimuli:

Option A: “Customer research at the speed of your product”
Option B: “Know what customers think before you build”
Option C: “From question to customer evidence in 24 hours”

The call:

ask_humans({
  mode: "preference",
  stimuli: [
    "Customer research at the speed of your product",
    "Know what customers think before you build",
    "From question to customer evidence in 24 hours"
  ],
  sample_size: 25,
  context: "Landing page tagline for a B2B SaaS research platform targeting product managers and growth leads at Series A-C companies"
})

2-3 hours later, get_results returns:

{
  "headline_metric": {
    "winner": "From question to customer evidence in 24 hours",
    "distribution": {
      "From question to customer evidence in 24 hours": 0.52,
      "Know what customers think before you build": 0.28,
      "Customer research at the speed of your product": 0.20
    }
  },
  "driving_themes": [
    {
      "theme": "Specificity of the time promise",
      "prevalence": 0.68,
      "summary": "The '24 hours' framing made the value tangible and verifiable. Participants said it felt like a real commitment, not a slogan.",
      "evidence": [
        { "quote": "24 hours is specific. If you can actually do that, it changes how I think about fit-for-purpose research." },
        { "quote": "The first two could be any research tool. This one has a number I can hold you to." }
      ]
    },
    {
      "theme": "Option B's framing resonates with the pain but not the solution",
      "prevalence": 0.44,
      "summary": "Several participants liked Option B's acknowledgment of the guessing problem but felt it didn't differentiate on how it was solved.",
      "evidence": [
        { "quote": "This one gets the problem right. But every research tool says they'll help me understand customers — I need to know what makes this different." }
      ]
    }
  ],
  "minority_objections": [
    {
      "theme": "'Speed of your product' resonated with engineering-adjacent buyers",
      "prevalence": 0.20,
      "summary": "PMs with strong engineering backgrounds preferred Option A's framing around product velocity.",
      "evidence": [
        { "quote": "I'm always fighting for research budget at sprint planning. 'Speed of your product' speaks to that context better than a turnaround time." }
      ]
    }
  ],
  "recommendations": [
    "Lead with Option C — the 24-hour specificity is the strongest differentiator",
    "Consider a hybrid for engineering-adjacent audiences: 'Customer evidence in 24 hours — fast enough for sprint planning'",
    "Option B works better as a sub-header than a primary tagline — pair it with C"
  ]
}

That is the complete result. Preference split, themes, minority objection, and actionable recommendations — all traced to real participant quotes. The decision is now based on evidence, not opinion.

What Does `ask_humans` Return?

The result structure is consistent across all study modes. For a preference study:

Field	What it contains
`headline_metric.winner`	The winning option
`headline_metric.distribution`	Preference percentages per option
`driving_themes`	Ranked themes explaining the preference, each with prevalence score and verbatim evidence
`minority_objections`	Themes from participants who chose other options — surfaces edge cases
`recommendations`	Concrete suggested actions, generated from the pattern across all conversations

Every evidence entry traces to a real participant conversation. The recommendations come from the pattern across all 25 conversations, not just one or two outlier quotes.

Edge Cases: When 25 Isn’t Enough?

Twenty-five participants gives you clear signal on most decisions. Use the standard 25 unless:

The split is close. If two options are within 8-10 percentage points, go to 50. A 48%/44% split at n=25 has meaningful uncertainty. The same split at n=50 is actionable.
The stakes are high. Brand repositioning, a new product name, a campaign headline for a 7-figure media buy — go to 50-100. The research cost is small relative to the downstream risk.
The audience is narrow. If your target is “CFOs at PE-backed companies with $50M+ ARR,” you may need a custom panel request. Standard preference studies work best with audiences the 4M+ panel reliably reaches.

When to use claim or message modes instead:

claim mode: you have one statement you want to test for believability or credibility, not a competition between options. “AI-moderated interviews at 98% participant satisfaction” — is that believable?
message mode: you want to test a longer piece of copy (email, landing page section, ad) in context. Participants react to the full message, not just a tag or headline fragment.

How Does User Intuition Handle Tagline Testing?

User Intuition’s agentic research platform is purpose-built for exactly this kind of quick, evidence-backed decision. Three things make it particularly well-suited for tagline work.

First, the ask_humans tool returns structured preference data immediately usable by an agent — not a raw transcript dump that requires human synthesis. The preference distribution, themes, and recommendations are all in a format the agent can parse, reason about, and act on in the same workflow that triggered the study.

Second, the 4M+ vetted panel across 50+ languages means you can test taglines with a globally representative sample or narrow to a specific audience profile — B2B buyers, a particular age range, category purchasers — without a separate recruitment process. Multi-layer fraud prevention (bot detection, duplicate suppression, professional respondent filtering) ensures the 25 participants are 25 genuine human responses at 98% average satisfaction.

Third, every tagline test feeds the Intelligence Hub. An agent calling query_intelligence six months from now can query “what did we learn about time-specificity in positioning language?” and get the relevant themes back across all past studies — not just this one. Studies compound. The second tagline test is informed by the first. The third by both. This is the advantage of running research on a platform with institutional memory rather than one-off polling tools.

Try it on the Starter plan — 3 free interviews at app.userintuition.ai/sign-up, no credit card needed.

Note from the User Intuition Team

Human moderation, done well, is the gold standard. A skilled moderator reads silence, follows a half-thought, knows when to push and when to wait. The trouble is what that costs at scale: one moderator, one participant, one hour at a time — and by interview a hundred, even the best aren't asking the same questions they asked at interview one.

User Intuition keeps what makes great moderation great — the depth, the laddering, the patient probing — and removes what holds it back. The AI moderator ladders 5–7 levels deep on every interview, with no fatigue wall and no calendar to manage. It runs hundreds of conversations in parallel, so a study fills in hours instead of weeks. Setup takes five minutes: upload your study guide and we turn it into a plan, write the screener, recruit from our 4M+ panel, and launch. Every interview is automatically scored on Length, Depth, and Coverage; if it doesn't pass, you don't pay. No refund required.

Preview a real study output before you pay — the only platform in the industry that lets you evaluate the work first. A 5-interview study lands at $150 in 24 hours. Already convinced? Sign up and try with 3 free quality interviews.

Frequently Asked Questions

Connect your AI agent to User Intuition via MCP (one config block, one API key), then ask the agent to run a preference study on your tagline options. The agent calls ask_humans with mode: 'preference', specifying your stimuli and sample size. Results arrive in 2-3 hours with preference splits, driving themes, and verbatim quotes from real participants.

ask_humans with mode preference returns a structured result with: a headline metric showing which option won and the preference distribution, driving themes ranked by prevalence with supporting quotes, minority objections from participants who preferred other options, and actionable recommendations. Every finding traces to a real verbatim quote.

25 participants is a reliable signal for most tagline decisions — enough to surface clear preferences and themes. If the preference split is close (within 10 percentage points) or if the stakes are high (brand repositioning, major campaign), go to 50. Use the dry_run parameter in ask_humans to estimate cost before committing.

Any MCP-compatible agent: Claude Desktop, Claude Code, ChatGPT (with connected apps), Cursor, VS Code, and custom agents built on LangChain, CrewAI, or the OpenAI Agents SDK. All use the same configuration pattern: add the MCP server URL and your USERINTUITION_API_KEY.

Use preference mode when participants choose between multiple options (taglines, headlines, names). Use claim mode when you need to know if a single statement is believable or compelling. Use message mode when testing how a piece of marketing copy (email, ad, landing page section) lands in context. All three return the same structured format with themes and verbatim quotes.

A 25-participant audio interview preference study costs approximately $625 at the standard $25/interview rate (Pro plan). The Starter plan includes 3 free interviews with no credit card — useful for a small pilot before committing to a full sample.

The 30-Minute Path

Real Example: 3 SaaS Taglines Tested With 25 People

What Does ask_humans Return?

Edge Cases: When 25 Isn’t Enough?

How Does User Intuition Handle Tagline Testing?

Frequently Asked Questions

How do I test a tagline with an AI agent?

What does ask_humans mode preference return?

How many participants do I need for a tagline test?

Which AI agents can run tagline tests with User Intuition?

What is the difference between preference, claim, and message modes?

How much does a 25-person tagline test cost?

Related Reading

Articles

Reference Guides

Put This Framework Into Practice

What Does `ask_humans` Return?