← Insights & Guides · 6 min read

How to Test a Tagline With Real Customers From an AI Agent

By

Tagline testing used to take 2-3 weeks. Brief a research vendor, wait for panel recruitment, schedule moderated sessions, debrief, synthesize a report. By the time you had real customer signal, your launch date had already moved.

With a MCP-connected agent and the User Intuition agentic research platform, the same test takes 30 minutes of setup and 2-3 hours of fielding. The agent handles everything: participant recruitment from the 4M+ panel, AI-moderated conversations, preference scoring, and structured output with verbatim quotes.

This guide shows the exact workflow — end to end, with real tool calls.

The 30-Minute Path

Four steps from “I need to test these taglines” to “here are the results”:

Step 1: Get an API key. Sign up at app.userintuition.ai/sign-up — Starter plan is free, 3 interviews included, no credit card.

Step 2: Connect your agent. One config block. Here are the two most common:

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "userintuition": {
      "command": "npx",
      "args": ["-y", "@userintuition-ai/mcp"],
      "env": {
        "USERINTUITION_API_KEY": "ui_sk_your_key_here"
      }
    }
  }
}

Cursor (Settings → MCP → Add Server):

{
  "userintuition": {
    "command": "npx",
    "args": ["-y", "@userintuition-ai/mcp"],
    "env": {
      "USERINTUITION_API_KEY": "ui_sk_your_key_here"
    }
  }
}

ChatGPT, VS Code, and custom agents follow the same pattern — point at https://mcp.userintuition.ai/mcp for Streamable HTTP/OAuth.

Step 3: Write your options. Three taglines works well. More than five dilutes the preference signal — participants lose track of their reasoning across too many options.

Step 4: Run the study. Ask your agent:

“Run a preference study on these 3 taglines with 25 real people: [Option A], [Option B], [Option C]. Target audience is B2B SaaS buyers, product managers and growth leads.”

The agent calls ask_humans with mode: "preference", specifying your stimuli and context. Recruitment starts immediately from the vetted panel. Results arrive in 2-3 hours.

Real Example: 3 SaaS Taglines Tested With 25 People

Let’s walk through what this looks like end to end.

The brief: A SaaS research platform is choosing between three positioning taglines before a website redesign.

The stimuli:

  • Option A: “Customer research at the speed of your product”
  • Option B: “Know what customers think before you build”
  • Option C: “From question to customer evidence in 48 hours”

The call:

ask_humans({
  mode: "preference",
  stimuli: [
    "Customer research at the speed of your product",
    "Know what customers think before you build",
    "From question to customer evidence in 48 hours"
  ],
  sample_size: 25,
  context: "Landing page tagline for a B2B SaaS research platform targeting product managers and growth leads at Series A-C companies"
})

2-3 hours later, get_results returns:

{
  "headline_metric": {
    "winner": "From question to customer evidence in 48 hours",
    "distribution": {
      "From question to customer evidence in 48 hours": 0.52,
      "Know what customers think before you build": 0.28,
      "Customer research at the speed of your product": 0.20
    }
  },
  "driving_themes": [
    {
      "theme": "Specificity of the time promise",
      "prevalence": 0.68,
      "summary": "The '48 hours' framing made the value tangible and verifiable. Participants said it felt like a real commitment, not a slogan.",
      "evidence": [
        { "quote": "48 hours is specific. If you can actually do that, it changes how I think about fit-for-purpose research." },
        { "quote": "The first two could be any research tool. This one has a number I can hold you to." }
      ]
    },
    {
      "theme": "Option B's framing resonates with the pain but not the solution",
      "prevalence": 0.44,
      "summary": "Several participants liked Option B's acknowledgment of the guessing problem but felt it didn't differentiate on how it was solved.",
      "evidence": [
        { "quote": "This one gets the problem right. But every research tool says they'll help me understand customers — I need to know what makes this different." }
      ]
    }
  ],
  "minority_objections": [
    {
      "theme": "'Speed of your product' resonated with engineering-adjacent buyers",
      "prevalence": 0.20,
      "summary": "PMs with strong engineering backgrounds preferred Option A's framing around product velocity.",
      "evidence": [
        { "quote": "I'm always fighting for research budget at sprint planning. 'Speed of your product' speaks to that context better than a turnaround time." }
      ]
    }
  ],
  "recommendations": [
    "Lead with Option C — the 48-hour specificity is the strongest differentiator",
    "Consider a hybrid for engineering-adjacent audiences: 'Customer evidence in 48 hours — fast enough for sprint planning'",
    "Option B works better as a sub-header than a primary tagline — pair it with C"
  ]
}

That is the complete result. Preference split, themes, minority objection, and actionable recommendations — all traced to real participant quotes. The decision is now based on evidence, not opinion.

What Does ask_humans Return?

The result structure is consistent across all study modes. For a preference study:

FieldWhat it contains
headline_metric.winnerThe winning option
headline_metric.distributionPreference percentages per option
driving_themesRanked themes explaining the preference, each with prevalence score and verbatim evidence
minority_objectionsThemes from participants who chose other options — surfaces edge cases
recommendationsConcrete suggested actions, generated from the pattern across all conversations

Every evidence entry traces to a real participant conversation. The recommendations come from the pattern across all 25 conversations, not just one or two outlier quotes.

Edge Cases: When 25 Isn’t Enough?

Twenty-five participants gives you clear signal on most decisions. Use the standard 25 unless:

  • The split is close. If two options are within 8-10 percentage points, go to 50. A 48%/44% split at n=25 has meaningful uncertainty. The same split at n=50 is actionable.
  • The stakes are high. Brand repositioning, a new product name, a campaign headline for a 7-figure media buy — go to 50-100. The research cost is small relative to the downstream risk.
  • The audience is narrow. If your target is “CFOs at PE-backed companies with $50M+ ARR,” you may need a custom panel request. Standard preference studies work best with audiences the 4M+ panel reliably reaches.

When to use claim or message modes instead:

  • claim mode: you have one statement you want to test for believability or credibility, not a competition between options. “AI-moderated interviews at 98% participant satisfaction” — is that believable?
  • message mode: you want to test a longer piece of copy (email, landing page section, ad) in context. Participants react to the full message, not just a tag or headline fragment.

How Does User Intuition Handle Tagline Testing?

User Intuition’s agentic research platform is purpose-built for exactly this kind of quick, evidence-backed decision. Three things make it particularly well-suited for tagline work.

First, the ask_humans tool returns structured preference data immediately usable by an agent — not a raw transcript dump that requires human synthesis. The preference distribution, themes, and recommendations are all in a format the agent can parse, reason about, and act on in the same workflow that triggered the study.

Second, the 4M+ vetted panel across 50+ languages means you can test taglines with a globally representative sample or narrow to a specific audience profile — B2B buyers, a particular age range, category purchasers — without a separate recruitment process. Multi-layer fraud prevention (bot detection, duplicate suppression, professional respondent filtering) ensures the 25 participants are 25 genuine human responses at 98% average satisfaction.

Third, every tagline test feeds the Intelligence Hub. An agent calling query_intelligence six months from now can query “what did we learn about time-specificity in positioning language?” and get the relevant themes back across all past studies — not just this one. Studies compound. The second tagline test is informed by the first. The third by both. This is the advantage of running research on a platform with institutional memory rather than one-off polling tools.

Try it on the Starter plan — 3 free interviews at app.userintuition.ai/sign-up, no credit card needed.

Note from the User Intuition Team

Your research informs million-dollar decisions — we built User Intuition so you never have to choose between rigor and affordability. We price at $20/interview not because the research is worth less, but because we want to enable you to run studies continuously, not once a year. Ongoing research compounds into a competitive moat that episodic studies can never build.

Don't take our word for it — see an actual study output before you spend a dollar. No other platform in this industry lets you evaluate the work before you buy it. Already convinced? Sign up and try today with 3 free interviews.

Frequently Asked Questions

Connect your AI agent to User Intuition via MCP (one config block, one API key), then ask the agent to run a preference study on your tagline options. The agent calls ask_humans with mode: 'preference', specifying your stimuli and sample size. Results arrive in 2-3 hours with preference splits, driving themes, and verbatim quotes from real participants.
ask_humans with mode preference returns a structured result with: a headline metric showing which option won and the preference distribution, driving themes ranked by prevalence with supporting quotes, minority objections from participants who preferred other options, and actionable recommendations. Every finding traces to a real verbatim quote.
25 participants is a reliable signal for most tagline decisions — enough to surface clear preferences and themes. If the preference split is close (within 10 percentage points) or if the stakes are high (brand repositioning, major campaign), go to 50. Use the dry_run parameter in ask_humans to estimate cost before committing.
Any MCP-compatible agent: Claude Desktop, Claude Code, ChatGPT (with connected apps), Cursor, VS Code, and custom agents built on LangChain, CrewAI, or the OpenAI Agents SDK. All use the same configuration pattern: add the MCP server URL and your USERINTUITION_API_KEY.
Use preference mode when participants choose between multiple options (taglines, headlines, names). Use claim mode when you need to know if a single statement is believable or compelling. Use message mode when testing how a piece of marketing copy (email, ad, landing page section) lands in context. All three return the same structured format with themes and verbatim quotes.
A 25-participant audio interview preference study costs approximately $500 at the standard $20/interview rate (Pro plan). The Starter plan includes 3 free interviews with no credit card — useful for a small pilot before committing to a full sample.
Get Started

Put This Framework Into Practice

Sign up free and run your first 3 AI-moderated customer interviews — no credit card, no sales call.

Self-serve

3 interviews free. No credit card required.

See it First

Explore a real study output — no sales call needed.

No contract · No retainers · Results in 72 hours