← Reference Deep-Dives Reference Deep-Dive March 20, 2026 · 11 min read

CPG Innovation Pipeline Screening Framework

By Kevin, Founder & CEO

TL;DR

Most CPG innovation pipelines carry 10-15 concepts competing for 3-4 development slots, but traditional agency screening costs $250,000-$750,000 and takes 6-12 months, forcing teams to cut concepts by committee rather than consumer evidence. This framework replaces that process with a four-stage AI-moderated screening system that evaluates the entire pipeline for $18,000-$35,000 in 1-2 weeks. Stage 1 runs 30-50 interviews per concept to triage the full pipeline using go/kill thresholds for appeal, problem-solution fit, barrier severity, and differentiation. Stage 2 runs 100 interviews per surviving concept for full evaluation. Stage 3 refines borderline concepts with 50-100 additional interviews. Stage 4 applies a portfolio scoring matrix to make final slot allocation decisions. User Intuition's 4M+ panel delivers verified category purchasers at each stage, with results in 24 hours per concept batch. The framework includes the concept scoring matrix, go/no-go criteria at each stage, and portfolio-level prioritization methodology.

This framework covers how to screen CPG innovation pipelines from 10-15 concepts to 3-4 winners using consumer evidence. Most CPG teams enter their annual innovation cycle with more concepts than capacity, and the unspoken default is committee-led selection: brand directors advocate for their favorites, R&D pushes the technically interesting ideas, and the concepts that survive are the ones with internal champions rather than consumer pull. This framework replaces that politics with a sequenced consumer-evidence pass that anyone in the organization can defend. For the full innovation research methodology, see Product Innovation Research Template for CPG. For the complete concept testing guide, see Concept Testing for CPG.

The four stages build on each other. Stage 1 triages the entire pipeline at low cost so that no concept advances on a hunch. Stage 2 spends real evaluation budget only on the survivors. Stage 3 gives marginal concepts a second chance with targeted refinement. Stage 4 lifts out of individual-concept scoring to portfolio composition, which is where the actual development-slot decision happens. The economics matter here: running the full sequence costs less than a single traditional agency concept test, while producing evidence on 10-15 ideas instead of one. Teams that adopt this framework typically discover that two or three of their internally favorite concepts kill in Stage 1, freeing the development pipeline for ideas that internal champions had overlooked.

The Four-Stage Screening Process

Stage 1: Quick Screen (30-50 interviews per concept)

Objective: Rapid go/kill assessment on the full pipeline.

Method: AI-moderated interviews with 30-50 verified category purchasers per concept (monadic design).

Questions (compressed, 15-minute interview):

“Tell me your initial reaction to this concept.”
“Does this solve a real problem for you? What problem?”
“What is the biggest concern or hesitation you have?”
“How is this different from what is already available?”
“If this were on the shelf, would you stop and pick it up?”

Timeline: 24 hours per batch of 3-5 concepts. Full pipeline in 1-2 weeks.

Cost: $600-$1,000 per concept. $6,000-$15,000 for a 10-15 concept pipeline.

Go/No-Go Criteria:

Metric	Advance	Refine	Kill
Spontaneous appeal (% positive)	>60%	40-60%	<40%
Problem-solution fit (% real problem)	>50%	30-50%	<30%
Barrier severity (% dealbreaker)	<30%	30-50%	>50%
Differentiation (% articulate difference)	>40%	20-40%	<20%

Expected outcome: 10-15 concepts triage to 5-7 that advance, 3-5 that go to refine, and 3-5 that are killed.

The 15-minute interview length is deliberate. At Stage 1, the team is making a binary call — does this concept deserve full evaluation, or not — and that decision rarely benefits from more interview time. Longer interviews at this stage often produce false positives: respondents talk themselves into liking concepts they would never buy because the conversation invites elaboration. The five compressed questions force the respondent to react fast, the way they would react in a shelf moment. Stage 1 is also where the AI moderator’s consistency pays the biggest dividend. Across 30-50 interviews per concept on 10-15 concepts, that is 450-750 conversations evaluated against identical probing logic — a level of consistency no human moderator panel could match.

Stage 2: Deep Evaluation (100 interviews per surviving concept)

Objective: Full concept evaluation of the 5-7 survivors, using the complete concept testing discussion guide.

Method: 100 verified category purchasers per concept, 30-minute AI-moderated interviews.

Timeline: 24 hours per concept.

Cost: $2,000 per concept. $10,000-$14,000 for 5-7 concepts.

Assessment dimensions:

Motivation hierarchy (laddering from attribute to value)
Price-value perception
Competitive displacement potential
Barrier addressability
Repurchase likelihood indicators

Expected outcome: 5-7 concepts triage to 3-4 with strong consumer evidence for advancement.

The motivation hierarchy is where Stage 2 earns its budget. Stage 1 tells you whether a concept appeals on first reaction; Stage 2 tells you whether that appeal is rooted in something durable. The five-to-seven-level laddering — from attribute, to functional benefit, to emotional benefit, to identity, to underlying value — separates concepts that win on novelty (short-lived) from concepts that win on a value connection (durable). When 100 respondents ladder to the same underlying value across a concept, you have a defensible advancement case. When the ladders fragment across unrelated values, the concept is more polarizing than the Stage 1 scores suggested, and the team should weigh portfolio fit carefully before slot allocation.

Objective: Test modified versions of concepts that showed potential but had addressable barriers.

Method: 50-100 interviews testing the refined concept versus the original.

Timeline: 24 hours.

Cost: $1,000-$2,000 per concept.

Key question: Did the refinement address the barrier without weakening the core appeal?

Stage 3 is the most under-used part of the framework. Most teams treat the refine bucket as a junk drawer — concepts that did not quite clear the Stage 2 bar but felt too promising to kill — and then never actually re-screen them. The discipline of running the refined-versus-original side-by-side test, even at half the sample size, separates concepts where a small wording or claim adjustment unlocks the appeal from concepts where the underlying idea was the problem all along. Common refinement targets include simplifying the value proposition, shifting the occasion claim, addressing the most-cited barrier directly in the concept copy, or repositioning against a different competitive frame. Each of those changes is testable in 24 hours at $1,000-$2,000.

Stage 4: Portfolio Decision

Objective: Select the 3-4 concepts for full development investment.

Inputs: Consumer evidence from Stages 1-3, plus business feasibility data (margin, supply chain, distribution, cannibalization risk).

Stage 4 is where the team explicitly stops asking “which concept won?” and starts asking “which portfolio of 3-4 concepts wins?” Those are different questions. A concept that ranks third on absolute consumer evidence may belong in the development slate because it opens a segment the top two ignore, or because its margin profile balances a high-risk top-ranked concept. Conversely, two top-ranked concepts that target identical occasions and demographics may cannibalize each other on launch, and one should be deferred. The consumer evidence from Stages 1-3 narrows the candidate set; Stage 4 is where business judgment finishes the job.

Concept Scoring Matrix

For each concept that reaches Stage 2, score on these dimensions:

Dimension	Weight	Weighted Score
Consumer appeal strength	25%
Problem-solution fit	20%
Motivation depth (value connection)	15%
Competitive differentiation	15%
Barrier addressability	10%
Price-value acceptance	10%
Repurchase indicators	5%
Total	100%	/5.00

Score interpretation:

4.0+: Strong advance. High confidence in consumer demand.
3.0-3.9: Conditional advance. Strong in some areas but has gaps to address.
2.0-2.9: Requires significant refinement. Re-screen after modification.
<2.0: Kill. Consumer evidence does not support advancement.

The weighting itself is a strategic statement. A brand premiumizing its portfolio should weight competitive differentiation and price-value acceptance more heavily than a value-tier extension; a brand defending market share against a new entrant should weight problem-solution fit and consumer appeal strength. The mistake to avoid is keeping default weights across every cycle: the matrix should reflect the specific strategic question this round of innovation is meant to answer. Document the weighting rationale alongside the scores so that future cycles can compare results against intent rather than against shifting goalposts.

Portfolio-Level Prioritization

After individual scoring, assess the portfolio:

Coverage: Do the 3-4 winners address different consumer segments or occasions? A portfolio of concepts that all target the same segment creates cannibalization risk.
Risk balance: Does the portfolio include both incremental (low risk, moderate upside) and breakthrough (higher risk, high upside) concepts?
Cross-concept patterns: What themes emerged across concepts? If consumers consistently value a specific attribute across multiple concepts, that is a category-level insight that should inform all future innovation.

The Intelligence Hub surfaces these cross-concept patterns automatically when all screening data is stored in the same system.

Portfolio composition is also where margin and operational reality enter the decision explicitly. A concept that scored 4.2 on consumer evidence but requires a new supply chain and carries a 30% margin profile may rank below a 3.6-scoring concept with existing supply and a 55% margin. Stage 4 forces the team to make that tradeoff transparently, with consumer evidence and business feasibility on the same page rather than in separate slide decks. The output is a development slate the CFO can defend to the board and the CMO can defend to the brand teams whose concepts did not advance.

Total Pipeline Screening Cost

Stage	Per Concept	Concepts	Total
Stage 1: Quick screen	$600-$1,000	10-15	$6,000-$15,000
Stage 2: Deep evaluation	$2,000	5-7	$10,000-$14,000
Stage 3: Refinement testing	$1,000-$2,000	2-3	$2,000-$6,000
Total			$18,000-$35,000

Compare to traditional agency screening of the same pipeline: $250,000-$750,000 over 6-12 months. The headline number understates the operational impact: the AI-moderated framework also unlocks a parallel-fielding model in which all 10-15 concepts go to panel simultaneously, so the team has a comparable view of every concept on the same day rather than carrying early concepts in memory for three months while later concepts field. That single change — parallel rather than sequential evaluation — is often more valuable than the cost savings, because it eliminates the recency bias that quietly shapes most committee decisions.

How Does This Compare to Traditional Agency Screening?

The cost gap is the headline number, but the operational differences are what change the innovation cycle. Traditional agency screening is sequenced because each concept is a discrete engagement: recruit, schedule, moderate, transcribe, analyze, report. Running 15 concepts in parallel through a single agency is logistically impossible without doubling fees, so teams batch concepts and screen 3-4 at a time over 8-12 weeks. That sequencing forces early concepts to compete against later concepts in memory rather than on evidence — and forces development decisions before the full pipeline has been evaluated. Running all 15 concepts simultaneously on User Intuition’s 4M+ panel, in 24 hours per batch, produces a side-by-side evaluation no agency can offer.

Dimension	AI-moderated screening	Traditional agency screening
Cost (15-concept pipeline)	$18,000-$35,000	$250,000-$750,000
Timeline (full pipeline)	1-2 weeks	6-12 months
Parallel concepts	All 15 simultaneous	3-4 at a time, sequenced
Interviews per concept (Stage 1)	30-50	8-12
Moderator consistency	Identical AI logic across every interview	Variable across human moderators and sessions
Knowledge persistence	Searchable Intelligence Hub	Static report on a shared drive
Iteration speed	Re-screen refined concept in 24h	Re-engage agency, 4-8 week cycle
Per-concept cost (Stage 1)	$600-$1,000	$25,000-$50,000

For the complete concept testing guide, see the pillar reference. Related guides in this batch — concept screening before full testing, concept test sample size, AI-moderated interviews vs. focus groups for CPG — cover the screening, sizing, and methodology questions this framework assumes are already settled.

What Goes Wrong When Teams Skip Stage 1?

The single most common failure mode in CPG innovation is collapsing Stages 1-3 into a single agency engagement that evaluates 4-5 concepts in moderate depth. The economic argument seems sound — fewer engagements, less coordination — but the strategic cost is large. When a team commits to deep evaluation on five concepts pre-selected by committee, they have already made the most important decision (which five) on the weakest evidence (internal preference). The Stage 1 quick screen exists specifically to reverse that order: let consumer evidence select the five, then commit deep evaluation budget to them.

Skipping Stage 1 also masks an asymmetry that matters at the portfolio level. In any 10-15 concept pipeline, two to three concepts will produce evidence so weak that they should never have reached deep evaluation. Without a quick screen, those concepts still consume full-evaluation budget — typically 30-40% of total spend — and crowd out the marginal concepts in Stages 2 and 3 that could have benefited from refinement. Teams that report disappointing innovation hit rates often have a Stage 1 problem, not a launch problem.

A CPG innovation pipeline is a portfolio decision dressed up as a sequence of concept decisions, and the framework that wins is the one that respects that. Stage 1 exists because committee selection is faster than evidence selection, and every concept that survives committee selection without quick-screen evidence is a bet placed against the market on the basis of internal politics. Stage 2 exists because appeal alone is not durability, and the laddering depth that separates novelty wins from value wins is where pre-launch confidence is earned. Stage 3 exists because most concepts are not killed by their core idea — they are killed by a specific barrier that targeted refinement could address. Stage 4 exists because portfolio composition is the actual development decision, and ranking individual concepts on absolute appeal often produces a slate that cannibalizes itself on launch. Run the four stages in sequence and the politics fade behind the evidence.

Running the four stages on User Intuition

The framework only works if every concept in the pipeline can be fielded at once and re-fielded in days — sequence the screening and the recency bias the framework exists to defeat creeps right back in. User Intuition makes the parallel model the default: all 10-15 concepts go to panel simultaneously as separate monadic studies, each drawing verified category purchasers, with batches returning in 24 hours so the team sees every concept side-by-side on the same day rather than carrying early ideas in memory for a quarter. For product innovation screening specifically, the capability that changes the decision is iteration speed at Stage 3: a refined concept can be tested against its original within two days, so the refine bucket stops being a junk drawer and becomes a real second-chance gate. Because every screening interview is moderated by the same AI logic — identical probing across 450-750 Stage 1 conversations — the go/kill thresholds compare cleanly, and all of it persists in one hub where cross-concept patterns surface as category-level insight for the next cycle. To see how a full pipeline screen is structured before you commit a development slate, book a demo and review a worked four-stage example.

What Should the Output of Stage 4 Look Like?

The Stage 4 deliverable is not a ranked list. It is a development slate — typically 3-4 concepts — with a written rationale that ties each slot to specific consumer evidence, portfolio role, and business case. Each entry should answer four questions: what consumer problem does this concept solve, which segment is it for, how does it complement the other concepts in the slate rather than overlap them, and what is the launch risk profile (incremental, moderate, breakthrough). Teams that produce that deliverable have built a defensible plan; teams that produce a ranked list with no portfolio logic have built a list of favorites with consumer-evidence cover.

The framework also creates an audit trail. Six months after launch, when results come in, the team can look back at Stage 1 and Stage 2 evidence for each concept and ask which signals predicted launch performance and which did not. That feedback loop is how the framework improves over time — the Intelligence Hub surfaces the patterns automatically when all screening data is stored in the same system, and the team’s next pipeline is screened against the lessons of the last one. Over three or four innovation cycles, the framework starts producing not just better individual decisions but a sharper internal model of what wins in the category.

For the full CPG innovation research framework, see Product Innovation Research Template for CPG. For agency-specific discussion-guide patterns, see agency concept testing discussion guide template. To screen your innovation pipeline with verified purchasers, launch a study or book a demo.

Note from the User Intuition Team

Human moderation, done well, is the gold standard. A skilled moderator reads silence, follows a half-thought, knows when to push and when to wait. The trouble is what that costs at scale: one moderator, one participant, one hour at a time — and by interview a hundred, even the best aren't asking the same questions they asked at interview one.

User Intuition keeps what makes great moderation great — the depth, the laddering, the patient probing — and removes what holds it back. The AI moderator ladders 5–7 levels deep on every interview, with no fatigue wall and no calendar to manage. It runs hundreds of conversations in parallel, so a study fills in hours instead of weeks. Setup takes five minutes: upload your study guide and we turn it into a plan, write the screener, recruit from our 4M+ panel, and launch. Every interview is automatically scored on Length, Depth, and Coverage; if it doesn't pass, you don't pay. No refund required.

Preview a real study output before you pay — the only platform in the industry that lets you evaluate the work first. A 5-interview study lands at $150 in 24 hours. Already convinced? Sign up and try with 3 free quality interviews.

Frequently Asked Questions

Stage one filters for category fit and shopper relevance using lightweight AI-moderated screens. Stage two explores purchase intent and concept comprehension with 15-20 interviews per concept. Stage three scores surviving concepts against a standardized matrix covering appeal, uniqueness, and credibility. Stage four prioritizes across the portfolio based on composite scores and strategic fit before allocating development resources.

The scoring matrix evaluates each concept against five to seven weighted dimensions — typically category relevance, emotional appeal, functional differentiation, purchase intent, and pricing headroom — and assigns a composite score. Weighting can be adjusted by strategic priority: a brand targeting premiumization weights differentiation and pricing headroom more heavily than a value-brand extension.

Traditional agency-run concept testing runs $25,000-$50,000 per concept because it bundles recruiter fees, moderator time, facility rental, and analyst hours. AI-moderated screening separates each of those cost centers: automated recruitment, AI moderation, and platform-generated analysis bring screening cost to $2,000-$4,000 per concept while delivering equivalent or richer qualitative depth.

Yes. Because User Intuition fields interviews in parallel across its 4M+ panel, teams can run all 10-15 concepts in a pipeline concurrently and receive results within 24 hours. This compresses a screening cycle that traditionally took 8-12 weeks into under a week, allowing teams to make development allocation decisions before competitive windows close.

Portfolio-level prioritization layers concept scores against strategic filters: white space coverage, cannibalization risk, and margin profile. A concept that scores highly on appeal but overlaps heavily with an existing SKU may rank below a moderate-scoring concept that opens a new occasion or demographic. The framework combines consumer evidence with business logic rather than treating score rank as the final answer.

CPG Innovation Pipeline Screening Framework

The Four-Stage Screening Process

Stage 1: Quick Screen (30-50 interviews per concept)

Stage 2: Deep Evaluation (100 interviews per surviving concept)

Stage 3: Refinement Testing (50-100 interviews per refined concept)

Stage 4: Portfolio Decision

Concept Scoring Matrix

Portfolio-Level Prioritization

Total Pipeline Screening Cost

How Does This Compare to Traditional Agency Screening?

What Goes Wrong When Teams Skip Stage 1?

Running the four stages on User Intuition

What Should the Output of Stage 4 Look Like?

Frequently Asked Questions

Put This Research Into Action

The Four-Stage Screening Process

Stage 1: Quick Screen (30-50 interviews per concept)

Stage 2: Deep Evaluation (100 interviews per surviving concept)

Stage 3: Refinement Testing (50-100 interviews per refined concept)

Stage 4: Portfolio Decision

Concept Scoring Matrix

Portfolio-Level Prioritization

Total Pipeline Screening Cost

How Does This Compare to Traditional Agency Screening?

What Goes Wrong When Teams Skip Stage 1?

Running the four stages on User Intuition

What Should the Output of Stage 4 Look Like?

Frequently Asked Questions

What does a four-stage CPG innovation screening process look like?

How does the concept scoring matrix work in an innovation pipeline screen?

Why does AI-moderated screening reduce CPG innovation pipeline costs so dramatically compared to traditional methods?

Can User Intuition screen an entire CPG innovation pipeline simultaneously?

How do you prioritize across concepts at the portfolio level after individual concept scores are in?

Related Reading

Articles

Reference Guides

Put This Research Into Action