Concept screening is a rapid evaluation pass that reduces a large portfolio of product ideas to a shortlist of viable candidates before committing resources to full concept testing. Companies that screen first typically spend 40-60% less on total concept research while launching stronger products, because screening prevents weak ideas from consuming expensive full-test budgets. For the complete pillar on concept testing methodology, see the concept testing complete guide.
The distinction between screening and testing is not simply one of rigor. They serve different decision functions. Screening answers “which of these ideas are worth developing further” while testing answers “how well does this developed concept perform and how should it be optimized.” Conflating the two leads to either over-investing in rough ideas that should be killed quickly or under-evaluating refined concepts that need detailed diagnostic feedback. The error is symmetric and expensive: skipping screening produces full-test budgets spent on ideas consumers do not want, while skipping full testing produces launches based on rough first-impression data that has not been validated against execution. The discipline of running both — in order, with different stimuli and different sample sizes — is what separates innovation programs that hit at sustainable rates from those that depend on luck.
Why Screen Before You Test
Most organizations generate far more product concepts than they can afford to test thoroughly. Without a systematic screening step, the selection of which concepts receive full testing relies on internal politics or arbitrary criteria.
The economics are compelling. Screening 15 concepts with AI-moderated interviews costs roughly $9,000-$15,000 total. Full testing those same 15 would cost $30,000-$75,000. Screening the 15, then full-testing the top 4, costs $17,000-$35,000 while concentrating resources on the most promising ideas. Screening also takes 24 hours versus several weeks for full testing.
The hidden cost that screening eliminates is the opportunity cost of organizational attention. When a brand team commits to full testing five concepts, those five concepts occupy strategic conversations, executive review meetings, and brand planning cycles for the duration of the test. If two of the five should have been killed at screening, the team has spent weeks of organizational bandwidth defending and analyzing concepts that were never going to launch. Screening surfaces those kills early and frees attention for the concepts that have a chance.
Designing the Screening Stimulus
Screening stimuli should be deliberately lower fidelity than full concept test stimuli. This is a feature, not a limitation. Low-fidelity stimuli test the underlying idea rather than the execution quality.
A screening stimulus consists of three to four sentences: a consumer insight establishing the need, a product description, a key benefit, and optionally a reason to believe. No visual design, no packaging, no branding.
Standardize the format across all concepts. When every concept follows the same template, differences in reactions reflect concept appeal rather than stimulus quality. Write in consumer language, not marketing language. Resist polishing favored concepts more than others, as unequal effort creates a self-fulfilling prophecy where better-written stimuli outperform.
The discipline of low-fidelity, equal-effort stimuli is harder than it looks. Brand teams instinctively over-invest in concepts they believe in, layering on visual mockups, claim hierarchies, and benefit-laddering language that read like a launch deck rather than a screening stimulus. The result is a screening study that confirms the team’s existing preferences rather than testing them. The fix is procedural: agree on a stimulus template before any individual concept is written, write all stimuli on the same day, and have a single person who is not the concept author do a final pass to normalize tone and length. The goal is for every stimulus to communicate the underlying idea at roughly equal clarity, so that consumer reactions reflect the idea rather than the writing.
Screening Methodology
Screening interviews take 15-20 minutes per concept compared to 30-45 minutes for full testing. Assign each concept to a separate sample cell of 30-50 verified category purchasers.
The interview explores four dimensions: relevance (does this address a real need), novelty (does it offer something new), clarity (can consumers understand it), and initial appeal (would they want to try it). AI-moderated interviews add diagnostic depth that surveys lack. When a respondent rates a concept as “not relevant,” the AI probes why, helping the team understand not just which concepts to cut but whether the underlying insight could be salvaged.
Score each concept on a consistent five-point scale across all four dimensions. Concepts above threshold on all four advance. Those below on relevance or appeal are killed. Those strong on some dimensions but not others are candidates for refinement.
The four-dimension model is deliberately spartan. Adding more dimensions at the screening stage produces composite scores that look precise but are actually less diagnostic — the team ends up averaging across noise. Relevance and appeal carry the most weight because they are the hardest dimensions to fix later: a concept that fails on relevance is solving a problem consumers do not have, and no amount of execution can repair that. Clarity and novelty are softer: clarity is a communication problem that better stimulus writing can solve in a refine cycle, and novelty is contextual to the category and competitive frame. Weight the dimensions in advance and document the rationale so that the scoring decisions are not relitigated when results arrive.
Go, Refine, or Kill Criteria
Establishing decision criteria before seeing results prevents post-hoc rationalization. Define three outcome categories and the thresholds that determine them.
Go concepts score above threshold on all four dimensions and proceed to full concept testing with refined stimuli and deeper diagnostic questioning. Refine concepts score well on some dimensions but not others: high relevance but low clarity signals a communication problem, not a concept problem. Refine candidates return to the ideation team with specific diagnostic feedback.
Kill concepts score below threshold on relevance, appeal, or both. Killing concepts is the primary value of screening: it prevents the organization from investing full-test resources in ideas consumers do not want. Limit each concept to one refine cycle. If it still does not meet go thresholds in the second round, kill it.
The political reality is that killing concepts is the hardest organizational move in innovation research. Concepts have internal sponsors, and sponsors fight for their concepts. The only defense is decision criteria committed to in writing before the data arrives. When the threshold is documented, the kill is impersonal — the concept did not clear the bar the team agreed to in advance, and the conversation is about the data rather than about the sponsor. Without that pre-commitment, every kill becomes a debate, and the discipline of screening collapses into negotiation.
Reducing Total Research Costs Through Screening
At AI-moderated pricing, screening one concept with 40 respondents costs approximately $1,000. Full testing costs $2,000-$5,000 per concept. Screening 15 concepts ($12,000) then full-testing the top 4 ($8,000-$20,000) totals $20,000-$32,000. Without screening, full-testing all 15 costs $30,000-$75,000.
Beyond direct cost savings, screening reduces opportunity cost. Concepts that survive screening reach full testing faster, accelerating time to launch in competitive CPG categories.
AI-Moderated Screening Interviews
AI-moderated interviews transform screening from a blunt sorting tool into a diagnostic instrument. The AI interviewer adapts follow-up questions to each respondent’s reactions, extracting more diagnostic information in 15 minutes than a static survey.
Cross-conversation synthesis identifies patterns no single interview reveals. When respondents consistently compare the concept to a specific existing product, that reveals the competitive frame it will occupy. The evidence-traced output means kill decisions come with supporting consumer language that depersonalizes politically charged portfolio decisions.
The evidence-traced kill is the single most under-appreciated feature of AI-moderated screening. In traditional screening, a kill decision arrives in the form of a moderator’s summary judgment, which a concept’s internal champion can dispute as moderator bias. In AI-moderated screening, the kill comes with the consumer language that produced it — 25 verbatims from the screening sample, all expressing the same barrier in their own words. That evidence depersonalizes the conversation: the team is not arguing with the moderator, it is reading what consumers said. Concept champions who would have fought a moderator kill quietly accept a verbatim-backed kill, because the evidence is consumer voice rather than research interpretation.
Integrating Screening into the Innovation Process
Screening delivers maximum value when it becomes a standard phase in the product development process rather than an ad hoc activity.
Build screening into the innovation calendar with quarterly rounds. Connect it to the stage-gate system so no concept advances past ideation without passing the screening threshold.
Accumulate screening data over time. Each round adds to a growing database that reveals category-level patterns: which concept types consistently pass and which consistently fail. This accumulated intelligence from a searchable concept testing hub makes ideation itself more efficient. Track the hit rate from screening to market success to continuously recalibrate and improve the screening instrument.
The screening database also changes the economics of ideation. When the team can see — across 50+ historical concepts — that a specific framing approach consistently fails on relevance, ideation sessions can avoid it from the start. When a specific consumer insight reliably generates strong scoring concepts, the team can return to it as a fertile starting point. The accumulated signal compounds over cycles: a screening practice that started as a research methodology becomes, after three or four years, a category-knowledge asset that gives the brand a structural advantage over competitors who treat each round as a fresh start.
How Does Concept Screening Compare to Full Concept Testing?
Screening and full testing are distinct phases with different stimuli, sample sizes, costs, and decision functions. Confusing them is the most common reason organizations under-perform on innovation research investment.
| Dimension | Concept screening | Full concept testing |
|---|---|---|
| Decision function | Which ideas advance? | How does this concept perform and optimize? |
| Stimulus fidelity | Low (3-4 sentences) | High (visual, copy, claims, price) |
| Sample size per concept | 30-50 | 100-200 |
| Interview length | 15-20 minutes | 30-45 minutes |
| Concepts evaluated | 10-15+ | 3-5 (post-screening survivors) |
| Cost per concept | $600-$1,000 | $2,000-$5,000 |
| Timeline | 24 hours | 1-2 weeks |
| Output | Go / refine / kill decision | Optimization roadmap + launch-readiness assessment |
| Risk if skipped | Full-test budget spent on weak ideas | Launches based on first-impression data |
The two phases are sequential, not substitutable. A screening pass tells you which ideas deserve full-test investment; a full test tells you how to refine those ideas for launch. Running only screening produces under-developed launches; running only full testing produces over-investment in ideas that should have been killed earlier. The combined sequence — screen 15, full-test 4 — is the standard discipline that hits programs at scale.
What Does Good Screening Stimulus Actually Look Like?
A working screening stimulus runs three to four sentences and follows a consistent template across every concept in the round. Sentence one establishes the consumer insight or problem (“Most parents struggle to get protein into their kids without sugar”). Sentence two describes the product in plain language (“A line of protein-fortified yogurt pouches sweetened only with fruit”). Sentence three names the key benefit (“Each pouch delivers 8g of protein with no added sugar”). Optional sentence four is the reason to believe (“Made by a team of pediatric nutritionists using a clinically tested fortification process”). No visual, no packaging mockup, no claim hierarchy, no price.
The reason for this minimalism is that screening tests the underlying idea. A concept that wins at screening on this 4-sentence stimulus has a real shot at consumer pull regardless of execution. A concept that needs polished visuals to test well at screening is a concept that depends on execution to work, which is a much weaker innovation bet. The 4-sentence test is harder for a concept to clear, which is exactly why it is the right filter at this stage. Teams that screen with high-fidelity stimuli routinely advance concepts that win on production values but cannot survive in-market when the consumer encounters them without the polished test environment.
What Goes Wrong When Teams Skip Screening?
Three failure modes are typical. First, the team full-tests too many concepts at moderate sample sizes, dilutes the budget across all of them, and ends up with statistically weak data on every option. The right move would have been to screen at low fidelity, kill three or four concepts cheaply, and concentrate the full-test budget on the survivors. Second, the team full-tests too few concepts (typically three) because the per-concept cost is high, and ends up with a development slate selected from a pre-narrowed candidate set. The concepts that were never written, or were killed in internal review without consumer evidence, may have been stronger than the three that made it to test. Third, the team treats internal preference as a proxy for screening. Brand and R&D leaders rank concepts by gut, the top-ranked concepts advance to full testing, and the screening signal that would have surfaced the consumer-led winners is never collected.
Screening is not optional and it is not the same thing as full testing. It is a discrete, cheap, fast pass that exists to make full testing more effective by removing the weak ideas before they consume real budget. The economics are clear — screening 15 concepts costs less than full-testing one — and the strategic value is larger than the cost savings: screening protects the organization from full-testing ideas that internal politics has elevated past their consumer-evidence weight. Teams that adopt screening as a standard phase in their innovation calendar typically see hit-rate improvements within two cycles, because the development slate is finally being selected by consumer pull rather than internal championship. Combined with an AI-moderated platform, screening reaches an operational cost and timeline where running it on every innovation cycle becomes the default, not the exception.
How User Intuition runs a concept screening pass
Screening only delivers its 40-60% cost saving if it stays cheap and fast — the moment a screening pass starts behaving like a full test, the economics collapse. User Intuition keeps screening in its proper lane: AI-moderated interviews expose participants to multiple low-fidelity concepts in a single session, capture first reactions, and collect go/refine/kill signal across a 30-50 person sample per concept in 24 hours. The format is conversational rather than survey-based, so when a participant rates a concept “not relevant,” the AI probes why — giving the team the qualitative signal to tell genuine appeal from polite response.
The capability that resolves the hardest part of screening — killing a concept with an internal sponsor — is the evidence-traced kill. A kill arrives not as a moderator’s summary judgment the concept’s champion can dispute, but with the consumer verbatims that produced it: 25 participants expressing the same barrier in their own words. That depersonalizes the portfolio decision, which is what lets a screening practice survive the politics that otherwise turn every kill into a negotiation. Run every cycle, the accumulated screening data becomes a category-knowledge asset, and the screening pass itself sits ahead of full testing within the broader concept testing workflow. Teams that want to run a screening pass on a live concept portfolio can start with a demo.
When Should a Concept Get a Second Screening Round?
A concept earns a refine round when it scored above threshold on relevance and appeal but below threshold on clarity or novelty. The refine instruction is specific: rewrite the stimulus to address the diagnostic feedback, hold relevance and appeal anchors constant, and re-screen against the same threshold. Limit each concept to one refine cycle. If a refined version still does not clear the threshold, kill it — the underlying idea is weaker than the original scores suggested, or the team is not finding the right framing on this cycle. Document the refine attempt in the screening database so that future ideation can avoid repeating the same framing in a new wrapper.
For related guides in this batch, see the CPG innovation pipeline screening framework for the four-stage screen → evaluate → refine → portfolio model, concept test sample size guidance for the sizing math that anchors screening and full testing, and AI-moderated interviews vs. focus groups for CPG for the methodology comparison that determines which research instrument to use at each stage. To run a screening pass with verified category purchasers at $25 per interview and 24-hour turnaround, launch a study or book a demo.