← Reference Deep-Dives Reference Deep-Dive · 9 min read

Concept Testing for Household Cleaning CPG Products

By Kevin

Household cleaning is a CPG category where concept testing must account for dimensions that most research methodologies overlook: the deeply ingrained sensory cues that signal whether a product works, the efficacy anxiety that drives brand loyalty, and the emerging tension between sustainability aspirations and cleaning performance expectations. A concept that scores well on appeal but fails to address these category-specific dynamics will underperform at shelf regardless of its innovation merit.

This guide covers a concept testing methodology designed specifically for household cleaning products, addressing the unique challenges of testing efficacy-dependent concepts, navigating the sustainability-performance trade-off, and validating format innovations in a category where consumer habits are extraordinarily sticky.


The Efficacy Perception Problem

Household cleaning products face a concept testing challenge that most other CPG categories do not: consumers cannot verify the core product claim until after purchase. You can taste a food product at a demo station. You can smell a fragrance on a test strip. But you cannot know whether an all-purpose cleaner removes grease better than your current product until you take it home and use it.

This means that concept testing for cleaning products is fundamentally a test of perceived efficacy, not actual efficacy. The research must measure whether consumers believe the product will work, and what drives or undermines that belief.

The Perceived Efficacy Framework identifies four consumer belief drivers that concept testing must probe:

Ingredient Signals. Consumers use ingredient information as a proxy for cleaning power, even when they do not understand the chemistry. “Contains bleach” communicates power for bathroom cleaning. “Plant-based surfactants” communicates gentleness for kitchen surfaces. The concept test must probe whether the ingredient story matches consumer expectations for the cleaning task.

Brand Transfer Equity. When an established cleaning brand extends into a new segment, consumers transfer their beliefs about the parent brand to the new product. This transfer can be positive (“if Lysol makes it, it must kill germs”) or negative (“that brand makes gentle products, I don’t trust it for heavy-duty cleaning”). AI-moderated interviews map these brand permission boundaries with precision that quantitative surveys cannot achieve.

Format-Efficacy Associations. Consumers hold strong, often unconscious beliefs about which product formats are effective for which tasks. Sprays are trusted for surface cleaning. Powders signal deep cleaning. Wipes suggest convenience but not heavy-duty performance. A format innovation must either align with these associations or deliberately overcome them through messaging and evidence.

Visual and Sensory Cues. Product color, label design, bottle shape, and especially scent all communicate efficacy before the product is used. Concept testing must explore these sensory expectations explicitly, because a formulation that is chemically effective but sensorially wrong will be perceived as ineffective.


Scent as a Category-Specific Testing Dimension

In few CPG categories is a single sensory dimension as commercially important as scent is in household cleaning. Research from the American Cleaning Institute has found that scent is the second most important purchase driver after cleaning performance, ranking above price and brand in many segments.

Concept testing for cleaning products must treat scent as a first-class testing variable, not an afterthought. The Scent Expectation Protocol embeds scent exploration into every concept interview:

Step 1: Baseline Scent Mapping. Before introducing the concept, the interview establishes the consumer’s scent expectations for the category and occasion. “When you finish cleaning your bathroom, what should it smell like? What scent tells you the room is clean?” These baseline expectations define the target for the new concept’s scent profile.

Step 2: Concept Scent Alignment. When the concept is introduced, probe whether the described or implied scent matches the consumer’s expectation. If the concept describes a “lavender-scented bathroom cleaner,” does that scent align with what the consumer associates with bathroom cleanliness? Some consumers associate lavender with relaxation, not cleaning. This misalignment would undermine purchase intent regardless of actual efficacy.

Step 3: Scent Trade-Off Exploration. For concepts that emphasize natural or sustainable ingredients, the interview probes how scent expectations change. Consumers accustomed to strong chemical scents as a signal of cleaning power may perceive unscented or naturally-scented products as less effective. The AI-moderated depth interview captures these trade-offs in the consumer’s own language, providing formulation teams with precise guidance.

Step 4: Scent in Context. The same scent may be evaluated differently depending on the room, the cleaning task, and who else is in the home. Interviews explore these contextual preferences to prevent the common mistake of selecting a single scent profile for a product used across multiple occasions.

The output of the Scent Expectation Protocol is a scent brief that bridges consumer expectations and R&D capabilities. This brief specifies not just which scent families are preferred but why, in which contexts, and with what performance associations.


The Sustainability-Performance Tension

No category illustrates the consumer tension between sustainability and performance more starkly than household cleaning. Consumers consistently report wanting sustainable cleaning products in surveys, yet sustainable cleaning brands consistently underperform their conventional competitors in market share. This gap between stated preference and actual behavior is where concept testing must focus.

AI-moderated interviews reveal the three consumer segments that brands must understand and target:

The Sustainability-First Segment (20-25% of category buyers). These consumers have already adopted sustainable cleaning products and accept potential performance trade-offs. For this segment, concept testing focuses on whether the sustainability credentials are credible and differentiated from existing green alternatives. The risk is not performance skepticism but green fatigue: another product making the same claims as the five sustainable cleaners they have already tried.

The Proof-Required Segment (30-35% of category buyers). This segment wants sustainable products but will not sacrifice cleaning performance. They are the swing voters of the category. Concept testing for this segment must measure whether the concept resolves the tension between sustainability and efficacy. What evidence or guarantees would make them believe this product cleans as well as their current conventional option? The answer is almost never a claim on the packaging; it is a specific, testable mechanism they can verify.

The Performance-First Segment (40-50% of category buyers). Sustainability is a bonus, not a driver. Concept testing for this segment should not lead with sustainability messaging, which may actually reduce perceived efficacy through the “green means weak” heuristic. Instead, test whether the sustainable attributes can be communicated as secondary benefits after the performance story is established.

The Tension Resolution Test presents the concept and then systematically probes: “You mentioned this sounds like it would clean well. Now let me tell you it is also made from plant-based ingredients and comes in recycled packaging. How does that change your view?” The order matters. Leading with sustainability primes the efficacy concern. Leading with performance and then adding sustainability adds value without triggering skepticism.

This testing approach, only possible through the adaptive probing of AI-moderated interviews, has helped CPG brands including Seventh Generation and Method optimize their messaging hierarchy for each consumer segment.


Format Innovation Testing: Tabs, Concentrates, and Refill Systems

The household cleaning category is experiencing a wave of format innovation driven by sustainability requirements, e-commerce optimization, and cost pressures. Concentrated formulas, dissolvable tabs, refill pouches, and subscription models all represent format changes that require specific concept testing approaches.

Format innovation introduces a behavioral change requirement that standard concept testing often misses. It is not enough to measure whether consumers like the idea of a concentrated cleaner; the test must determine whether they will actually change their behavior to use it correctly.

The Behavioral Adoption Assessment evaluates format concepts across four adoption barriers:

Comprehension Barrier. Does the consumer understand how the new format works? Concentrated formulas require dilution. Tabs require dissolution. Refill systems require keeping the original bottle. If the format requires steps that are unfamiliar, concept testing must measure the comprehension gap and the willingness to learn.

Habit Disruption Cost. How much does the format change disrupt existing cleaning routines? A consumer who reaches for a spray bottle under the sink has a deeply embedded motor habit. Switching to a concentrated formula that requires mixing before each use introduces friction at the moment of need. The interview probes whether the consumer perceives the benefit as worth the habit change.

Trust Transfer. Does the consumer trust that the new format delivers the same result as the familiar format? Concentrated formulas face a “less product = less clean” perception. Tabs face skepticism about dissolution completeness. The concept test must identify and quantify these format-specific trust barriers.

Household Context. New formats often fail because they do not fit the physical context of the home. Where does the concentrate bottle go? Is the refill pouch easy to pour? Do the tabs dissolve in cold water if the consumer does not have a hot water connection in the utility area? These practical concerns emerge only through in-depth conversation about the consumer’s actual cleaning environment.

Format innovation testing should sample both current category users and lapsed or light users, because format changes that simplify the experience may attract consumers who had previously disengaged from the category.


Claims Development: From Consumer Language to Shelf Communication

In the household cleaning category, the claims on the front of the package are the primary selling mechanism at shelf. Most purchase decisions are made in seconds, and the claim that catches the consumer’s eye determines whether the product enters the consideration set. Concept testing must produce not just a go/no-go decision on the overall concept but specific, validated claims that the marketing and packaging teams can use.

The Claims Extraction Protocol uses AI-moderated concept interviews to systematically capture the language consumers use when reacting to the concept, and then converts that language into candidate claims.

Phase 1: Unstructured Reaction. Present the concept and capture the consumer’s spontaneous language. The phrases they use to describe what the product does, who it is for, and why they would or would not buy it are raw material for claims development. “So basically it cleans everything without smelling like chemicals” is a consumer-generated claim that outperforms anything a copywriter would draft in isolation.

Phase 2: Competitive Comparison Language. Ask the consumer to compare the concept to their current product. The dimensions they choose for comparison reveal what matters and how they frame the decision. “It is like [competitor] but without the residue” tells you that residue is a pain point and that the comparison frame is a specific competitor, not the category in general.

Phase 3: Proof Point Identification. For every positive reaction, probe what would make the claim credible on shelf. “What would you need to see on the packaging to believe this actually works?” Consumers generate proof point ideas that range from specific test results to certifications to ingredient callouts. These proof points become the supporting evidence architecture for the lead claim.

Phase 4: Claim Stress Testing. Present candidate claims and probe for skepticism, confusion, or negative associations. The claim “kills 99.9% of bacteria” generates different reactions than “eliminates germs naturally.” Testing specific claim language at this level of granularity prevents the expensive mistake of launching with a claim that tested well in a survey but fails to communicate at shelf.

The output of the Claims Extraction Protocol is a prioritized claims hierarchy: a lead claim, two to three supporting claims, and the proof points required for each. This hierarchy feeds directly into packaging design and retailer sell-in materials.


Competitive Context Testing for Category Disruption

Household cleaning concepts do not exist in a vacuum. They compete for shelf space, shopping cart space, and mental availability against entrenched brands with decades of consumer trust. Concept testing that ignores the competitive context produces misleadingly optimistic results.

The Competitive Context Method embeds the competitive environment into every concept evaluation:

Current Repertoire Mapping. Before any concept exposure, the interview maps the consumer’s complete cleaning product repertoire: every product in the home, for which task, how long they have used it, and why they chose it. This baseline reveals the real competitive set, which is often different from what the brand’s competitive analysis assumes.

Concept-in-Context Evaluation. The concept is presented within the consumer’s actual repertoire. “You currently use [their product] for [their task]. Imagine you saw this new product next to it at the store. What goes through your mind?” This framing forces the consumer to evaluate the concept against their actual alternative, not an abstract ideal.

Switching Barrier Identification. For consumers who express interest, the interview probes every barrier to actually switching: stockpile of current product, concern about trying something new on important surfaces, family members who prefer the current product, loyalty points or subscription commitments. These barriers are invisible in survey data but decisive in actual purchase behavior.

Competitive Response Scenario. For truly disruptive concepts, the interview explores how the consumer would react if their current brand responded with a similar product or a price promotion. This scenario reveals whether the concept has genuine differentiation or whether it is vulnerable to competitive response.

The competitive context method is particularly important for CPG brands launching into concentrated or sustainable formats where the primary competition is not another sustainable brand but the conventional product the consumer has used for years. Understanding the depth and nature of that competitive attachment is essential for realistic launch planning.

Frequently Asked Questions

Pre-production efficacy testing uses descriptive concept boards that specify the cleaning scenario, the product's claimed action, and the result. AI-moderated interviews probe whether consumers believe the claim based on their experience with the category, what evidence or cues would make it credible, and how it compares to claims from products they currently trust. This identifies claims that resonate before formulation investment.
Scent is one of the strongest trust signals in household cleaning. Consumers consistently report that scent communicates cleanliness, efficacy, and product quality. Concept testing must probe scent expectations for each product format and cleaning occasion, because the right scent for a bathroom cleaner is different from the right scent for a kitchen surface spray. Mismatched scent expectations are a leading cause of repeat purchase failure.
Sustainability claims on cleaning products trigger a specific consumer tension: will it still clean as well? AI-moderated interviews reveal that consumers segment into three groups on this dimension. Approximately 20-25% will accept a perceived efficacy trade-off for sustainability. Another 30-35% want sustainability only if efficacy is guaranteed. The remaining 40-50% prioritize cleaning power above all else. Testing must identify which segment the concept targets and whether the messaging resolves the efficacy-sustainability tension.
Get Started

Put This Research Into Action

Run your first 3 AI-moderated customer interviews free — no credit card, no sales call.

Self-serve

3 interviews free. No credit card required.

Enterprise

See a real study built live in 30 minutes.

No contract · No retainers · Results in 72 hours