Creating Research-Backed Roadmaps: RICE, Kano, and Real Users

Product managers face a recurring challenge: how to choose which features to build when resources are finite and stakeholder opinions are infinite. The industry has responded with prioritization frameworks—RICE scoring, Kano analysis, value vs. effort matrices—that promise to bring rigor and objectivity to roadmap decisions.

These frameworks work, to a point. They create structure. They force explicit tradeoffs. They give teams a shared language for debate. But they share a critical weakness: they depend entirely on the quality of inputs. When teams estimate reach, impact, confidence, and effort without systematic customer research, they’re building sophisticated mathematical models on top of guesswork.

The gap between framework-driven roadmaps and customer reality shows up in predictable ways. Features that scored high on paper fail to drive adoption. “Must-have” capabilities generate indifference. Competitive threats that seemed urgent turn out to be noise. A recent analysis of B2B SaaS roadmaps found that 64% of features delivered in a given quarter failed to meet their adoption targets, with the primary cause attributed to “misaligned customer priorities.”

The solution isn’t to abandon prioritization frameworks. It’s to recognize what they actually are: calculation engines that amplify the quality of your inputs. Feed them assumptions, and they’ll give you precisely calculated guesses. Feed them research-backed insights, and they become strategic tools.

How RICE Scoring Breaks Without Research

RICE—Reach, Impact, Confidence, Effort—has become the default prioritization framework for many product teams. Its appeal is obvious: it reduces complex decisions to a single number, making comparison straightforward. A feature that scores 850 clearly beats one that scores 340.

The problem emerges when teams examine how they’re actually calculating these scores. Reach estimates often come from product analytics showing how many users touch a particular area of the product. But analytics can’t tell you whether those users care about improvements in that area, or whether they’re simply passing through on their way to something else. A feature area with high traffic might be working perfectly, or it might be so confusing that users keep returning to figure it out.

Impact scores suffer from similar issues. Teams typically estimate impact on a scale—minimal (0.25x), low (0.5x), medium (1x), high (2x), massive (3x)—based on internal discussions about potential value. But these discussions happen in conference rooms, not with customers. One enterprise software company discovered this gap when they scored a proposed analytics dashboard as “high impact” (2x), only to learn through systematic customer research that their users had already built workarounds using existing export features. The actual impact, measured six months post-launch, was closer to 0.3x.

Confidence scores reveal the most honest assessment of uncertainty, but teams rarely use them honestly. A confidence score of 50% should mean the team believes there’s a coin-flip chance their reach and impact estimates are correct. In practice, confidence scores tend to cluster around 80%, reflecting optimism bias more than genuine probability assessment.

Effort estimation faces different challenges. Engineering teams have decades of practice estimating technical complexity, and their estimates have gotten progressively better. The disconnect comes from scope creep driven by customer reality. A feature estimated at 2 person-months of effort can balloon to 6 person-months when customer research reveals edge cases, integration requirements, and workflow variations that weren’t visible in the original spec.

The Kano Model’s Hidden Dependency on Customer Voice

The Kano model takes a different approach to prioritization, categorizing features based on their relationship to customer satisfaction. Basic features cause dissatisfaction when absent but don’t increase satisfaction when present—they’re table stakes. Performance features create satisfaction proportional to their quality—more is better. Delighters generate disproportionate satisfaction even when minimally implemented, often because customers didn’t expect them.

This framework explicitly acknowledges that customer perception matters. A feature isn’t inherently a delighter or a basic expectation—its category depends on what customers think and feel. Yet teams often conduct Kano classification through internal workshops, with product managers and designers debating which category each feature belongs to.

The traditional Kano survey methodology offers a more rigorous approach: ask customers how they’d feel if a feature was present, then ask how they’d feel if it was absent. Plot the responses on a matrix to determine classification. But this method has its own limitations. It works well for features customers can easily imagine, but struggles with novel capabilities that require context or demonstration. It also captures stated preferences rather than revealed behavior, a gap that behavioral economics research has shown to be substantial.

A more fundamental issue: Kano classification isn’t static. Features migrate between categories as markets mature and competitor offerings evolve. Video conferencing was a delighter in project management software five years ago. Today it’s approaching basic expectation status. Teams that classified features once, at the beginning of a planning cycle, find their categorization outdated by the time features ship.

Customer research addresses these limitations by capturing not just what customers say they want, but the context around their needs. When a B2B software company asked customers about desired integrations using traditional Kano surveys, Salesforce integration scored as a performance feature—important and expected. But qualitative research revealed nuance: customers in certain industries needed deep, bidirectional sync with custom fields, while others needed only basic contact import. This distinction transformed roadmap decisions, shifting resources from building a comprehensive integration that would satisfy few users completely to building a flexible integration framework that could be configured for specific use cases.

Value vs. Effort Matrices and the Estimation Problem

The 2x2 matrix plotting value against effort has become a staple of product management. The logic is appealingly simple: pursue high-value, low-effort opportunities first (quick wins), then tackle high-value, high-effort projects (major initiatives), while avoiding low-value work regardless of effort.

This framework’s weakness is the same as RICE’s: it requires accurate value estimation. But value is inherently customer-dependent. A feature delivers value only if customers adopt it, use it correctly, and achieve their goals with it. Without systematic research, value estimates reflect internal assumptions about customer needs rather than validated understanding.

The effort axis presents its own challenges. Technical effort is relatively predictable, but total effort includes design, testing, documentation, support training, and go-to-market work. Customer research often reveals that the real effort in launching a feature isn’t building it—it’s making it discoverable, understandable, and integrated into customer workflows.

One SaaS company estimated a new reporting feature at 3 weeks of engineering effort, placing it firmly in the “quick win” quadrant. But customer research during the design phase revealed that their users had widely varying definitions of basic reporting concepts. What constituted a “conversion” varied by customer segment. Date range handling needed to account for fiscal calendars, not just calendar years. The feature still took 3 weeks to build, but required an additional 4 weeks of design iteration and 2 weeks of in-app guidance development to be usable. Total effort: 9 weeks. The feature migrated from “quick win” to “major initiative” once real customer needs were understood.

How Research Transforms Framework Inputs

The solution isn’t to replace prioritization frameworks with pure qualitative research. Frameworks provide necessary structure for comparing disparate opportunities. Research provides the ground truth that makes that comparison meaningful.

Consider reach estimation in RICE scoring. Product analytics might show that 40% of users visit the settings page monthly. But customer research can reveal that half of those visits are users trying to accomplish tasks that should happen elsewhere in the product. The true reach of a settings improvement isn’t 40% of users—it’s 20%. This distinction changes the feature’s RICE score from 800 to 400, potentially dropping it below the funding line.

Impact estimation becomes more honest with research. Instead of debating whether a feature will have “medium” or “high” impact in a conference room, teams can ask customers directly: “If we built this capability, how would it change your workflow?” The answers reveal not just whether customers want the feature, but whether they’d actually use it, whether it solves a real problem or just a perceived one, and whether the proposed solution matches how they’d actually want to use it.

A fintech company used this approach to evaluate a proposed bill-pay feature. Internal RICE scoring estimated high impact (2x) based on survey data showing that 70% of users wanted bill-pay functionality. But qualitative research revealed that most users already had bill-pay through their bank and weren’t planning to switch. The actual impact—measured as users who would change their behavior—was closer to 0.5x. The feature still made the roadmap, but with realistic expectations and a revised go-to-market strategy focused on the segment that would actually use it.

Confidence scores benefit most from research. When teams base estimates on assumptions, confidence should be low—but acknowledging uncertainty feels uncomfortable, so scores drift upward. Research provides evidence, which legitimately increases confidence. A team that interviews 50 customers about a proposed feature and finds consistent patterns can honestly assign an 80% confidence score. A team working from assumptions should be scoring 30-40%, and the RICE framework will appropriately deprioritize those uncertain bets.

Kano Classification with Research Rigor

The Kano model becomes substantially more useful when classification is based on systematic customer research rather than internal debate. But the research methodology matters.

Traditional Kano surveys ask functional and dysfunctional questions: “How would you feel if this feature was present?” and “How would you feel if this feature was absent?” Responses range from “I like it” to “I expect it” to “I’m neutral” to “I dislike it.” The combination of answers determines classification.

This approach works for straightforward features but struggles with complexity. When evaluating a proposed collaboration feature, survey responses might indicate it’s a performance feature—satisfaction increases with quality. But qualitative research might reveal that the feature is actually a delighter for remote teams and irrelevant for co-located teams. The aggregate survey data obscures critical segmentation.

More sophisticated research approaches combine quantitative Kano surveys with qualitative interviews that explore context. Why would this feature matter to you? When would you use it? What are you doing today instead? How would your workflow change? These questions reveal not just classification but intensity and frequency of need.

Research also helps teams understand Kano migration patterns. By tracking classification over time and across customer segments, teams can identify features that are moving from delighters to performance features to basic expectations. This temporal understanding informs not just what to build, but when to build it. A feature that’s currently a delighter but will be a basic expectation in 18 months might warrant earlier investment than one that will remain a delighter for years.

A healthcare software company used this approach to evaluate telemedicine features. Initial Kano surveys suggested video consultations were a delighter—nice to have but not expected. But longitudinal research revealed rapid migration toward basic expectation status, accelerated by the pandemic. This insight prompted earlier investment than the initial classification would have suggested, allowing the company to meet customer expectations as they evolved rather than lagging behind market changes.

Practical Research Methods for Roadmap Decisions

The challenge for product teams is conducting research with sufficient rigor and speed to inform roadmap decisions. Traditional research methodologies—recruit participants, schedule interviews, conduct sessions, analyze transcripts, synthesize findings—often take 6-8 weeks. By the time insights arrive, the planning cycle has moved on.

Modern research approaches address this timing problem through several mechanisms. AI-moderated interviews can reach dozens or hundreds of customers simultaneously, compressing the data collection phase from weeks to days. Platforms like User Intuition conduct conversational interviews at scale, using adaptive questioning to explore individual customer contexts while maintaining methodological consistency.

The key is maintaining research quality while increasing speed. Poorly designed surveys that ask leading questions or miss important nuances can be conducted quickly, but they generate misleading data that leads to bad decisions. The research methodology needs to support genuine exploration—following up on unexpected responses, probing for underlying motivations, identifying edge cases that wouldn’t emerge from scripted surveys.

For RICE scoring, research should focus on validating reach and impact assumptions. Reach validation involves confirming that the users who could access a feature are actually the users who would use it. Impact validation requires understanding not just whether customers want a feature, but whether they’d change their behavior, whether the proposed solution matches their mental model, and whether the feature solves a real problem or just an articulated desire.

For Kano classification, research should explore the emotional response to features—not just stated preferences but genuine reactions. This requires showing or describing features in enough detail that customers can imagine using them, then probing for context: When would this matter? What would you do differently? How important is this compared to other capabilities you’re missing?

A consumer software company used this approach to evaluate a proposed premium tier. Traditional Kano surveys suggested that advanced analytics would be a performance feature—customers wanted more and better analytics. But qualitative research revealed segmentation: power users saw analytics as a basic expectation and would churn without them, while casual users saw them as irrelevant. This finding transformed the roadmap decision from “should we build advanced analytics?” to “should we segment our product and serve these audiences differently?”

Integrating Research into Planning Cycles

The practical challenge is making research a regular input to roadmap decisions rather than a special project that happens occasionally. This requires changing how planning cycles work.

Traditional planning often starts with internal brainstorming—what features could we build?—then narrows through prioritization frameworks. Research, if it happens at all, comes late in the process, validating decisions that are already largely made. This sequence wastes research’s highest value: shaping which opportunities even make it onto the consideration list.

A more effective sequence starts with research. Before brainstorming features, teams conduct systematic customer research to understand current pain points, unmet needs, and emerging workflow changes. This research generates a set of validated customer problems. Brainstorming then focuses on solutions to these problems rather than features that might be interesting to build.

Prioritization frameworks come next, but with research-backed inputs. Reach estimates are based on validated understanding of which customers face which problems. Impact estimates reflect actual customer behavior patterns rather than assumptions. Confidence scores honestly reflect the quality of available evidence.

This approach requires research that’s fast enough to fit into planning cycles. Quarterly planning needs research that completes in 2-3 weeks, not 8-10. This timing constraint has driven adoption of AI-moderated research platforms that can conduct and analyze dozens of interviews in 48-72 hours, delivering insights while they’re still actionable.

One B2B software company restructured their planning cycle around this research-first approach. They conduct customer research in weeks 1-2 of each quarter, generating a validated set of customer problems and opportunities. Weeks 3-4 involve solution brainstorming and technical scoping. Weeks 5-6 focus on prioritization using RICE scoring with research-backed inputs. The result: their feature adoption rates increased from 47% to 68%, and the percentage of features meeting adoption targets rose from 36% to 71%.

When Frameworks and Research Disagree

Sometimes research and frameworks point in different directions. A feature might score well on RICE but fail to resonate in customer interviews. A capability might classify as a Kano delighter but require massive technical effort. These conflicts reveal important tensions that deserve explicit discussion.

When RICE scoring suggests building a feature but customer research reveals lukewarm response, the disconnect often traces to reach or impact overestimation. The feature might benefit a smaller segment than analytics suggested, or the impact might be lower than internal assumptions predicted. These conflicts should prompt score revision rather than research dismissal.

When Kano classification suggests a feature is a delighter but effort is prohibitive, the decision becomes strategic: is it worth building something that will delight customers but consume substantial resources? Sometimes yes—delighters can differentiate products in crowded markets. Sometimes no—basic expectations and performance features might deliver more value per unit of effort.

Research can also reveal that frameworks are asking the wrong questions entirely. A consumer app company used RICE scoring to prioritize social sharing features, which scored well based on reach (many users) and assumed impact (viral growth). But customer research revealed that users actively avoided sharing because the app involved personal finance management—a private activity. The framework said build it. Customers said they wouldn’t use it. The company trusted research and invested those resources elsewhere, avoiding a costly mistake.

Building Research Capability for Continuous Roadmap Input

Making research a regular input to roadmap decisions requires building organizational capability, not just running occasional studies. This means establishing research rhythms, building research literacy across the product team, and creating systems for translating research insights into framework inputs.

Research rhythms should align with planning cycles. If planning happens quarterly, research should happen quarterly. If teams use continuous discovery with weekly prioritization, research needs to be continuous as well. The cadence matters less than the consistency—research should be a regular input, not a special event.

Research literacy means product managers, designers, and engineers understand how to interpret research findings and translate them into prioritization inputs. When research reveals that a feature would be used by 20% of customers rather than the assumed 40%, the team needs to update their RICE reach estimate. When Kano research shows a feature is a basic expectation for one segment and irrelevant for another, the team needs to decide whether to build for the segment or find a solution that works for both.

Translation systems help teams move from research insights to framework inputs systematically. Templates can guide this process: “Based on research with [N] customers, we estimate reach at [X]% because [evidence]. We estimate impact at [Y]x because [evidence]. Our confidence is [Z]% because [evidence quality].” This structure forces explicit connection between research findings and prioritization inputs.

Technology platforms that conduct research and integrate with product management tools can streamline this translation. When research insights flow directly into roadmap planning tools, with clear traceability from customer quotes to RICE scores, teams can make evidence-based decisions without manual data transfer.

A key capability is knowing when you have enough research to make a decision. Perfect information is impossible and waiting for it creates its own costs. Teams need frameworks for assessing research sufficiency: Have we talked to representatives from each major customer segment? Have we explored edge cases and workflow variations? Do we understand not just what customers want but why they want it and how they’d use it? When these questions have solid answers, research is sufficient. When they don’t, more research is warranted.

Measuring Whether Research-Backed Roadmaps Work

The ultimate test is whether research-informed prioritization leads to better outcomes. Several metrics reveal this impact.

Feature adoption rates measure whether customers actually use what you build. Industry benchmarks suggest that 40-50% of features achieve meaningful adoption (typically defined as use by 20%+ of target users within 90 days of launch). Teams that systematically incorporate research into prioritization typically see adoption rates of 65-75%, with the improvement driven by better understanding of which features customers actually need and will use.

Adoption velocity—how quickly customers start using new features—also improves with research-backed roadmaps. When features align with existing customer workflows and mental models, adoption happens faster. When they require behavior change or solve problems customers don’t recognize, adoption is slow. Research helps teams identify these patterns before building, either improving the feature design or setting realistic adoption expectations.

Customer satisfaction with the product roadmap provides another signal. Regular surveys asking customers whether recent releases addressed their needs typically show 50-60% positive response. Teams using research to inform prioritization often see this rise to 70-80%, with customers reporting that the company “understands what we need” and “builds features we actually use.”

Resource efficiency matters as well. Building features that customers don’t adopt wastes engineering capacity. Teams that measure engineering time spent on features that fail to achieve adoption targets often find 30-40% of capacity goes to low-impact work. Research-backed prioritization typically reduces this waste to 15-20%, freeing resources for higher-impact initiatives.

Competitive win rates in deals where product capabilities are a key decision factor provide external validation. When research reveals which capabilities actually drive purchase decisions—as opposed to which ones customers mention in RFPs—teams can prioritize features that improve win rates rather than features that check boxes.

The Path Forward

Prioritization frameworks like RICE and Kano aren’t flawed—they’re incomplete without research. They provide structure and rigor for comparing opportunities, but they depend entirely on the quality of inputs. Feed them assumptions, and they’ll give you precisely calculated guesses. Feed them research-backed insights, and they become powerful strategic tools.

The shift from assumption-based to research-backed roadmapping requires changes in process, capability, and culture. Process changes mean conducting research before prioritization rather than after, integrating research insights into framework inputs systematically, and establishing research rhythms that align with planning cycles. Capability changes mean building research literacy across product teams, establishing methods for translating insights into prioritization inputs, and potentially adopting technology platforms that make research faster and more accessible. Cultural changes mean valuing evidence over opinions, acknowledging uncertainty honestly, and being willing to revise estimates when research reveals new information.

The payoff for making these changes is substantial. Higher feature adoption rates mean less wasted engineering capacity. Better alignment with customer needs means higher satisfaction and lower churn. More accurate prioritization means resources flow to initiatives that actually drive business outcomes. In competitive markets where product differentiation matters, these advantages compound over time.

The question isn’t whether to use prioritization frameworks—they’re too valuable to abandon. The question is whether to use them with honest inputs based on research, or continue feeding them assumptions and hoping for the best. Teams that choose research-backed prioritization don’t just build better roadmaps. They build better products.