The discussion guide is the operational artifact that makes in-depth interviews reproducible. Most teams don’t have one. They start a study with three or four research questions in a doc, run the first interview off the top of their head, and let each subsequent interview drift wherever the respondent takes it. By interview #5 the team has effectively run five different studies. The findings doc that comes out of that work is a Frankenstein — comparable on surface themes, incomparable on anything that matters.
This guide covers what a discussion guide actually is, the six sections that make IDI quality consistent across a study, the sample prompts that go in each section, and the mistakes that quietly destroy data quality before analysis even starts. It also covers how a guide functions when AI moderation executes against it — which changes what a “good” guide looks like in practice.
What a discussion guide is — and what it is not
A discussion guide is a structured probe library plus a flow. It defines the sections of the interview, the primary prompts that open each section, and the follow-up layers a moderator uses to push past surface answers into the reasoning behind them. It does not specify the exact words the moderator says, the exact order of every question, or the boundaries of any single response.
A script, by contrast, is a verbatim list of questions read in order. Scripts make sense for structured interviews where comparability of phrasing matters more than depth. In an in-depth interview the value proposition is the opposite — comparability of structure with freedom inside each section to follow what the respondent actually does for a living. A guide that reads like a script is usually a sign the team is using IDIs to do work surveys would do faster.
The discussion guide is the artifact you would hand to a moderator who has never run this study and reasonably expect them to produce comparable output to your senior researcher. If your guide cannot do that, it is too thin.
The six sections of a strong IDI discussion guide
The structure that holds up across study types — churn, win-loss, concept testing, jobs-to-be-done, foundational discovery — has six sections in this sequence:
1. Opening rapport (2-3 minutes)
The goal is not introduction, it is psychological safety. The respondent needs to know that the conversation is open, there are no wrong answers, and the moderator is going to follow their thinking rather than test them on a script.
Sample prompts:
- “Before we get into the topics, I’d love to hear a little about you. Where are you joining from and what does a typical workday look like?”
- “We’re going to spend about 30 minutes together. There are no right or wrong answers — I’m just trying to understand your perspective. Sound good?”
Skip the company introduction. Respondents do not care what your platform does, and front-loading it primes them to give you the answers they think you want.
2. Warm-up context (3-5 minutes)
The respondent answers questions about the domain you’re studying without yet being asked anything sharp. The point is to surface the language they actually use — categories, vendor names, workflow steps, pain points — so that later questions can be asked in their vocabulary, not yours.
Sample prompts:
- “Tell me about the last time you had to [domain task]. Walk me through what happened.”
- “When you think about [category], what’s top of mind? Just whatever comes up first.”
- “How do you usually go about [process]? I’m just trying to understand how it actually works for you.”
Avoid evaluative or attitudinal questions here. “How satisfied are you with…” or “How important is…” both invite the respondent to start grading rather than describing.
3. Behavioral grounding (6-8 minutes)
Now you anchor the conversation in a specific event. Behavioral questions about a particular instance produce dramatically more reliable data than questions about general behavior, because respondents misremember frequency and intensity but remember stories.
Sample prompts:
- “Take me back to the last time you [specific behavior]. What was happening that day?”
- “Walk me through what you did first. And then what?”
- “What were you trying to accomplish when you started?”
This is also the section where you separate stated behavior from actual behavior. “I always do X” is a self-report; “the last time I did this, I actually did Y” is data. The grounding section is where the discrepancy shows up.
4. Deep dive with laddering (12-15 minutes)
This is the section that produces the insight. Laddering is the technique of repeatedly asking “why does that matter” or “what would that mean for you” after each answer, moving from surface attribute preferences down to underlying motivations and decision drivers.
Sample prompts:
- “You mentioned [specific thing they said]. Why was that important to you?”
- “What would have happened if that hadn’t worked?”
- “Tell me more about that. What do you mean specifically?”
- “Help me understand — when you say [X], what does that look like in practice?”
The laddering structure is what separates an IDI from a long survey. Stopping at level two or three of probing — which is what fatigued human moderators tend to do — leaves the most strategically useful insight unreached. Stable underlying motivations typically surface between five and seven levels deep.
Build your guide with primary prompts at the top of each topic and a bank of follow-up probes underneath, organized by what the respondent might say. The moderator reads the room and picks the probe that fits the thread.
5. Counter-factual exploration (5-7 minutes)
The counter-factual section is where you test the robustness of what you’ve heard. If a respondent has described a behavior or preference, ask them what would change it.
Sample prompts:
- “What would have to be true for you to switch?”
- “If [alternative scenario], what would you have done differently?”
- “Imagine [your tool] disappeared tomorrow — what would you reach for?”
- “Has there ever been a time when you considered [opposite behavior]? What was that about?”
Counter-factuals separate genuinely-held preferences from default behaviors. A respondent who describes a strong preference but cannot articulate any condition under which they would switch is probably running on inertia, not preference.
6. Wrap-up (2-3 minutes)
The respondent gets the floor. The wrap-up surfaces the topic you forgot to ask about — which is, often, the topic that mattered most.
Sample prompts:
- “Before we close, is there anything I didn’t ask about that you think I should have?”
- “If you were running this conversation, what would you have asked yourself?”
- “Anything else on your mind about [topic]?”
This is the single highest-leverage two minutes in most discussion guides. Cutting it for time is one of the most common mistakes in IDI execution.
Mistakes that wreck IDI quality
Three patterns destroy data quality more often than all other mistakes combined.
Leading questions. “Don’t you think the new design is cleaner?” tells the respondent which answer you want. The signal is gone before the answer arrives. The fix is to rewrite toward neutral framing — “What do you think when you look at this?” — and to read every question in the draft guide while asking “could a positive, negative, and neutral answer all sound natural here?”
Closed questions in qualitative settings. “Did you find that helpful?” produces a yes or a no. Open questions — “What was it like to use that?” — produce a story. If your guide has more than two or three closed questions in the body sections, you are running a structured interview, not an IDI.
Intent-assuming prompts. “Why did you choose us?” assumes a deliberate choice was made. Many “choices” are defaults, drift, inheritance from a previous team, or coercion by procurement. Reframe to behavioral: “How did you end up using [product]?” The story that comes back may not look like a choice at all.
Two other mistakes show up often enough to flag:
- Overscripting probes. Pre-writing the exact follow-ups for every primary prompt is a sign the team is afraid the moderator will improvise badly. The fix is to write better moderators and a better guide, not to handcuff both.
- Stacking too many primary prompts per section. A guide with 15 primary prompts in a 12-minute deep dive section is a guide that will be rushed through at one minute per question, producing zero laddering depth. Three to four primary prompts per section is the practical ceiling.
Structure for adaptive probing — scaffolding, not handcuffs
A discussion guide is scaffolding. The moderator climbs through the sections in sequence, but inside each section the path through the primary prompts is determined by what the respondent says.
The way to write a guide that supports adaptive probing is to layer it:
- Section header with goal stated in one sentence.
- Primary prompts at the top of each section, in priority order. Three or four, not ten.
- Probe banks below each primary prompt — six to eight follow-up questions organized by what the respondent might say. If they describe a behavior, here are the laddering probes. If they describe a problem, here are the impact probes. If they describe a workaround, here are the substitution probes.
- Must-hit questions flagged at the top of the guide. Two to three per study, total. These are the questions that must appear in every interview regardless of how the conversation flows. Everything else is optional.
This structure lets the moderator pursue the respondent’s thread without losing the study’s frame.
Copy-able template format
A working template looks like this in a doc:
| Section | Time | Primary prompts | Probe bank |
|---|---|---|---|
| Opening rapport | 2-3 min | Background + framing | (Light follow-ups on what they share) |
| Warm-up context | 3-5 min | ”Tell me about the last time…” / “When you think about [category]…" | "What does that usually look like?” / “Who else is involved?” |
| Behavioral grounding | 6-8 min | ”Walk me through the last time…” / “What were you trying to do?" | "Then what?” / “How did you decide?” / “What did you look at first?” |
| Deep dive + laddering | 12-15 min | 3-4 priority topics | ”Why does that matter?” / “What would that mean for you?” / “Tell me more about that” (5-7 layers) |
| Counter-factuals | 5-7 min | ”What would have to be true to switch?” / “If [scenario], what would change?" | "Has there ever been a time…?” / “What would happen if…?” |
| Wrap-up | 2-3 min | ”What didn’t I ask about?” / “What would you have asked?” | (Open) |
The structure is study-agnostic. Swap the primary prompts for your specific research questions, keep the section sequence and probe-bank layout, and you have a working guide.
How does User Intuition handle discussion guide design?
User Intuition treats the discussion guide as the operational contract of the study. Teams either upload an existing guide or build one inside the platform across the six standard sections — opening, warm-up, behavioral grounding, deep dive, counter-factuals, wrap-up — with primary prompts and probe banks defined per section. The platform then executes against the guide for every participant in the study.
The shift that AI moderation introduces is not replacement of the discussion guide. It is consistent execution of it. A human moderator brings creativity but also fatigue, inconsistent probing, and unconscious bias — interview #1 and interview #50 in the same study read as if they were conducted by different people because they effectively were. AI moderation applies the same guide, with the same probing structure and the same depth, in every session of the study. The artifact becomes reliable.
Three proof points of how this shows up in practice:
- Identical structure across the sample. Every respondent moves through the same six sections in the same sequence, so cross-interview synthesis works on a like-for-like basis instead of fighting moderator drift.
- Laddering depth held consistently. Where a fatigued human moderator stops at the second or third “why,” the AI moderator stays in the laddering sequence until the respondent’s underlying reasoning surfaces.
- Adaptive thread-following inside the guide’s frame. The moderator pursues whatever the respondent surfaces, but it returns to uncovered guide topics rather than letting the conversation drift off the study.
Teams running in-depth interviews on the platform spend their time designing the guide and reviewing the findings, not running fieldwork. The guide is what teams iterate on; the platform handles execution at full sample size. For the broader use-case framing, see the user research solutions page.
Bottom line
A discussion guide is the difference between an IDI study that produces a usable findings doc and one that produces 20 unrelated transcripts. The six-section structure — opening, warm-up, behavioral grounding, deep dive, counter-factuals, wrap-up — holds across study types. The mistakes are predictable: leading questions, closed questions in qualitative settings, intent-assuming prompts. The fix is the same for all three: rewrite toward open, behavioral, non-judgmental framing.
If your team is running IDIs without a discussion guide, the cheapest quality improvement available is to write one. If your team has a guide that drifts in execution because human moderators inevitably vary across a study, the next leverage point is consistent execution at scale.