The AI Customer Research Stack in 2026: A Real Process
Customer research has a backlog problem. Research teams, when they exist, are perpetually behind. Product teams move fast, ship things, and by the time formal research catches up, the team has already made the decision based on intuition. Research either informs the work or it doesn't.
The AI tools available for research in 2026 don't fix the fundamental tension between research timelines and product timelines. But they compress the synthesis phase significantly. The part where you sit with 20 interview transcripts and try to pattern-match themes used to take days. With the right tools, it takes a few hours.
This guide covers the three-tool stack that research and product teams are actually using: Notably, Maze, and Claude. Plus notes on where each breaks down and what you still have to do yourself.
The three tools and what they handle
Notably: Qualitative research synthesis. Upload interview recordings or transcripts, tag themes, identify patterns across multiple sessions. The AI features help with auto-tagging and pattern surfacing.
Maze AI: Unmoderated usability testing, survey distribution, quantitative research at scale. The AI layer helps with question suggestions, analysis, and insight generation from test results.
Claude: Synthesis and communication. Takes the structured outputs from Notably and Maze and turns them into stakeholder reports, product briefs, and decision-ready insights.
Notably for qualitative synthesis
Notably is the most purpose-built tool in this stack. It's designed specifically for qualitative research, which puts it ahead of general-purpose tools for synthesis work.
Pricing: The Starter plan is $29/month per editor. The Team plan is $49/month per editor. Most small research teams can work with Starter unless they need the collaboration features in Team.
What it actually does:
You upload interview recordings, transcripts (from Zoom, Otter.ai, or wherever you record), or notes, and Notably organizes everything in a shared workspace. The AI features are the main differentiation:
Auto-tagging: Notably identifies moments in transcripts where participants mention themes, problems, or emotions and suggests tags. It's not 100% accurate but it's accurate enough that reviewing and accepting/rejecting suggestions is faster than manually reading and tagging from scratch.
Insight surfacing: After you've tagged across multiple sessions, Notably surfaces patterns: "this theme appeared in 8 of 12 interviews" type summaries. The automated insight generation is useful for a first pass but you should always verify the connections yourself before presenting to stakeholders.
Highlight reels: Notably lets you create video highlight clips from interview recordings by selecting transcript moments. This is genuinely useful for sharing evidence with product and engineering teams who won't read a 20-page synthesis document but will watch a 4-minute highlight reel of customers describing their pain.
Where Notably falls short: It's a tool for organizing and surfacing what's in your data, not interpreting it. The AI identifies patterns, but the judgment about which patterns matter and what they mean for product decisions is still the researcher's job. A junior researcher who doesn't know what questions to ask going in will get organized noise, not useful insight.
The practical workflow:
- Upload 10 to 15 interview transcripts for a research project.
- Let Notably auto-tag on your predefined theme codes.
- Review and correct the auto-tags (30 to 45 minutes for 15 interviews vs. 3+ hours manually).
- Review the surfaced patterns.
- Export the tagged insights.
Maze for quantitative and usability research
Maze handles the quantitative side: unmoderated usability tests, concept tests, card sorting, tree testing, and surveys. It's a full research platform with an AI layer added on top.
Pricing: The Starter plan is free but limited to 5 participant responses per study. The Team plan is $99/month and covers most research team needs. Organizations plan is custom pricing.
What the AI features add:
Question suggestions: When building a test or survey, Maze's AI suggests follow-up questions based on your research goals. These suggestions are hit-or-miss but occasionally surface angles you hadn't considered. Worth scanning.
Metrics interpretation: After a usability test, Maze's AI provides an interpretation of the task success rates, time-on-task data, and satisfaction scores. The interpretation is basic and you'd reach the same conclusions manually, but it writes a readable summary that you can use as a starting point for stakeholder reports.
Heatmaps and click path analysis: Standard usability research features that Maze now presents with AI-generated summaries of where users clicked and what that suggests about navigation expectations.
Study templates: Maze has AI-generated study templates for common research goals (first-click testing, 5-second tests, navigation testing). These are useful if you're newer to usability research and want a structured starting point.
Where Maze works best: Early-stage concept validation and usability testing where you need data quickly and can accept the tradeoffs of unmoderated testing (participants can be confused by instructions, some sessions are unusable, you lose the ability to ask follow-up questions in the moment).
Where to use something else: Deep explorative research, sensitive topics, anything requiring significant probe questions in response to what participants say. For those, moderated research with a real interviewer produces better data. Maze is for speed and scale, not depth.
Claude for synthesis and communication
This is where the three-tool stack comes together. Claude Pro at $20/month is the tool that takes structured outputs from Notably and Maze and turns them into the artifacts that actually influence product decisions.
Research synthesis reports:
Export your insights from Notably (tagged themes, key quotes, frequency data) and your metrics summary from Maze (task completion rates, key findings) and paste them into Claude.
Ask Claude to write a research synthesis report in a specific format: executive summary (3 bullets), key findings (5 to 7 findings with supporting evidence), implications for product decisions, and open questions for future research.
Give it the specific decisions the research was meant to inform: "This research was commissioned to understand why users drop off in the onboarding flow between step 2 and step 3. The product team needs to decide whether to change the flow design or simplify the copy."
Claude's synthesis will be as good as the inputs you give it. If you paste in well-tagged, clean data, it produces a solid first draft of a research report. If you paste in raw, unstructured notes, it produces a messier summary.
Stakeholder presentations:
After generating the synthesis, ask Claude to write the talking points for a 10-minute stakeholder presentation of the research findings. This is usually 5 to 8 bullet points per finding with the key quote or data point that makes each finding credible.
For non-research audiences (product managers, founders, engineering leads), framing matters as much as content. Claude can translate research language into product language: "users find the onboarding confusing" becomes "3 of 5 users couldn't identify what to do after signing up, and the help text was read by 0 of 5 users in our test."
Survey question review:
Before you run a survey through Maze, paste your draft questions into Claude and ask it to review for leading questions, double-barreled questions, and response scale issues. Claude catches problems that are easy to miss when you've been staring at your own questions for too long.
The full research workflow
Here's how the stack works in practice for a typical product research project:
Week 1: Planning
- Define research questions and participant criteria.
- Use Claude to draft a discussion guide for interviews and review for question quality.
- Set up a Maze study for any quantitative validation you need.
Week 2: Data collection
- Run 10 to 12 user interviews (still requires a human interviewer for qualitative work).
- Transcribe with Otter.ai or Zoom AI transcription.
- Distribute Maze study to your panel or recruiting pool.
Week 3: Synthesis
- Upload transcripts to Notably.
- Auto-tag and manually review/correct tags (3 to 4 hours instead of 2 days).
- Review Maze results and AI-generated metrics summary.
- Export everything to Claude.
- Generate synthesis report draft (1 to 2 hours).
- Edit for accuracy and add researcher interpretation.
Week 4: Communication
- Present findings to product team with Claude-assisted talking points.
- Share highlight reel from Notably with clips of key moments.
- File synthesis report in research repository.
Total researcher time for a 10-session project: roughly 30 to 35 hours instead of 55 to 65 hours. Most of the savings are in synthesis.
The costs
| Tool | Plan | Monthly |
|---|---|---|
| Notably | Starter | $29 |
| Maze | Team | $99 |
| Claude | Pro | $20 |
| Total | $148/month |
For a full-time researcher or a product team doing regular research, $148/month is justified quickly. A single round of research that saves 20 hours of researcher time pays for months of subscriptions.
For infrequent research (one project per quarter), the cost calculus is different. Maze charges per month, not per project, so if you're doing one study in February and nothing until May, you're paying for idle months. Consider using Maze's free tier for lower-volume needs.
What the tools won't do
They won't tell you what questions to ask. The research plan still requires someone who understands what the product team needs to decide and how to design a study that answers those questions. Good research starts before any of these tools are opened.
They won't replace interviewer skill. In qualitative research, the follow-up question, the moment you notice the participant's hesitation and ask about it, the ability to distinguish what someone says from what they mean, these are irreplaceable. Unmoderated tools like Maze are useful for some research questions and wrong for others.
They won't validate your sampling. Getting research participants who actually represent your target users is a hard problem that AI tools don't address. Synthesizing data from unrepresentative participants faster still gives you unrepresentative data.
The value of this stack is that it makes the part of research that doesn't require human judgment much faster. That frees researchers to do more of the work that does require judgment: design, facilitation, interpretation, and communication.