
How we compare interviews to ensure we improve our Synthetic Organic Parity — 85 to 92%
How do we know we are right? How do we know our Synthetic Users are as real as organic users? We compare.
At Synthetic Users we have a page stuck to our wall that simply reads: The best possible user interviews using LLMs. That’s great but how do we ensure they are indeed the best? How do we know we are right?
How we compare qualitative interviews
Step 0: Recruiting
We use various services to recruit users. Prolific.com Usertesting.com UserInterviews.com All we can say is that it’s painful. That participants don’t show up… but that this whole process is absolutely necessary for us. It’s the only way we know we’re getting better Synthetic Users.
Step 1: Running organic interviews
We used Lookback to run the interviews which are then transcribed.
In this case, the interviews aim at exploring the how primary and secondary school teachers in the UK incorporate technology into their classrooms.

Here is a zoomed out version. At the end of every interview we append the topics mentioned in that interview (effectively starting our codebook: a collection of themes/topics we can then compare with the Synthetic side).

Step 2: Identifying Common Themes
First, we identify the themes and topics that are common in both interviews.

Qualitative Analysis: We manually review the flow of conversation, use of idiomatic expressions, and overall readability. We call this glanceability and with it we are assessing wether the responses are relevant and appropriate for the questions asked, mirroring the real interview's context-responsiveness. To be honest, for the purposes of this comparison study we are initially more interested in content overlap. This is because we can always drill deeper with Synthetic Interviews if we feel the interview stays too shallow).Context Analysis: we determine how well the participant responds to the context provided in the interview questions. We do this through topic consistency comparing the number of topics brought up, first with these organic interviews, later with synthetic interviews.
Here’swith the manual process from a previous comparison study.

To make this more measurable we also run a Word and Phrase Frequency Analysis + N-gram Analysis where we compare the frequency of certain words or phrases to identify linguistic patterns. There is software out there like NVivo that can help you do this quicker.To perform a quantitative textual analysis we extract and count the occurrences of individual words and phrases that are relevant to the themes of both interviews. In this case we use a simple GPT. Given the length of the interviews, we focus on key terms related to their challenges, emotions, and solutions. It’s important to narrow down the criteria for parity otherwise the target is too large.
Step 3: We summarise the 8 reports into Organic Insights
If you run organic interviews on a regular basis, you are familiar with the process up to here.

Step 4: We run the same interview script, using a Custom Interview within Synthetic Users

Step 5: We then run the Synthetic Interviews.

You can see the 8 Synthetic Interviews on the right

Step 6: Quantifying Overlaps
Based on these common themes, we quantify the level of overlap. Given that these themes are quite fundamental to both interviews, they contribute significantly to the similarity score. If these were the only factors considered, the score might be quite high (around 90+), since these themes are central to both interviews.

A method to arrive at a parity score

Quantitatively calculating a parity number in this context involves subjective judgment since we're dealing with qualitative data. Here is a simple framework we use:
Thematic overlap (30%)
Depth and specificity of insights (30%)
Comprehensiveness of coverage (20%)
Qualitative alignment (20%)
In the case of this Comparison study, both sets of interviews score highly across all these criteria, with particularly strong performance in thematic overlap and qualitative alignment.
Thematic Overlap: Both reports cover essentially the same themes, including the use of technology, integration challenges, support and training, impact on learning, digital equity, and future aspirations. This comprehensive thematic overlap is a strong indicator of alignment in the core areas of interest.
Depth and Specificity of Insights: While the Organic report provides more specific examples and the Synthetic report offers broader insights, both approaches are complementary rather than contradictory. The Organic report's specificity enriches the Synthetic report's broader themes, making them more tangible and relatable. This complementary nature enhances the overall understanding of technology's role in education, suggesting a closer alignment than initially assessed.
Comprehensiveness and Coverage: Both reports, despite their differences in presentation, contribute to a holistic understanding of the current state and future potential of educational technology. They address not only the practical aspects of technology use but also the pedagogical, ethical, and strategic considerations. This comprehensive coverage further supports a higher parity number.
Qualitative Alignment: Beyond the visible topics, there's a qualitative alignment in the underlying sentiments and concerns expressed in both sets of insights. Issues like digital equity, the need for ongoing support and training, and the excitement for future technologies are universally acknowledged. This shared understanding and prioritization of key issues in education technology suggest a deeper alignment.
What have we learned from comparing interviews?
Synthetic Users at first glance lack the depth of organic users. You and I would be surprised if this wasn’t the case. One of the learnings has been to enrich our Synthetic Users with more personal accounts and challenges (it’s in the dataset, we just had to surface it). As of late Feb 2024 when you run interviews, you’ll find them much richer with personal accounts.
Follow on questions are really important. Don’t just give up at the first glance. Unlike with organic interviews, with Synthetic Users you can ask follow on questions. Drill down into areas where you feel the answers are too generalist.

c. Be specific and get specificity back. Specifying users’ qualities will provide more specific answers. Be as specific as you can be. This is specially relevant if you don’t want your Synthetic Users to default to more encompassing levels of consciousness, i.e. prioritising policy over personal budget.
PRO TIP: Start with a recent study you have top of mind and compare it yourself in order to gain confidence.
This is what we tell all our users when they are starting out. Try it, compare it with an existing study. Put the Synthetic interviews next to the organic and see how comfortable you feel about it.
Releated Articles
More articles for you

The Lie We Tell Ourselves About Customer Research
Most research asks what people say. The problem is people don't do what they say. This piece breaks down the gap between stated and revealed preference — and why behavioral modeling, not better interviews, is how you close it.

Two ways to run research with Synthetic Users and why the difference matters
Iris, what is the difference of using agents to accelerate research.

Synthetic Users vs digital twins
You don’t need a twin for “a parent in rural Ohio who shops weekly at Walmart, prefers fragrance-free, and has a toddler with eczema.” You sample a parent profile with relevant traits and constraints, add retail and dermatology context, and generate behaviors consistent with both.

Two major papers. One shared direction.
LLM-powered Synthetic Users have crossed from concept to validated method. This proves they can predict human behavior accurately, letting teams run fast, low-cost behavioral experiments without replacing real participants.

Gartner says we lead. That's kind of them.
Gartner’s latest report on AI-powered synthetic user research cites Synthetic Users as a leader.

Introducing Shuffle v2
Shuffle v2 is a feature that intelligently shuffles between multiple large language models via a routing agent to produce more realistic, diverse Synthetic Users with better organic parity.

Chain-of-feeling
Synthetic Users use a “chain-of-feeling” approach—combining emotional states with OCEAN personality traits—to produce more human-like, realistic user responses and yield richer UX insights.

Generative Agent Simulations of 1,000 People
A paper that thoroughly executes a parity study between Synthetic and Organic users.

21 Peer reviewed papers that support the Synthetic Users thesis
Here is a compilation of all the papers that help make a case for Synthetic Users.

Why we shuffle between models — to ensure both parity and diversity!
Synthetic Users balances aligned and unaligned models to maintain diversity and authenticity in simulated users while ensuring ethical standards and user expectations are met.

Latest press articles for Synthetic Users
Synthetic Users and AI are transforming research methodologies, offering innovative, cost-effective alternatives to traditional human subject studies.

Comparison studies. The opportunity lies in the deviation.
When we compare different studies, especially looking at what synthetic (artificial interviews) and organic (real-world interviews) data tell us, we often find they mostly talk about the same things but there's also a bit where they don't match up. This gap is super interesting because it's like finding hidden treasure in what we thought we knew versus what we might have missed.

How we deal with bias
Harnessing the power of AI in our Synthetic Users, we strive for a balance between reflecting reality and ethical responsibility, ensuring diversity and fairness while maintaining realism.

The transition to Continuous Insight
The transition towards Continuous Insight™ aligns research activities more closely with the dynamic needs of the business and ensures that product development is continuously informed by up-to-date user insights.

The Art of the Vibes Engine
Large language models (LLMs) like GPT-4 serve as powerful "vibes engines," empathizing with diverse groups and generating contextually relevant content. Their applications span market research, customer support, user experience design, and mental health support, offering invaluable insights and personalized experiences. While not infallible sources of truth, LLMs enable creativity, personalization, and connection within the realm of human language.

There is a faster and more accurate way to do research. Use Synthetic Users.
How Synthetic Users is changing the research process.

The wisdom of the silicon crowd
In the light of an ancient parable, we explore a new paper that dives into how ensembles of large language models match the prediction accuracy of human crowds. It reveals that combining machine predictions with human insights leads to the most robust forecasting results.

Three research papers that helped us build ❤️ Synthetic Users
For the sceptics amongst us who need more tangible research in order to engage with this brave new world. Full disclosure: we are part of the sceptics.

What is RAG and why it’s important for Synthetic Research
Ahead of our RAG launch we explain Retrieval-Augmented Generation (RAG) and how it enhances Synthetic Users by providing increased realism, contextual depth, and adaptive learning, with profound implications for market research, user experience testing, training, education, and innovative product development.

Synthetic Users system architecture (the simplified version).
Foundation models underpin Synthetic Users with advanced capabilities, enhanced by synthetic data and RAG layers for realism and business alignment, all within a collaborative multi-agent framework for richer interactions.

Saturation score. How do we know how many interviews to run?
Determine your interview target for achieving topic saturation using our efficient approach, leveraging the historical wisdom of research pioneers. This method ensures deep insights with theoretical sampling at its core.

How Synthetic Users are gaining depth
Synthetic Users are evolving to address criticism about their generalist nature by incorporating representative data sets and personal narratives.

How we compare interviews to ensure we improve our Synthetic Organic Parity — 85 to 92%
How do we know we are right? How do we know our Synthetic Users are as real as organic users? We compare.

Synthetic Users: Merging Qualitative and Quantitative Research, in seconds.
At Synthetic users we are blurring the lines between qualitative and quantitative research. Here's how we are going about this transformative approach.