
Three research papers that helped us build ❤️ Synthetic Users
For the sceptics amongst us who need more tangible research in order to engage with this brave new world. Full disclosure: we are part of the sceptics.
Three papers that we have relied upon whilst developing our product. If you wish to understand the power of LLMs regarding their ability to simulate human behaviour, both at the individual and collective (market) level, these three papers and their research will change your mind. We have no direct connection with the authors.
E pluribus unum
In September 2022, Lisa Argyle, Ethan C. Busby, Nancy Fulda, Joshua Gubler, Christopher Rytting and David Wingate wrote the seminal paper: Out of One, Many: Using Language Models to Simulate Human Samples. It had a profound influence on our team.
Language Models as Social Science Tools: The paper explores the potential of using language models, as proxies for human sub-populations in social science research. This approach could revolutionize how researchers study human attitudes, behaviors, and demographics by simulating a wide variety of human responses.
Algorithmic Fidelity and Silicon Sampling: The concept of "algorithmic fidelity" is introduced, referring to the degree to which a language model's outputs accurately reflect the complex patterns of ideas, attitudes, and socio-cultural contexts found in human populations. Through "silicon sampling," where the model is conditioned on socio-demographic backstories, Foundation models demonstrate a high degree of algorithmic fidelity, closely mirroring human response distributions across various studies.
Applications and Implications: The findings suggest that language models can be effectively used in social science research to generate insights into human society, potentially reducing the need for costly and time-consuming data collection from human subjects.
Homo Silicus
The second paper that we recommend is John J. Horton’s Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?
Large Language Models (LLMs) can simulate human economic behavior (termed "homo silicus") in various scenarios, allowing researchers to explore economic theories and behaviors through computational simulations. This approach can generate insights into human decision-making processes by endowing these models with specific preferences, information, and scenarios, then observing their choices.
Experiments with LLMs can replicate and extend findings from traditional behavioral economics studies, offering a cost-effective and rapid method to pilot studies and explore variations in experimental design. This includes manipulating the models' endowments (e.g., fairness, self-interest, political views) to observe changes in decision-making, which can help in understanding the underlying factors influencing human economic behaviors.
While promising, using LLMs for social science research also raises questions about the representativeness and validity of the findings, given the models' training on vast, uncurated text data. However, the ability to condition these models to adopt various personas and the low cost of experimentation make them a valuable tool for preliminary research, hypothesis generation, and exploring the parameter space of economic behaviors.
Synthetic Users learning: Front-load AI into your research as oposed to just at the end to summarise transcripts and findings.
Using GPT for Market Research
In the Summer of 2023, James Brand (Microsoft), Ayelet Israeli (Harvard Business School - Marketing Unit) and Donald Ngwe (Microdoft) published this paper. In it, through various experiments, they come to the following conclusions:
LLMs Mimic Consumer Behavior: Research shows that Generative Pre-trained Transformer models can simulate consumer responses to market research surveys, reflecting well-known consumer behavior patterns such as price sensitivity and brand preference. This suggests LLMs can be valuable tools for understanding consumer preferences without the need for traditional surveys.
Realistic Estimates of Consumer Preferences: responses to market research prompts provide realistic estimates of willingness to pay for products and features. These estimates align closely with those obtained from actual consumer surveys, indicating that LLMs can offer a fast, cost-effective alternative to traditional market research methods.
Guidelines for Effective Use: Successful application of LLMs in market research requires careful prompt design to elicit meaningful responses. Factors such as response order, specificity of the requested output, and inclusion of an option not to purchase can significantly influence the quality of the data obtained from LLMs. We are following these guidelines as we roll out Surveys with Synthetic Users.
LLMs offer market insights into consumer preferences with efficiency and at a lower cost than traditional methods. This is where Synthetic Users comes into play. By leveraging the capabilities of LLMs, Synthetic Users can provide businesses and researchers with rich, simulated qualitative data that mirrors real human responses. This is particularly useful in the early stages of product development, market research, and social science studies, where gathering large volumes of qualitative data from human participants is prohibitively expensive and time-consuming.
Synthetic Users' technology, inspired by the principles outlined in these papers, offers a glimpse into the future of research and market analysis, where simulated data complements or even substitute traditional methods, making research more accessible and efficient.
Releated Articles
More articles for you

The Lie We Tell Ourselves About Customer Research
Most research asks what people say. The problem is people don't do what they say. This piece breaks down the gap between stated and revealed preference — and why behavioral modeling, not better interviews, is how you close it.

Two ways to run research with Synthetic Users and why the difference matters
Iris, what is the difference of using agents to accelerate research.

Synthetic Users vs digital twins
You don’t need a twin for “a parent in rural Ohio who shops weekly at Walmart, prefers fragrance-free, and has a toddler with eczema.” You sample a parent profile with relevant traits and constraints, add retail and dermatology context, and generate behaviors consistent with both.

Two major papers. One shared direction.
LLM-powered Synthetic Users have crossed from concept to validated method. This proves they can predict human behavior accurately, letting teams run fast, low-cost behavioral experiments without replacing real participants.

Gartner says we lead. That's kind of them.
Gartner’s latest report on AI-powered synthetic user research cites Synthetic Users as a leader.

Introducing Shuffle v2
Shuffle v2 is a feature that intelligently shuffles between multiple large language models via a routing agent to produce more realistic, diverse Synthetic Users with better organic parity.

Chain-of-feeling
Synthetic Users use a “chain-of-feeling” approach—combining emotional states with OCEAN personality traits—to produce more human-like, realistic user responses and yield richer UX insights.

Generative Agent Simulations of 1,000 People
A paper that thoroughly executes a parity study between Synthetic and Organic users.

21 Peer reviewed papers that support the Synthetic Users thesis
Here is a compilation of all the papers that help make a case for Synthetic Users.

Why we shuffle between models — to ensure both parity and diversity!
Synthetic Users balances aligned and unaligned models to maintain diversity and authenticity in simulated users while ensuring ethical standards and user expectations are met.

Latest press articles for Synthetic Users
Synthetic Users and AI are transforming research methodologies, offering innovative, cost-effective alternatives to traditional human subject studies.

Comparison studies. The opportunity lies in the deviation.
When we compare different studies, especially looking at what synthetic (artificial interviews) and organic (real-world interviews) data tell us, we often find they mostly talk about the same things but there's also a bit where they don't match up. This gap is super interesting because it's like finding hidden treasure in what we thought we knew versus what we might have missed.

How we deal with bias
Harnessing the power of AI in our Synthetic Users, we strive for a balance between reflecting reality and ethical responsibility, ensuring diversity and fairness while maintaining realism.

The transition to Continuous Insight
The transition towards Continuous Insight™ aligns research activities more closely with the dynamic needs of the business and ensures that product development is continuously informed by up-to-date user insights.

The Art of the Vibes Engine
Large language models (LLMs) like GPT-4 serve as powerful "vibes engines," empathizing with diverse groups and generating contextually relevant content. Their applications span market research, customer support, user experience design, and mental health support, offering invaluable insights and personalized experiences. While not infallible sources of truth, LLMs enable creativity, personalization, and connection within the realm of human language.

There is a faster and more accurate way to do research. Use Synthetic Users.
How Synthetic Users is changing the research process.

The wisdom of the silicon crowd
In the light of an ancient parable, we explore a new paper that dives into how ensembles of large language models match the prediction accuracy of human crowds. It reveals that combining machine predictions with human insights leads to the most robust forecasting results.

Three research papers that helped us build ❤️ Synthetic Users
For the sceptics amongst us who need more tangible research in order to engage with this brave new world. Full disclosure: we are part of the sceptics.

What is RAG and why it’s important for Synthetic Research
Ahead of our RAG launch we explain Retrieval-Augmented Generation (RAG) and how it enhances Synthetic Users by providing increased realism, contextual depth, and adaptive learning, with profound implications for market research, user experience testing, training, education, and innovative product development.

Synthetic Users system architecture (the simplified version).
Foundation models underpin Synthetic Users with advanced capabilities, enhanced by synthetic data and RAG layers for realism and business alignment, all within a collaborative multi-agent framework for richer interactions.

Saturation score. How do we know how many interviews to run?
Determine your interview target for achieving topic saturation using our efficient approach, leveraging the historical wisdom of research pioneers. This method ensures deep insights with theoretical sampling at its core.

How Synthetic Users are gaining depth
Synthetic Users are evolving to address criticism about their generalist nature by incorporating representative data sets and personal narratives.

How we compare interviews to ensure we improve our Synthetic Organic Parity — 85 to 92%
How do we know we are right? How do we know our Synthetic Users are as real as organic users? We compare.

Synthetic Users: Merging Qualitative and Quantitative Research, in seconds.
At Synthetic users we are blurring the lines between qualitative and quantitative research. Here's how we are going about this transformative approach.