Three research papers that helped us build ❤️ Synthetic Users

For the sceptics amongst us who need more tangible research in order to engage with this brave new world. Full disclosure: we are part of the sceptics.

Three papers that we have relied upon whilst developing our product. If you wish to understand the power of LLMs regarding their ability to simulate human behaviour, both at the individual and collective (market) level, these three papers and their research will change your mind. We have no direct connection with the authors.

‍

E pluribus unum

‍

In September 2022, Lisa Argyle, Ethan C. Busby, Nancy Fulda, Joshua Gubler, Christopher Rytting and David Wingate wrote the seminal paper: Out of One, Many: Using Language Models to Simulate Human Samples. It had a profound influence on our team.

‍

Language Models as Social Science Tools: The paper explores the potential of using language models, as proxies for human sub-populations in social science research. This approach could revolutionize how researchers study human attitudes, behaviors, and demographics by simulating a wide variety of human responses.
‍
Algorithmic Fidelity and Silicon Sampling: The concept of "algorithmic fidelity" is introduced, referring to the degree to which a language model's outputs accurately reflect the complex patterns of ideas, attitudes, and socio-cultural contexts found in human populations. Through "silicon sampling," where the model is conditioned on socio-demographic backstories, Foundation models demonstrate a high degree of algorithmic fidelity, closely mirroring human response distributions across various studies.
‍
Applications and Implications: The findings suggest that language models can be effectively used in social science research to generate insights into human society, potentially reducing the need for costly and time-consuming data collection from human subjects.

‍

Homo Silicus

‍

The second paper that we recommend is John J. Horton’s Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?

‍

Large Language Models (LLMs) can simulate human economic behavior (termed "homo silicus") in various scenarios, allowing researchers to explore economic theories and behaviors through computational simulations. This approach can generate insights into human decision-making processes by endowing these models with specific preferences, information, and scenarios, then observing their choices.
‍
Experiments with LLMs can replicate and extend findings from traditional behavioral economics studies, offering a cost-effective and rapid method to pilot studies and explore variations in experimental design. This includes manipulating the models' endowments (e.g., fairness, self-interest, political views) to observe changes in decision-making, which can help in understanding the underlying factors influencing human economic behaviors.
‍
While promising, using LLMs for social science research also raises questions about the representativeness and validity of the findings, given the models' training on vast, uncurated text data. However, the ability to condition these models to adopt various personas and the low cost of experimentation make them a valuable tool for preliminary research, hypothesis generation, and exploring the parameter space of economic behaviors.

‍

Synthetic Users learning: Front-load AI into your research as oposed to just at the end to summarise transcripts and findings.

‍

Using GPT for Market Research

‍

In the Summer of 2023, James Brand (Microsoft), Ayelet Israeli (Harvard Business School - Marketing Unit) and Donald Ngwe (Microdoft) published this paper. In it, through various experiments, they come to the following conclusions:

‍

LLMs Mimic Consumer Behavior: Research shows that Generative Pre-trained Transformer models can simulate consumer responses to market research surveys, reflecting well-known consumer behavior patterns such as price sensitivity and brand preference. This suggests LLMs can be valuable tools for understanding consumer preferences without the need for traditional surveys.
‍

Realistic Estimates of Consumer Preferences: responses to market research prompts provide realistic estimates of willingness to pay for products and features. These estimates align closely with those obtained from actual consumer surveys, indicating that LLMs can offer a fast, cost-effective alternative to traditional market research methods.
‍
Guidelines for Effective Use: Successful application of LLMs in market research requires careful prompt design to elicit meaningful responses. Factors such as response order, specificity of the requested output, and inclusion of an option not to purchase can significantly influence the quality of the data obtained from LLMs. We are following these guidelines as we roll out Surveys with Synthetic Users.

‍

LLMs offer market insights into consumer preferences with efficiency and at a lower cost than traditional methods. This is where Synthetic Users comes into play. By leveraging the capabilities of LLMs, Synthetic Users can provide businesses and researchers with rich, simulated qualitative data that mirrors real human responses. This is particularly useful in the early stages of product development, market research, and social science studies, where gathering large volumes of qualitative data from human participants is prohibitively expensive and time-consuming.

‍

Synthetic Users' technology, inspired by the principles outlined in these papers, offers a glimpse into the future of research and market analysis, where simulated data complements or even substitute traditional methods, making research more accessible and efficient.

‍

February 11, 2026

The Lie We Tell Ourselves About Customer Research

We don’t buy because of what we say — we buy because of what our subconscious decides.

Three research papers that helped us build ❤️ Synthetic Users

E pluribus unum

Homo Silicus

Using GPT for Market Research

More articles curated for you

The Lie We Tell Ourselves About Customer Research

Two ways to run research with Synthetic Users and why the difference matters

Building Synthetic Users with "Guts First": Why Logic Isn't Enough

Synthetic Users vs digital twins

Two major papers. One shared direction.

Gartner says we lead. That's kind of them.

Introducing Shuffle v2

Chain-of-feeling

Generative Agent Simulations of 1,000 People

21 Peer reviewed papers that support the Synthetic Users thesis

Why we shuffle between models — to ensure both parity and diversity!

Latest press articles for Synthetic Users

How we deal with bias

The Art of the Vibes Engine

Comparison studies. The opportunity lies in the deviation.

The transition to Continuous Insight

There is a faster and more accurate way to do research. Use Synthetic Users.

The wisdom of the silicon crowd

What is RAG and why it’s important for Synthetic Research

Synthetic Users system architecture (the simplified version).

Saturation score. How do we know how many interviews to run?

Three research papers that helped us build ❤️ Synthetic Users

How Synthetic Users are gaining depth

How we compare interviews to ensure we improve our Synthetic Organic Parity — 85 to 92%

Synthetic Users: Merging Qualitative and Quantitative Research, in seconds.