The wisdom of the silicon crowd

In the light of an ancient parable, we explore a new paper that dives into how ensembles of large language models match the prediction accuracy of human crowds. It reveals that combining machine predictions with human insights leads to the most robust forecasting results.

The study "Wisdom of the Silicon Crowd" examines how ensembles of large language models (LLMs) compare to human crowds in forecasting accuracy. It introduces a novel ensemble approach where predictions from multiple LLMs are aggregated, demonstrating that this method achieves forecasting results on par with those from a large group of human forecasters. This finding underscores the potential of LLMs to replicate the "wisdom of the crowd" effect traditionally observed in human groups. In other words, aggregated predictions outperform individual forecasts.

A Hybrid approach is the correct one

The question then posed is whether LLM forecasts can be enhanced by integrating human-generated predictions. The results indicate that while LLM predictions improve when exposed to human forecasts, the most effective strategy is a hybrid approach, combining both human and machine predictions.

The study showed that the LLMs’ predictions could be improved by incorporating median human predictions, which resulted in a notable increase in accuracy.

John Horton had mentioned this before the study came out.

__wf_reserved_inherit

This suggests a promising direction for leveraging both human intuition and machine efficiency in decision-making processes.

Within Synthetic Users we advocate this approach. Organic users are not to be underestimated. A Hybrid approach is by far the best way forward.

__wf_reserved_inherit

The introduction and methodology sections of the study provided a detailed background on the increasing capabilities of LLMs in various complex and economically valuable tasks. The models used were diverse, ranging from GPT-4 to smaller models, each with unique training data and parameters, which could potentially enhance prediction accuracy by reducing individual biases and errors. The study also highlighted the importance of using real-world forecasting questions to ensure external validity and practical applicability of the findings. Hypotheticals have little value in this space.

__wf_reserved_inherit

LLMs perform at par with human crowds in forecasting. The "wisdom of the silicon crowd" indicates a significant step forward in the utilization of artificial intelligence in complex cognitive tasks like forecasting, opening up new avenues for future research and application.

The Blind man and the Elephant

There’s an elephant in the room, surrounded by blind men. In this ancient story, several blind men each touch a different part of an elephant and then attempt to describe the entire animal based on their limited experience. The man who feels the elephant's tusk insists it must be like a spear, while the one who feels its side is certain it is like a wall. Each perspective is both valid and limited.

The parable of the blind men and an elephant is a great metaphor when considering the limitations of both human and machine forecasting: go hybrid (for now). Let’s be honest, it’s just a matter of time before the steering wheel in most cars becomes obsolete, but for the time being, we still need it.

We tend to claim absolute truth based on our limited subjective experience, disregarding other people's subjective experiences. Both human intuition and machine learning can fall into this trap. Humans often rely on personal experiences or biases, while machines can overfit to their training data, failing to generalize to real-world situations.

In the context of forecasting, the parable underscores the value of the hybrid approach. Just as a more accurate understanding of the elephant comes from integrating all the blind men's perspectives, a more accurate forecast comes from integrating both human and machine predictions.

We can mitigate the limitations of each if we combine both, much like the blind men would have a more complete understanding of the elephant if they shared their experiences. This hybrid approach, which leverages the strengths of both human intuition and machine efficiency, holds the promise of providing a more comprehensive and accurate picture in complex forecasting tasks.

Releated Articles

More articles for you

The Lie We Tell Ourselves About Customer Research

Most research asks what people say. The problem is people don't do what they say. This piece breaks down the gap between stated and revealed preference — and why behavioral modeling, not better interviews, is how you close it.

Two ways to run research with Synthetic Users and why the difference matters

Iris, what is the difference of using agents to accelerate research.

Synthetic Users vs digital twins

You don’t need a twin for “a parent in rural Ohio who shops weekly at Walmart, prefers fragrance-free, and has a toddler with eczema.” You sample a parent profile with relevant traits and constraints, add retail and dermatology context, and generate behaviors consistent with both.

Two major papers. One shared direction.

LLM-powered Synthetic Users have crossed from concept to validated method. This proves they can predict human behavior accurately, letting teams run fast, low-cost behavioral experiments without replacing real participants.

Gartner says we lead. That's kind of them.

Gartner’s latest report on AI-powered synthetic user research cites Synthetic Users as a leader.

Introducing Shuffle v2

Shuffle v2 is a feature that intelligently shuffles between multiple large language models via a routing agent to produce more realistic, diverse Synthetic Users with better organic parity.

Chain-of-feeling

Synthetic Users use a “chain-of-feeling” approach—combining emotional states with OCEAN personality traits—to produce more human-like, realistic user responses and yield richer UX insights.

Generative Agent Simulations of 1,000 People

A paper that thoroughly executes a parity study between Synthetic and Organic users.

Cover image for the article: 21 peer-reviewed papers supporting the Synthetic Users thesis

21 Peer reviewed papers that support the Synthetic Users thesis

Here is a compilation of all the papers that help make a case for Synthetic Users.

Why we shuffle between models — to ensure both parity and diversity!

Synthetic Users balances aligned and unaligned models to maintain diversity and authenticity in simulated users while ensuring ethical standards and user expectations are met.

Latest press articles for Synthetic Users

Synthetic Users and AI are transforming research methodologies, offering innovative, cost-effective alternatives to traditional human subject studies.

Comparison studies. The opportunity lies in the deviation.

When we compare different studies, especially looking at what synthetic (artificial interviews) and organic (real-world interviews) data tell us, we often find they mostly talk about the same things but there's also a bit where they don't match up. This gap is super interesting because it's like finding hidden treasure in what we thought we knew versus what we might have missed.

How we deal with bias

Harnessing the power of AI in our Synthetic Users, we strive for a balance between reflecting reality and ethical responsibility, ensuring diversity and fairness while maintaining realism.

The transition to Continuous Insight

The transition towards Continuous Insight™ aligns research activities more closely with the dynamic needs of the business and ensures that product development is continuously informed by up-to-date user insights.

The Art of the Vibes Engine

Large language models (LLMs) like GPT-4 serve as powerful "vibes engines," empathizing with diverse groups and generating contextually relevant content. Their applications span market research, customer support, user experience design, and mental health support, offering invaluable insights and personalized experiences. While not infallible sources of truth, LLMs enable creativity, personalization, and connection within the realm of human language.

There is a faster and more accurate way to do research. Use Synthetic Users.

How Synthetic Users is changing the research process.

The wisdom of the silicon crowd

In the light of an ancient parable, we explore a new paper that dives into how ensembles of large language models match the prediction accuracy of human crowds. It reveals that combining machine predictions with human insights leads to the most robust forecasting results.

Three research papers that helped us build ❤️ Synthetic Users

For the sceptics amongst us who need more tangible research in order to engage with this brave new world. Full disclosure: we are part of the sceptics.

What is RAG and why it’s important for Synthetic Research

Ahead of our RAG launch we explain Retrieval-Augmented Generation (RAG) and how it enhances Synthetic Users by providing increased realism, contextual depth, and adaptive learning, with profound implications for market research, user experience testing, training, education, and innovative product development.

Synthetic Users system architecture (the simplified version).

Foundation models underpin Synthetic Users with advanced capabilities, enhanced by synthetic data and RAG layers for realism and business alignment, all within a collaborative multi-agent framework for richer interactions.

Saturation score. How do we know how many interviews to run?

Determine your interview target for achieving topic saturation using our efficient approach, leveraging the historical wisdom of research pioneers. This method ensures deep insights with theoretical sampling at its core.

How Synthetic Users are gaining depth

Synthetic Users are evolving to address criticism about their generalist nature by incorporating representative data sets and personal narratives.

How we compare interviews to ensure we improve our Synthetic Organic Parity — 85 to 92%

How do we know we are right? How do we know our Synthetic Users are as real as organic users? We compare.

Synthetic Users: Merging Qualitative and Quantitative Research, in seconds.

At Synthetic users we are blurring the lines between qualitative and quantitative research. Here's how we are going about this transformative approach.

Signup to our newsletter

AI-powered user research platform that replaces traditional participant recruitment with synthetic agents. Get research-grade insights in minutes, not weeks.

© 2026 Synthetic Users Inc.

Signup to our newsletter

AI-powered user research platform that replaces traditional participant recruitment with synthetic agents. Get research-grade insights in minutes, not weeks.

© 2026 Synthetic Users Inc.

Signup to our newsletter

AI-powered user research platform that replaces traditional participant recruitment with synthetic agents. Get research-grade insights in minutes, not weeks.

© 2026 Synthetic Users Inc.