Why we shuffle between models — to ensure both parity and diversity!

Synthetic Users balances aligned and unaligned models to maintain diversity and authenticity in simulated users while ensuring ethical standards and user expectations are met.

A recent paper, "Creativity Has Left the Chat: The Price of Debiasing Language Models" by Behnam Mohammadi, brings to light a critical issue: the process of alignment, particularly through Reinforcement Learning from Human Feedback (RLHF), while reducing biases and toxic content, also significantly diminishes the diversity and creativity of these models. This poses a unique challenge for services like Synthetic Users, which rely on LLMs to create realistic and varied users for in-depth interviews and surveys.

The findings from Mohammadi’s research are eye-opening. Aligned models, those refined through RLHF to adhere closely to human values, exhibit lower entropy in their outputs, form distinct clusters in the embedding space, and tend towards attractor states, which essentially limit the range of possible responses. This lack of syntactic and semantic diversity can lead to repetitive patterns and homogeneity in the generated users, which undermines the richness and variability essential for comprehensive user research. In contrast, base models, which are unaligned, maintain a higher degree of flexibility and creativity, producing more diverse and varied outputs. This presents a dilemma: how do we harness the safety and reliability of aligned models while preserving the creative potential necessary for generating diverse synthetic users?

At Synthetic Users, we recognize the importance of balancing these two facets to ensure what we term Synthetic Organic Parity (SOP). SOP is crucial for creating synthetic users that not only reflect real human diversity but also adhere to ethical standards and user expectations. To achieve this, we have adopted a hybrid approach that shuffles between aligned and non-aligned models. This strategy allows us to leverage the strengths of both types of models. By alternating between the consistency of aligned models and the creativity of base models, we ensure that our synthetic users remain diverse, realistic, and free from harmful biases.

This dual approach not only enhances the authenticity of our simulated interviews and surveys but also upholds the integrity of the data collected. It enables us to provide richer insights into user behavior and preferences. As we continue to innovate in this space, we will continue refining methods to balance alignment and diversity, ensuring that our synthetic users remain both diverse and ethically sound. This is the future of AI-driven user and market research – a future where diversity and alignment coexist, driving deeper understanding and better outcomes for all. That ending was a bit vanilla, but you get the picture.