
Tutorials
Enrich your Synthetic Users with your data. RAG tutorial.
Learn how to enrich Synthetic Users with your own data using Retrieval-Augmented Generation (RAG) — making your AI participants more context-aware and more accurate.
What is RAG, in plain terms? Think of it as handing your Synthetic Users a briefing pack before the interview. You upload documents you trust — previous interviews, past surveys, consumer insight reports — and whenever a Synthetic User answers a question, the system first looks through that pack, pulls out the most relevant passages, and uses them to ground its response. That's all "Retrieval-Augmented Generation" means: the model retrieves the right bits of your material, then generates its answer with them in hand.
Why does this matter? Large language models, left to their own devices, answer from what they learned during training — which is broad, but not specific to your customers, your category, or your past research. By anchoring Synthetic Users in data you've curated, you reduce the chance of generic or off-target responses and make sure the findings reflect your unique context.
One honest caveat: the frontier models we use (we shuffle between several) were trained on a vast amount of internet data spanning roughly the last 25 years, so portions of your documents may already exist in their training set. In that case, uploading them doesn't introduce new information so much as it adds trust and control — you're telling the system, explicitly, "prioritize this." Either way, your data takes precedence.
Where to find it. RAG has a dedicated tab inside every project (1) — click Upload File (2) to add documents. RAG data is applied per project; it never crosses over into your other projects.

You can also add context files from the audience panel while planning a study. Open the Context files (RAG) section (2), then upload a document or pick one already in your project (3). Supported formats include PDF, PPTX, DOCX, XLSX, ODP, ODT, and ODS, up to 10MB per file — we're agnostic about format, so feel free to upload your data in an unstructured manner.
The context space indicator shows how much room you have left. A word of advice: more isn't always better — overloading the context can make your Synthetic Users less reliable, so upload what's relevant rather than everything you have.

Questions? Book time with our team, or email us at support@syntheticusers.com.