Soak. Get to saturation faster
Rapid, reproducible analysis of qualitative data using LLMs.
Request Beta Access Sign In Find out moreTransparent and privacy-friendly
Soak is free to use and open to community contributions. The core system is open source and available on GitHub. Run analyses on your own computer or in a secure, managed environment.
Unlike interactive AI sessions, Soak sticks to the process. Analyses are transparent and the entire process can be audited by reviewers or shared with colleagues.
Expert-directed
Soak is designed to encode genuine expertise in qualitative data analysis. Rather than ad-hoc queries, Soak pipelines are carefully designed and validated, building on existing literature in AI-assisted analyses and allowing domain experts to direct and shape the analysis according to the specific research questions of interest.
Analysis at scale
Even LLMs have limits to their attention. Growing evidence suggests that LLMs exhibit peak/recency effects in their attention1 which makes it important to present large datasets in manageable chunks. Soak automatically splits data in sensible ways to ensure that the LLM can focus on the most important aspects of the data. Codes and themes generated during independent readings of hundreds or thousands of documents are clustered and consolidated to ensure that nothing important is lost.
Trust but verify
Hallucination and misrepresentation are fundamental risks in LLM-based analyses1, but ad-hoc analytic sessions make it hard to detect errors and correct them. Soak provides tools to thoroughly cross-check all quotes and codes, ensuring that analyses are based on what participants actually said. All quotes and paraphrases from sources are scored against the original texts, and potential issues flagged for human review.
Compare and contrast
As they say, "comparison can be the thief of joy". But to verify that an analysis is robust it can be helpful to compare versions made with different LLMs, posing different research questions, adopting different analytic frameworks, or even run on different datasets or subsets of the data. Soak makes comparisons easy, and provides visual and quantitative measures of the overlap between themes extracted.
Structured data extraction for mixed methods
Soak provides tools to extract structured data as part of the analysis. A simple text-based prompting language constrains LLMs to provide answers from fixed sets of categories or scores within a specific range. Structured data can be analysed alongside thematic or qualitative analyses: for example to characterise a sample or quantify other aspects of talk within the data.
Ready to try Soak?
Soak is currently in private beta. Request access to start analysing your qualitative data.
Request Beta Access View example analysis