Connection to the special issue theme#
The call asks how computational tools can enter qualitative workflows without compromising interpretive rigour, methodological reflexivity, and accountable links between analytic claims and empirical materials, and it asks specifically about AI-assisted analytic exploration that surfaces candidate phenomena for closer inspection rather than as an analytic endpoint. This paper answers with a worked methodological position: a minimalist form of causal mapping in which the only task given to a large language model is a narrow, locally checkable extraction, while every interpretive decision stays with the analyst. It speaks directly to the issue's themes of the limits of automation, the irreducibility of human judgement, and auditability in AI-supported workflows, and it gives explicit answers to the required disclosures on the type of AI used, its role, and how accountability was maintained.
Theoretical and analytical orientation#
The debate about generative AI in qualitative analysis is usually drawn as a line between rejecting the technology to protect non-positivist, meaning-based practice (Jowsey et al. 2025) and embracing it to work at scale. We take a position orthogonal to that line. We share the anti-positivist instinct that a research question is not a decision procedure an algorithm can execute, and we do not claim our method for reflexive, Big-Q thematic work in the sense developed by Braun and Clarke (Braun & Clarke 2023; Braun & Clarke 2025). Yet for the large class of applied questions about what people say causes what, a deliberately generic causal mapping approach is surprisingly useful, precisely because its coding act is narrow enough to be checked locally.
The orientation is small-q and exploratory. The unit of analysis is a single causal claim made by a source, recorded as an ordered pair of a cause and an effect, attached to the verbatim quote that supports it and to a source identifier (Axelrod 1976; Eden 1988; Powell et al. 2024; Narayanan 2005). The dataset of such claims is a links table, and that table, rather than a narrative, is the core qualitative product. It has a recognisable relative in Mayring's rule-guided qualitative content analysis (Mayring 2000), differing in that its unit is an ordered pair, which is what makes pathway analysis possible.
A bridge to the interpretive tradition runs through the difference between a code and a theme. On Braun and Clarke's own account, a real theme is not a topic summary but a meaning-based interpretive story built around a central organising idea: it could not have been written before the analysis, because it says something in relation to the research question (Braun & Clarke 2023). A finding, in other words, is answer-shaped. Causal claims are one species of answer-shaped finding, and causal mapping captures that species at scale while keeping each instance tied to its evidence.
Data and methodology#
The methodological argument is illustrated with a corpus of 48 interviews on the experience of loneliness among young adults aged 18 to 24, recruited from four deprived London boroughs in 2019 and available as a de-identified open dataset (n.d.). The corpus is analysed in three contrasting ways, which together clarify the position.
First, a causal-mapping pass. Each interview is processed in short chunks. The model is given one instruction: identify each passage where the text says one thing influenced another, and for each record the cause, the effect, and the exact supporting quote. The corpus-level structure is recovered by aggregating the resulting links, around 3,392 quote-grounded causal claims produced in roughly twenty minutes. The analyst then queries the links table with explicit, reversible operations: which factors are most often said to drive an outcome, how pathways differ across subgroups, which claims are contested. This is a qualitative version of the split-apply-combine strategy ({wickham 2011): the model splits, a deterministic pipeline applies, and the human combines. The division of labour is strict. The model is a clerk that proposes quote-backed candidate links; the human is the architect who frames the question, curates the factor vocabulary, designs the pipeline, and writes the interpretation. Evidence that narrow, codebook-anchored extraction is the usable regime for LLM coding (Xiao et al. 2023), and a validation study of AI-assisted causal mapping in particular (n.d.), support treating this one step as reliable enough to be worth automating.
Second, for contrast, a fully autonomous AI thematic pass on the same data, in which an agent was given the transcripts and a high-level instruction and developed and applied a thematic method on its own, producing a readable account of four "mechanism stories". This is neither our position nor the dialogic one: there was no real conversation, and the machine ran the whole interpretation itself, which makes it, oddly, the most positivist of the three. On checking, some of its quotations were not verbatim, one was attributed to the wrong interview, and some of its claims about its own process overstated what had been done. These are exactly the errors that handing interpretation wholesale to a model makes hard to catch.
Third, by reference, the human-AI dialogue that conversational frameworks prescribe (Friese 2025; n.d.; Nguyen-Trung & Nguyen 2026), in which the analyst keeps interpretation by staying in the exchange. The comparison shows that in both the dialogic and the autonomous cases the quote must be requested and then verified after the fact, whereas in causal mapping the quote is the unit of coding, present on every link by construction.
Main claims in relation to the literature#
-
The choice between rejecting and embracing AI is a false one for questions about what people say causes what. A one-way pipeline, narrow extraction, then deterministic transforms, then human synthesis, is more accountable than a conversation precisely because it is not a conversation.
-
The position withstands the strongest objections in the current literature. To the methodological-incongruence and skeuomorphism argument (n.d.), we answer that our links table is not a memory crutch imposed on the model, which is stateless across chunks, but an audit artefact for humans and readers; holding the analysis "in the model's head" is the move that reinstates the black box. To the empirical bias finding that LLM coding errors are systematically non-random (Ashwin et al. 2025; Wei et al. 2025), we answer that our workflow relocates bias into an explicit, shareable prompt and makes each unit checkable against its quote, and that the bias literature's own remedy, a narrow model trained on human-coded samples, points toward checkable coding rather than holistic synthesis. To the post-coding claim that the future is dialogic (Friese 2025; n.d.), we answer that the dialogic paradigm puts the analytic state where it cannot be reproduced, and that even its proponents concede there is no guaranteed way to detect AI bias or to prevent fabricated quotes (n.d.). To the deflationary objection that the labour has merely moved into prompt-fiddling, we answer that the concentration of judgement at a few explicit, shareable moments is the gain.
-
The anti-positivist rejection and our position are orthogonal rather than opposed. The reject letter scopes its refusal to Big-Q reflexive work and concedes that rule-governed techniques such as content analysis can be automated (Jowsey et al. 2025); our narrow extraction falls on the automatable side of its own line. This extends, rather than contradicts, calls to move beyond binary acceptance and rejection (Friese et al. 2026; n.d.).
AI-use declaration#
- Type. Commercial large language models, used at the extraction step only; specific model versions are reported because behaviour changes across versions.
- Role. The model proposes candidate causal links from short text chunks, each paired with a verbatim quote. It does not summarise across documents, choose the codebook, write the report, or make any claim about the world.
- Accountability. Every link is traceable to a specific quote and source; every step beyond extraction is deterministic and human-authored; the extraction prompt functions as a shareable codebook; the links table can be inspected line by line. Sensitive material is handled with training-on-input disabled where available, with attention to consent, confidentiality and removal of identifying information at the chunking stage.
The full paper develops the argument and the worked comparison in detail, with practical guidance on the workflow and a full account of its limits.
References
Ashwin, Chhabra, & Rao (2025). Using Large Language Models for Qualitative Analysis Can Introduce Serious Bias. SAGE Publications Inc. https://doi.org/10.1177/00491241251338246.
Axelrod (1976). The Analysis of Cognitive Maps. In Structure of Decision : The Cognitive Maps of Political Elites.
Braun, & Clarke (2023). Toward Good Practice in Thematic Analysis: Avoiding Common Problems and Be(Com)Ing a Knowing Researcher. Taylor \& Francis. https://doi.org/10.1080/26895269.2022.2129597.
Braun, & Clarke (2025). Reporting Guidelines for Qualitative Research: A Values-Based Approach. Routledge. https://doi.org/10.1080/14780887.2024.2382244.
Eden (1988). Cognitive Mapping. https://doi.org/10.1016/0377-2217(88)90002-1.
Friese (2025). Conversational Analysis with AI - CA to the Power of AI: Rethinking Coding in Qualitative Analysis. https://doi.org/10.2139/ssrn.5232579.
Friese, Nguyen-Trung, Powell, & Morgan (2026). Beyond Binary Positions: Making Space for Critical and Reflexive GenAI Integration in Qualitative Research. https://doi.org/10.2139/ssrn.5962174.
Jowsey, Braun, Clarke, Lupton, & Fine (2025). We Reject the Use of Generative Artificial Intelligence for Reflexive Qualitative Research. https://doi.org/10.2139/ssrn.5676462.
Mayring (2000). Qualitative Content Analysis. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research. https://doi.org/10.17169/FQS-1.2.1089.
Narayanan (2005). Causal Mapping: An Historical Overview. In Causal Mapping for Research in Information Technology. https://www.google.co.uk/books/edition/_/61z36j6QgmAC?hl=en&gbpv=1.
Nguyen-Trung, & Nguyen (2026). Narrative-Integrated Thematic Analysis (NITA): How Can LLMs Support Theme Generation without Coding?. Routledge. https://doi.org/10.1080/14780887.2026.2638348.
Powell, Copestake, & Remnant (2024). Causal Mapping for Evaluators. https://doi.org/10.1177/13563890231196601.
Wei, Liu, Barany, Ocumpaugh, Mehta, Nasiar, Baker, Zambrano, Vanacore, & Giordano (2025). Cultural Alignment and Biases in Qualitative Coding: Comparing GPT and Human Coders. https://doi.org/10.35542/osf.io/h8u4f_v1.
Xiao, Yuan, Liao, Abdelghani, & Oudeyer (2023). Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding. In 28th International Conference on Intelligent User Interfaces. https://doi.org/10.1145/3581754.3584136.
{wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. https://doi.org/10.18637/jss.v040.i01.