Wednesday, 22 April 2026

Analysing Documents with AI: A Multi-Stage Prompting Approach

Analysing Documents with AI: A Multi-Stage Prompting Approach

What happens when a data scientist and a statistician are asked to challenge each other's reading of the same paper?

The coding-focused prompting technique described in a previous post has a natural sibling: the same multi-stage, dual-persona approach works remarkably well for document analysis. Instead of building software through iterative expert review, you are analysing a piece of work — a research paper, a dataset report, a literature review — and subjecting it to exactly the same kind of structured, adversarial scrutiny.

This post walks through how that adapted prompt works, why the underlying techniques make it more than a glorified summarisation tool, and what happened when it was tested on a social network analysis of co-authorship patterns in an academic repository.

Process flow for the example and approach


Why Not Just Ask for a Summary?

A single-shot summary prompt is fine if you want a précis. But analysis is different. Analysis requires asking uncomfortable questions: Are the methods appropriate? Do the conclusions follow from the evidence? What has been left out? What assumptions are buried in the framing?

The problem is that a single AI voice tends to hedge. It reports what is there. A prompt that forces the AI to embody two distinct 'expert' perspectives — with different training, different instincts, and different things they are likely to notice — changes the texture of the output entirely.

Defining your own personas is a deliberate act of prompt engineering. You are essentially programming the lens through which the document will be evaluated.— Adapted from the original Vibe Coding post

The Three Stages

Stage 1 — Context Gathering

Before any analysis begins, the AI asks questions about the document — one at a time. This single-question interviewing technique is borrowed directly from the coding approach. It prevents the AI from making assumptions about your purpose, your audience, or what you actually want to get out of the analysis. The process continues until you type "stop it", at which point an initial analysis is produced. Think of this as the first draft: a baseline reading of the document before the real scrutiny begins.

Stage 2 — Persona Definition

You define two expert personas. The choice matters more than it might seem. For a research paper in the social sciences, a data scientist and a statistician will notice very different things — and will disagree in productive ways. A data scientist may focus on what the data could reveal if the analysis were extended; a statistician may immediately reach for questions of validity, sampling, and inference. Both share one trait: they play devil's advocate, but constructively. They are not there to dismiss — they are there to sharpen.

Stage 3 — Iterative Expert Review

This is where the depth emerges. The two personas review the document together, one insight at a time. Each round surfaces observations, suggested improvements, points of agreement or disagreement, and a shared refinement. After each round, a single focused question is posed to you. Your answer shapes the next round. When you are done, type "stop please" and the AI produces a revised analysis reflecting everything discussed.

— ✦ —

The Prompt Techniques at Work

Tree of ThoughtsChain of ThoughtPersona SimulationIterative RefinementSingle-Question Prompting

Tree of Thoughts (ToT) is what gives the dual-persona review its analytical range. Each expert independently explores the document from their own vantage point, following different branches of reasoning before the two perspectives are brought together. You see not just what they conclude, but how they arrived there — and where they diverge.

Chain of Thought (CoT) ensures the reasoning is made explicit at every step. Rather than jumping to a verdict on the document's methodology or conclusions, each expert shows their working. This transparency is what distinguishes genuinely useful feedback from superficially confident assertions.

The single-question interviewing in Stage 1 draws on what is sometimes called socratic elicitation — pacing the context-gathering so that each answer genuinely informs the next question, rather than overwhelming the user with a checklist upfront.

A Real Test: Co-Authorship Networks

The prompt was tested on a document analysing social networks of authors within an academic repository — specifically, who co-authored with whom and what those collaboration patterns revealed about the field.

The two personas were a data scientist and a statistician. The differences in perspective surfaced quickly. The data scientist was drawn to questions of network topology: what did the clustering patterns suggest about invisible research communities? Were there bridging authors who connected otherwise separate groups? What would a different graph layout reveal?

The statistician pushed back on some of those interpretations. Co-authorship networks are noisy. Repository coverage is uneven. Are we confident the network represents the field, or just the slice of it that happens to deposit papers in this particular archive? The apparent clusters — are they intellectually meaningful groupings, or artefacts of institutional proximity?

Example Exchange — Illustrative of the Dynamic
Data Scientist

The high-degree nodes in this network are doing important structural work — they're not just prolific authors, they look like connectors between research communities that don't otherwise interact. That's worth making explicit in the analysis.

Statistician

Agreed on the observation, but I'd want to be careful about the inference. High degree could reflect institutional size as much as intellectual bridging. Do we have any way to control for departmental co-location before claiming these nodes are doing cross-community work?

That kind of exchange — one voice extending the interpretation, another immediately questioning its basis — is exactly what the prompt is designed to produce. Neither persona is wrong. The tension between them is the point.

— ✦ —

Taking It Further

Once the iterative review has run its course, the natural next step is to ask the AI to produce a structured write-up of the findings — one that reflects not just what the document says, but the questions that were raised and the refinements that emerged. For academic work in particular, this can form the skeleton of a methods critique or a literature review commentary, ready to be developed further.

Is it perfect? No, there are things that are missed, and work is probably needed to make them question each other's answers more. These are simulated experts.It is the start of an idea, not the end solution.

The Complete Prompt

-- DOCUMENT ANALYSIS PROMPT -- I am going to upload a file to be analysed. Keep asking me questions about it until you have enough to proceed with analysis, or I type "stop it" — then do the initial analysis. You will then prompt the user to provide details of persona1. You will then prompt the user to provide details of persona2. Both personas like to play devil's advocate, but phrase their ideas in a constructive way. You will act as Persona1 and Persona2, and following the approach in STAGE 3, provide iterative feedback — one insight at a time — with the goal of deepening the analysis. You will ask questions until "stop please" is typed in, then provide the revised analysis based on the discussion. ------------------------- STAGE 3: EXPERT REVIEW ------------------------- Simulate Persona1, Persona2 review. For each stage of review: - Provide each expert's observations - Suggested improvements - Points of agreement/disagreement - A shared refinement
— ✦ —

The Bigger Idea

What this prompt demonstrates — like its coding counterpart — is that the structure of a conversation with an AI shapes the quality of what comes out. A flat, single-shot prompt produces a flat, single-pass response. A prompt that builds in stages, defines perspectives, and creates space for genuine disagreement produces something closer to peer review.

The personas do not have to be a data scientist and a statistician. They could be a domain expert and a complete outsider. A methods specialist and a reader focused entirely on implications. A critic and an advocate. The choice of who sits at the table determines what gets noticed — and what gets challenged.

Try it on your next document. You might be surprised what your experts find - but check their answers.


All opinions in this blog are the Author's and should not in any way be seen as reflecting the views of any organisation the Author has any association with. Twitter @scottturneruon

No comments:

Post a Comment

Analysing Documents with AI: A Multi-Stage Prompting Approach

Analysing Documents with AI: A Multi-Stage Prompting Approach What happens when a data scientist and a statistician are asked to challenge e...