Skip to main content

Analysing Documents with AI: A Multi-Stage Prompting Approach

Analysing Documents with AI: A Multi-Stage Prompting Approach

What happens when a data scientist and a statistician are asked to challenge each other's reading of the same paper?

The coding-focused prompting technique described in a previous post has a natural sibling: the same multi-stage, dual-persona approach works remarkably well for document analysis. Instead of building software through iterative expert review, you are analysing a piece of work — a research paper, a dataset report, a literature review — and subjecting it to exactly the same kind of structured, adversarial scrutiny.

This post walks through how that adapted prompt works, why the underlying techniques make it more than a glorified summarisation tool, and what happened when it was tested on a social network analysis of co-authorship patterns in an academic repository.

Process flow for the example and approach


Why Not Just Ask for a Summary?

A single-shot summary prompt is fine if you want a précis. But analysis is different. Analysis requires asking uncomfortable questions: Are the methods appropriate? Do the conclusions follow from the evidence? What has been left out? What assumptions are buried in the framing?

The problem is that a single AI voice tends to hedge. It reports what is there. A prompt that forces the AI to embody two distinct 'expert' perspectives — with different training, different instincts, and different things they are likely to notice — changes the texture of the output entirely.

Defining your own personas is a deliberate act of prompt engineering. You are essentially programming the lens through which the document will be evaluated.— Adapted from the original Vibe Coding post

The Three Stages

Stage 1 — Context Gathering

Before any analysis begins, the AI asks questions about the document — one at a time. This single-question interviewing technique is borrowed directly from the coding approach. It prevents the AI from making assumptions about your purpose, your audience, or what you actually want to get out of the analysis. The process continues until you type "stop it", at which point an initial analysis is produced. Think of this as the first draft: a baseline reading of the document before the real scrutiny begins.

Stage 2 — Persona Definition

You define two expert personas. The choice matters more than it might seem. For a research paper in the social sciences, a data scientist and a statistician will notice very different things — and will disagree in productive ways. A data scientist may focus on what the data could reveal if the analysis were extended; a statistician may immediately reach for questions of validity, sampling, and inference. Both share one trait: they play devil's advocate, but constructively. They are not there to dismiss — they are there to sharpen.

Stage 3 — Iterative Expert Review

This is where the depth emerges. The two personas review the document together, one insight at a time. Each round surfaces observations, suggested improvements, points of agreement or disagreement, and a shared refinement. After each round, a single focused question is posed to you. Your answer shapes the next round. When you are done, type "stop please" and the AI produces a revised analysis reflecting everything discussed.

— ✦ —

The Prompt Techniques at Work

Tree of ThoughtsChain of ThoughtPersona SimulationIterative RefinementSingle-Question Prompting

Tree of Thoughts (ToT) is what gives the dual-persona review its analytical range. Each expert independently explores the document from their own vantage point, following different branches of reasoning before the two perspectives are brought together. You see not just what they conclude, but how they arrived there — and where they diverge.

Chain of Thought (CoT) ensures the reasoning is made explicit at every step. Rather than jumping to a verdict on the document's methodology or conclusions, each expert shows their working. This transparency is what distinguishes genuinely useful feedback from superficially confident assertions.

The single-question interviewing in Stage 1 draws on what is sometimes called socratic elicitation — pacing the context-gathering so that each answer genuinely informs the next question, rather than overwhelming the user with a checklist upfront.

A Real Test: Co-Authorship Networks

The prompt was tested on a document analysing social networks of authors within an academic repository — specifically, who co-authored with whom and what those collaboration patterns revealed about the field.

The two personas were a data scientist and a statistician. The differences in perspective surfaced quickly. The data scientist was drawn to questions of network topology: what did the clustering patterns suggest about invisible research communities? Were there bridging authors who connected otherwise separate groups? What would a different graph layout reveal?

The statistician pushed back on some of those interpretations. Co-authorship networks are noisy. Repository coverage is uneven. Are we confident the network represents the field, or just the slice of it that happens to deposit papers in this particular archive? The apparent clusters — are they intellectually meaningful groupings, or artefacts of institutional proximity?

Example Exchange — Illustrative of the Dynamic
Data Scientist

The high-degree nodes in this network are doing important structural work — they're not just prolific authors, they look like connectors between research communities that don't otherwise interact. That's worth making explicit in the analysis.

Statistician

Agreed on the observation, but I'd want to be careful about the inference. High degree could reflect institutional size as much as intellectual bridging. Do we have any way to control for departmental co-location before claiming these nodes are doing cross-community work?

That kind of exchange — one voice extending the interpretation, another immediately questioning its basis — is exactly what the prompt is designed to produce. Neither persona is wrong. The tension between them is the point.

— ✦ —

Taking It Further

Once the iterative review has run its course, the natural next step is to ask the AI to produce a structured write-up of the findings — one that reflects not just what the document says, but the questions that were raised and the refinements that emerged. For academic work in particular, this can form the skeleton of a methods critique or a literature review commentary, ready to be developed further.

Is it perfect? No, there are things that are missed, and work is probably needed to make them question each other's answers more. These are simulated experts.It is the start of an idea, not the end solution.

The Complete Prompt

-- DOCUMENT ANALYSIS PROMPT -- I am going to upload a file to be analysed. Keep asking me questions about it until you have enough to proceed with analysis, or I type "stop it" — then do the initial analysis. You will then prompt the user to provide details of persona1. You will then prompt the user to provide details of persona2. Both personas like to play devil's advocate, but phrase their ideas in a constructive way. You will act as Persona1 and Persona2, and following the approach in STAGE 3, provide iterative feedback — one insight at a time — with the goal of deepening the analysis. You will ask questions until "stop please" is typed in, then provide the revised analysis based on the discussion. ------------------------- STAGE 3: EXPERT REVIEW ------------------------- Simulate Persona1, Persona2 review. For each stage of review: - Provide each expert's observations - Suggested improvements - Points of agreement/disagreement - A shared refinement
— ✦ —

The Bigger Idea

What this prompt demonstrates — like its coding counterpart — is that the structure of a conversation with an AI shapes the quality of what comes out. A flat, single-shot prompt produces a flat, single-pass response. A prompt that builds in stages, defines perspectives, and creates space for genuine disagreement produces something closer to peer review.

The personas do not have to be a data scientist and a statistician. They could be a domain expert and a complete outsider. A methods specialist and a reader focused entirely on implications. A critic and an advocate. The choice of who sits at the table determines what gets noticed — and what gets challenged.

Try it on your next document. You might be surprised what your experts find - but check their answers.


All opinions in this blog are the Author's and should not in any way be seen as reflecting the views of any organisation the Author has any association with. Twitter @scottturneruon

Comments

Popular posts from this blog

GenAI Productivity: Ideas to project proposal 1

One of the ways I use Generative AI with students is to take basic ideas for projects, usually a title, and get these tools to greater ideas and start of a project proposal. This is with all the usual caveats  Check the references (if any); It is going to be basic, so extend it. In this example I am going to use Co-pilot but the ChatGPT, etc can be used, employing a few basic prompt engineering basics: personas (who is the target audience?) and Templates (how do I want it to look?) to start this process. Example:  Project ideas for MSc Data Intelligence students (persona)  on a particular topic. The reply will include subheadings and relevant (hopefully) content for  TITLE, INTRODUCTION, PROBLEM STATEMENT. The prompt: " Taking the topic "Leveraging open-source tools to measure and present academics publications automatically from public domain data.". Give five innovative projects for a Master's level student dissertation in Data Intelligence. Each project example wi...

Getting multiple viewpoints with ChatGPT

Well sort of! There are approaches where we can get the generative AI to look at a problem from multiple perspectives (or personas) and bring the ideas generated, ideally informed by the others. to a final plan. One of the main strategy is called Tree of Thoughts (see here for more detail  https://www.forbes.com/sites/lanceeliot/2023/09/08/prompt-engineering-embraces-tree-of-thoughts-as-latest-new-technique-to-solve-generative-ai-toughest-problems/?sh=5ce79bdb2c8b ). The central idea is get a number of expert opinions, allow potential cross-fertilization of ideas, come up with actions or plans. Let see this action.  Scenario: Find out about the UK Government's plans on Disability support and then use Tree of Thoughts to produce some ideas for a company making disability equipment based on their website. Google's Gemini will be used. Stage 1 "UK Governments plans on Disability support ": Prompt:  Read, convert to plain text and consolidate information from the followi...

AI as a Mirror: Transforming Vague Student Ideas into a More Rigorous Project Agreement

The Problem: The "Generic App" and the "Time Sink" We’ve all been there: a student walks into a 1-to-1 with a vague desire to "do something with AI" or "build a fitness app." You spend 45 minutes trying to find a technical "hook" that justifies a Level 6 or Level 7 grade, only for the student to drift back into "CRUD app" territory by week three. The Philosophy: AI as a Mirror Instead of you doing the heavy lifting, this workflow uses AI as a Mirror . It reflects the student’s own skills and career goals back to them, but with the structural rigour of a virtual supervisory team. It’s not about the AI "giving" the idea; it’s about the AI forcing the student to defend and refine their own concepts until they hold water. The Framework: 3 Months of Rigour This prompt is specifically designed for intensive/conversion MSc or summer capstone projects . It assumes a tight 12-week implementation window. By forcing the AI to w...