Tuesday, 14 April 2026

GenaI as co-author and more importantly as "Devil's Advocate"

In a companion post on context stacking, I came across an idea that stayed with me — and I wanted to explore it further.

This piece isn’t just about the final blog produced, but about the process behind creating it using generative AI (specifically, Claude.ai). Rather than using AI as a writing shortcut, I used it as a thinking partner — one that could challenge my assumptions, test my reasoning, and help strengthen the argument before anything was finalised.

What emerged was a structured workflow (shown below) that others can adopt when using AI to improve the rigour, not just their output. And it all starts with setting the context and the audience and telling the generative AI to pick it apart.


The process:

Before using Generative AI: Two drafts were produced, and the second draft went through this final process, as described by claude.ai here.

Here's the workflow we followed:

1. Critical Reading and Initial Diagnosis

We started with a close reading of your original draft, analysing what was working well structurally and argumentatively, and identifying the key weaknesses an academic reviewer would likely challenge — particularly the flipped classroom compliance problem you raised directly.

2. Devil's Advocate Questioning — Round One

Rather than moving straight to revision, we used structured devil's advocate questioning across four pressure points: the transferability of the central claim, the wrong mental model problem, academic displacement, and the equity argument. Each question was designed to find where the argument was exposed rather than where it was strong.

3. Iterative Stress-Testing — Multiple Rounds

Crucially, we didn't accept your first answers. Each response you gave was pushed harder in the next round — your answers generated new, more specific questions. This produced several iterations where the argument either strengthened genuinely or revealed honest limitations. Notably, when you challenged my assertion about academic workload lacking evidential basis, that exchange itself modelled the critical thinking the piece advocates.

4. Position Clarification on Key Tensions

Through this process several important clarifications emerged that weren't in the original draft: the champion fragility problem, the coordinated misconception risk, lived experience as embodied presence rather than purely knowledge, and most significantly the honest narrowing of your target audience from "all academics" to "student-centred module leaders with genuine autonomy."

5. Honest Scoping Decision

The pivotal moment was agreeing that the piece was most honestly a proof of concept argument for early adopters rather than a universal prescription — and recognising that this resolved the central contradiction between your equity argument and your actual audience.

6. Structural Redesign Before Drafting

Before writing a single word of the revised piece, we agreed a revised section architecture that mapped the new argument logic — ensuring the structure served the argument rather than retrofitting the argument into the original structure.

7. Drafting with Referenced Integration

Only then did we draft, integrating Biggs and Tang and Freeman et al. as gaps the conversation had specifically identified, and producing a piece that honestly names its scope in the subtitle rather than burying the limitation.


The meta-point worth noting

This workflow is essentially the context stacking methodology your piece advocates — applied to academic writing rather than lecture preparation. We built a model, stress-tested it against challenging questions, identified where it broke down, refined it, and only then committed it to final form. The process mirrored the argument


The bigger takeaway

This process highlights a more powerful way to use generative AI. Instead of treating it as a tool for faster writing, it can be used to improve how we think. By asking it to question, challenge, and probe our ideas, we can produce work that is more robust, more honest, and ultimately more useful. On a more personal note, it pushed me hard to think what I was actually thinking and meaning - it didn't give me the answers.

The inverted questioning and lots of multiple rounds were the key strengths of the approach at each stage/round. I was asked some insightful but critical questions. I did want it to give me the answers but push me to address (and face) critical questions.


All opinions in this blog are the Author's and should not in any way be seen as reflecting the views of any organisation the Author has any association with. Twitter @scottturneruon

AI, the Flipped Classroom and a Possible Future of the Lecture

A proof of concept argument for student-centred module leaders

A tweet recently caught my attention https://x.com/ihtesham2005/status/2041576806810370553?s=20. It described an MIT student who had developed what he called “context stacking” — uploading lecture materials, readings and related papers into an AI tool before each class, then using carefully constructed prompts to build a mental model of the content before setting foot in the lecture hall. By the time he arrived, the professor wasn’t teaching him anything new. They were confirming, refining and occasionally surprising him. That surprise, he said, was the only thing he wrote down.

This is not simply pre-reading with extra steps. Using generative AI as an external thinking partner, this student was identifying gaps in his own understanding before the lecture began — doing what good tutors have always done, asking not “what do you know?” but “where does your understanding break down?” This maps directly onto the higher-order thinking that Bloom et al. (1956) place at the top of the learning hierarchy: analysis, evaluation, synthesis. Traditionally, students only reached those levels during revision or assessment. This approach gets them there before the lecture starts.

This is, in effect, an AI-enhanced flipped classroom. Bergmann and Sams (2012) argued that class time should be reserved for learning activities that most benefit from human interaction. If AI can now handle the initial construction of conceptual understanding, the classroom becomes available for exactly that. Biggs and Tang (2011), whose constructive alignment framework connects learning outcomes, activities and assessment into a coherent whole, would recognise the logic immediately: align the preparation to the outcome, and the lecture becomes the place where understanding is tested, not transmitted.

The Risk Inside the Opportunity

There is a genuine risk embedded in this approach. A student who arrives with a confidently pre-constructed mental model may filter everything the lecturer says through that model, accepting what confirms it and discarding what challenges it. In STEM subjects this can be particularly consequential: a subtly wrong understanding held with confidence may be harder to correct than no understanding at all.

Kahneman’s (2011) distinction between System 1 and System 2 thinking is instructive here. AI-assisted pre-learning risks producing fluent, fast System 1 confidence — the feeling of understanding — where slow, deliberate System 2 scrutiny is required. Bjork and Bjork (2011) identified “desirable difficulties” as central to deep learning: the productive struggle that counterintuitively strengthens long-term retention. A further risk, less discussed, is coordinated misconception at scale: if a whole cohort context-stacks using similar prompts, they may arrive sharing the same confident errors.

The methodological answer is not to abandon peer discussion but to structure it differently. Rather than asking groups to arrive at a shared model, the task should be to stress-test their models against challenging, preferably unseen questions: does the model still hold? A lecturer who deliberately unsettles a pre-formed mental model is not undermining a student’s preparation — they are completing it. The goal is not to arrive at the lecture with answers. It is to arrive with better questions, and with enough structure in one’s thinking to recognise, rather than deflect, the moment when an expert says something that doesn’t fit.

What This Means for the Academic

If the student’s job is to arrive with better questions, the academic’s job becomes something far more interesting than content delivery. The academic most at risk is not the expert practitioner who contextualises theory through their own research and professional experience. The academic most at risk is the one whose primary function is delivering established content — and students can already find that content elsewhere. Boyer (1990) argued that teaching and research are not in tension but expressions of the same scholarly identity. AI-assisted preparation, used well, restores academics to that identity: real-time sense-making, the ability to respond to where students actually are, and the capacity to generate connections no uploaded document anticipated.

Critically, lived experience remains beyond what AI can authentically replicate. When an academic says “in my own research I found…” or “working on project X, this is what happened…” they are doing something an AI cannot: demonstrating embodied disciplinary judgement, visibly comfortable with complexity and genuinely responsive in the moment. Students recognise this. It is not merely what the academic knows but the evident confidence with which they inhabit that knowledge that establishes authority. Research on active versus traditional lecturing consistently finds that student-centred instruction produces superior learning outcomes (Lasnier et al., 2022; Wieman, 2014). Chickering and Gamson’s (1987) pillars of quality higher education — contact with faculty, active learning and prompt feedback — describe exactly what this approach makes possible.

Who This Is For — and What That Demands

It is important to be honest about the scope of this argument. Context stacking as a structured pedagogical approach is most immediately applicable to module leaders who already have genuine autonomy over their formative activities, and who already lean toward student-centred practice. For those academics, the workload shift required is not radical: reflecting carefully on what formative questions best expose shallow understanding is good practice anyway. It is, as Biggs and Tang (2011) would frame it, a matter of aligning activities more deliberately to intended outcomes.

This is not, however, an argument for leaving the approach to individual discovery. That path privileges the already privileged: students with the digital literacy and self-regulation to experiment independently, and academics with the confidence and autonomy to innovate alone. Wingate (2006) argued that study skills should be embedded in the curriculum rather than treated as bolt-on extras. AI-assisted study methodology should now be considered foundational academic literacy, no different from library skills or academic writing. Without structured development, generative AI risks producing convergent homogenisation of thinking rather than deeper understanding (Kirkpatrick et al., 2025).

The Responsibility of Early Adopters

This is a proof of concept argument, not a universal prescription — and that distinction carries a specific obligation. Champions of new pedagogical approaches are structurally fragile: they leave, get promoted, or burn out, and the approach collapses with them unless it has been evaluated and embedded beyond individual enthusiasm. The responsibility on early adopters is therefore clear: document rigorously, evaluate honestly, and disseminate actively — through internal teaching and learning conferences, through peer networks, and through publication in practitioner journals.

The workload question in particular remains underresearched and deserves honest investigation rather than reassurance. Institutional buy-in is more likely to follow demonstrated outcomes than theoretical argument. Teaching-focused institutions, often more dependent on student income than research income, have strong structural incentives to get pedagogy right — and the evidence base for active, student-centred approaches is already compelling (Freeman et al., 2014). What is needed now is evidence that this specific approach works, for which students, under which conditions, and at what cost.

Conclusion

The MIT student did not use AI to avoid learning. He used it to arrive ready to learn at a level most students never reach until revision, if at all. The question for educators is not whether students will use these tools — they will — but whether those of us with the autonomy and inclination to respond will do so rigorously enough to build an evidence base that eventually moves the approach beyond the already-converted.

The irreplaceability of the human academic does not lie in knowing more than an AI. It lies in being present in the room, reading the moment, and asking the question that unsettles comfortable certainty and replaces it with something harder and more durable. That is not a role under threat from generative AI. It is a role that generative AI, used well by students, finally makes possible at scale — for those willing to meet them there.

References

Bergmann, J., & Sams, A. (2012). Flip your classroom: Reach every student in every class every day. International Society for Technology in Education.
 
Biggs, J., & Tang, C. (2011). Teaching for quality learning at university (4th ed.). Open University Press.
 
Bjork, E. L., & Bjork, R. A. (2011). Making things hard on yourself, but in a good way: Creating desirable difficulties to enhance learning. In M. A. Gernsbacher et al. (Eds.), Psychology and the real world (pp. 56-64). Worth Publishers.
 
Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., & Krathwohl, D. R. (1956). Taxonomy of educational objectives: Handbook 1: Cognitive domain. David McKay.
 
Boyer, E. L. (1990). Scholarship reconsidered: Priorities of the professoriate. Carnegie Foundation for the Advancement of Teaching.
 
Chickering, A. W., & Gamson, Z. F. (1987). Seven principles for good practice in undergraduate education. AAHE Bulletin, 39(7), 3-7.
 
Freeman, S., et al. (2014). Active learning increases student performance in science, engineering, and mathematics. Proceedings of the National Academy of Sciences, 111(23), 8410-8415.
https://doi.org/10.1073/pnas.1319030111
PMid:24821756 PMCid:PMC4060654
 
Kahneman, D. (2011). Thinking, fast and slow. Farrar, Straus and Giroux.
 
Kirkpatrick, J., et al. (2025). University students' perceptions of generative AI for critical thinking and creativity. Innovations in Education and Teaching International. 
https://doi.org/10.1080/14703297.2025.2600476
 
Lasnier, A., et al. (2022). Effect of active learning versus traditional lecturing on the learning achievement of college students in humanities and social sciences: A meta-analysis. Higher Education. 
https://doi.org/10.1007/s10734-022-00977-8
 
Wieman, C. (2014). Large-scale comparison of science teaching methods sends clear message. Proceedings of the National Academy of Sciences, 111(23), 8319-8320.
https://doi.org/10.1073/pnas.1407304111
PMid:24853505 PMCid:PMC4060683
 
Wingate, U. (2006). Doing away with 'study skills'. Teaching in Higher Education, 11(4), 457-469.
https://doi.org/10.1080/13562510600874268


Appendix:

CONTEXT STACKING

BUILD → TEST → FIX → STRESS → VALIDATE

1. Load Inputs (2 days before): Readings, slides, extra sources, problem sets, include notes from last week.

2. Map Ideas: Identify 5 core concepts and connections. Possible Prompt after stage 1: What are the 5 core concepts this week's content is built on, and how do they connect to what I studied last week?

3. Find Gaps: What must you understand to teach it?

Possible Prompt after stage 2: What would I need to understand to teach this to someone with zero background? What would I need to genuinely understand about this material to teach it to someone with zero background?

4. Fix Gaps (90 min): Study only weak areas.

5. Think Like Examiner: What questions expose shallow understanding? Possible prompt: What question would expose someone who understands this topic only at a surface level? What question could a professor ask that exposes shallow understanding?

6. Lecture: Confirm understanding, note surprises only.

7. Lock It In: Reflect and rate confidence



Note: This blog is to be read alongside the companion blog. In this post Clauda.AI was used as a 'co-author' helping phrase things but most importantly (the focus of the campion blog), being a harsh critic- Devils Advocate' - the prompts used were specific on this, not just give me the answer, but stress-test; it in some respects the reverse what Generative AI have a tendency to do of giving things that confirm your thinking. See https://llmapplied.blogspot.com/2026/04/genai-as-co-author-and-more-importantly.html for more details on the process.


All opinions in this blog are the Author's and should not in any way be seen as reflecting the views of any organisation the Author has any association with. Twitter @scottturneruon

Friday, 10 April 2026

Starting a Literature Review with GenAI: A Supervisor’s Secret Weapon

Robot in the classroom, the idea of ai in education. shows a classroom and teacher.


If you supervise research students at undergraduate or postgraduate level, you are likely to be very familiar with the "blank stare"—that moment a student first confronts the sheer, overwhelming mountain of academic literature they are expected to read, synthesise, and critique. Information overload is a genuine academic pain point, often manifesting as a severe case of "blank page" syndrome.

As academics, we know that starting is half the battle. This is where Generative AI, like ChatGPT-4o, shines not as a tool to write the review for the student, but as a structural scaffold. Much like using AI as a mirror to transform vague student ideas, we can use GenAI to help students map the thematic landscape of a topic before they dive into deep reading. It breaks the ice, organises chaos into a digestible format, and gives them a structured starting point.

Here is a practical, step-by-step workflow you can share with your students to help them generate a foundational literature matrix.

The Workflow: Mapping the Landscape

The goal of this exercise isn't just to find papers; it’s to identify common features and themes across the literature. By forcing the AI to iterate and categorise, we teach the student how to look for cross-disciplinary themes rather than just reading papers in isolated silos.

Have your students start with this complex prompt, adjusting the topic to their specific research area (in this example, we use "VR in Higher Education"):

Prompt 1:

You will generate a search and produce a summary table of published papers on the following topic "VR in Higher Education". > Iterate 5 times the following Step 1; Step 2; Step 3; and Step 4. > Step 1. Search for 3 new papers relating to the topic and add to the list of papers stored. > Step 2. Identifying Common Features to at least three papers not included in the previous interaction. Each iteration all Common Features are maintained but can be revised. > Step 3. On each iteration from the papers stored revise the following table. The table will have four parts: Common Features, summary of the Common Feature, identified and included all papers that have Common Feature, all papers that don't match the Common Feature. > Step 4. Add the full reference to all the papers to a Harvard styled reference list. Display the full table. Display the full reference list.

Why this works: The magic here lies in the iteration. The AI builds a comparative matrix, separating papers that share a theme (like "Challenges in Implementation") from those that don't. It immediately provides the student with a high-level, organised view of the current academic discourse.

Once the table is generated, the next step is translating that raw data into academic prose.

Prompt 2:

Using the table and reference. Analyze the results and summarise the results with appropriate citations.

This generates a short, synthesised summary of the findings, helping the student see how an academic narrative is woven together from disparate sources.

Levelling Up: The Chain of Density (CoD)

Once students have the basic summary from Prompt 2, they shouldn't stop there. We want to push for richer, more academically dense writing. This is where you can introduce the Chain of Density (CoD) prompting technique.

Instead of accepting the first output, the CoD approach asks the AI to rewrite the summary multiple times, each time identifying missing "entities" (specific methodologies, nuanced findings, or theoretical frameworks) and weaving them into the text without increasing the word count. It forces the summary to become less generic and more informationally rich, mirroring the density of actual academic writing.

Ethics and Critical Assessment: The Reality Check

Before sending students off to generate literature matrices, we must establish a clear ethical boundary. GenAI is an assistant, not the primary researcher.

Academics and students alike must be acutely aware of AI's limitations—most notably, its tendency to hallucinate. AI models can, and will, invent realistic-sounding citations or confidently misrepresent a paper’s methodology. Therefore, this workflow is strictly a starting point.

Students must physically track down, read, and verify every single paper the AI may suggest. The AI's synthesis should be treated as a draft map of a new territory; you still have to walk the terrain yourself to verify the landmarks. Relying blindly on AI outputs without human verification is a fast track to academic misconduct.

Final Thoughts

Used thoughtfully, GenAI transforms the daunting initial stages of a literature review into an engaging, structured exercise. It empowers students to overcome the blank page and helps them think thematically from day one.


Further Reading

If you are looking to integrate more AI-assisted workflows into your research or teaching, check out these related posts:

All opinions in this blog are the Author's and should not in any way be seen as reflecting the views of any organisation the Author has any association with. Twitter @scottturneruon

Saturday, 4 April 2026

No Code, No Problem: How to Use ChatGPT to Compare Any Two Websites

A system overview of the blog post showing the stages
System Overview (produced using ChatGPT)



There's a moment many of us have had: you're looking at a competitor's website, then back at your own, and you just know something's different — but you can't quite put your finger on what. Traditionally, getting a rigorous answer meant hiring a consultant, running expensive user research, or spending hours doing it manually. What if you could get sharp, structured, comparative analysis in under ten minutes — without writing a single line of code?

That's exactly what this project set out to prove.

It started with a specific problem: comparing course websites to understand how they stacked up against a competitor. The goal wasn't just a surface-level look — it was to understand how real people, with different needs and backgrounds, would actually experience each site. The solution turned out to be a structured ChatGPT workflow built entirely in standard chat, using nothing more than a sequence of carefully designed prompts.

The core idea is simple but powerful: instead of asking ChatGPT one big question and hoping for the best, you break the task into stages. Each stage builds context before the next one begins. By the time the actual analysis runs, ChatGPT isn't working in the dark — it has a detailed picture of both websites and a clear human framework to evaluate them against. The result feels less like a generic AI summary and more like a considered brief from someone who actually did the reading.

Here's how it works.

Step 1 — Feed in the websites

The first subprompt instructs ChatGPT to ask for two websites, one at a time:

"Ask the user to enter two websites to compare. Label them as website1 and website2 respectively. Ask each one as a separate prompt."

Entering each site separately is deliberate. It gives ChatGPT a moment to analyse each one individually before any comparison begins — and it does. After each URL is entered, it produces a quick overview of key characteristics and early signals about the site's purpose, tone, and structure. Think of it as ChatGPT doing its homework before the debate starts.

Step 2 — Define your personas

This is where the workflow gets interesting. Rather than comparing websites in the abstract, the approach anchors the analysis in real human perspectives. Three personas are entered one at a time:

"Then ask three new prompts for new personas to be entered by the user. These will be labelled as persona1, persona2 and persona3 — a new prompt per persona."

The personas used in testing were deliberately varied: a time-pressed, university-educated man in his forties; a semi-retired woman with a doctoral background who leans towards world news; and a recently graduated engineer in his early twenties who lives on his phone. After each persona is entered, ChatGPT expands it — making reasonable assumptions about behaviour, expectations, and priorities. In testing, these inferences were consistently sensible and added useful texture to what could otherwise be quite flat demographic descriptions.

This step is worth pausing on, because it's the secret ingredient. Personas transform the analysis from "which site is better?" to "better for whom?" — which is a much more useful question to answer.

Step 3 — Run the analysis

With two websites and three personas loaded into context, the final subprompt does the heavy lifting:

"Please compare and contrast the websites against the personas. For each persona give a score out of 100 for the following: Overall score and Usability. Also for each persona add a summary. While analysing it take a pessimistic view and suggest improvements. Critically review the marketing and offer, and compare against each other. Present these in a graphical way to aid understanding. The audience to view the results of the analysis is the web team for the two sites."

The output is genuinely impressive. ChatGPT produced an executive summary for each site covering strengths, weaknesses and risks, followed by scored comparisons per persona. It then offered a strategic comparison across dimensions like trust, speed, content depth and engagement — ending with one-line recommendations per site. All without a single spreadsheet, survey or agency brief.

One instruction in that final prompt is worth highlighting: "take a pessimistic view." This small addition makes a meaningful difference. Left to its own devices, ChatGPT tends towards balance and diplomacy. Nudging it towards scepticism pushes the output past polite generalities and into the kind of direct, critical feedback that's actually useful for a web team trying to improve.

What worked well

The staged approach is what makes this work. Each subprompt doesn't just collect information — it primes ChatGPT to think in a particular way before the next input arrives. By the time the comparison runs, the model has a rich, structured mental model of both sites and all three users. That's fundamentally different from dumping everything into a single prompt and hoping for coherence.

The persona framework also proved its value. It gave stakeholders a way into the results that felt human and relatable, rather than abstract. A web team looking at scores for a 22-year-old engineering graduate will instinctively know what to prioritise in a way that a generic usability score simply doesn't communicate.

What's next

The workflow held up well, but there's clear room to evolve. The visualisations produced were functional but basic — future iterations should push for richer, more interactive outputs that make the data easier to present to senior stakeholders. More ambitiously, the analysis could be tailored so each persona receives a version of the report written for them — not just used as a lens within a single document. Imagine handing a one-page summary to a time-poor marketing director versus a detailed breakdown to a UX designer; the underlying data is the same, the framing entirely different.

There's also an argument for making the workflow more dynamic. Rather than moving linearly through the stages, a more sophisticated version might pause after the initial website analysis to ask clarifying questions, or allow personas to be weighted differently depending on the strategic priority of each audience segment.

Areas to Improve

  • Better visualisations — move beyond basic outputs to richer, more interactive displays suited to senior stakeholders
  • Persona-tailored reports — deliver each persona a version of the analysis written for them, not just referenced within a single document
  • A more dynamic workflow — add clarifying questions mid-process and allow personas to be weighted by strategic priority

But as a starting point, this is a genuinely practical, no-code approach to competitive website analysis that any intermediate AI user can pick up today. The prompts are reusable, the structure adapts to almost any industry — from e-commerce to healthcare to financial services — and the whole thing runs in a standard ChatGPT session with no plugins, no integrations, and no specialist knowledge required.

Sometimes the most powerful tools are the ones hiding in plain sight. Have a go yourself and improve this. Have to see improvements via the comments.

Wednesday, 25 March 2026

AI as a Mirror: Transforming Vague Student Ideas into a More Rigorous Project Agreement





The Problem: The "Generic App" and the "Time Sink"

We’ve all been there: a student walks into a 1-to-1 with a vague desire to "do something with AI" or "build a fitness app." You spend 45 minutes trying to find a technical "hook" that justifies a Level 6 or Level 7 grade, only for the student to drift back into "CRUD app" territory by week three.

The Philosophy: AI as a Mirror

Instead of you doing the heavy lifting, this workflow uses AI as a Mirror. It reflects the student’s own skills and career goals back to them, but with the structural rigour of a virtual supervisory team. It’s not about the AI "giving" the idea; it’s about the AI forcing the student to defend and refine their own concepts until they hold water.

The Framework: 3 Months of Rigour

This prompt is specifically designed for intensive/conversion MSc or summer capstone projects. It assumes a tight 12-week implementation window. By forcing the AI to work within this constraint, we prevent the "I'm building the next Amazon" delusions and focus on a feasible, high-quality technical contribution. But a tweak to 6 months instead of 3 months is a minor tweak in the prompt.

The Supervisor’s Facilitation Guide

To use this tool effectively in a session, this is a tool, not a solution; it will not always be right. Suggest keeping these three "supervisory moves" in mind:

  1. The Technical "Meat": In Stage 2, don't let the student just pick an idea because it "looks cool." Look for the Technical Challenge or Research Question. If the AI suggests a "Security Dashboard," ask the student: "What is the specific investigative element here?"

  2. Lean into the Conflict: In Stage 4, when the "Expert Personas" disagree, that’s your teaching moment. Use that friction to explain Critical Evaluation. If Persona 2 (the Tech Lead) hates the stack and Persona 3 (the Academic) loves the value, ask the student to mediate.

  3. The Technical Sanity Check: Treat AI hallucinations as a pedagogical feature. Tell the student: "The AI suggested this framework—your first task is to find one piece of official documentation proving this is viable for our 3-month window."


Post-Session: From Chat to Agreement

Once the "stop it" command is issued, the work isn't done. The output should serve two purposes:

  1. The Literature Review Skeleton: Use the "Steps" and "Sources" provided to build the student's initial reading list.

  2. The Project Agreement: This output acts as an initial agreement. If the student wants to pivot in Week 8, you refer back to this document to remind them of the agreed scope and technical goals. If they want to pivot in week 1 or 2 then it can be revised.

A Note on Transparency: Encourage students to cite this process in their "Methodology" or "Reflective Practice" chapter. Documenting how they used AI to refine their scope is a great way to demonstrate professional AI literacy. With that in mind the prompt was refined using ChatGPT with a few tweaks to correct it.


The Prompt

Follow this structured workflow exactly.

-------------------------

STAGE 1: PERSONA AND CONTEXT CREATION

-------------------------

 

Step 1: Create Persona1

- Ask the user to enter the details of Persona1, whose project this will be.

 

 

Step 2: Ask the user for the project area

Ask the user to describe:

- subject area or domain

- technologies of interest

- types of users involved

- preferred themes (e.g. AI, cybersecurity, web, data, accessibility, education, health, sustainability)

- anything to avoid

- project type (software, data-focused, research-led, or mixed)

- desired level of challenge

 

Then summarise the project context.

 

Step 3: Create Persona2 ask the user to enter details of this

- A reviewer/adviser with a different perspective

- Include:


  - Expertise

  - What they care about most

  - Common concerns

  - What they consider a strong final-year project

 

Step 4: Create Persona3 ask the user to enter details of this

- Another reviewer with a distinct perspective

- Include the same fields as Persona2 add the element that this person is naturally pessimistic 

 

-------------------------

STAGE 2: IDEA GENERATION

-------------------------

 

Using Persona1 and the project context, generate:

- 5 original project ideas

- Each must include:

  - Title

  - ~100-word summary

  - Why it matters to Persona1

 

Constraints:

- Suitable for UK final-year undergraduate Computing

- Achievable in 3 months

- Not overly broad

 

Then ask the user to choose one idea.

 

-------------------------

STAGE 3: PROPOSAL CREATION & REFINEMENT

-------------------------

 

Generate a proposal including:

- Title

- Summary (max 250 words)

- Aim

- Objectives

- Steps to achieve the goal in 3 months included the need for a literature review

- Resources needed

- Useful sources of information

 

Then enter a refinement loop:

- Ask targeted questions (scope, users, tech, evaluation, risks, ethics)

- Update proposal after each answer

- Keep it realistic for 3 months

- Continue until the user types: stop it

 

-------------------------

STAGE 4: EXPERT REVIEW

-------------------------

 

After "stop it":

 

Simulate Persona1, Persona2, Persona3 reviewing the proposal.

 

For each stage of review:

- Provide each expert’s observations

- Suggested improvements

- Points of agreement/disagreement

- A shared refinement

 

Review across:

1. stakeholder fit

2. feasibility

3. academic value

4. technical suitability

5. risks and ethics

6. objectives and deliverables

7. resources and sources

 

Finish with:

- Final refined proposal with following elements:  Title

- Summary (max 250 words)

- Aim

- Objectives

- Steps to achieve the goal in 3 months included the need for a literature review

- Resources needed

- Useful sources of information

- One action takeaway from each expert

 

-------------------------

RULES

-------------------------

- Keep everything feasible within 3 months

- Maintain UK university academic standards

- Ensure clarity and specificity

- Include evaluation considerations

- Avoid overly generic ideas

- Do NOT reveal hidden reasoning, only structured outputs

 

This prompt has itself gone through iterations. Starting as a project idea generator for a fixed context, an idea generator for a specific student (e.g., Persona1), to a tool that tests the ideas with 'experts' and refines the idea.

Summary

From a Blank Page to a Stress-Tested Proposal

Starting a final-year project is often a battle against the "blank page" and the hidden risks that only emerge when it’s too late to change course. This framework acts as a Digital Co-Pilot for supervisors and students to use together, ensuring the first step of the academic journey is the right one.

The Methodology: Tree of Thoughts

Rather than providing a single, linear suggestion, this tool uses a Tree of Thoughts approach. It explores multiple branching paths for a project—evaluating different technologies, scopes, and domains—before pruning them down to the most viable candidate. This ensures the final proposal isn't just the first idea, but the best one.

The "Triple-Perspective" Committee

To achieve this, the Co-Pilot simulates a real-world project committee to provide a 360-degree view:

  1. The Student (Persona 1): Focuses on skill levels, career goals, and manageable workloads.

  2. The Academic (Persona 2): Ensures "academic weight," research depth, and alignment with university marking rubrics.

  3. The Pessimistic Engineer (Persona 3): The crucial Inverted AI perspective. This persona acts as the "Devil’s Advocate."

The Power of Pessimism (Inverted AI)

Standard AI is often "hallucinatorily optimistic," promising that complex features can be built in days. In an academic setting, optimism is a risk. By inverting the prompt through a pessimistic lens, we:

  • Identify "Project Killers": We find technical bottlenecks and ethical red flags before they become reality.

  • Aggressively Manage Scope: The pessimist cuts away "feature creep," leaving a lean, high-quality project that is actually achievable in three months.

  • Stress-Test the Logic: If an idea can survive the scrutiny of a skeptic, it is far more likely to survive a final viva or a professional technical review.

Your Collaborative Co-Pilot

This tool is designed for supervisors and students to sit down together. It bridges the gap between a student's ambition and the reality of a 12-week deadline. By the end of the session, the Co-Pilot provides a structured, "vetted" roadmap that has already survived its first round of critical feedback.


Summary for the User: Use this to transform "What should I do?" into "Here is exactly how I will succeed, and here is how I've mitigated the risks."


All opinions in this blog are the Author's and should not in any way be seen as reflecting the views of any organisation the Author has any association with. Twitter @scottturneruon

Tuesday, 24 March 2026

A Practical Guide to Building Lessons with AI (Real Savings, No Shortcuts)



There is no shortage of articles telling academics that Generative AI is going to transform education. It is, and it will continue to do so. However, many of these pieces are long on enthusiasm and short on detail. This is not one of those.

What follows is a practical account of using ChatGPT to build a real teaching session. I’ll cover what I did, what worked, what failed, and how long it actually took. No hype—just the reality of how it saved me time and how it could possibly do the same for you.

The Test Case

My subject was a four-hour session on Pytest in Django, aimed at final-year BSc Software Engineering students. These students have a basic grasp of Django but possess solid overall coding skills. The session was split into a one-hour lecture and three hours of hands-on practical work in VS Code.

The Strategy: Starting with the Prompt

The key to getting useful output is being specific upfront. Rather than simply asking ChatGPT to "create a lesson on Pytest," I provided a detailed prompt specifying the audience, topic, format, and—crucially—how I wanted the interaction to work. I wanted an iterative process where the AI asked me questions until I was satisfied before producing the final content. Here is the prompt I used:

"I want to create a four-hour teaching session — one hour lecture and three hours of practicals. Topic is Pytest in Django for a group of final-year BSc Software Engineering students. They have a basic understanding of Django.

I want this to be done in two parts: the lecture slides and then the practical teaching material using VS Code.

We will start with the slides. Please ask me questions until I type 'now stop,' at which point the slides should be produced. Then we will move to the practical material, again iterating through questions until I type 'done now'."

This two-part structure was deliberate. By separating the lecture from the lab, I ensured each section got the focus it deserved without the AI trying to do everything at once.

The "Thinking Partner" Approach

What worked best was the question-and-answer refinement loop. Instead of generating a wall of generic content, ChatGPT asked clarifying questions about learning objectives, the depth of detail required, and the specific tools students would use.

This is the part most guides skip: GenAI tools are far more effective as a thinking partner in the design phase than as a one-shot content generator. The questioning actually helped me think through what I did—and didn't—want to include, which ultimately helped me do a better job.

The Results: What was Produced?

  • The Lecture Slides: The initial output provided a logical structure: testing concepts, Pytest vs. Django’s built-in runner, fixtures, and mocking. However, it struggled to calibrate the depth. The first pass was pitched at beginners; it took a few rounds of the question loop to bring the content up to the level of final-year students.

  • The Practicals: ChatGPT produced a series of stepped exercises. The structure was a useful scaffold, but the exercises initially lacked context. They were "bare basics" and needed more "why" behind the "what" to be truly educational for these students.

The Reality Check: What I Changed

The code examples required the most intervention. While some were fine, others contained small but meaningful errors. These are the "silent killers" of a teaching session—errors that would waste ten or twenty minutes of lab time while students struggle to figure out why their environment isn't running.

The Rule: Treat every AI code example as untested until you have run it yourself. I rewrote several examples substantially and tweaked others.

I also found the slide text a bit "flat." It was accurate but dry. I rewrote the explanatory paragraphs in my own voice to ensure the materials felt like they came from a human, not a manual.

The Bottom Line: How long did it take?

Building a session like this from scratch—slides, practicals, code examples, and timing—usually takes me six to eight hours.

Using this AI-assisted approach, the entire process took about 4 hours. That included the iterative questioning, reviewing the output, fixing the code, testing the code, and rewriting the text.

The time spent was cut by roughly 50%. However, that remaining time required your attention, and having a 'partner' asking meaningful questions helped as the activity changed. The saving is real; the shortcut is not.

Is it worth it?

Good For...Not Good For...
Structure: Getting a solid framework quickly.Context: Understanding your specific students.
Ideation: Prompting you to think of missed topics.Subject Knowledge: It cannot replace your expertise.
Mechanical Tasks: Saving time on slide building.Accuracy: Producing ready-to-use code.
Scaffolding: Generating a base for exercises.Calibration: Getting the pacing right without your input.

Where to go from here

If you want to try this, start simple. Pick one session you are already planning. Write a prompt that specifies:

  1. Your audience and their level.

  2. The format you need.

  3. The interaction style (ask me questions first, output second).

Review everything with the same critical eye you’d apply to a textbook you’ve never used before. Fix what’s wrong, cut what doesn’t fit, and keep the AI asking questions until you’re happy.

The goal isn’t to hand your job over to an AI. It’s to spend less time on the mechanical parts of the job so you have more time for the parts that actually require your expertise. In my experience, that is a trade well worth making.



All opinions in this blog are the Author's and should not in any way be seen as reflecting the views of any organisation the Author has any association with. Twitter @scottturneruon

Wednesday, 18 March 2026

From Boring to Beautiful: How I Used Claude to Transform a Dash App in Minutes


I've been learning Python data visualisation, working through Murat Durmus's Hands-On Introduction to Essential Python Libraries and Frameworks alongside the official Dash tutorial. The resulting code was functional — a basic bar chart comparing data for San Francisco and MontrĂ©al — but it looked like exactly what it was: a beginner's first attempt. Plain white background, default colours, numbered axes, and a title that just said "Data Viz."

So I decided to run an experiment. Could Claude AI turn a scrappy 20-line script into something genuinely worth showing people?


Before running the prompt


The First Prompt

I pasted the code into Claude.ai with a simple instruction: "Rewrite this following code to be graphically more interesting."

The result was striking. Claude switched to a dark "neon terminal" aesthetic — deep navy background, electric teal and magenta accents, and a stylish monospaced font. The bars got proper labels, the axes were cleaned up, and the whole thing felt intentional rather than accidental. It had gone from looking like homework to looking like a developer portfolio piece.


After the 1st prompt



Refining for a Real Audience

I pushed further. Same code, new prompt: "Rewrite this to be graphically more interesting for a general audience. Choose whatever works best for this audience."


After the second prompt



This time Claude made very different choices — and that's the interesting part. Recognising that a general audience needs warmth and clarity rather than technical cool, it switched to a bright, friendly design. Rounded bars in coral and teal, a clean white card layout, and a Nunito font that feels approachable rather than intimidating. It even added summary stat cards above the chart — showing the average and peak month for each city — so someone who doesn't want to "read" a chart can still instantly understand the data.

What I Noticed

The code grew substantially. My original 20 lines became well over 150 — defining colour palettes, layout styles, hover tooltips, and summary components. That might sound like more complexity, but it's actually the opposite: Claude generated the boilerplate so I didn't have to. The finished app is more readable for users, even if there's more code underneath.

The bigger lesson? The prompt matters as much as the tool. "More interesting" and "more interesting for a general audience" produced completely different results — one optimised for aesthetics, one for usability.


 

Code based on dash.plotly.com/tutorial and Murat Durmus (2023), pages 143–145.

 

References

Anthropic. (2024). Claude AI [Large language model]. Retrieved from https://claude.ai

Durmus, M. (2023). Hands-on introduction to essential Python libraries and frameworks (pp. 143–145). Amazon KDP. Retrieved from https://www.amazon.com

Plotly Technologies Inc. (2024). Dash documentation: Tutorial. Retrieved from https://dash.plotly.com/tutorial

All opinions in this blog are the Author's and should not in any way be seen as reflecting the views of any organisation the Author has any association with. Twitter @scottturneruon

GenaI as co-author and more importantly as "Devil's Advocate"

In a companion post on context stacking, I came across an idea that stayed with me — and I wanted to explore it further. This piece isn’t ...