Same Prompt, Four AIs — Why the Answers Aren’t the Same The differences aren’t just in the answers—they’re in the thinking Generative AI tools are often discussed as if they were interchangeable—different interfaces delivering broadly similar outputs. However, when applied to complex intellectual tasks, meaningful differences begin to emerge. To explore this, I ran the same academically rigorous prompt through four leading systems—Claude, ChatGPT, Google Gemini, and Copilot. The task required a full thematic analysis of a researcher’s career using the framework developed by Virginia Braun and Victoria Clarke . What followed was not simply variation in output, but variation in how each system approached the act of analysis itself. Same Input, Different Interpretations At a high level, the experiment is simple: One prompt → Four models → Four distinct approaches What changes is not the instruction, but how each system: Interprets the task Handles uncertainty Applies methodology Defines ...
Experiments with various forms of LLMs to improve productivity