Saturday, 21 February 2026

Reverse engineering a specification from a solution using GenAI: Part 1

Imagine buying complex furniture, but the instructions are a chaotic pile of sticky notes. This is often the daily reality for software engineers. Whether you are "vibe coding" a new feature or trying to connect two different web services, you often find yourself digging through messy, undocumented code just to understand how things talk to each other.

The OpenAPI Specification (OAS) is the "Universal Instruction Manual" that fixes this.

Why This Matters for Modern Engineering

In a world of AI agents like Gemini, ChatGPT, and Claude, having a machine-readable "map" of your code is a superpower.

  • For Humans: It provides a clear map of what a service does and what it needs.

  • For AI Agents: Tools like Cursor or Windsurf can use an OpenAPI spec to understand your project’s "intent" (the vibe) without getting lost in the syntax of your legacy files.

  • For Students: It allows you to take an old Python/Flask project and instantly generate interactive docs, automated tests, or even a brand-new frontend.

What Can You Do With It?

With an OpenAPI blueprint, developers can plug into tools that do the heavy lifting for them:

  • Create Visual Guides: Turn complex code into sleek, interactive websites where users can test the service with the click of a button.

  • Write Code Automatically: Instantly generate the "glue code" needed for apps or servers, saving hours of manual typing.

  • Automate Testing: Let tools read the blueprint to automatically double-check that the software works exactly as promised.

The Experiment: Can AI Work Backwards?

If we have existing software but no instruction manual, can we use Generative AI to "reverse-engineer" one just by looking at the code?

Step 1: The Raw Materials

We started with a simple, zipped Python Flask payroll system. A zipped file for an example Python Flask based simple payroll system (and it is simple): https://github.com/scottturnercanterbury/musical-meme.git


Step 2: The "Vibe" Prompt

We loaded the zip file into the LLM with a direct instruction:

"Unpack this zip file into individual files. Produce an OpenAPI specification based on these files starting with app.py."

The AI successfully navigated the file structure, ignored the junk (like virtual environment folders), and generated a YAML file


Step 3: The "Nudge" (The Reality Check)

Important: The AI rarely gets it 100% right on the first try. In our test, the initial output was missing key elements—specifically some specific route parameters and internal logic definitions.

We had to "nudge" the AI to look closer at the Flask routes to ensure every endpoint was captured. This is where your role as an engineer comes in: The AI handles the boilerplate; you handle the accuracy.


I added a further  

The "Aha!" Moment: Capturing the Intent

One of the most impressive parts of this experiment was how the AI identified the vibe of the project.

Even though modern development leans toward JSON APIs, this legacy project was a server-side HTML app. The AI correctly identified that the responses were text/html rather than application/json. It understood the original coder’s intent without being told.

JSON
{
  "openapi": "3.1.0",
  "info": {
    "title": "Musical Meme Payroll Management System",
    "description": "A simple Flask-based payroll application... using form submissions rather than JSON APIs."
  },
  "paths": {
    "/": {
      "get": {
        "summary": "List employees",
        "responses": {
          "200": {
            "description": "HTML page containing employee list",
            "content": {
              "text/html": { "schema": { "type": "string" } }
            }
          }
        }
      }
    }
  }
}

Figure 1: An extract showing the AI correctly identifying the HTML response "vibe".


Validation: Does it actually work?

We took the final JSON and loaded it into Swagger. After our manual "nudges" and tweaks, the specification passed validation perfectly. 

Here's to start with this:

  1. Got to https://swagger.io/  
  2. Set-up a new account if you don't have one.up a new account

We can load it the JSON (figure 1 just shows an extract of the OpenAPI produced) and then check if it works against the specification (hint:it did)


Figure 2 Swagger testing the specification



Figure 3: The schemas produced 


🎓 Student Challenge: The 5-Minute Doc

Don't just take my word for it. Try this:

  1. Download the zip from the GitHub link above.

  2. Ask your favorite LLM to generate the OpenAPI spec.

  3. The Task: Use that generated spec to create a live documentation site using Redoc or Swagger UI in under 5 minutes.

Conclusion: Reverse-engineering a spec isn't just possible—it's a gateway to modernising "messy" code so you can keep vibe coding without the technical debt.

Coming soon: Can we go the other way? Generating an entire working app from just the Specification.



All opinions in this blog are the Author's and should not in any way be seen as reflecting the views of any organisation the Author has any association with. 

No comments:

Post a Comment

Starting a Literature Review with GenAI: A Supervisor’s Secret Weapon

If you supervise research students at undergraduate or postgraduate level, you are likely to be very familiar with the "blank stare...