1. Build the knowledge base
Past bugs, incidents, issue tickets, and blame history are collected and stored in a vector-backed retrieval layer so the system can surface similar historical failures for a new change.
Case study
I built an autonomous regression-review workflow that compares pull request diffs against historical failure patterns, then posts evidence-backed feedback before risky changes reach production.
The project sits at the overlap of quality engineering and AI workflow design: use LLM reasoning where it adds leverage, but anchor decisions in concrete historical evidence rather than vague summarisation.
Context
Traditional code review is good at spotting style issues, obvious mistakes, and architectural concerns. It is much less reliable at noticing when a new change quietly resembles a past production failure, especially in large or fast-moving codebases.
Quality teams often have valuable defect history spread across Sentry, issue trackers, and commit logs, but that knowledge rarely shows up at the exact moment a risky pull request is under review.
Turn historical bug knowledge into something reviewable and immediate, not something buried in old tickets after a regression has already shipped.
My role
Workflow
Past bugs, incidents, issue tickets, and blame history are collected and stored in a vector-backed retrieval layer so the system can surface similar historical failures for a new change.
The agent analyses changed files and code patterns, then retrieves the most relevant historical examples before generating any review output.
GPT-4o is used as a reasoning layer, not the source of truth. The model compares the diff against retrieved examples and identifies likely regression patterns with supporting references.
Findings are sent back to GitHub as structured review comments so the PR author gets signal in the same workflow where decisions are already happening.
Key decisions
Impact
The value of this project is not just AI-assisted review. It demonstrates a way to turn QA knowledge into an active engineering system that improves decision quality earlier in the lifecycle, where prevention is cheaper than recovery.
Stack