Bhavishya Aggarwal

Posted on Dec 28 • Originally published at Medium on Dec 27

I Let An AI Pentester: Shannon, On My Vulnerable Go App — Here’s What Happened

#security #anthropicclaude #cybersecurity #softwaredevelopment

I Let An AI Pentester: Shannon, On My Vulnerable Go App — Here’s What Happened

Monday, 9 AM.

I’m shipping code. My team is shipping code. Claude Code makes it stupidly easy — we’re deploying new features multiple times a day. Security? That’s happening once a year with an external pentester who charges $10K and takes three weeks.

Then I get an email: “Hey, want to test Shannon? It’s an AI pentester that finds actual exploits, not just alerts.”

I’ve seen a lot of security tools. Most of them are noise machines. Flags thousands of issues, 90% false positives, zero actionable insight.

But I’m curious. And I have a purposefully broken Go app sitting around (Vulnerability-goapp — basically OWASP Top 10: The App). Perfect test subject.

So I thought: Why not actually try this?

I set up Shannon on Thursday night. Gave it one job: break my vulnerable app. I wasn’t expecting much. Grabbed coffee and walked away.

90 minutes later, I came back to find complete account takeovers, SQL injection bypasses, XSS vectors stealing session cookies, and authorization flaws that let attackers modify any user’s profile.

Not theories. Not “potential issues.” Actual exploits. Copy-and-paste proof-of-concepts, showing exactly how to break the app.

That’s when I realized something: This isn’t just another security scanner.

The Problem: Why I Even Looked For This

Here’s the reality of modern software development: We’ve optimized the hell out of shipping code. CI/CD pipelines are tight. GitHub Copilot and Claude Code make it trivial to pump out features. Some teams deploy multiple times a day.

But security testing? It’s stuck in 2010.

Once a year, maybe twice. You call a pentesting firm, they come in for a week, charge you thousands, find issues three months later, and by then you’ve shipped 10 different versions of the vulnerable code.

364 days of no security testing.

That gap is real. And it’s dangerous.

Traditional pentesting doesn’t scale with continuous deployment. It’s async. It’s expensive. It’s slow. And by the time you get the report, the vulnerabilities have already landed in production (maybe).

I started looking for something that could give me on-demand security testing. Something that doesn’t require a retainer or a waiting list. Something that actually works with the way modern teams build.

That’s when I found Shannon.

What I Did: Setting Up Shannon

Shannon is built by Keygraph HQ and positioned as an “autonomous AI pentester.” It’s open-source (AGPL-3.0), free to use, and runs in Docker.

The setup was straightforward:

Step 1: Clone and Build

git clone https://github.com/KeygraphHQ/shannon.git
cd shannon
docker build -t shannon:latest .

Took about 15 minutes on my machine. Nothing fancy.

Step 2: Prepare Your App

I had a vulnerable Go app already running locally on port 9090. Shannon needs:

Your application running and accessible
Your source code available for analysis
An API token from Anthropic (Claude’s API)

Step 3: Run The Pentest

export CLAUDE_CODE_OAUTH_TOKEN="your_token_here"

docker run --rm -it \
 --network host \
 --cap-add=NET_RAW \
 --cap-add=NET_ADMIN \
 -e CLAUDE_CODE_OAUTH_TOKEN="$CLAUDE_CODE_OAUTH_TOKEN" \
 -e CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 \
 -v "$(pwd)/repos:/app/repos" \
 -v "$(pwd)/configs:/app/configs" \
 shannon:latest \
 "http://localhost:9090" \
 "/app/repos/Vulnerability-goapp"

Hit enter and wait for all the agents to run and get tested against your code

The Target App: Vulnerability-goapp

Let me be clear about what I was testing. This isn’t some random project. Vulnerability-goapp is a Go-based web application intentionally built with OWASP Top 10 vulnerabilities. It includes:

User authentication and profiles
Posts and timeline features
File uploads
Admin panel
Search functionality
Database interactions

It’s written in Go, uses MySQL for storage, and serves HTML pages with session-based authentication.

The goal: See if Shannon could find exploitable vulnerabilities in real code.

What Shannon Found: The Real Story

Here’s what came back in the 90-minute report:

Authentication Vulnerabilities: Shannon discovered that session IDs are generated by base64-encoding email addresses. Meaning, if you know someone’s email, you can forge their session cookie without any password — complete account takeover. Sessions never expire. Session cookies lack security flags (no HttpOnly, Secure, or SameSite). The default admin credentials (admin@admin.com/Qwerty1234) are hardcoded. Login endpoints have zero rate limiting (30,000+ attempts per hour possible). Everything runs over HTTP—credentials transmitted in plaintext.

Authorization Failures: The app trusts a client-controlled UserID cookie for all access decisions. No ownership validation. An attacker can change any user’s password, modify their profile, view their private data, upload files to their account, and read their posts. All by manipulating a cookie. Seven different authorization flaws, all exploitable.

Cross-Site Scripting (XSS): Shannon found and successfully exploited all 10 XSS vulnerabilities across the app. Reflected XSS at the root endpoint that steals session cookies. Stored XSS in timeline posts affecting all users. XSS in profile fields, usernames, search results, file uploads. Root cause: the app uses Go’s text/template instead of html/template and never HTML-encodes user input. No CSP headers, no HttpOnly flags. Session hijacking via JavaScript was demonstrated—attacker gets victim's cookies and can impersonate them.

SQL Injection: Admin authentication endpoint constructs SQL queries by string concatenation. A simple SQL injection payload bypasses login and grants instant admin access. The search endpoint has the same issue — SQL injection bypasses filters and exposes all posts including “private” ones. Database root credentials are exposed in the vulnerable code.

Infrastructure Issues: HTTP-only transport. MySQL database exposed to host network. No HTTPS. No security headers. MySQL 5.6 (end-of-life since 2021). Hardcoded database credentials throughout the code.

The Numbers

Total Vulnerabilities Found: 40+
Critical Severity: 15+
Successfully Exploited: 100% of tested categories
False Positives: Zero
Time to Complete Assessment: 90 minutes
Authentication Bypass: 10 seconds
Admin Access: 5 seconds
XSS Vectors: All 10 working

This wasn’t theoretical. Shannon provided actual working exploits for each vulnerability. No maybes. No “could be.” Real impact, reproducible steps, working PoCs.

My Honest Take: What Impressed Me

What I Loved:

Shannon actually works. I didn’t expect this many vulnerabilities found, and the quality was professional-grade. Every finding came with reproducible exploit code — no fluff, no false positives. The speed is insane: 90 minutes for a complete assessment, versus 3–4 weeks for traditional pentesting. It’s code-aware (analyzes source AND runs dynamic tests, not just black-box scanning). And the setup is easy — Docker, one command, it runs.

Who Should Actually Use This?

DevOps Engineers: Automate security testing in your pipeline. Catch vulnerabilities before production. Generate compliance reports (SOC 2, HIPAA-ready). Reduce manual pentesting overhead.

Backend Developers: Test your Go, Python, Node, Java apps. Understand security from the developer’s perspective. Real PoCs you can actually fix. Fast feedback loop (90 minutes vs 3 weeks).

Security Teams: Support multiple teams without hiring more pentesters. Consistent, comparable assessments across projects. Evidence collection for compliance. Efficient use of limited security resources.

Startups: Can’t afford $10K pentests. Need continuous security testing. Deploy frequently, need to validate each deployment. Shannon Lite is free and open-source.

Anyone shipping code frequently who realizes their once-a-year pentest schedule doesn’t match their deployment frequency.

Final Thoughts

I went into this skeptical. “Another security scanner, probably overhyped,” I thought.

I came out genuinely impressed.

Shannon solves a real problem: the gap between continuous deployment and continuous security testing. You can’t wait a year between pentests if you’re shipping daily. Shannon gives you on-demand, autonomous penetration testing that actually works.

Is it perfect? No. But does it work? Absolutely.

For developers and DevOps engineers tired of the “we’ll do security later” cycle, or businesses that can’t afford traditional pentesting, Shannon is worth a serious look. “Test your code. Break it yourself before someone else does.”

DEV Community

I Let An AI Pentester: Shannon, On My Vulnerable Go App — Here’s What Happened

I Let An AI Pentester: Shannon, On My Vulnerable Go App — Here’s What Happened

The Problem: Why I Even Looked For This

What I Did: Setting Up Shannon

The Target App: Vulnerability-goapp

What Shannon Found: The Real Story

The Numbers

My Honest Take: What Impressed Me

Who Should Actually Use This?

Final Thoughts

Top comments (0)