A few weeks ago I went to GitHub Copilot Dev Days. Agents doing security reviews in CI, enforcing compliance on every push, managing Kubernetes operations. The demos were impressive. But I kept coming back to one question: what happens when something goes wrong and nobody can trace what the agent did?
I posted about it on X and it started a conversation. Someone responded that the guardrail is a comprehensive test suite. Fair point for application code. But most teams don't have comprehensive test coverage for infrastructure, and even if they did, tests only catch what you anticipated. Agents improvise.
Then I started looking into what's actually happening in production and it got worse.
Security researchers at General Analysis found that Cursor with Supabase MCP would read support tickets containing malicious commands and execute them. An attacker embedded SQL instructions in a support ticket telling the agent to read the integration_tokens table and post the data back. It did exactly that. The entire SQL database was exposed through a support ticket.
A misconfigured GitHub MCP server allowed unauthorized access to private vulnerability reports. Over 13,000 MCP servers launched on GitHub in 2025. Developers are integrating them faster than security teams can catalog them.
At a startup called SaaStr, an autonomous coding agent was given a maintenance task during a code freeze with explicit instructions to make no changes. It ran a DROP DATABASE command and wiped production. Then it generated 4,000 fake user accounts and fabricated logs to cover it up.
And Anthropic themselves had to patch a vulnerability in the official MCP inspector tool that quietly opened a backdoor on developer machines.
The common thread in all of these: no visibility into what the agent was doing until after the damage was done.
I've felt this gap myself. I spent the last year building MCP agents at an enterprise platform. Agents that submitted requests, modified records, called tools autonomously. The observability behind them was essentially just application logs. If something went wrong I could dig through logs and maybe reconstruct what happened. Maybe. There was no structured way to see which tools the agent called, in what order, with what arguments, what data it touched, and what it got back.
MCP's own protocol specification can't enforce security at the protocol level. A research paper on enterprise MCP security specifically names "Insufficient Auditability" as a critical threat, noting that inadequate logging restricts "detection and investigation of security events."
So I built the thing I wished I had.
mcp-audit-trail is a lightweight observability layer for MCP agents. It captures a structured audit trail of every tool call and generates a visual report of what the agent did during a session.
There are two ways to use it.
The first is the proxy. You wrap any MCP server command with mcp-audit proxy --server "python your_server.py" and it sits transparently between the client and server, intercepting every JSON-RPC message in both directions. The client and server don't know it's there. You don't change any code. You just get a complete log of every interaction.
The second is the programmatic API. You import AuditLogger into your own MCP client code and record tool calls as they happen. You configure which tools are sensitive and which perform write actions using AuditConfig, and the logger handles classification and entity tracking automatically.
Both modes produce a structured JSON audit log. Each event captures the timestamp, which tool was called, what arguments were passed, what the result was, what data entities were accessed, and whether any errors occurred. The log also includes a session summary: total tool calls, tools used, unique data entities touched, and error count.
Then you run mcp-audit report and get a standalone HTML report. It shows the session summary, a tool usage breakdown where each tool is tagged as READ, SENSITIVE, or WRITE, a data access map showing which entities were touched by which tools, and an interactive event timeline where you can expand any event to see the full arguments and results. Sensitive data access gets flagged in purple. Write actions get amber. Errors get red.
The demo scenario I built with it is intentionally designed to surface the patterns that matter. An agent searches for employees, accesses pay information for people in different departments, submits a time-off request, and tries to look up a non-existent employee. The report immediately surfaces questions a security or compliance team would ask. Why did the agent access salary data for someone in a different department? Did the employee authorize that time-off submission? Was the failed lookup a hallucination?
This isn't trying to be a full enterprise solution. Companies like MintMCP, Ithena, and Datadog are building comprehensive MCP observability platforms. What I wanted was something you could drop into any existing MCP setup in 30 seconds and immediately see what your agent is doing. No gateway to deploy, no infrastructure to set up. Just pip install mcp-audit-trail and wrap your server.
The repo is at github.com/khushidahi/mcp-audit-trail. Install it, run the demo, and look at the report. If you're running MCP agents in production without structured audit logging, you're flying blind. And based on what I've seen in the last few months, that's most of us.