Poisoning the Safety Net | Brooks McMillin

Talk at CackalackyCon 2026 on compromising the AI development toolchain itself. Teams are stacking defenses around LLM-generated code: pre-commit hooks, AI code review agents, AGENTS.md/CLAUDE.md context files, multi-layered CI. Those defenses work well against accidental LLM mistakes. They fare worse against intentional adversarial input. The talk walks through live attacks tuned against claude-sonnet-4-6 over the weeks leading up to the conference: poisoning context files with believable paperwork instead of directives, manipulating CI-integrated reviewers through the diff itself, weaponizing the trust boundary between developer and AI gate, and compromising the agent runtime through MCP servers and shared skills that sit outside the repo's review boundary.

Each attack is paired with the defense that actually holds: CODEOWNERS and integrity protection for context files, security invariant tests that assert properties instead of relying on review, separating critical security rules from LLM-accessible context, and tool/MCP allowlists with isolated runtimes. Every demo runs against a public monorepo (github.com/brooksmcmillin/infra) with real defensive tooling deployed.