Introduction
My previous post on defense in depth for AI-assisted development covered what safety layers to use when coding with LLMs and why they matter. The response was great, but the most common question I got was: “How do I actually set all this up?”
Fair question. That post assumed you already had a working development environment with Claude Code, pre-commit, and GitHub Actions experience. This post assumes you have none of that. If you’ve never used Claude Code, never configured a pre-commit hook, and never written a GitHub Actions workflow, this guide will take you from zero to having every safety layer I described.
Note: The specific tools in this guide are for Python projects, but the architecture—pre-commit hooks, context files, CI review agents, branch protection—applies to any language. If you’re working in JavaScript, Go, or another ecosystem, swap the Python-specific tools for your equivalents and follow the same layered approach.
We’ll set up:
- Claude Code — the AI coding assistant that writes your code
- Pre-commit hooks — automated checks that block bad code before it’s committed
- CLAUDE.md — a context file that teaches Claude your project’s rules
- Local review agents — specialized AI agents that audit your codebase
- CI workflows — GitHub Actions that enforce checks before code merges
- Branch protection — GitHub settings that make CI checks mandatory
By the end, you’ll have the same multi-layer safety setup I use across my projects.
Prerequisites
Before we start, you’ll need:
- A GitHub account with a repository you want to protect (public repos are ideal for testing since they get unlimited GitHub Actions minutes)
- Python 3.10+ installed (python.org or your system package manager)
- uv for Python package management (install with
curl -LsSf https://astral.sh/uv/install.sh | sh). You can use pip instead, but this guide defaults to uv. - Git installed and configured with your GitHub credentials
- A Claude subscription — Claude Code requires a Pro, Max, or Team plan, or an Anthropic API key
That’s it. We’ll install everything else along the way.
Step 1: Install Claude Code
Claude Code is Anthropic’s CLI tool that lets you use Claude directly in your terminal to write, edit, and review code.
Install the CLI
The recommended installation method is the native binary:
Linux / macOS:
curl -fsSL https://claude.ai/install.sh | bash
Windows (PowerShell):
irm https://claude.ai/install.ps1 | iex
Alternatively, if you have Node.js 18+ installed, you can use npm:
npm install -g @anthropic-ai/claude-code
Authenticate
Open a terminal, navigate to your project directory, and run:
claude
On first launch, Claude Code will walk you through authentication. You’ll log in with your Anthropic account (the same one tied to your Claude Pro/Max subscription). If you’re using an API key instead, set it as an environment variable before launching:
export ANTHROPIC_API_KEY="your-key-here"
claude
Verify it works
Once authenticated, you should see Claude’s interactive prompt. Type a simple question to confirm everything is connected:
> What files are in this directory?
Claude will use its tools to list your files. If you see output, you’re good to go. Type /exit to leave the session for now.
Troubleshooting: If claude isn’t found after installation, close and reopen your terminal to refresh your PATH. If issues persist, run claude doctor for automated diagnostics.
Step 2: Set Up Pre-commit Hooks
Pre-commit hooks run automatically before every git commit, blocking code that fails checks. This is your first line of defense against LLM-generated mistakes.
Install the pre-commit framework
pre-commit is a Python tool that manages git hook scripts. Install it:
# With uv (recommended)
uv tool install pre-commit
# Or with pip
pip install pre-commit
# Or with Homebrew (macOS)
brew install pre-commit
Verify the installation:
pre-commit --version
Create the configuration file
In your project root, create a .pre-commit-config.yaml file. Here’s the configuration from my previous post, with comments explaining what each hook does:
repos:
# General file hygiene
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: trailing-whitespace # Removes trailing spaces
- id: end-of-file-fixer # Ensures files end with a newline
- id: check-yaml # Validates YAML syntax
- id: check-added-large-files # Blocks files > 500KB
- id: check-json # Validates JSON syntax
- id: check-toml # Validates TOML syntax
- id: check-merge-conflict # Catches leftover merge markers
- id: debug-statements # Catches leftover breakpoint()/pdb
# Secret detection - catches leaked credentials
- repo: https://github.com/Yelp/detect-secrets
rev: v1.5.0
hooks:
- id: detect-secrets
args: ['--baseline', '.secrets.baseline']
# Python linting and formatting
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.15.5
hooks:
- id: ruff # Linter (replaces flake8, isort, etc.)
args: [--fix, --exit-non-zero-on-fix]
- id: ruff-format # Formatter (replaces black)
# Type checking
- repo: local
hooks:
- id: pyright
name: pyright
entry: uv run pyright # Or just: pyright (if installed globally)
language: system
types: [python]
pass_filenames: false
# Security scanning
- repo: https://github.com/PyCQA/bandit
rev: 1.9.4
hooks:
- id: bandit
args: ['-c', 'pyproject.toml']
What each layer catches:
| Hook | What it catches | Example |
|---|---|---|
detect-secrets | API keys, passwords, tokens in code | password = "sk-ant-..." |
ruff | Style violations, unused imports, bad patterns | import os (unused) |
ruff-format | Inconsistent formatting | Mixed tabs/spaces |
pyright | Type errors, missing attributes | Calling .foo() on a str |
bandit | Security anti-patterns | eval(), weak crypto, hardcoded passwords |
Initialize detect-secrets
The detect-secrets hook needs a baseline file so it knows which “secrets” are intentional (like test fixtures). First, install the CLI tool:
uv tool install detect-secrets
# Or with pip
pip install detect-secrets
Then generate the baseline:
detect-secrets scan > .secrets.baseline
Review the baseline file—it lists every pattern that looks like a secret. If there are false positives (like test API keys), they’ll be allowlisted. New secrets added after this baseline will be caught and blocked.
Configure bandit
Bandit needs a configuration section in your pyproject.toml. If you don’t have one yet, create it with uv init or just create an empty pyproject.toml file in your project root. Then add:
[tool.bandit]
exclude_dirs = ["tests"]
This tells bandit to skip test files, where patterns like assert and test credentials are expected.
Set up pyright (type checking)
Add pyright as a dev dependency:
uv add --dev pyright
If you’re not using uv, install pyright globally:
npm install -g pyright
And change the pre-commit hook entry to just pyright instead of uv run pyright.
Install the hooks
Now activate the hooks in your repository:
pre-commit install
This configures git to run your hooks before every commit. Test that everything works:
pre-commit run --all-files
You’ll see each hook run against your codebase. Don’t be alarmed if some hooks show “Failed” on the first run—hooks like trailing-whitespace and end-of-file-fixer auto-fix files for you. Just stage the fixed files with git add and run the hooks again. Once everything passes, your first safety layer is in place.
Tip: The hook versions in your config will get stale over time. Run pre-commit autoupdate periodically to bump them to the latest releases.
Optional: Add tests to pre-commit
For smaller projects, running tests on every commit is valuable. Add this to your .pre-commit-config.yaml:
- repo: local
hooks:
- id: pytest
name: pytest
entry: uv run pytest # Or just: pytest (if not using uv)
language: system
types: [python]
pass_filenames: false
stages: [pre-commit]
Trade-off: This makes every commit slower. For large test suites, skip this and rely on CI (Step 5) for test enforcement instead. My rule of thumb: if tests run in under 30 seconds, add them to pre-commit. Otherwise, CI only.
Step 3: Create a CLAUDE.md File
CLAUDE.md is a file in your repository root that tells Claude (and any AI tools) how your project works. It’s the single most effective way to prevent LLM mistakes because it provides context that the model wouldn’t otherwise have.
Without a CLAUDE.md, Claude will guess at your conventions. With one, it follows them.
Create the file
Create CLAUDE.md in your project root. Here’s a template based on what I use:
# Project Name
## Quick Commands
- **Run tests**: `uv run pytest`
- **Run linter**: `uv run ruff check .`
- **Run formatter**: `uv run ruff format .`
- **Run type checker**: `uv run pyright`
- **Run all pre-commit hooks**: `pre-commit run --all-files`
## Architecture
Brief description of the project structure:
- `src/` — Main application code
- `tests/` — Test files (mirror the src/ structure)
- `docs/` — Documentation
## Coding Standards
- Use type hints on all function signatures
- Use `str | None` syntax, not `Optional[str]`
- Use async/await for I/O operations
- Use `uv` for package management, not pip
## Security Requirements
- NEVER commit secrets, API keys, or credentials
- NEVER use `eval()` or `exec()` on user input
- NEVER disable authentication or CSRF protection
- Always use parameterized queries for database access
- Always validate and sanitize user input
## Common Patterns
### Error Handling
Use custom exception classes, not bare `except`:
```python
try:
result = await service.get_item(item_id)
except ItemNotFoundError:
raise HTTPException(status_code=404, detail="Item not found")
API Responses
Always return consistent response shapes:
# Success: {"data": {...}, "meta": {...}}
# Error: {"detail": {"code": "ERR_001", "message": "..."}}
Anti-patterns (DO NOT do these)
- Do not remove existing security middleware to fix test failures
- Do not downgrade HTTPS to HTTP in configuration
- Do not truncate or rewrite entire files when making targeted edits
- Do not add
# pragma: allowlist secretto bypass secret detection
## What to include
Based on my experience, these sections prevent the most LLM mistakes:
1. **Quick commands** — Prevents Claude from inventing build/test commands that don't exist
2. **Architecture overview** — Enforces consistency in where code goes
3. **Security requirements** — Explicit "never do this" guardrails
4. **Common patterns** — Working examples Claude can follow
5. **Anti-patterns** — Things Claude should actively avoid
The file doesn't need to be perfect on day one. Start with quick commands and security requirements, then add patterns as you notice Claude making the same mistake twice.
# Step 4: Set Up Local Review Agents
Pre-commit hooks catch syntactic problems. Review agents catch *logic* problems—the kind of mistakes where the code runs fine but does the wrong thing. Things like incomplete authorization checks, race conditions, or missing error handling.
I maintain a set of specialized agents in my [claude-code-agents](https://github.com/brooksmcmillin/claude-code-agents) repo. Each agent is a Markdown file with a system prompt optimized for one type of analysis.
## Get the agents
Clone the agents repo:
```bash
git clone https://github.com/brooksmcmillin/claude-code-agents.git ~/claude-code-agents
This gives you five agents:
| Agent | What it checks |
|---|---|
security-code-reviewer.md | Vulnerabilities, misconfigurations, insecure patterns |
test-coverage-checker.md | Untested code paths in critical areas |
code-optimizer.md | Complexity, duplication, dead code |
dependency-auditor.md | CVEs, outdated packages, license issues |
doc-auditor.md | Stale, missing, or inconsistent documentation |
Run an agent
From your project directory, run an agent using Claude Code’s --system-prompt flag to load the agent’s prompt from the Markdown file:
claude --system-prompt "$(cat ~/claude-code-agents/security-code-reviewer.md)" -p "Review this project"
The -p (print) flag runs Claude non-interactively—it analyzes your codebase using read-only tools (Read, Grep, Glob, Bash), prints a report of findings, and exits. It won’t make any changes—it only provides recommendations.
You can also target a specific directory:
claude --system-prompt "$(cat ~/claude-code-agents/security-code-reviewer.md)" -p "Review src/api"
Run multiple agents in parallel
The real power comes from running agents from within a Claude Code session. Start Claude Code in your project:
claude
Then ask it to run multiple agents at once:
> Run the security-code-reviewer and test-coverage-checker agents from
~/claude-code-agents against this project. Run them in parallel and
summarize the findings.
Claude will spawn the agents as parallel subprocesses, each with its own context window focused on its specialty. When they finish, Claude summarizes all findings and prioritizes what to fix.
Cost considerations
Running agents is token-intensive. A full run of all five agents can use 250k+ tokens. Be intentional about when you run them:
- Security agent: Run before any PR that touches authentication, authorization, or user input handling
- All agents: Run periodically (weekly or after major feature work) as a comprehensive audit
- Single agents: Run as needed when you suspect issues in a specific area
Step 5: Set Up CI Workflows
Pre-commit hooks can be skipped with git commit --no-verify. Review agent findings can be ignored. CI workflows are the enforced safety net—they block merges when checks fail.
Option A: Use reusable workflows (quickest setup)
I maintain reusable CI workflows in my workflows repo that you can reference directly. Create .github/workflows/ci.yml in your project:
name: CI
on:
push:
branches: [main]
pull_request:
jobs:
ci:
uses: brooksmcmillin/workflows/.github/workflows/python-ci.yml@main
with:
package-name: your-package-name
run-type-check: true
run-lint: true
run-tests: true
run-security: true
secrets:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
This single file gives you linting (ruff), type checking (pyright), tests (pytest with coverage), build verification, and security scanning (bandit)—all running in GitHub Actions on every PR.
Option B: Write your own CI workflow
If you prefer full control, here’s a standalone workflow that replicates the same checks. Create .github/workflows/ci.yml:
name: CI
on:
push:
branches: [main]
pull_request:
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- uses: actions/setup-python@v6
with:
python-version: "3.12"
- run: uv sync
- run: uv run ruff check .
- run: uv run ruff format --check .
type-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- uses: actions/setup-python@v6
with:
python-version: "3.12"
- run: uv sync
- run: uv run pyright
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- uses: actions/setup-python@v6
with:
python-version: "3.12"
- run: uv sync
- run: uv run pytest --cov --cov-report=xml
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- uses: actions/setup-python@v6
with:
python-version: "3.12"
- run: uv sync
- run: uv run bandit -r src/ -c pyproject.toml
Add Claude Code review to PRs
This is the automated version of the review agents from Step 4. Claude reviews every PR and leaves comments—both a general code review and a security-focused review. Create .github/workflows/claude-review.yml:
name: Claude Review
on:
pull_request:
types: [opened, synchronize, reopened]
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
code-review:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
issues: read
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@v6
with:
fetch-depth: 1
- name: Run Claude Code Review
uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
allowed_bots: "dependabot[bot]"
prompt: |
You are a code reviewer. Review this pull request for
code quality, correctness, and maintainability.
Use the repository's CLAUDE.md for guidance on style
and conventions.
Focus on:
- Correctness: logic errors, edge cases, error handling
- Maintainability: readability, structure, naming
- Style: consistency with project patterns
Use `gh pr comment` with your Bash tool to leave your
review as a comment on the PR.
claude_args: >-
--allowed-tools
"Bash(gh pr comment:*),Bash(gh pr diff:*),Bash(gh pr view:*),Bash(gh api:*)"
security-review:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
issues: read
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@v6
with:
fetch-depth: 1
- name: Run Claude Security Review
uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
allowed_bots: "dependabot[bot]"
prompt: |
You are a security reviewer. Review this pull request
for security vulnerabilities and concerns.
Focus on:
- Injection vulnerabilities (SQL, command, XSS)
- Authentication and authorization issues
- Data exposure (secrets, PII in logs)
- Input validation at system boundaries
- Dependency vulnerabilities
- SSRF, path traversal, and OWASP Top 10 issues
Use `gh pr comment` with your Bash tool to leave your
security review as a comment on the PR.
claude_args: >-
--allowed-tools
"Bash(gh pr comment:*),Bash(gh pr diff:*),Bash(gh pr view:*),Bash(gh api:*)"
How this works: The claude-code-action installs Claude Code inside the GitHub Actions runner and executes it with your prompt. Claude has full read access to the checked-out repository—it can read files, search code, and analyze the diff just like it would locally. The claude_args line restricts which write tools Claude can use (in this case, only gh commands for posting comments), preventing it from modifying code during review.
A few things to note about this workflow:
synchronizetrigger — Reviews re-run when new commits are pushed to the PR, not just when it’s first opened. This catches issues introduced during iteration.concurrencywithcancel-in-progress— If you push again while a review is running, the old one cancels instead of piling up.pull-requests: write— Required for Claude to post comments viagh pr comment.allowed_bots— Lets Claude review Dependabot PRs too.
Set up authentication for Claude Code review
You have three options:
Option 1: Use /install-github-app (easiest)
Inside a Claude Code session, run /install-github-app. This installs Anthropic’s GitHub app on your repository, which handles authentication automatically—no API keys or tokens to manage. If you use this option, remove the anthropic_api_key line from the workflow YAML above. This is the fastest way to get Claude reviewing PRs.
Option 2: Anthropic API key
- Go to console.anthropic.com and create an API key
- In your GitHub repo, go to Settings > Secrets and variables > Actions
- Click New repository secret
- Name:
ANTHROPIC_API_KEY, Value: your API key
Option 3: Claude subscription OAuth token (avoids separate API costs)
If you have a Claude Pro or Max subscription:
- Run
claude setup-tokenin your terminal - Copy the token it generates
- Add it as a GitHub secret named
CLAUDE_CODE_OAUTH_TOKEN - In the workflow, replace the
anthropic_api_keyline with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
Optional: Add advanced security scanning
For production systems, layer additional scanning tools. Create .github/workflows/security.yml:
name: Security Scanning
on:
push:
branches: [main]
pull_request:
jobs:
semgrep:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Run Semgrep
uses: semgrep/semgrep-action@v1
with:
config: >-
p/security-audit
p/secrets
p/python
codeql:
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- uses: actions/checkout@v6
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: python
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
trivy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Run Trivy
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
scan-ref: '.'
What each tool adds:
| Tool | What it catches | Why it matters |
|---|---|---|
| Semgrep | Semantic patterns (SQL injection via f-strings, etc.) | Understands code meaning, not just text |
| CodeQL | Taint analysis (user input flowing to dangerous sinks) | Traces data flow across functions |
| Trivy | Known CVEs in dependencies, secrets in files | Broad vulnerability database |
These are all free for open-source projects and add significant coverage beyond bandit alone.
Step 6: Enable Branch Protection
The final step makes CI checks mandatory. Without branch protection, someone can push directly to main or use git commit --no-verify to skip pre-commit hooks entirely.
Set up branch protection rules
- Go to your GitHub repository
- Click Settings > Branches
- Under “Branch protection rules”, click Add branch protection rule (or Add classic branch protection rule)
- Set the branch name pattern to
main - Enable these settings:
- Require a pull request before merging — Prevents direct pushes to main
- Require status checks to pass before merging — This is the key setting
- Require branches to be up to date before merging — Ensures CI runs on the latest code
- In the status checks search box, add the checks from your CI workflow:
linttype-checktestsecurity
- Click Create (or Save changes)
Now no code can reach main without passing every check. This is the layer that can’t be bypassed.
Note: If your status checks don’t appear in the search box, they need to have run at least once. Push a PR to trigger the workflows, then come back and add them.
Putting It All Together
Here’s what happens now when you (or Claude) write code:
- You code with Claude — Claude follows your
CLAUDE.mdconventions - You commit — Pre-commit hooks block secrets, type errors, style violations, and security anti-patterns before the commit is created
- You push a PR — CI workflows run linting, type checking, tests, and security scans. Two Claude agents review the PR in parallel—one for code quality, one for security—and leave comments.
- You merge — Branch protection ensures all checks passed. Code reaches
mainonly after surviving every layer.
┌──────────────────────────────────────────────────────────────────┐
│ Code with Claude (guided by CLAUDE.md) │
└─────────────────────────────┬────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ Pre-commit hooks (local) │
│ ├─ detect-secrets → blocks leaked credentials │
│ ├─ ruff → blocks style/lint violations │
│ ├─ pyright → blocks type errors │
│ └─ bandit → blocks security anti-patterns │
└─────────────────────────────┬────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ CI Workflows (GitHub Actions, on every PR) │
│ ├─ Lint + format check → enforces code quality │
│ ├─ Type checking → enforces type safety │
│ ├─ Tests + coverage → enforces correctness │
│ ├─ Security scanning → enforces secure patterns │
│ ├─ Claude code review → catches logic errors │
│ └─ Claude security review → catches auth/injection/crypto │
└─────────────────────────────┬────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ Branch Protection │
│ └─ All checks must pass before merge → enforced safety net │
└──────────────────────────────────────────────────────────────────┘
Quick Reference: Files to Create
Here’s a checklist of every file this guide creates:
| File | Purpose |
|---|---|
CLAUDE.md | Context and rules for Claude |
.pre-commit-config.yaml | Pre-commit hook configuration |
.secrets.baseline | Baseline for detect-secrets |
pyproject.toml (bandit section) | Bandit configuration |
.github/workflows/ci.yml | Lint, type check, test, security scan |
.github/workflows/claude-review.yml | AI code review and security review on PRs |
.github/workflows/security.yml | Semgrep, CodeQL, Trivy (optional) |
What This Costs
Time:
- Initial setup: ~30 minutes following this guide
- Per commit: ~30 seconds (pre-commit hooks)
- Per PR: ~3-5 minutes (CI pipeline)
Money:
- Claude Code: Included with Claude Pro ($20/month) or Max ($100/month) subscription. Pro has usage limits that heavy development can hit—Max is worth it if you’re using Claude Code and review agents heavily.
- GitHub Actions: Free for public repos; 2,000 minutes/month free for private repos
- Claude API for CI reviews: ~$0.10-0.50 per PR (or use your subscription OAuth token)
- All security scanning tools: Free and open-source
What you avoid:
- Leaked credentials in git history (expensive to remediate)
- Authentication bypasses shipping to production
- The time cost of manually reviewing every line of LLM-generated code
Conclusion
LLMs are powerful coding assistants, but they need guardrails. The setup in this guide—pre-commit hooks, CLAUDE.md context, local review agents, CI workflows, and branch protection—creates defense in depth that catches different categories of mistakes at different stages.
The key principle: no single layer is sufficient. Pre-commit hooks catch patterns but can be skipped. Review agents catch logic errors but can be ignored. CI workflows catch everything but only run on push. Branch protection makes CI mandatory. Together, they form a safety net that’s hard to bypass accidentally.
Start here and iterate. You don’t need everything on day one:
- Start with pre-commit hooks — 10 minutes, immediate value on every commit
- Add CLAUDE.md — 5 minutes, dramatically improves Claude’s output quality
- Add CI workflows — 15 minutes, enforces checks on every PR
- Enable branch protection — 2 minutes, makes CI checks mandatory
- Add review agents — Use as needed for deeper analysis
Each layer you add reduces the chance of LLM-generated vulnerabilities reaching production. The cost is minimal. The insurance is significant.
And while this guide uses Python tools, the architecture transfers directly to other ecosystems. Swap ruff for ESLint, pyright for the TypeScript compiler, bandit for npm audit—the layered approach is the same. The Claude Code review workflows and branch protection rules work identically regardless of language.
This is a companion guide to Defense in Depth for AI-Assisted Development, which covers the reasoning behind each layer in more depth.