One of my favorite agent use-cases is code review... openai codex has /review built in, and claude code-review plugin (not the big fancy auto PR review feature) is fine. But for review I don't want claude to prompt me Yes/No 100x. And I never want to run it in "dangerous mode." For review, it is not making changes but just running sometimes complex bash queries of all things against the code as it looks for patterns. Sooo this is the perfect time to mount everything into a docker and let Claude run dangerously there.

The script to do this:

  1. Determines what to compare, (extant changes with upstream, or accepts a branch name, create a temporary git worktree for it)
  2. Spins up a Docker container with only that worktree mounted, read-only
  3. Runs: claude --dangerously-skip-permissions with the code-review prompt a. Either use the shorter prompt "review this please" b. or the longer skill, code-review, that spins up multiple subagents for more thorough review
  4. Store findings in pr-reviews/branch-name.md.
  5. Clean up the worktree on exit.

Here is the working version that I use multiple times a day now.

A next step, perhaps, is to orchestrate multiple runs for even more careful/thorough findings, and have some final agent aggregate the reports.

And a step before that even could be an initial agent that guesses the complexity of the changes and suggests or decides how many separate reviews to run!