Secret Detection Before Agent Execution Production Playbook: practical systems guide

Technical field guide on secret detection before agent execution production playbook for teams building dependable AI coding workflows.

Every production agent system eventually commits a secret to git—API keys in test fixtures, database passwords in example configs, OAuth tokens in debug scripts. The blast radius depends entirely on how long it takes to catch them. A credential scanner that runs after your agent merges to main is a compliance checkbox; one that runs before execution starts is a kill switch.

This playbook walks through building secret detection that stops agent loops before they ship leaked credentials, not after. We'll cover scanning boundaries, pattern engines, false positive workflows, and how to wire detection into the Goatfied agent loop's constraint phase without slowing iteration velocity.

Why pre-execution vs. pre-commit

Most teams run secret scanners as GitHub Actions after push or as pre-commit hooks developers can skip with `--no-verify`. Agents don't have that social contract. An LLM editing files in a loop has no concept of "I know this looks like a key but it's actually a placeholder." It will happily generate realistic-looking AWS access keys in example code, commit them, open a PR, and move on.

Pre-execution scanning catches secrets before the agent loop begins its plan-edit-validate cycle. In Goatfied's architecture, this means running detection during the constraint phase—after the agent proposes a plan but before any file edits hit disk. If a secret appears in the planned changeset or already exists in the working tree, the loop halts and surfaces the finding to a human reviewer.

The operational difference is stark: pre-commit scanning tells you "we almost shipped this," pre-execution scanning tells you "we never even tried."

Pattern engines: regex, entropy, and validated matches

Start with a tiered detection strategy. Tier one is high-confidence patterns—regexes for AWS keys (`AKIA[0-9A-Z]{16}`), GitHub PATs (`ghp_[a-zA-Z0-9]{36}`), Slack tokens, Stripe keys. These rarely false-positive and should hard-block execution.

Tier two is entropy-based scanning. Any 32+ character alphanumeric string with high Shannon entropy deserves scrutiny. This catches generic API keys, base64-encoded credentials, and custom authentication tokens that don't follow public patterns. Set a threshold—typically 4.5+ bits per character—and treat hits as warnings that require human confirmation.

Tier three is validated matches: when you detect an AWS key, make a lightweight API call to `sts:GetCallerIdentity` to confirm it's real. GitHub tokens can be validated against their `/user` endpoint. This eliminates false positives from example code and placeholder values, but adds latency and requires careful rate limit handling.

A practical implementation looks like this in pseudo-code:


def scan_before_execution(plan, workspace):

    findings = []

    

    # Tier 1: known patterns

    for pattern, label in HIGH_CONFIDENCE_PATTERNS:

        matches = pattern.findall(plan.full_diff)

        findings.extend([{"type": label, "line": m.line, "severity": "critical"} 

                         for m in matches])

    

    # Tier 2: entropy

    for token in extract_tokens(plan.full_diff, min_length=32):

        if shannon_entropy(token) > 4.5:

            findings.append({"type": "high_entropy", "value": token, "severity": "warning"})

    

    # Tier 3: validate critical findings

    for f in findings:

        if f["type"] in VALIDATABLE_TYPES:

            if validate_credential(f):

                f["severity"] = "critical_validated"

    

    return findings

Wire this into the constraint phase. If any critical findings surface, the agent loop stops and presents them in the PR description or terminal output for review.

Handling false positives without drowning signal

Entropy scanning generates noise—test fixtures with long random strings, base64-encoded images, minified JS bundles. You need an escape hatch that doesn't undermine the entire system.

Use inline annotations for known safe values:


# goatfied:secret-safe

TEST_API_KEY = "sk_test_51HqJ..."

Your scanner parses these markers and skips annotated lines. This requires human judgment—the annotation is an assertion that "I reviewed this and it's safe"—but keeps the scanner strict by default.

For repository-wide exceptions, maintain an allowlist file (`.goatfied/secret-allowlist.yml`) with SHA-256 hashes of known safe strings:


allow:

  - hash: "a3c8f9..."  # example key in docs/quickstart.md

    reason: "public documentation placeholder"

  - hash: "7d2e1b..."  # test fixture

    reason: "hardcoded test data, no real credential"

The scanner computes hashes of detected secrets and checks the allowlist before blocking. This makes exceptions auditable—every allowed secret requires a documented reason—and prevents accidental re-introduction if the string changes.

Integrating with the Goatfied constraint phase

In Goatfied's agent loop, the constraint phase runs after planning but before execution. The agent proposes a set of file edits; constraints validate whether those edits are safe and reasonable. Secret detection fits naturally here.

Before the agent writes any files, run the scanner against:

The full planned diff (new content being added)
The current workspace state (secrets already present that might now be committed)
Environment variable references (catching `process.env.API_KEY` patterns that leak structure)

If the scanner surfaces critical findings, inject them into the constraint report. The loop pauses and the user sees:


❌ Constraint violation: secrets detected



3 potential secrets found in planned changes:

  • Line 47 of src/config.ts: AWS access key (AKIA...)

  • Line 103 of tests/fixtures.json: High entropy token (fj83hd...)



Review and annotate safe values or regenerate plan without credentials.

The agent doesn't proceed until a human clears the findings or adjusts the plan. This preserves Goatfied's compile-first, audit-friendly workflow: every constraint violation is logged, and no code moves forward without passing all gates.

Secrets already in the tree

Pre-execution scanning only helps if you also scan the current state. An agent editing an existing file with a committed secret will re-commit that secret unless you catch it.

On every agent loop start, run a full workspace scan and surface any findings as warnings. This won't block the loop—the damage is already done—but prevents agents from propagating secrets into new files or PRs.

For remediation, use automated rewriting: when the scanner detects a committed secret, it can propose a followup PR that replaces the literal value with an environment variable reference and adds a `.env.example` entry. This turns a security incident into a configuration improvement.

Operational telemetry

Track scanner performance in production:

**Detection latency**: how long does constraint-phase scanning add to loop startup? Target sub-500ms for small diffs.
**False positive rate**: what percentage of blocked loops were actually safe? Aim for <5%.
**Secret prevalence**: how many secrets appear in agent-generated plans vs. human-authored code?

These metrics reveal whether your detection is effective or just friction. If false positives exceed 10%, tighten tier-two entropy thresholds or expand the allowlist. If detection latency climbs above one second, move validation to async background checks.

[The Goatfied agent loop: how we ship code that actually compiles first try](/blog/goatfied-agent-loop-compiles-first-try)
[Agent-Loop playbook #1: practical Goatfied tactics for shipping PRs](/blog/goatfied-agent-loop-playbook-1)