AI Code Guard (Scanner Operations)

Status: Planned Audience: Scanner operators, Security owners, CI maintainers Related: docs/modules/policy/guides/ai-code-guard-policy.md, docs/benchmarks/ai-code-guard/README.md

AI Code Guard adds fast, deterministic checks for AI-assisted code changes so security, IP, and license issues are caught before release gates. The guard runs in Scanner, emits evidence for Policy, and can be surfaced via CLI or SCM annotations.

1) Checks

1.1 Secrets and unsafe patterns

Secrets leak detection uses the existing secrets analyzer ruleset and classifies findings as new or pre-existing via hunk hashes.
Unsafe API detection reuses language capability scanners (eval/exec, SQL concat, weak crypto, process spawn).
Findings include file path, line range, and masked snippets (ASCII only).

1.2 Attribution and similarity

Changed hunks are normalized (line endings, whitespace, path separators) and hashed into deterministic hunk IDs.
Similarity is evaluated against allowlist and denylist corpora shipped with Offline Kit.
Unknown provenance over the review threshold requires justification.

1.3 License hygiene

Dependency diffs are derived from SBOM changes.
License evidence is mapped to allow/review/block verdicts using the policy matrix.
Snippets exceeding the line threshold require a provenance comment or waiver reference.

2) Inputs and evidence

Inputs:

Base and head refs (or explicit diff).
Scanner findings (secrets and capabilities).
SBOM inventory and license evidence.
Allowlist and denylist corpora with pinned digests.
Guard policy config (.stellaops.yml).

Evidence output:

Deterministic JSON payload with hunk hashes, similarity scores, finding summaries, and rule versions.
DSSE-ready bundle for Attestor registration.

Example (abbreviated):

{
  "status": "review",
  "hunks": 4,
  "secrets": { "new": 1, "pre_existing": 0 },
  "unsafe_apis": 2,
  "similarity": { "max": 0.87, "denylist_hit": false },
  "licenses": { "block": 0, "review": 1 }
}

3) Determinism and offline posture

Stable ordering of hunks and findings; all hashes use canonical JSON and UTF-8.
Similarity corpora are addressed by digest and packaged in Offline Kit bundles.
No network calls during evaluation; all inputs are local or provided by the caller.

4) Integration points

Scanner WebService exposes guard run endpoints for CI and Console.
CLI uses stella guard run for JSON, SARIF, and GitLab formats.
Integrations post SCM annotations and status checks when configured.
Attestor registers guard evidence as a predicate type for audit trails.

5) Overrides

Overrides are Policy-driven and require issue links plus expiry. The guard emits override metadata for audit trails; Policy decides whether to allow a time-boxed waiver.

2.8 KiB Raw Blame History