526 lines
		
	
	
		
			18 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			526 lines
		
	
	
		
			18 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
## 0) Scope at a glance
 | 
						||
 | 
						||
**Scan surfaces**
 | 
						||
 | 
						||
* **Images (static):** every file in every layer, plus Dockerfile metadata (ENV/ARG/LABEL, history).
 | 
						||
* **Runtime (live containers):** env vars, process args, mounted volumes (e.g., `/run/secrets`), logs, selected files created at runtime.
 | 
						||
 | 
						||
**Detection methods**
 | 
						||
 | 
						||
1. **Deterministic patterns (regex)** for known secret types.
 | 
						||
2. **Heuristics**: entropy scoring for unknown/random secrets.
 | 
						||
3. **Contextual signals**: filename/path, key names, nearby keywords, file type hints.
 | 
						||
4. **Structural checks**: e.g., JWT decodable, cloud key prefix/length.
 | 
						||
5. **(Optional) Lightweight validation**: local checksum/format (no network calls by default).
 | 
						||
 | 
						||
**Reporting**
 | 
						||
 | 
						||
* JSON (and optionally SARIF) with: *where*, *what rule matched*, *snippet masked*, *confidence*, *severity*, *layer/container process*, and *remediation hint*.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 1) Docker‑aware discovery workflow
 | 
						||
 | 
						||
### A. Images (static, pre‑runtime)
 | 
						||
 | 
						||
1. **Obtain filesystem + metadata**
 | 
						||
 | 
						||
   * Prefer **API**: Docker Engine (Docker.DotNet) to `Images.GetImageAsync` and **export/tar** (`docker save`) in memory.
 | 
						||
   * Parse `manifest.json` + `config.json`; capture:
 | 
						||
 | 
						||
     * `config.Env` (final env),
 | 
						||
     * **history**/`created_by` for `ENV`/`ARG`/`RUN` strings,
 | 
						||
     * labels.
 | 
						||
2. **Scan every layer**
 | 
						||
 | 
						||
   * Stream‑extract each layer tar (e.g., SharpCompress).
 | 
						||
   * Track **added/modified paths** per layer (so you can report: *layer N, file X*).
 | 
						||
   * **Text‑only filter**: skip clearly binary files (e.g., sample N bytes; if >30% non‑printables, skip or downrank).
 | 
						||
3. **File content & name/path analysis**
 | 
						||
 | 
						||
   * Apply **regex detectors** (Section 3) and **entropy** (Section 4).
 | 
						||
   * Weigh findings with **context** (Section 5).
 | 
						||
4. **Dockerfile/History checks**
 | 
						||
 | 
						||
   * Flag secrets in `ENV`/`ARG`/`RUN` strings (e.g., `ENV MYSQL_ROOT_PASSWORD=...`).
 | 
						||
   * Flag **deleted‑later files** that were present in earlier layers (common leak).
 | 
						||
   * Highlight missing `.dockerignore` patterns when suspicious files (.env, .pem, .tfstate) entered any layer.
 | 
						||
 | 
						||
### B. Running containers (runtime)
 | 
						||
 | 
						||
1. **Enumerate** containers and **inspect**:
 | 
						||
 | 
						||
   * `InspectContainerAsync` → `Config.Env`, `HostConfig.Binds`, `Mounts`, image id.
 | 
						||
2. **Env var scan**
 | 
						||
 | 
						||
   * Scan all `key=value` pairs with the same detectors (regex + entropy + context on the key name).
 | 
						||
3. **Process args**
 | 
						||
 | 
						||
   * `docker top` or `/proc/<pid>/cmdline` via `Exec` → scan args for `--password=...`, `--api-key=...`.
 | 
						||
4. **Mounted secret paths**
 | 
						||
 | 
						||
   * Default locations: `/run/secrets/*`, `/var/run/secrets/*`, K8s secret volumes, config maps that may contain creds.
 | 
						||
   * Retrieve via `GetArchiveFromContainerAsync` and scan.
 | 
						||
5. **Logs (optional but valuable)**
 | 
						||
 | 
						||
   * Attach/stream logs; scan lines for secret patterns; provide **live redaction** option.
 | 
						||
 | 
						||
> **Note**: Memory forensics is possible but heavy; treat as optional/IR-only.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 2) High‑value filename/path heuristics (fast wins)
 | 
						||
 | 
						||
Run these **glob/name** checks before content scanning to prioritize files:
 | 
						||
 | 
						||
**Generic secret indicators**
 | 
						||
 | 
						||
```
 | 
						||
**/*.env        **/.env*           **/*secret*.*      **/*secr*.* 
 | 
						||
**/*credential*.*                 **/*creds*.*       **/*passwd* 
 | 
						||
**/password*   **/*token*.*       **/*apikey*.*      **/*api_key*.* 
 | 
						||
**/*.pem       **/*.key           **/*.pfx           **/*.p12 
 | 
						||
**/*.jks       **/*.keystore      **/id_rsa          **/id_dsa 
 | 
						||
**/id_ecdsa    **/id_ed25519      **/private.pem     **/server.key 
 | 
						||
**/tls.key     **/jwt*.key
 | 
						||
```
 | 
						||
 | 
						||
**Common app/config**
 | 
						||
 | 
						||
```
 | 
						||
**/appsettings*.json              **/secrets*.json
 | 
						||
**/application.{yml,yaml,properties}
 | 
						||
**/application-*.{yml,yaml,properties}
 | 
						||
**/config.yaml  **/settings.yml   **/settings.py
 | 
						||
**/wp-config.php **/config.php     **/settings.php
 | 
						||
**/nuget.config  **/settings.xml (Maven)  **/gradle.properties
 | 
						||
**/docker-compose*.yml   **/compose*.yml
 | 
						||
**/PublishProfiles/*.pubxml
 | 
						||
```
 | 
						||
 | 
						||
**Cloud/CLI creds**
 | 
						||
 | 
						||
```
 | 
						||
**/.aws/credentials  **/.aws/config
 | 
						||
**/gcloud/application_default_credentials.json
 | 
						||
**/.azure/**         **/doctl/config.yaml     **/.oci/config
 | 
						||
**/.docker/config.json  **/.dockercfg
 | 
						||
**/.npmrc  **/.yarnrc  **/.pypirc  **/.gem/credentials  **/.netrc
 | 
						||
```
 | 
						||
 | 
						||
**Infra/IaC**
 | 
						||
 | 
						||
```
 | 
						||
**/*.tfstate  **/*.tfvars*   **/kube/config  **/.kube/config  **/*kubeconfig*
 | 
						||
**/service-account*.json     **/*-sa.json    **/*-key.json
 | 
						||
```
 | 
						||
 | 
						||
**Orchestrator runtime**
 | 
						||
 | 
						||
```
 | 
						||
/run/secrets/*     /var/run/secrets/*
 | 
						||
```
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 3) **Regex detector catalog** (battle‑tested patterns)
 | 
						||
 | 
						||
> Use `RegexOptions.Compiled | RegexOptions.IgnoreCase` (case‑sensitive where needed).
 | 
						||
> Always **mask** values in reports (e.g., show first 4 + last 4 chars).
 | 
						||
 | 
						||
### 3.1 Private keys / certificates
 | 
						||
 | 
						||
* **OpenSSH private key**
 | 
						||
  `@"-----BEGIN OPENSSH PRIVATE KEY-----"`
 | 
						||
* **Generic PEM private key**
 | 
						||
  `@"-----BEGIN (?:RSA |DSA |EC |PGP )?PRIVATE KEY-----"`
 | 
						||
* **PGP private key**
 | 
						||
  `@"-----BEGIN PGP PRIVATE KEY BLOCK-----"`
 | 
						||
 | 
						||
> (Public keys/certificates are *not* secrets: `BEGIN PUBLIC KEY`, `BEGIN CERTIFICATE` → downrank/ignore.)
 | 
						||
 | 
						||
### 3.2 Cloud: AWS
 | 
						||
 | 
						||
* **Access Key ID**
 | 
						||
  `@"\b(?:AKIA|ASIA|AGPA|AIDA|AROA|AIPA|ANPA)[A-Z0-9]{16}\b"`
 | 
						||
* **Secret Access Key (context‑aided)**
 | 
						||
  `@"\b[A-Za-z0-9/\+=]{40}\b"`
 | 
						||
  *Boost only if near `aws|secret|access[_-]?key|AWS_SECRET_ACCESS_KEY` within ~50 chars.*
 | 
						||
* **Credentials file lines**
 | 
						||
 | 
						||
  * `@"aws_access_key_id\s*=\s*[A-Z0-9]{20}"`
 | 
						||
  * `@"aws_secret_access_key\s*=\s*[A-Za-z0-9/\+=]{40}"`
 | 
						||
 | 
						||
### 3.3 Cloud: GCP / Google
 | 
						||
 | 
						||
* **API key**
 | 
						||
  `@"\bAIza[0-9A-Za-z\-_]{35}\b"`
 | 
						||
* **Service Account JSON** (two‑term signature)
 | 
						||
 | 
						||
  * `@"""type""\s*:\s*""service_account"""`
 | 
						||
  * `@"""private_key""\s*:\s*""-----BEGIN PRIVATE KEY-----"`
 | 
						||
 | 
						||
### 3.4 Cloud: Azure
 | 
						||
 | 
						||
* **Storage connection string**
 | 
						||
  `@"DefaultEndpointsProtocol=https;AccountName=[^;]+;AccountKey=[A-Za-z0-9\+/=]{88};EndpointSuffix=core\.windows\.net"`
 | 
						||
* **SAS token (simplified)**
 | 
						||
  `@"\bsv=\d{4}-\d{2}-\d{2}[^ ]*?&sig=[A-Za-z0-9%/\+=]{40,}\b"`
 | 
						||
 | 
						||
### 3.5 Dev platforms / SCM
 | 
						||
 | 
						||
* **GitHub PAT**
 | 
						||
  `@"\bgh[prusoa]_[A-Za-z0-9]{36}\b"`
 | 
						||
* **GitLab PAT**
 | 
						||
  `@"\bglpat-[A-Za-z0-9\-_]{20,}\b"`
 | 
						||
* **NPM token**
 | 
						||
 | 
						||
  * in `.npmrc`: `@"//registry\.npmjs\.org/:_authToken=\s*(npm_[A-Za-z0-9]{36})"`
 | 
						||
  * raw form: `@"\bnpm_[A-Za-z0-9]{36}\b"`
 | 
						||
* **PyPI token**
 | 
						||
  `@"\bpypi-AgEIcHlwaS5vcmc[A-Za-z0-9\-_]{50,}\b"`
 | 
						||
 | 
						||
### 3.6 Messaging / SaaS
 | 
						||
 | 
						||
* **Slack tokens (broad)**
 | 
						||
  `@"\bxox[a-z]-[A-Za-z0-9-]{8,}-[A-Za-z0-9-]{8,}-[A-Za-z0-9-]{8,}(?:-[A-Za-z0-9-]{8,})?\b"`
 | 
						||
* **Stripe**
 | 
						||
  `@"\bsk_(?:live|test)_[0-9a-zA-Z]{24}\b"`
 | 
						||
* **SendGrid**
 | 
						||
  `@"\bSG\.[A-Za-z0-9\-_]{16,32}\.[A-Za-z0-9\-_]{16,64}\b"`
 | 
						||
* **Mailgun**
 | 
						||
  `@"\bkey-[0-9a-zA-Z]{32}\b"`
 | 
						||
* **Twilio**
 | 
						||
 | 
						||
  * SID: `@"\bAC[0-9a-f]{32}\b"`
 | 
						||
  * Auth token (context aided): `@"\b[0-9a-f]{32}\b"` near `twilio|auth[_-]?token`
 | 
						||
* **Discord bot**
 | 
						||
  `@"\b[A-Za-z\d]{24}\.[A-Za-z\d\-_]{6}\.[A-Za-z\d\-_]{27}\b"`
 | 
						||
 | 
						||
### 3.7 Database / service connection strings
 | 
						||
 | 
						||
* **PostgreSQL**
 | 
						||
  `@"\bpostgres(?:ql)?://[^:\s]+:[^@\s]+@[^/\s]+"`
 | 
						||
* **MySQL**
 | 
						||
  `@"\bmysql://[^:\s]+:[^@\s]+@[^/\s]+"`
 | 
						||
* **MongoDB**
 | 
						||
  `@"\bmongodb(?:\+srv)?://[^:\s]+:[^@\s]+@[^/\s]+"`
 | 
						||
* **SQL Server (ADO.NET)**
 | 
						||
  `@"\bData Source=[^;]+;Initial Catalog=[^;]+;User ID=[^;]+;Password=[^;]+;"`
 | 
						||
* **Redis**
 | 
						||
  `@"\bredis(?:\+ssl)?://(?::[^@]+@)?[^/\s]+"`
 | 
						||
* **Basic auth in URL (generic)**
 | 
						||
  `@"[a-zA-Z][a-zA-Z0-9+\-.]*://[^:/\s]+:[^@/\s]+@[^/\s]+"`
 | 
						||
 | 
						||
### 3.8 Docker / CLI auth artifacts
 | 
						||
 | 
						||
* **Docker config.json auth**
 | 
						||
  `@"""auth""\s*:\s*""[A-Za-z0-9\+/=]{20,}"""`
 | 
						||
* **.netrc auth**
 | 
						||
  `@"(?mi)^machine\s+\S+\s+login\s+\S+\s+password\s+\S+"`
 | 
						||
 | 
						||
### 3.9 Tokens / JWT
 | 
						||
 | 
						||
* **JWT (structural)**
 | 
						||
  `@"\beyJ[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_-]{10,}\b"`
 | 
						||
 | 
						||
### 3.10 Build tools / package managers
 | 
						||
 | 
						||
* **NuGet (cleartext)**
 | 
						||
  `@"<add\s+key=""ClearTextPassword""\s+value=""[^""]+"""`
 | 
						||
  `@"<add\s+key=""Password""\s+value=""[^""]+"""`  *(base64 ‑ still secret)*
 | 
						||
* **Maven settings.xml**
 | 
						||
  `@"<server>\s*<id>[^<]+</id>\s*<username>[^<]+</username>\s*<password>[^<]+</password>"`
 | 
						||
* **Gradle**
 | 
						||
  `@"(?i)\bsigning\.password\s*=\s*.+"`
 | 
						||
 | 
						||
> Keep regexes modular; associate each with:
 | 
						||
> `{ Id, Name, Pattern, Severity, Examples, RecommendedRemediation }`.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 4) Entropy detector (catches “unknown” secrets)
 | 
						||
 | 
						||
**Why:** Many org‑specific tokens won’t match known regexes.
 | 
						||
 | 
						||
**Implementation**
 | 
						||
 | 
						||
* Extract candidate tokens by character class:
 | 
						||
 | 
						||
  * base64/base64url: `[A-Za-z0-9/_\-\+=]{20,}`
 | 
						||
  * hex: `[A-Fa-f0-9]{32,}`
 | 
						||
  * general mixed: `[A-Za-z0-9]{24,}`
 | 
						||
* Compute **Shannon entropy** per candidate. Use **alphabet‑aware thresholds**:
 | 
						||
 | 
						||
  * **base64/url**: ≥ **4.0** bits/char & length ≥ 24
 | 
						||
  * **hex**: ≥ **3.0** bits/char & length ≥ 32
 | 
						||
  * **alnum**: ≥ **4.0** bits/char & length ≥ 24
 | 
						||
* **Context boosts** (raise confidence) if **within 64 chars** of:
 | 
						||
  `password|passwd|pwd|secret|token|apikey|api_key|api-key|client[_-]?secret|private[_-]?key|connectionstring|conn[_-]?str|bearer`
 | 
						||
* **Context suppressors** (lower confidence/ignore):
 | 
						||
 | 
						||
  * File/path contains: `example|sample|test|fixture|dummy`
 | 
						||
  * Surrounding line contains: `REDACTED|<redacted>|changeme`
 | 
						||
  * Known non‑secret blocks: `BEGIN PUBLIC KEY`, `BEGIN CERTIFICATE`
 | 
						||
* Cap **N findings per file** (e.g., 50) to avoid log floods.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 5) Scoring & de‑duping
 | 
						||
 | 
						||
Combine signals into a **confidence score**:
 | 
						||
 | 
						||
* +0.9 Regex “hard” match (e.g., OpenSSH private key)
 | 
						||
* +0.7 Regex “soft” match (e.g., AWS secret 40‑char near keyword)
 | 
						||
* +0.4 Entropy pass
 | 
						||
* +0.2 Suspicious filename/path
 | 
						||
* –0.5 Suppressor keyword/file
 | 
						||
* +0.2 Structural check passes (e.g., JWT decodes)
 | 
						||
 | 
						||
**Severity**
 | 
						||
 | 
						||
* **Critical**: private keys, cloud root creds, Docker auth, DB creds in URLs, verified JWT signing keys.
 | 
						||
* **High**: API tokens (GitHub/GitLab/Slack/Stripe), secrets in ENV/ARG history.
 | 
						||
* **Medium**: high‑entropy candidates with strong context.
 | 
						||
* **Low**: weak context/entropy only, or likely sample values.
 | 
						||
 | 
						||
**De‑dupe** same value across files/layers/envs; keep a single canonical record with **occurrence list**.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 6) Docker‑specific checks you must implement
 | 
						||
 | 
						||
* **ENV/ARG leakage in history**
 | 
						||
  Parse `config.History[].CreatedBy` or `docker history --no-trunc`.
 | 
						||
  Flag any `ENV/ARG` with suspicious key names or values matching detectors.
 | 
						||
* **Deleted‑later files**
 | 
						||
  If a file existed in an earlier layer and got deleted later (common `.env` mishap), still flag it and report **layer** + **instruction** that introduced it.
 | 
						||
* **`.dockerignore` advisory**
 | 
						||
  If high‑risk files (.env, .pem, .tfstate, credentials) entered the build context once, suggest `.dockerignore` entries.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 7) Runtime inspection rules
 | 
						||
 | 
						||
* **Environment**
 | 
						||
 | 
						||
  * Scan all `Env` pairs; **boost** hits for keys containing:
 | 
						||
    `PASSWORD|PASS|PWD|SECRET|TOKEN|KEY|CLIENT_SECRET|SAS|CONNECTIONSTRING`
 | 
						||
* **Process args**
 | 
						||
 | 
						||
  * Flag `--password`, `--api-key`, `--token`, `--secret`, `--connection-string`.
 | 
						||
* **Mounted secrets**
 | 
						||
 | 
						||
  * Enumerate `/run/secrets/*`, `/var/run/secrets/*` (Swarm/K8s).
 | 
						||
  * Ensure permissions are restrictive; still **scan contents** (apps sometimes copy them elsewhere).
 | 
						||
* **Logs**
 | 
						||
 | 
						||
  * Tail & scan. Provide **optional redaction** pipeline.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 8) Reporting format (JSON)
 | 
						||
 | 
						||
Example JSON for one finding:
 | 
						||
 | 
						||
```json
 | 
						||
{
 | 
						||
  "detectorId": "aws.accessKeyId",
 | 
						||
  "name": "AWS Access Key ID",
 | 
						||
  "severity": "HIGH",
 | 
						||
  "confidence": 0.92,
 | 
						||
  "valueSample": "AKIA************WXYZ",
 | 
						||
  "locations": [
 | 
						||
    {
 | 
						||
      "type": "image-layer-file",
 | 
						||
      "image": "repo/app:1.4.2",
 | 
						||
      "layerDigest": "sha256:...abc",
 | 
						||
      "path": "/app/.env",
 | 
						||
      "line": 12
 | 
						||
    },
 | 
						||
    {
 | 
						||
      "type": "container-env",
 | 
						||
      "containerId": "f3e9d...",
 | 
						||
      "envKey": "AWS_ACCESS_KEY_ID"
 | 
						||
    }
 | 
						||
  ],
 | 
						||
  "context": {
 | 
						||
    "filePathScore": 0.2,
 | 
						||
    "regexMatch": true,
 | 
						||
    "entropy": null,
 | 
						||
    "nearbyKeywords": ["AWS_ACCESS_KEY_ID"]
 | 
						||
  },
 | 
						||
  "remediation": "Remove from image; inject via secrets manager or runtime mount; rotate the key."
 | 
						||
}
 | 
						||
```
 | 
						||
 | 
						||
> Optionally also emit **SARIF** to plug into code‑scanning dashboards.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 9) C# implementation sketch
 | 
						||
 | 
						||
### Project layout
 | 
						||
 | 
						||
```
 | 
						||
SecretsScanner/
 | 
						||
  Core/
 | 
						||
    IDetector.cs                 // interface: Detect(stream|text, path, context) -> Findings
 | 
						||
    RegexDetector.cs             // holds Pattern, Hints, Confidence rules
 | 
						||
    EntropyDetector.cs           // Shannon entropy
 | 
						||
    JwtDetector.cs               // structural decoding check
 | 
						||
    FileClassifier.cs            // text/binary check, ext-based hints
 | 
						||
    Scoring.cs                   // combine signals; severity
 | 
						||
    PathsHeuristics.cs           // globs & filename rules
 | 
						||
    ReportModel.cs               // JSON schema / SARIF
 | 
						||
  Docker/
 | 
						||
    ImageReader.cs               // reads image tars, layers via Docker.DotNet or stream
 | 
						||
    HistoryParser.cs             // extracts ENV/ARG from history
 | 
						||
    ContainerInspector.cs        // env, args, mounts, logs (Docker.DotNet)
 | 
						||
  Catalog/
 | 
						||
    RegexCatalog.cs              // patterns (section 3), per-detector metadata
 | 
						||
    Keywords.cs                  // boost/suppress lists
 | 
						||
  Cli/
 | 
						||
    Program.cs                   // options: image, container, path; json output; fail-on
 | 
						||
```
 | 
						||
 | 
						||
### C# snippets (illustrative)
 | 
						||
 | 
						||
**Regex catalog**
 | 
						||
 | 
						||
```csharp
 | 
						||
public static class RegexCatalog
 | 
						||
{
 | 
						||
    public static readonly (string Id, string Name, Regex Rx, string Severity, string Hint)[] Rules =
 | 
						||
    {
 | 
						||
        ("pem.openssh", "OpenSSH Private Key",
 | 
						||
            new Regex(@"-----BEGIN OPENSSH PRIVATE KEY-----", RegexOptions.Compiled),
 | 
						||
            "CRITICAL", "Remove private keys from images; use mounts or vault."),
 | 
						||
        ("pem.private", "PEM Private Key",
 | 
						||
            new Regex(@"-----BEGIN (?:RSA |DSA |EC |PGP )?PRIVATE KEY-----", RegexOptions.Compiled),
 | 
						||
            "CRITICAL", "Remove private keys; rotate credentials."),
 | 
						||
        ("aws.akid", "AWS Access Key ID",
 | 
						||
            new Regex(@"\b(?:AKIA|ASIA|AGPA|AIDA|AROA|AIPA|ANPA)[A-Z0-9]{16}\b", RegexOptions.Compiled),
 | 
						||
            "HIGH", "Rotate; use IAM roles/STS; remove from code/config."),
 | 
						||
        ("github.pat", "GitHub Personal Access Token",
 | 
						||
            new Regex(@"\bgh[prusoa]_[A-Za-z0-9]{36}\b", RegexOptions.Compiled),
 | 
						||
            "HIGH", "Revoke PAT; use fine-grained tokens; remove from image."),
 | 
						||
        // ... add remaining patterns from Section 3
 | 
						||
    };
 | 
						||
}
 | 
						||
```
 | 
						||
 | 
						||
**Entropy**
 | 
						||
 | 
						||
```csharp
 | 
						||
public static class Entropy
 | 
						||
{
 | 
						||
    public static double Shannon(ReadOnlySpan<char> s, ReadOnlySpan<char> alphabet)
 | 
						||
    {
 | 
						||
        Span<int> counts = stackalloc int[256];
 | 
						||
        int n = 0;
 | 
						||
        foreach (var ch in s)
 | 
						||
        {
 | 
						||
            if (alphabet.IndexOf(ch) >= 0) { counts[ch]++; n++; }
 | 
						||
        }
 | 
						||
        if (n == 0) return 0.0;
 | 
						||
        double H = 0.0;
 | 
						||
        for (int i = 0; i < counts.Length; i++)
 | 
						||
        {
 | 
						||
            if (counts[i] == 0) continue;
 | 
						||
            double p = counts[i] / (double)n;
 | 
						||
            H -= p * Math.Log(p, 2);
 | 
						||
        }
 | 
						||
        return H;
 | 
						||
    }
 | 
						||
}
 | 
						||
```
 | 
						||
 | 
						||
**Candidate extraction (simplified)**
 | 
						||
 | 
						||
```csharp
 | 
						||
static readonly Regex Base64Token = new(@"[A-Za-z0-9/_\-\+=]{20,}", RegexOptions.Compiled);
 | 
						||
static readonly Regex HexToken    = new(@"[A-Fa-f0-9]{32,}", RegexOptions.Compiled);
 | 
						||
 | 
						||
IEnumerable<Candidate> ExtractCandidates(string line)
 | 
						||
{
 | 
						||
    foreach (Match m in Base64Token.Matches(line)) yield return new Candidate(m.Value, "b64", line);
 | 
						||
    foreach (Match m in HexToken.Matches(line))    yield return new Candidate(m.Value, "hex", line);
 | 
						||
}
 | 
						||
```
 | 
						||
 | 
						||
**Scoring**
 | 
						||
 | 
						||
```csharp
 | 
						||
double Score(DetectionSignals s)
 | 
						||
{
 | 
						||
    double score = 0;
 | 
						||
    if (s.RegexHard) score += 0.9;
 | 
						||
    if (s.RegexSoft) score += 0.7;
 | 
						||
    if (s.EntropyHit) score += 0.4;
 | 
						||
    if (s.SuspiciousPath) score += 0.2;
 | 
						||
    if (s.StructuralOk) score += 0.2;
 | 
						||
    if (s.Suppressor) score -= 0.5;
 | 
						||
    return Math.Clamp(score, 0, 1);
 | 
						||
}
 | 
						||
```
 | 
						||
 | 
						||
**Docker (Docker.DotNet)**
 | 
						||
 | 
						||
* Images: `IImageOperations.GetImageHistoryAsync`, `Images.GetImageAsync` + tar unpack.
 | 
						||
* Containers: `Containers.InspectContainerAsync`, `Exec.ExecCreateContainerAsync` + `ExecStart`, `GetArchiveFromContainerAsync`, `Logs.GetContainerLogsAsync`.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 10) False‑positive control & hygiene
 | 
						||
 | 
						||
* **Ignore lists**: file globs (`test/**`, `**/*.example.*`), value lists (`REDACTED`, `example`, `dummy`, `changeme`).
 | 
						||
* **Public materials**: downrank matches inside `BEGIN PUBLIC KEY`/`BEGIN CERTIFICATE`.
 | 
						||
* **Thresholds**: tune entropy and minimum lengths to your codebase; keep per‑detector knobs in config.
 | 
						||
* **Masking**: never print full values; keep secure logs.
 | 
						||
* **Rate‑limits**: cap per‑file matches; cap per‑container to avoid spam.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 11) CI/CD and policy
 | 
						||
 | 
						||
* **Build step**: after `docker build`, run image scan; **fail** on High/Critical (configurable).
 | 
						||
* **Pre‑deploy**: scan runtime env for env/args/mounts (read‑only).
 | 
						||
* **Baselining**: allow a first pass to **baseline known leftovers**, then block any **new** secrets.
 | 
						||
* **Rotation**: auto‑emit per‑type remediation (e.g., rotate PAT, revoke AWS AK/SK, move to secret manager).
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 12) Optional enhancements
 | 
						||
 | 
						||
* **SBOM‑guided scanning**: use SBOM/file inventory to prioritize text/config assets; cache base layers.
 | 
						||
* **JWT structural checks**: base64url‑decode header/payload; verify JSON; flag if plausible.
 | 
						||
* **Checksum checks**: Luhn for CCNs (if in scope); simple format checks for cloud tokens.
 | 
						||
* **Interactive audit**: CLI `--audit` mode to triage and write an “allowlist/baseline”.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## 13) Minimal “first list” your dev can paste today
 | 
						||
 | 
						||
**Start with these detectors (high ROI):**
 | 
						||
 | 
						||
* PEM/OPENSSH private keys
 | 
						||
* AWS AKID + secret (context‑aided)
 | 
						||
* GitHub PAT, GitLab PAT, NPM, PyPI
 | 
						||
* Slack, Stripe, SendGrid, Twilio
 | 
						||
* Docker config `auth` field
 | 
						||
* DB connection strings (Postgres/MySQL/Mongo/SQLServer)
 | 
						||
* JWT
 | 
						||
* `.aws/credentials`, `.npmrc`, `.docker/config.json`, `appsettings*.json`, `.env*`, `*.tfstate`, `*kubeconfig*` (path heuristics)
 | 
						||
* Entropy (base64/hex/alnum) with context boosts/suppressors
 | 
						||
 | 
						||
That set alone catches the overwhelming majority of real‑world leaks.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
### Final note
 | 
						||
 | 
						||
This blueprint keeps everything **offline** (no external calls), so it’s safe in CI and reproducible. If you later want to add **credential validation** (e.g., confirm an AWS key via STS), make it opt‑in and heavily rate‑limited.
 | 
						||
 | 
						||
If you want, I can package these regexes and the scaffolding into a **starter C# repo** with a CLI (`scan image <ref> | scan container <id> | scan path <dir>`) and JSON output.
 |