git.stella-ops.org/docs/modules/scanner/entropy.md

# Entropy Analysis for Executable Layers

> **Status:** Draft – Sprint 186/209
> **Owners:** Scanner Guild · Policy Guild · UI Guild · Docs Guild

## 1. Overview

Entropy analysis highlights opaque regions inside container layers (packed binaries, stripped blobs, embedded firmware) so Stella Ops can prioritise artefacts that are hard to audit. The scanner computes per-file entropy metrics, reports opaque ratios per layer, and feeds penalties into the trust algebra.

## 2. Scanner pipeline (`SCAN-ENTROPY-186-011/012`)

* **Target files:** ELF, PE/COFF, Mach-O executables and large raw blobs (>16 KB). Archive formats (zip/tar) are unpacked by existing analyzers before entropy processing.
* **Section analysis:**
  * ELF – `.text`, `.rodata`, `.data`, custom sections.
  * PE – section table entries (`IMAGE_SECTION_HEADER`).
  * Mach-O – LC_SEGMENT/LC_SEGMENT_64 sections.
* **Sliding window:** 4 KB window with 1 KB stride. Entropy calculated using Shannon entropy:

  \[
  H = -\sum_{i=0}^{255} p_i \log_2 p_i
  \]

  Windows with `H ≥ 7.2` bits/byte are marked “opaque”.
* **Heuristics & hints:**
  * Flag entire files with no symbols or stripped debug info.
  * Detect known packer section names (`.UPX*`, `.aspack`, etc.).
  * Record offsets, window sizes, and entropy values to support explainability.
* **Outputs:**
  * `entropy.report.json` (per-file details, windows, hints).
  * `layer_summary.json` (opaque byte ratios per layer and overall image).
  * Penalty score contributed to the trust algebra (`entropy_penalty`).

All JSON output is canonical (sorted keys, UTF-8) and included in DSSE attestations/replay bundles.

## 3. JSON Schemas

### 3.1 `entropy.report.json`

```jsonc
{
  "schema": "stellaops.entropy/report@1",
  "imageDigest": "sha256:…",
  "layerDigest": "sha256:…",
  "files": [
    {
      "path": "/opt/app/libblob.so",
      "size": 5242880,
      "opaqueBytes": 1342177,
      "opaqueRatio": 0.25,
      "flags": ["stripped", "section:.UPX0"],
      "windows": [
        { "offset": 0, "length": 4096, "entropy": 7.45 },
        { "offset": 1024, "length": 4096, "entropy": 7.38 }
      ]
    }
  ]
}
```

### 3.2 `layer_summary.json`

```jsonc
{
  "schema": "stellaops.entropy/layer-summary@1",
  "imageDigest": "sha256:…",
  "layers": [
    {
      "digest": "sha256:layer4…",
      "opaqueBytes": 2306867,
      "totalBytes": 10485760,
      "opaqueRatio": 0.22,
      "indicators": ["packed", "no-symbols"]
    }
  ],
  "imageOpaqueRatio": 0.18,
  "entropyPenalty": 0.12
}
```

## 4. Policy integration (`POLICY-RISK-90-001`)

* Policy Engine receives `entropy_penalty` and per-layer ratios via scan evidence.
* Default thresholds:
  * Block when `imageOpaqueRatio > 0.15` and provenance unknown.
  * Warn when any executable has `opaqueRatio > 0.30`.
* Penalty weights are configurable per tenant. Policy explanations include:
  * Highest-entropy files and offsets.
  * Reason code (packed, no symbols, runtime reachable).

## 5. UI experience (`UI-ENTROPY-40-001/002`)

* **Heatmaps:** render entropy along the file timeline (green → red).
* **Layer donut:** show opaque % per layer with tooltips linking to file list.
* **“Why risky?” chips:** highlight triggers such as *Packed-like*, *Stripped*, *No symbols*.
* Policy banners explain configured thresholds and mitigation (add provenance, unpack, or accept risk).
* Provide direct download links to `entropy.report.json` for audits.

## 6. CLI / API hooks

* CLI – `stella scan artifacts --entropy` option prints top opaque files and penalties.
* API – `GET /api/v1/scans/{id}/entropy` serves summary + evidence references.
* Notify templates can include entropy penalties to escalate opaque images.

## 7. Trust algebra

The penalty is computed as:

\[
\text{entropyPenalty} = K \sum_{\text{layers}} \left( \frac{\text{opaqueBytes}}{\text{totalBytes}} \times \frac{\text{layerBytes}}{\text{imageBytes}} \right)
\]

* Default `K = 0.5`.
* Cap penalty at 0.3 to avoid over-weighting tiny blobs.
* Combine with other trust signals (reachability, provenance) to prioritise audits.

## 8. Implementation checklist

| Area | Task ID | Notes |
|------|---------|-------|
| Scanner analysis | `SCAN-ENTROPY-186-011` | Sliding window entropy & heuristics |
| Evidence output | `SCAN-ENTROPY-186-012` | JSON reports + DSSE |
| Policy integration | `POLICY-RISK-90-001` | Trust weight + explanations |
| UI | `UI-ENTROPY-40-001/002` | Visualisation & messaging |
| Docs | `DOCS-ENTROPY-70-004` | (this guide) |

Update this document as thresholds change or additional packer signatures are introduced.