Files
git.stella-ops.org/docs/reachability/explainability-schema.md
StellaOps Bot 6e45066e37
Some checks failed
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
up
2025-12-13 09:37:15 +02:00

455 lines
11 KiB
Markdown

# Explainability Schema
_Last updated: 2025-12-13. Owner: Policy Guild + Docs Guild._
This document defines the explainability schema addressing gaps EX1-EX10 from the November 2025 product findings. It specifies the canonical format for vulnerability verdict explanations, DSSE signing policy, CAS storage rules, and export/replay formats.
---
## 1. Overview
Explainability provides auditable, machine-readable rationale for every vulnerability verdict. Each explanation includes:
- **Decision chain:** Ordered list of rules/policies that contributed to the verdict
- **Evidence links:** References to graphs, runtime facts, VEX statements, and SBOM components
- **Confidence scores:** Per-rule and aggregate confidence values
- **Redaction metadata:** PII handling and data classification
---
## 2. Gap Resolutions
### EX1: Schema/Canonicalization + Hashes
**Explanation schema:**
```json
{
"schema": "stellaops.explanation@v1",
"explanation_id": "explain:sha256:{hex}",
"finding_id": "P-7:S-42:pkg:maven/log4j@2.14.1:CVE-2021-44228",
"verdict": {
"status": "affected",
"severity": {"normalized": "Critical", "score": 10.0},
"confidence": 0.92
},
"decision_chain": [
{
"rule_id": "rule:reachability_gate",
"rule_version": "1.0.0",
"inputs": {
"reachability.state": "CR",
"reachability.confidence": 0.92
},
"output": {"allowed": true, "contribution": 0.4},
"evidence_refs": ["cas://reachability/graphs/blake3:..."]
},
{
"rule_id": "rule:severity_baseline",
"rule_version": "1.0.0",
"inputs": {
"cvss_base": 10.0,
"epss_percentile": 0.95
},
"output": {"severity": "Critical", "contribution": 0.6},
"evidence_refs": ["cas://advisories/CVE-2021-44228.json"]
}
],
"aggregate_confidence": 0.88,
"created_at": "2025-12-13T10:00:00Z",
"policy_version": "sha256:...",
"graph_revision_id": "rev:blake3:..."
}
```
**Canonicalization rules:**
1. JSON keys sorted alphabetically at all levels
2. Arrays in `decision_chain` ordered by rule execution sequence
3. `evidence_refs` arrays sorted alphabetically
4. No whitespace, UTF-8 encoding
5. Hash computed over canonical JSON: `sha256(canonical_json)`
### EX2: DSSE Predicate/Signing Policy
**DSSE predicate type:**
```
stella.ops/explanation@v1
```
**Signing policy:**
| Element | Required | Signer |
|---------|----------|--------|
| Explanation body | Yes | Policy Engine key |
| Graph DSSE reference | Yes (if reachability cited) | Scanner key |
| VEX DSSE reference | Yes (if VEX cited) | Policy Engine key |
**DSSE envelope structure:**
```json
{
"payloadType": "application/vnd.stellaops.explanation+json",
"payload": "<base64(canonical_explanation_json)>",
"signatures": [
{
"keyid": "policy-engine-signing-2025",
"sig": "base64:..."
}
]
}
```
**Signing requirements:**
- All explanations must be signed before CAS storage
- Signing key must be registered in Authority key store
- Key rotation triggers re-signing of active explanations (configurable)
### EX3: CAS Storage Rules for Evidence
**Storage layout:**
```
cas://explanations/
{sha256}/ # Explanation body
{sha256}.dsse # DSSE envelope
by-finding/{finding_id}/ # Index by finding
by-policy/{policy_digest}/ # Index by policy version
by-graph/{graph_revision_id}/ # Index by graph revision
```
**Storage rules:**
1. Explanations are immutable after signing
2. New verdicts create new explanation documents (no updates)
3. Previous explanations are retained per retention policy
4. Cross-references validated at write time (graphs, VEX must exist)
**Deduplication:**
- Identical canonical JSON produces identical hash
- CAS returns existing reference if content matches
### EX4: Link to Decision/Policy and graph_revision_id
**Required links:**
```json
{
"links": {
"policy_version": "sha256:7e1d...",
"policy_uri": "cas://policy/versions/sha256:7e1d...",
"graph_revision_id": "rev:blake3:a1b2...",
"graph_uri": "cas://reachability/revisions/blake3:a1b2...",
"sbom_digest": "sha256:def4...",
"sbom_uri": "cas://scanner-artifacts/sbom.cdx.json",
"vex_digest": "sha256:e5f6...",
"vex_uri": "cas://excititor/vex/openvex.json"
}
}
```
**Validation:**
- All linked artifacts must exist at explanation creation time
- Links are verified during replay/audit
- Broken links cause replay verification failure
### EX5: Export/Replay Bundle Format
**Export bundle manifest:**
```json
{
"schema": "stellaops.explanation.bundle@v1",
"bundle_id": "bundle:explain:2025-12-13",
"created_at": "2025-12-13T10:00:00Z",
"explanations": [
{
"explanation_id": "explain:sha256:...",
"finding_id": "...",
"explanation_uri": "explanations/sha256:....json",
"dsse_uri": "explanations/sha256:....dsse"
}
],
"dependencies": {
"graphs": [
{"revision_id": "rev:blake3:...", "uri": "graphs/blake3:....json"}
],
"policies": [
{"digest": "sha256:...", "uri": "policies/sha256:....json"}
],
"vex_statements": [
{"digest": "sha256:...", "uri": "vex/sha256:....json"}
]
},
"verification": {
"bundle_hash": "sha256:...",
"signature": "base64:...",
"signed_by": "policy-engine-signing-2025"
}
}
```
**Replay verification:**
```bash
stella explain verify --bundle ./explanation-bundle.tgz
# Output:
Bundle: bundle:explain:2025-12-13
Explanations: 42
Dependencies: 5 graphs, 2 policies, 12 VEX
Verifying explanations...
Canonical hashes: 42/42 MATCH
DSSE signatures: 42/42 VALID
Dependency links: 42/42 RESOLVED
Replay verification PASSED.
```
### EX6: PII/Redaction Rules
**Redaction categories:**
| Category | Redaction | Example |
|----------|-----------|---------|
| User identifiers | Hash | `user:alice` -> `user:sha256:a1b2...` |
| IP addresses | Mask | `192.168.1.100` -> `192.168.x.x` |
| File paths | Normalize | `/home/alice/code/...` -> `{HOME}/code/...` |
| Email addresses | Hash | `alice@example.com` -> `email:sha256:...` |
| API keys/tokens | Omit | `Authorization: Bearer xxx` -> `[REDACTED]` |
**Redaction metadata:**
```json
{
"redaction": {
"applied": true,
"level": "standard",
"fields_redacted": ["actor.email", "evidence.file_path"],
"redaction_policy": "stellaops.redaction.standard@v1"
}
}
```
**Export modes:**
- `--redacted` (default): Apply standard redaction
- `--full`: Include all data (requires `explain:export:full` scope)
- `--audit`: Include redaction audit trail
### EX7: Size Budgets
**Limits:**
| Element | Default Limit | Configurable |
|---------|--------------|--------------|
| Explanation body | 256 KB | Yes |
| Decision chain entries | 100 | Yes |
| Evidence refs per rule | 20 | Yes |
| Total evidence refs | 200 | Yes |
| Path entries | 50 | No |
**Truncation behavior:**
When limits are exceeded:
1. Log warning with truncation details
2. Add `truncation` metadata to explanation
3. Store full evidence in separate CAS object
4. Include `full_evidence_uri` reference
```json
{
"truncation": {
"applied": true,
"elements_truncated": ["decision_chain", "evidence_refs"],
"full_evidence_uri": "cas://explanations/full/sha256:..."
}
}
```
### EX8: Versioning
**Schema versioning:**
- Schema version in `schema` field: `stellaops.explanation@v1`
- Breaking changes increment major version
- Minor changes (additive fields) use v1.x
- Backward compatibility maintained for 2 major versions
**Migration support:**
```bash
stella explain migrate --from v1 --to v2 --input ./explanations/
# Output:
Migrating 1000 explanations from v1 to v2...
Migrated: 998
Skipped (already v2): 2
Migration complete.
```
**Version compatibility matrix:**
| API Version | Schema v1 | Schema v2 |
|-------------|-----------|-----------|
| 1.0.x | Full | N/A |
| 1.1.x | Full | Full |
| 2.0.x | Read-only | Full |
### EX9: Golden Fixtures/Tests
**Test fixture location:**
```
tests/Explanation/
fixtures/
simple-affected.json
simple-not-affected.json
with-reachability-evidence.json
multi-rule-chain.json
truncated-evidence.json
redacted-pii.json
golden/
simple-affected.golden.json
simple-affected.golden.dsse
datasets/explanations/
schema/
explanation.schema.json
samples/
log4j-affected/
explanation.json
expected-hash.txt
```
**Test categories:**
1. **Canonicalization tests:** Verify hash stability across JSON reordering
2. **DSSE signing tests:** Verify signature creation and verification
3. **Redaction tests:** Verify PII handling
4. **Truncation tests:** Verify size budget enforcement
5. **Replay tests:** Verify bundle export/import cycle
6. **Migration tests:** Verify version upgrade paths
**CI integration:**
```yaml
# .gitea/workflows/explanation-tests.yml
explanation-tests:
runs-on: ubuntu-latest
steps:
- name: Run explanation tests
run: dotnet test src/Policy/__Tests/StellaOps.Policy.Explanation.Tests
- name: Verify golden fixtures
run: scripts/verify-golden-fixtures.sh tests/Explanation/golden/
```
### EX10: Determinism Guarantees
**Determinism requirements:**
1. Same inputs produce identical `explanation_id` hash
2. Decision chain ordering is stable (execution order)
3. Evidence refs sorted alphabetically
4. Timestamps use UTC ISO-8601 with millisecond precision
5. Floating-point values rounded to 6 decimal places
**Verification:**
```bash
# Run twice with same inputs, verify identical hashes
stella explain generate --finding "..." --output a.json
stella explain generate --finding "..." --output b.json
diff a.json b.json # Should be empty
# Or use built-in verify
stella explain verify-determinism --finding "..." --iterations 3
```
---
## 3. API Reference
### 3.1 Generate Explanation
```http
POST /api/policy/findings/{findingId}/explain
Authorization: Bearer <token>
Content-Type: application/json
{
"mode": "full",
"include_evidence": true,
"redaction_level": "standard"
}
```
### 3.2 Get Explanation
```http
GET /api/explanations/{explanationId}
Authorization: Bearer <token>
Accept: application/json
```
### 3.3 Export Explanation Bundle
```http
POST /api/explanations/export
Authorization: Bearer <token>
Content-Type: application/json
{
"finding_ids": ["...", "..."],
"include_dependencies": true,
"redaction_level": "standard"
}
```
### 3.4 Verify Explanation
```http
POST /api/explanations/{explanationId}/verify
Authorization: Bearer <token>
```
---
## 4. CLI Reference
```bash
# Generate explanation for a finding
stella explain generate --finding "P-7:S-42:pkg:maven/log4j@2.14.1:CVE-2021-44228"
# Export explanation bundle
stella explain export --findings ./finding-ids.txt --output ./bundle.tgz
# Verify explanation
stella explain verify --explanation ./explanation.json --dsse ./explanation.dsse
# Verify bundle
stella explain verify --bundle ./bundle.tgz
# Check determinism
stella explain verify-determinism --finding "..." --iterations 5
```
---
## 5. Related Documentation
- [Function-Level Evidence](./function-level-evidence.md) - Evidence chain guide
- [Graph Revision Schema](./graph-revision-schema.md) - Graph versioning
- [Policy API](../api/policy.md) - Policy Engine REST API
- [DSSE Predicates](../modules/attestor/architecture.md) - Signing specifications
---
_Last updated: 2025-12-13. See Sprint 0401 EXPLAIN-GAPS-401-064 for change history._