Below are implementation-grade guidelines for Stella Ops Product Managers (PMs) and Development Managers (Eng Managers / Tech Leads) for two tightly coupled capabilities: 1. **Exception management as auditable objects** (not suppression files) 2. **Audit packs** (exportable, verifiable evidence bundles for releases and environments) The intent is to make these capabilities: * operationally useful (reduce friction in CI/CD and runtime governance), * defensible in audits (tamper-evident, attributable, time-bounded), and * consistent with Stella Ops’ positioning around determinism, evidence, and replayability. --- # 1. Shared objectives and boundaries ## 1.1 Objectives These two capabilities must jointly enable: * **Risk decisions are explicit**: Every “ignore/suppress/waive” is a governed decision with an owner and expiry. * **Decisions are replayable**: If an auditor asks “why did you ship this on date X?”, Stella Ops can reproduce the decision using the same policy + evidence + knowledge snapshot. * **Decisions are exportable and verifiable**: Audit packs include the minimum necessary artifacts and a manifest that allows independent verification of integrity and completeness. * **Operational friction is reduced**: Teams can ship safely with controlled exceptions, rather than ad-hoc suppressions, while retaining accountability. ## 1.2 Out of scope (explicitly) Avoid scope creep early. The following are out of scope for v1 unless mandated by a target customer: * Full GRC mapping to specific frameworks (you can *support evidence*; don’t claim compliance). * Fully automated approvals based on HR org charts. * Multi-year archival systems (start with retention, export, and immutable event logs). * A “ticketing system replacement.” Integrate with ticketing; don’t rebuild it. --- # 2. Shared design principles (non-negotiables) These principles apply to both Exception Objects and Audit Packs: 1. **Attribution**: every action has an authenticated actor identity (human or service), a timestamp, and a reason. 2. **Immutability of history**: edits are new versions/events; never rewrite history in place. 3. **Least privilege scope**: exceptions must be as narrow as possible (artifact digest over tag; component purl over “any”; environment constraints). 4. **Time-bounded risk**: exceptions must expire. “Permanent ignore” is a governance smell. 5. **Deterministic evaluation**: given the same policy + snapshot + exceptions + inputs, the outcome is stable and reproducible. 6. **Separation of concerns**: * Exception store = governed decisions. * Scanner = evidence producer. * Policy engine = deterministic evaluator. * Audit packer = exporter/assembler/verifier. --- # 3. Exception management as auditable objects ## 3.1 What an “Exception Object” is An Exception Object is a structured, versioned record that modifies evaluation behavior *in a controlled manner*, while leaving the underlying findings intact. It is not: * a local `.ignore` file, * a hidden suppression rule, * a UI-only toggle, * a vendor-specific “ignore list” with no audit trail. ### Exception types you should support (minimum set) PMs should start with these canonical types: 1. **Vulnerability exception** * suppress/waive a specific vulnerability finding (e.g., CVE/CWE) under defined scope. 2. **Policy exception** * allow a policy rule to be bypassed under defined scope (e.g., “allow unsigned artifact for dev namespace”). 3. **Unknown-state exception** (if Stella models unknowns) * allow a release despite unresolved unknowns, with explicit risk acceptance. 4. **Component exception** * allow/deny a component/package/version across a domain, again with explicit scope and expiry. ## 3.2 Required fields and schema guidelines PMs: mandate these fields; Eng: enforce them at API and storage level. ### Required fields (v1) * **exception_id** (stable identifier) * **version** (monotonic; or event-sourced) * **status**: proposed | approved | active | expired | revoked * **owner** (accountable person/team) * **requester** (who initiated) * **approver(s)** (who approved; may be empty for dev environments depending on policy) * **created_at / updated_at / approved_at / expires_at** * **scope** (see below) * **reason_code** (taxonomy) * **rationale** (free text, required) * **evidence_refs** (optional in v1 but strongly recommended) * **risk_acceptance** (explicit boolean or structured “risk accepted” block) * **links** (ticket ID, PR, incident, vendor advisory reference) – optional but useful * **audit_log_refs** (implicit if event-sourced) ### Scope model (critical to defensibility) Scope must be structured and narrowable. Provide scope dimensions such as: * **Artifact scope**: image digest, SBOM digest, build provenance digest (preferred) (Avoid tags as primary scope unless paired with immutability constraints.) * **Component scope**: purl + version range + ecosystem * **Vulnerability scope**: CVE ID(s), GHSA, internal ID; optionally path/function/symbol constraints * **Environment scope**: cluster/namespace, runtime env (dev/stage/prod), repository, project, tenant * **Time scope**: expires_at (required), optional “valid_from” PM guideline: default UI and API should encourage digest-based scope and warn on broad scopes. ## 3.3 Reason codes (taxonomy) Reason codes are a moat because they enable governance analytics and policy automation. Minimum suggested taxonomy: * **FALSE_POSITIVE** (with evidence expectations) * **NOT_REACHABLE** (reachable proof preferred) * **NOT_AFFECTED** (VEX-backed preferred) * **BACKPORT_FIXED** (package/distro evidence preferred) * **COMPENSATING_CONTROL** (link to control evidence) * **RISK_ACCEPTED** (explicit sign-off) * **TEMPORARY_WORKAROUND** (link to mitigation plan) * **VENDOR_PENDING** (under investigation) * **BUSINESS_EXCEPTION** (rare; requires stronger approval) PM guideline: reason codes must be selectable and reportable; do not allow “Other” as the default. ## 3.4 Evidence attachments Exceptions should evolve from “justification-only” to “justification + evidence.” Evidence references can point to: * VEX statements (OpenVEX/CycloneDX VEX) * reachability proof fragments (call-path subgraph, symbol references) * distro advisories / patch references * internal change tickets / mitigation PRs * runtime mitigations Eng guideline: store evidence as references with integrity checks (hash/digest). For v2+, store evidence bundles as content-addressed blobs. ## 3.5 Lifecycle and workflows ### Lifecycle states and transitions * **Proposed** → **Approved** → **Active** → (**Expired** or **Revoked**) * **Renewal** should create a **new version** (never extend an old record silently). ### Approvals PM guideline: * At least two approval modes: 1. **Self-approved** (allowed only for dev/experimental scopes) 2. **Two-person review** (required for prod or broad scope) Eng guideline: * Enforce approval rules via policy config (not hard-coded). * Record every approval action with actor identity and timestamp. ### Expiry enforcement Non-negotiable: * Expired exceptions must stop applying automatically. * Renewals require an explicit action and new audit trail. ## 3.6 Evaluation semantics (how exceptions affect results) This is where most products become non-auditable. You need deterministic, explicit rules. PM guideline: define precedence clearly: * Policy engine evaluates baseline findings → applies exceptions → produces verdict. * Exceptions never delete underlying findings; they alter the *decision outcome* and annotate the reasoning. Eng guideline: exception application must be: * **Deterministic** (stable ordering rules) * **Transparent** (verdict includes “exception applied: exception_id, reason_code, scope match explanation”) * **Scoped** (match explanation must state which scope dimensions matched) ## 3.7 Auditability requirements Exception management must be audit-ready by construction. Minimum requirements: * **Append-only event log** for create/approve/revoke/expire/renew actions * **Versioning**: every change results in a new version or event * **Tamper-evidence**: hash chain events or sign event batches * **Retention**: define retention policy and export strategy PM guideline: auditors will ask “who approved,” “why,” “when,” “what scope,” and “what changed since.” Design the UX and exports to answer those in minutes. ## 3.8 UX guidelines Key UX flows: * **Create exception from a finding** (pre-fill CVE/component/artifact scope) * **Preview impact** (“this will suppress 37 findings across 12 images; are you sure?”) * **Expiry visibility** (countdown, alerts, renewal prompts) * **Audit trail view** (who did what, with diffs between versions) * **Search and filters** by owner, reason, expiry window, scope breadth, environment UX anti-patterns to forbid: * “Ignore all vulnerabilities in this image” with one click * Silent suppressions without owner/expiry * Exceptions created without linking to scope and reason ## 3.9 Product acceptance criteria (PM-owned) A feature is not “done” until: * Every exception has owner, expiry, reason code, scope. * Exception history is immutable and exportable. * Policy outcomes show applied exceptions and why. * Expiry is enforced automatically. * A user can answer: “What exceptions were active for this release?” within 2 minutes. --- # 4. Audit packs ## 4.1 What an audit pack is An Audit Pack is a **portable, verifiable bundle** that answers: * What was evaluated? (artifacts, versions, identities) * Under what policies? (policy version/config) * Using what knowledge state? (vuln DB snapshot, VEX inputs) * What exceptions were applied? (IDs, owners, rationales) * What was the decision and why? (verdict + evidence pointers) * What changed since the last release? (optional diff summary) PM guideline: treat the Audit Pack as a product deliverable, not an export button. ## 4.2 Pack structure (recommended) Use a predictable, documented layout. Example: * `manifest.json` * pack_id, generated_at, generator_version * hashes/digests of every included file * signing info (optional in v1; recommended soon) * `inputs/` * artifact identifiers (digests), repo references (optional) * SBOM(s) (CycloneDX/SPDX) * `vex/` * VEX docs used + any VEX produced * `policy/` * policy bundle used (versioned) * evaluation settings * `exceptions/` * all exceptions relevant to the evaluated scope * plus event logs / versions * `findings/` * normalized findings list * reachability evidence fragments if applicable * `verdict/` * final decision object * explanation summary * signed attestation (if supported) * `diff/` (optional) * delta from prior baseline (what changed materially) ## 4.3 Formats: human and machine You need both: * **Machine-readable** (JSON + standard SBOM/VEX formats) for verification and automation * **Human-readable** summary (HTML or PDF) for auditors and leadership PM guideline: machine artifacts are the source of truth. Human docs are derived views. Eng guideline: * Ensure the pack can be generated **offline**. * Ensure deterministic outputs where feasible (stable ordering, consistent serialization). ## 4.4 Integrity and verification At minimum: * `manifest.json` includes a digest for each file. * Provide a `stella verify-pack` CLI that checks: * manifest integrity * file hashes * schema versions * optional signature verification For v2: * Sign the manifest (and/or the verdict) using your standard attestation mechanism. ## 4.5 Confidentiality and redaction Audit packs often include sensitive data (paths, internal package names, repo URLs). PM guideline: * Provide **redaction profiles**: * external auditor pack (minimal identifiers) * internal audit pack (full detail) * Provide encryption options (password/recipient keys) if packs leave the environment. Eng guideline: * Redaction must be deterministic and declarative (policy-based). * Pack generation must not leak secrets from raw scan logs. ## 4.6 Pack generation workflow Key product flows: * Generate pack for: * a specific artifact digest * a release (set of digests) * an environment snapshot (e.g., cluster inventory) * a date range (for audit period) * Trigger sources: * UI * API * CI pipeline step Engineering: * Treat pack generation as an async job (queue + status endpoint). * Cache pack components when inputs are identical (avoid repeated work). ## 4.7 What must be included (minimum viable audit pack) PMs should enforce that v1 includes: * Artifact identity * SBOM(s) or component inventory * Findings list (normalized) * Policy bundle reference + policy content * Exceptions applied (full object + version info) * Final verdict + explanation summary * Integrity manifest with file hashes Add these when available (v1.5+): * VEX inputs and outputs * Knowledge snapshot references * Reachability evidence fragments * Diff summary vs prior release ## 4.8 Product acceptance criteria (PM-owned) Audit Packs are not “done” until: * A third party can validate the pack contents haven’t been altered (hash verification). * The pack answers “why did this pass/fail?” including exceptions applied. * Packs can be generated without external network calls (air-gap friendly). * Packs support redaction profiles. * Pack schema is versioned and backward compatible. --- # 5. Cross-cutting: roles, responsibilities, and delivery checkpoints ## 5.1 Responsibilities **Product Manager** * Define exception types and required fields * Define reason code taxonomy and governance policies * Define approval rules by environment and scope breadth * Define audit pack templates, profiles, and export targets * Own acceptance criteria and audit usability testing **Development Manager / Tech Lead** * Own event model (immutability, versioning, retention) * Own policy evaluation semantics and determinism guarantees * Own integrity and signing design (manifest hashes, optional signatures) * Own performance and scalability targets (pack generation and query latency) * Own secure storage and access controls (RBAC, tenant isolation) ## 5.2 Deliverables checklist (for each capability) For “Exception Objects”: * PRD + threat model (abuse cases: blanket waivers, privilege escalation) * Schema spec + versioning policy * API endpoints + RBAC model * UI flows + audit trail UI * Policy engine semantics + test vectors * Metrics dashboards For “Audit Packs”: * Pack schema spec + folder layout * Manifest + hash verification rules * Generator service + async job API * Redaction profiles + tests * Verifier CLI + documentation * Performance benchmarks + caching strategy --- # 6. Common failure modes to actively prevent 1. **Exceptions become suppressions again** If you allow exceptions without expiry/owner or without audit trail, you’ve rebuilt “ignore lists.” 2. **Over-broad scopes by default** If “all repos/all images” is easy, you will accumulate permanent waivers and lose credibility. 3. **No deterministic semantics** If the same artifact can pass/fail depending on evaluation order or transient feed updates, auditors will distrust outputs. 4. **Audit packs that are reports, not evidence** A PDF without machine-verifiable artifacts is not an audit pack—it’s a slide. 5. **No renewal discipline** If renewals are frictionless and don’t require re-justification, exceptions never die. --- # 7. Recommended phased rollout (to manage build cost) **Phase 1: Governance basics** * Exception object schema + lifecycle + expiry enforcement * Create-from-finding UX * Audit pack v1 (SBOM/inventory + findings + policy + exceptions + manifest) **Phase 2: Evidence binding** * Evidence refs on exceptions (VEX, reachability fragments) * Pack includes VEX inputs/outputs and knowledge snapshot identifiers **Phase 3: Verifiable trust** * Signed verdicts and/or signed pack manifests * Verifier tooling and deterministic replay hooks --- If you want, I can convert the above into two artifacts your teams can execute against immediately: 1. A concise **PRD template** (sections + required decisions) for Exceptions and Audit Packs 2. A **technical spec outline** (schema definitions, endpoints, state machines, and acceptance test vectors)