Files
git.stella-ops.org/src/StellaOps.Excititor.Attestation/EXCITITOR-ATTEST-01-003-plan.md
Vladimir Moushkov 224c76c276
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
feat(rust): Implement RustCargoLockParser and RustFingerprintScanner
- Added RustCargoLockParser to parse Cargo.lock files and extract package information.
- Introduced RustFingerprintScanner to scan for Rust fingerprint records in JSON files.
- Created test fixtures for Rust language analysis, including Cargo.lock and fingerprint JSON files.
- Developed tests for RustLanguageAnalyzer to ensure deterministic output based on provided fixtures.
- Added expected output files for both simple and signed Rust applications.
2025-10-22 18:11:01 +03:00

158 lines
12 KiB
Markdown

# EXCITITOR-ATTEST-01-003 - Verification & Observability Plan
- **Date:** 2025-10-19
- **Status:** In progress (2025-10-22)
- **Owner:** Team Excititor Attestation
- **Related tasks:** EXCITITOR-ATTEST-01-003 (Wave 0), EXCITITOR-WEB-01-003/004, EXCITITOR-WORKER-01-003
- **Prerequisites satisfied:** EXCITITOR-ATTEST-01-002 (Rekor v2 client integration)
## 1. Objectives
1. Provide deterministic attestation verification helpers consumable by Excititor WebService (`/excititor/verify`, `/excititor/export*`) and Worker re-verification loops.
2. Surface structured diagnostics for success, soft failures, and hard failures (signature mismatch, Rekor gaps, artifact digest drift).
3. Emit observability signals (logs, metrics, optional tracing) that can run offline and degrade gracefully when transparency services are unreachable.
4. Add regression tests (unit + integration) covering positive path, negative path, and offline fallback scenarios.
## 2. Deliverables
- `IVexAttestationVerifier` abstraction + `VexAttestationVerifier` implementation inside `StellaOps.Excititor.Attestation`, encapsulating DSSE validation, predicate checks, artifact digest confirmation, Rekor inclusion verification, and deterministic diagnostics.
- DI wiring (extension method) for registering verifier + instrumentation dependencies alongside the existing signer/rekor client.
- Shared `VexAttestationDiagnostics` record describing normalized diagnostic keys consumed by Worker/WebService logging.
- Metrics utility (`AttestationMetrics`) exposing counters/histograms via `System.Diagnostics.Metrics`, exported under `StellaOps.Excititor.Attestation` meter.
- Activity source (`AttestationActivitySource`) for optional tracing spans around sign/verify operations.
- Documentation updates (`EXCITITOR-ATTEST-01-003-plan.md`, `TASKS.md` notes) describing instrumentation + test expectations.
- Test coverage in `StellaOps.Excititor.Attestation.Tests` (unit) and scaffolding notes for WebService/Worker integration tests.
## 3. Verification Flow
### 3.1 Inputs
- `VexAttestationRequest` from Core (contains export identifiers, artifact digest, metadata, source providers).
- Optional Rekor reference from previous signing (`VexAttestationMetadata.Rekor`).
- Configured policies (tolerated clock skew, Rekor verification toggle, offline mode flag, maximum metadata drift).
### 3.2 Steps
1. **Envelope decode** - retrieve DSSE envelope + predicate from storage (Worker) or request payload (WebService), canonicalize JSON, compute digest, compare with metadata `envelopeDigest`.
2. **Subject validation** - ensure subject digest matches exported artifact digest (algorithm & value) and export identifier matches `request.ExportId`.
3. **Signature verification** - delegate to signer/verifier abstraction (cosign/x509) using configured trust anchors; record `signature_state` diagnostic (verified, skipped_offline, failed).
4. **Provenance checks** - confirm predicate type (`https://stella-ops.org/attestations/vex-export`) and metadata shape; enforce deterministic timestamp tolerance.
5. **Transparency log** - if Rekor reference present and verification enabled, call `ITransparencyLogClient.VerifyAsync` with retry/backoff budget; support offline bypass with diagnostic `rekor_state=unreachable`.
6. **Result aggregation** - produce `VexAttestationVerification` containing `IsValid` flag and diagnostics map (includes `failure_reason` when invalid).
### 3.3 Failure Categories & Handling
| Category | Detection | Handling |
|---|---|---|
| Signature mismatch | Signer verification failure or subject digest mismatch | Mark invalid, emit warning log, increment `verify.failed` counter with `reason=signature_mismatch`. |
| Rekor absence/stale | Rekor verify returns false | Mark invalid unless offline mode configured; log with correlation ID; `reason=rekor_missing`. |
| Predicate schema drift | Predicate type or required fields missing | Mark invalid, include `reason=predicate_invalid`. |
| Time skew | `signedAt` older than policy threshold | Mark invalid (hard) or warn (soft) per options; include `reason=stale_attestation`. |
| Unexpected metadata | Unknown export format, provider mismatch | Mark invalid; `reason=metadata_mismatch`. |
| Offline Rekor | HTTP client throws | Mark soft failure if `AllowOfflineTransparency` true; degrade metrics with `rekor_state=offline`. |
## 4. Observability
### 4.1 Metrics (Meter name: `StellaOps.Excititor.Attestation`)
| Metric | Type | Dimensions | Description |
|---|---|---|---|
| `stellaops.excititor.attestation.verify.total` | Counter<long> | `result` (`success`/`failure`/`soft_failure`), `component` (`webservice`/`worker`), `reverify` (`true`/`false`) | Counts verification attempts. |
| `stellaops.excititor.attestation.verify.duration.ms` | Histogram<double> | `component`, `result` | Measures end-to-end verification latency. |
| `stellaops.excititor.attestation.verify.rekor.calls` | Counter<long> | `result` (`verified`/`unreachable`/`skipped`) | Rekor verification outcomes. |
| `stellaops.excititor.attestation.verify.cache.hit` | Counter<long> | `hit` (`true`/`false`) | Tracks reuse of cached verification results (Worker loop). |
Metrics must register via static helper using `Meter` and support offline operation (no exporter dependency). Histogram records double milliseconds; use `Stopwatch.GetElapsedTime` for monotonic timing.
### 4.2 Logging
- Use structured logs (`ILogger<VexAttestationVerifier>`) with event IDs: `AttestationVerified` (Information), `AttestationVerificationFailed` (Warning), `AttestationVerificationError` (Error).
- Include correlation ID (`request.QuerySignature.Value`), `exportId`, `envelopeDigest`, `rekorLocation`, `reason`, and `durationMs`.
- Avoid logging private keys or full envelope; log envelope digest only. For debug builds, gate optional envelope JSON behind `LogLevel.Trace` and configuration flag.
### 4.3 Tracing
- Activity source name `StellaOps.Excititor.Attestation` with spans `attestation.verify` (parent from WebService request or Worker job) including tags: `stellaops.export_id`, `stellaops.result`, `stellaops.rekor.state`.
- Propagate Activity through Rekor client via `HttpClient` instrumentation (auto instrumentation available).
## 5. Integration Points
### 5.1 WebService
- Inject `IVexAttestationVerifier` into export endpoints and `/excititor/verify` handler.
- Persist verification result diagnostics alongside response payload for deterministic clients.
- Return HTTP 200 with `{ valid: true }` when verified; 409 for invalid attestation with diagnostics JSON; 503 when Rekor unreachable and offline override disabled.
- Add caching for idempotent verification (e.g., by envelope digest) to reduce Rekor calls and surface via metrics.
### 5.2 Worker
- Schedule background job (`EXCITITOR-WORKER-01-003`) to re-verify stored attestations on TTL (default 12h) using new verifier; on failure, flag export for re-sign and notify via event bus (future task).
- Emit logs/metrics with `component=worker`; include job IDs and next scheduled run.
- Provide cancellation-aware loops (respect `CancellationToken`) and deterministic order (sorted by export id).
### 5.3 Storage / Cache Hooks
- Store latest verification status and diagnostics in attestation metadata collection (Mongo) keyed by `envelopeDigest` + `artifactDigest` to avoid duplicate work.
- Expose read API (via WebService) for clients to fetch last verification timestamp + result.
## 6. Test Strategy
### 6.1 Unit Tests (`StellaOps.Excititor.Attestation.Tests`)
- `VexAttestationVerifierTests.VerifyAsync_Succeeds_WhenSignatureAndRekorValid` - uses fake signer/verifier + in-memory Rekor client returning success.
- `...ReturnsSoftFailure_WhenRekorOfflineAndAllowed` - ensure `IsValid=true`, diagnostic `rekor_state=offline`, metric increments `result=soft_failure`.
- `...Fails_WhenDigestMismatch` - ensures invalid result, log entry recorded, metrics increment `result=failure` with `reason=signature_mismatch`.
- `...Fails_WhenPredicateTypeUnexpected` - invalid with `reason=predicate_invalid`.
- `...RespectsCancellation` - cancellation token triggered before Rekor call results in `OperationCanceledException` and no metrics increments beyond started attempt.
### 6.2 WebService Integration Tests (`StellaOps.Excititor.WebService.Tests`)
- `VerifyEndpoint_Returns200_OnValidAttestation` - mocks verifier to return success, asserts response payload, metrics stub invoked.
- `VerifyEndpoint_Returns409_OnInvalid` - invalid diag forwarded, ensures logging occurs.
- `ExportEndpoint_IncludesVerificationDiagnostics` - ensures signed export responses include last verification metadata.
### 6.3 Worker Tests (`StellaOps.Excititor.Worker.Tests`)
- `ReverificationJob_RequeuesOnFailure` - invalid result triggers requeue/backoff.
- `ReverificationJob_PersistsStatusAndMetrics` - success path updates repository & metrics.
### 6.4 Determinism/Regression
- Golden test verifying that identical inputs produce identical diagnostics dictionaries (sorted keys).
- Ensure metrics dimensions remain stable via snapshot test (e.g., capturing tags in fake meter listener).
## 7. Implementation Sequencing
1. Introduce verifier abstraction + implementation with basic tests (signature + Rekor success/failure).
2. Add observability helpers (metrics, activity, logging) and wire into verifier; extend tests to assert instrumentation (using in-memory listener/log sink).
3. Update WebService DI/service layer to use verifier; craft endpoint integration tests.
4. Update Worker scheduling code to call verifier & emit metrics.
5. Wire persistence/caching and document configuration knobs (retry, offline, TTL).
6. Finalize documentation (architecture updates, runbook entries) before closing task.
## 8. Configuration Defaults
- `AttestationVerificationOptions` (new): `RequireRekor=true`, `AllowOfflineTransparency=false`, `MaxClockSkew=PT5M`, `ReverifyInterval=PT12H`, `CacheWindow=PT1H`.
- Options bind from configuration section `Excititor:Attestation` across WebService/Worker; offline kit ships defaults.
## 9. Open Questions
- Should verification gracefully accept legacy predicate types (pre-1.0) or hard fail? (Proposed: allow via allowlist with warning diagnostics.)
- Do we need cross-module eventing when verification fails (e.g., notify Export module) or is logging sufficient in Wave 0? (Proposed: log + metrics now, escalate in later wave.)
- Confirm whether Worker re-verification writes to Mongo or triggers Export module to re-sign artifacts automatically; placeholder: record status + timestamp only.
## 10. Acceptance Criteria
- Plan approved by Attestation + WebService + Worker leads.
- Metrics/logging names peer-reviewed to avoid collisions.
- Test backlog items entered into respective `TASKS.md` once implementation starts.
- Documentation (this plan) linked from `TASKS.md` notes for discoverability.
## 11. 2025-10-22 Progress Notes
- Implemented `IVexAttestationVerifier`/`VexAttestationVerifier` with structural validation (subject/predicate checks, digest comparison, Rekor probes) and diagnostics map.
- Added `VexAttestationVerificationOptions` (RequireTransparencyLog, AllowOfflineTransparency, MaxClockSkew) and wired configuration through WebService DI.
- Created `VexAttestationMetrics` (`excititor.attestation.verify_total`, `excititor.attestation.verify_duration_seconds`) and hooked into verification flow with component/rekor tags.
- `VexAttestationClient.VerifyAsync` now delegates to the verifier; DI registers metrics + verifier via `AddVexAttestation`.
- Added unit coverage in `VexAttestationVerifierTests` (happy path, digest mismatch, offline Rekor) and updated client/export/webservice stubs to new verification signature.