feat: Initialize Zastava Webhook service with TLS and Authority authentication
- Added Program.cs to set up the web application with Serilog for logging, health check endpoints, and a placeholder admission endpoint. - Configured Kestrel server to use TLS 1.3 and handle client certificates appropriately. - Created StellaOps.Zastava.Webhook.csproj with necessary dependencies including Serilog and Polly. - Documented tasks in TASKS.md for the Zastava Webhook project, outlining current work and exit criteria for each task.
This commit is contained in:
		| @@ -0,0 +1,149 @@ | ||||
| # EXCITITOR-ATTEST-01-003 - Verification & Observability Plan | ||||
|  | ||||
| - **Date:** 2025-10-19 | ||||
| - **Status:** Draft | ||||
| - **Owner:** Team Excititor Attestation | ||||
| - **Related tasks:** EXCITITOR-ATTEST-01-003 (Wave 0), EXCITITOR-WEB-01-003/004, EXCITITOR-WORKER-01-003 | ||||
| - **Prerequisites satisfied:** EXCITITOR-ATTEST-01-002 (Rekor v2 client integration) | ||||
|  | ||||
| ## 1. Objectives | ||||
|  | ||||
| 1. Provide deterministic attestation verification helpers consumable by Excititor WebService (`/excititor/verify`, `/excititor/export*`) and Worker re-verification loops. | ||||
| 2. Surface structured diagnostics for success, soft failures, and hard failures (signature mismatch, Rekor gaps, artifact digest drift). | ||||
| 3. Emit observability signals (logs, metrics, optional tracing) that can run offline and degrade gracefully when transparency services are unreachable. | ||||
| 4. Add regression tests (unit + integration) covering positive path, negative path, and offline fallback scenarios. | ||||
|  | ||||
| ## 2. Deliverables | ||||
|  | ||||
| - `IVexAttestationVerifier` abstraction + `VexAttestationVerifier` implementation inside `StellaOps.Excititor.Attestation`, encapsulating DSSE validation, predicate checks, artifact digest confirmation, Rekor inclusion verification, and deterministic diagnostics. | ||||
| - DI wiring (extension method) for registering verifier + instrumentation dependencies alongside the existing signer/rekor client. | ||||
| - Shared `VexAttestationDiagnostics` record describing normalized diagnostic keys consumed by Worker/WebService logging. | ||||
| - Metrics utility (`AttestationMetrics`) exposing counters/histograms via `System.Diagnostics.Metrics`, exported under `StellaOps.Excititor.Attestation` meter. | ||||
| - Activity source (`AttestationActivitySource`) for optional tracing spans around sign/verify operations. | ||||
| - Documentation updates (`EXCITITOR-ATTEST-01-003-plan.md`, `TASKS.md` notes) describing instrumentation + test expectations. | ||||
| - Test coverage in `StellaOps.Excititor.Attestation.Tests` (unit) and scaffolding notes for WebService/Worker integration tests. | ||||
|  | ||||
| ## 3. Verification Flow | ||||
|  | ||||
| ### 3.1 Inputs | ||||
|  | ||||
| - `VexAttestationRequest` from Core (contains export identifiers, artifact digest, metadata, source providers). | ||||
| - Optional Rekor reference from previous signing (`VexAttestationMetadata.Rekor`). | ||||
| - Configured policies (tolerated clock skew, Rekor verification toggle, offline mode flag, maximum metadata drift). | ||||
|  | ||||
| ### 3.2 Steps | ||||
|  | ||||
| 1. **Envelope decode** - retrieve DSSE envelope + predicate from storage (Worker) or request payload (WebService), canonicalize JSON, compute digest, compare with metadata `envelopeDigest`. | ||||
| 2. **Subject validation** - ensure subject digest matches exported artifact digest (algorithm & value) and export identifier matches `request.ExportId`. | ||||
| 3. **Signature verification** - delegate to signer/verifier abstraction (cosign/x509) using configured trust anchors; record `signature_state` diagnostic (verified, skipped_offline, failed). | ||||
| 4. **Provenance checks** - confirm predicate type (`https://stella-ops.org/attestations/vex-export`) and metadata shape; enforce deterministic timestamp tolerance. | ||||
| 5. **Transparency log** - if Rekor reference present and verification enabled, call `ITransparencyLogClient.VerifyAsync` with retry/backoff budget; support offline bypass with diagnostic `rekor_state=unreachable`. | ||||
| 6. **Result aggregation** - produce `VexAttestationVerification` containing `IsValid` flag and diagnostics map (includes `failure_reason` when invalid). | ||||
|  | ||||
| ### 3.3 Failure Categories & Handling | ||||
|  | ||||
| | Category | Detection | Handling | | ||||
| |---|---|---| | ||||
| | Signature mismatch | Signer verification failure or subject digest mismatch | Mark invalid, emit warning log, increment `verify.failed` counter with `reason=signature_mismatch`. | | ||||
| | Rekor absence/stale | Rekor verify returns false | Mark invalid unless offline mode configured; log with correlation ID; `reason=rekor_missing`. | | ||||
| | Predicate schema drift | Predicate type or required fields missing | Mark invalid, include `reason=predicate_invalid`. | | ||||
| | Time skew | `signedAt` older than policy threshold | Mark invalid (hard) or warn (soft) per options; include `reason=stale_attestation`. | | ||||
| | Unexpected metadata | Unknown export format, provider mismatch | Mark invalid; `reason=metadata_mismatch`. | | ||||
| | Offline Rekor | HTTP client throws | Mark soft failure if `AllowOfflineTransparency` true; degrade metrics with `rekor_state=offline`. | | ||||
|  | ||||
| ## 4. Observability | ||||
|  | ||||
| ### 4.1 Metrics (Meter name: `StellaOps.Excititor.Attestation`) | ||||
|  | ||||
| | Metric | Type | Dimensions | Description | | ||||
| |---|---|---|---| | ||||
| | `stellaops.excititor.attestation.verify.total` | Counter<long> | `result` (`success`/`failure`/`soft_failure`), `component` (`webservice`/`worker`), `reverify` (`true`/`false`) | Counts verification attempts. | | ||||
| | `stellaops.excititor.attestation.verify.duration.ms` | Histogram<double> | `component`, `result` | Measures end-to-end verification latency. | | ||||
| | `stellaops.excititor.attestation.verify.rekor.calls` | Counter<long> | `result` (`verified`/`unreachable`/`skipped`) | Rekor verification outcomes. | | ||||
| | `stellaops.excititor.attestation.verify.cache.hit` | Counter<long> | `hit` (`true`/`false`) | Tracks reuse of cached verification results (Worker loop). | | ||||
|  | ||||
| Metrics must register via static helper using `Meter` and support offline operation (no exporter dependency). Histogram records double milliseconds; use `Stopwatch.GetElapsedTime` for monotonic timing. | ||||
|  | ||||
| ### 4.2 Logging | ||||
|  | ||||
| - Use structured logs (`ILogger<VexAttestationVerifier>`) with event IDs: `AttestationVerified` (Information), `AttestationVerificationFailed` (Warning), `AttestationVerificationError` (Error). | ||||
| - Include correlation ID (`request.QuerySignature.Value`), `exportId`, `envelopeDigest`, `rekorLocation`, `reason`, and `durationMs`. | ||||
| - Avoid logging private keys or full envelope; log envelope digest only. For debug builds, gate optional envelope JSON behind `LogLevel.Trace` and configuration flag. | ||||
|  | ||||
| ### 4.3 Tracing | ||||
|  | ||||
| - Activity source name `StellaOps.Excititor.Attestation` with spans `attestation.verify` (parent from WebService request or Worker job) including tags: `stellaops.export_id`, `stellaops.result`, `stellaops.rekor.state`. | ||||
| - Propagate Activity through Rekor client via `HttpClient` instrumentation (auto instrumentation available). | ||||
|  | ||||
| ## 5. Integration Points | ||||
|  | ||||
| ### 5.1 WebService | ||||
|  | ||||
| - Inject `IVexAttestationVerifier` into export endpoints and `/excititor/verify` handler. | ||||
| - Persist verification result diagnostics alongside response payload for deterministic clients. | ||||
| - Return HTTP 200 with `{ valid: true }` when verified; 409 for invalid attestation with diagnostics JSON; 503 when Rekor unreachable and offline override disabled. | ||||
| - Add caching for idempotent verification (e.g., by envelope digest) to reduce Rekor calls and surface via metrics. | ||||
|  | ||||
| ### 5.2 Worker | ||||
|  | ||||
| - Schedule background job (`EXCITITOR-WORKER-01-003`) to re-verify stored attestations on TTL (default 12h) using new verifier; on failure, flag export for re-sign and notify via event bus (future task). | ||||
| - Emit logs/metrics with `component=worker`; include job IDs and next scheduled run. | ||||
| - Provide cancellation-aware loops (respect `CancellationToken`) and deterministic order (sorted by export id). | ||||
|  | ||||
| ### 5.3 Storage / Cache Hooks | ||||
|  | ||||
| - Store latest verification status and diagnostics in attestation metadata collection (Mongo) keyed by `envelopeDigest` + `artifactDigest` to avoid duplicate work. | ||||
| - Expose read API (via WebService) for clients to fetch last verification timestamp + result. | ||||
|  | ||||
| ## 6. Test Strategy | ||||
|  | ||||
| ### 6.1 Unit Tests (`StellaOps.Excititor.Attestation.Tests`) | ||||
|  | ||||
| - `VexAttestationVerifierTests.VerifyAsync_Succeeds_WhenSignatureAndRekorValid` - uses fake signer/verifier + in-memory Rekor client returning success. | ||||
| - `...ReturnsSoftFailure_WhenRekorOfflineAndAllowed` - ensure `IsValid=true`, diagnostic `rekor_state=offline`, metric increments `result=soft_failure`. | ||||
| - `...Fails_WhenDigestMismatch` - ensures invalid result, log entry recorded, metrics increment `result=failure` with `reason=signature_mismatch`. | ||||
| - `...Fails_WhenPredicateTypeUnexpected` - invalid with `reason=predicate_invalid`. | ||||
| - `...RespectsCancellation` - cancellation token triggered before Rekor call results in `OperationCanceledException` and no metrics increments beyond started attempt. | ||||
|  | ||||
| ### 6.2 WebService Integration Tests (`StellaOps.Excititor.WebService.Tests`) | ||||
|  | ||||
| - `VerifyEndpoint_Returns200_OnValidAttestation` - mocks verifier to return success, asserts response payload, metrics stub invoked. | ||||
| - `VerifyEndpoint_Returns409_OnInvalid` - invalid diag forwarded, ensures logging occurs. | ||||
| - `ExportEndpoint_IncludesVerificationDiagnostics` - ensures signed export responses include last verification metadata. | ||||
|  | ||||
| ### 6.3 Worker Tests (`StellaOps.Excititor.Worker.Tests`) | ||||
|  | ||||
| - `ReverificationJob_RequeuesOnFailure` - invalid result triggers requeue/backoff. | ||||
| - `ReverificationJob_PersistsStatusAndMetrics` - success path updates repository & metrics. | ||||
|  | ||||
| ### 6.4 Determinism/Regression | ||||
|  | ||||
| - Golden test verifying that identical inputs produce identical diagnostics dictionaries (sorted keys). | ||||
| - Ensure metrics dimensions remain stable via snapshot test (e.g., capturing tags in fake meter listener). | ||||
|  | ||||
| ## 7. Implementation Sequencing | ||||
|  | ||||
| 1. Introduce verifier abstraction + implementation with basic tests (signature + Rekor success/failure). | ||||
| 2. Add observability helpers (metrics, activity, logging) and wire into verifier; extend tests to assert instrumentation (using in-memory listener/log sink). | ||||
| 3. Update WebService DI/service layer to use verifier; craft endpoint integration tests. | ||||
| 4. Update Worker scheduling code to call verifier & emit metrics. | ||||
| 5. Wire persistence/caching and document configuration knobs (retry, offline, TTL). | ||||
| 6. Finalize documentation (architecture updates, runbook entries) before closing task. | ||||
|  | ||||
| ## 8. Configuration Defaults | ||||
|  | ||||
| - `AttestationVerificationOptions` (new): `RequireRekor=true`, `AllowOfflineTransparency=false`, `MaxClockSkew=PT5M`, `ReverifyInterval=PT12H`, `CacheWindow=PT1H`. | ||||
| - Options bind from configuration section `Excititor:Attestation` across WebService/Worker; offline kit ships defaults. | ||||
|  | ||||
| ## 9. Open Questions | ||||
|  | ||||
| - Should verification gracefully accept legacy predicate types (pre-1.0) or hard fail? (Proposed: allow via allowlist with warning diagnostics.) | ||||
| - Do we need cross-module eventing when verification fails (e.g., notify Export module) or is logging sufficient in Wave 0? (Proposed: log + metrics now, escalate in later wave.) | ||||
| - Confirm whether Worker re-verification writes to Mongo or triggers Export module to re-sign artifacts automatically; placeholder: record status + timestamp only. | ||||
|  | ||||
| ## 10. Acceptance Criteria | ||||
|  | ||||
| - Plan approved by Attestation + WebService + Worker leads. | ||||
| - Metrics/logging names peer-reviewed to avoid collisions. | ||||
| - Test backlog items entered into respective `TASKS.md` once implementation starts. | ||||
| - Documentation (this plan) linked from `TASKS.md` notes for discoverability. | ||||
		Reference in New Issue
	
	Block a user