diff --git a/EXECPLAN.md b/EXECPLAN.md index ad0fafc2..edf5713b 100644 --- a/EXECPLAN.md +++ b/EXECPLAN.md @@ -50,7 +50,7 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - Team Tools Guild, BE-Conn-MSRC: read EXECPLAN.md Wave 0 and SPRINTS.md rows for `src/StellaOps.Concelier.Connector.Common/TASKS.md`. Focus on FEEDCONN-SHARED-STATE-003 (**TODO). Confirm prerequisites (none) before starting and report status in module TASKS.md. - Team UX Specialist, Angular Eng: read EXECPLAN.md Wave 0 and SPRINTS.md rows for `src/StellaOps.Web/TASKS.md`. Focus on WEB1.TRIVY-SETTINGS (DONE 2025-10-21), WEB1.TRIVY-SETTINGS-TESTS (DONE 2025-10-21), and WEB1.DEPS-13-001 (DONE 2025-10-21). Confirm prerequisites (none) before starting and report status in module TASKS.md. - Team Zastava Core Guild: read EXECPLAN.md Wave 0 and SPRINTS.md rows for `src/StellaOps.Zastava.Core/TASKS.md`. Focus on ZASTAVA-CORE-12-201 (DONE 2025-10-23), ZASTAVA-CORE-12-202 (DONE 2025-10-23), ZASTAVA-CORE-12-203 (DONE 2025-10-23), ZASTAVA-OPS-12-204 (DONE 2025-10-23). Confirm prerequisites (none) before starting and report status in module TASKS.md. -- Team Zastava Webhook Guild: read EXECPLAN.md Wave 0 and SPRINTS.md rows for `src/StellaOps.Zastava.Webhook/TASKS.md`. Focus on ZASTAVA-WEBHOOK-12-101 (DONE 2025-10-24), ZASTAVA-WEBHOOK-12-102 (DOING 2025-10-24), ZASTAVA-WEBHOOK-12-103 (DOING 2025-10-24), ZASTAVA-WEBHOOK-12-104 (TODO). Confirm prerequisites (none) before starting and report status in module TASKS.md. +- Team Zastava Webhook Guild: read EXECPLAN.md Wave 0 and SPRINTS.md rows for `src/StellaOps.Zastava.Webhook/TASKS.md`. Focus on ZASTAVA-WEBHOOK-12-101 (DONE 2025-10-24), ZASTAVA-WEBHOOK-12-102 (DONE 2025-10-24), ZASTAVA-WEBHOOK-12-103 (DONE 2025-10-24), ZASTAVA-WEBHOOK-12-104 (DONE 2025-10-24). Confirm prerequisites (none) before starting and report status in module TASKS.md. ### Wave 1 - Team Bench Guild, Language Analyzer Guild: read EXECPLAN.md Wave 1 and SPRINTS.md rows for `bench/TASKS.md`. Focus on BENCH-SCANNER-10-002 (TODO). Confirm prerequisites (internal: SCANNER-ANALYZERS-LANG-10-301 (Wave 0)) before starting and report status in module TASKS.md. @@ -77,7 +77,7 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - Team Team Excititor Export: read EXECPLAN.md Wave 1 and SPRINTS.md rows for `src/StellaOps.Excititor.Export/TASKS.md`. Focus on EXCITITOR-EXPORT-01-006 (DONE 2025-10-21). Confirm prerequisites (internal: EXCITITOR-EXPORT-01-005 (Wave 0), POLICY-CORE-09-005 (Wave 0)) before starting and report status in module TASKS.md. - Team Team Excititor Worker: read EXECPLAN.md Wave 1 and SPRINTS.md rows for `src/StellaOps.Excititor.Worker/TASKS.md`. Focus on EXCITITOR-WORKER-01-003 (TODO). Confirm prerequisites (internal: EXCITITOR-ATTEST-01-003 (Wave 0); external: EXCITITOR-EXPORT-01-002, EXCITITOR-WORKER-01-001) before starting and report status in module TASKS.md. - Team UI Guild: read EXECPLAN.md Wave 1 and SPRINTS.md rows for `src/StellaOps.UI/TASKS.md`. Focus on UI-ATTEST-11-005 (DONE 2025-10-23), UI-VEX-13-003 (TODO), UI-POLICY-13-007 (TODO), UI-ADMIN-13-004 (TODO), UI-AUTH-13-001 (DONE 2025-10-23), UI-SCANS-13-002 (TODO), UI-NOTIFY-13-006 (DOING 2025-10-19), UI-SCHED-13-005 (TODO). Confirm prerequisites (internal: ATTESTOR-API-11-201 (Wave 0), AUTH-DPOP-11-001 (Wave 0), AUTH-MTLS-11-002 (Wave 0), EXCITITOR-EXPORT-01-005 (Wave 0), NOTIFY-WEB-15-101 (Wave 0), POLICY-CORE-09-006 (Wave 0), SCHED-WEB-16-101 (Wave 0), SIGNER-API-11-101 (Wave 0); external: EXCITITOR-CORE-02-001, SCANNER-WEB-09-102, SCANNER-WEB-09-103) before starting and report status in module TASKS.md. -- Team Zastava Observer Guild: read EXECPLAN.md Wave 1 and SPRINTS.md rows for `src/StellaOps.Zastava.Observer/TASKS.md`. Focus on ZASTAVA-OBS-12-001 (DOING 2025-10-24). Confirm prerequisites (internal: ZASTAVA-CORE-12-201 (Wave 0)) before starting and report status in module TASKS.md. +- Team Zastava Observer Guild: read EXECPLAN.md Wave 1 and SPRINTS.md rows for `src/StellaOps.Zastava.Observer/TASKS.md`. Focus on ZASTAVA-OBS-12-001 (DONE 2025-10-24). Confirm prerequisites (internal: ZASTAVA-CORE-12-201 (Wave 0)) before starting and report status in module TASKS.md. ### Wave 2 - Team Bench Guild, Notify Team: read EXECPLAN.md Wave 2 and SPRINTS.md rows for `bench/TASKS.md`. Focus on BENCH-NOTIFY-15-001 (TODO). Confirm prerequisites (internal: NOTIFY-ENGINE-15-301 (Wave 1)) before starting and report status in module TASKS.md. @@ -98,7 +98,7 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - Team TBD: read EXECPLAN.md Wave 2 and SPRINTS.md rows for `src/StellaOps.Scanner.Analyzers.Lang.DotNet/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Go/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Node/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Python/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Rust/TASKS.md`. SCANNER-ANALYZERS-LANG-10-305B/304B/303B/306B wrapped on 2025-10-22; next focus moves to `10-307*` shared helper integration and Wave 2 benchmark polish. Node packaging milestone 10-308N closed 2025-10-21. Confirm prerequisites (internal: SCANNER-ANALYZERS-LANG-10-303A (Wave 1), SCANNER-ANALYZERS-LANG-10-304A (Wave 1), SCANNER-ANALYZERS-LANG-10-305A (Wave 1), SCANNER-ANALYZERS-LANG-10-306A (Wave 1), SCANNER-ANALYZERS-LANG-10-307N (Wave 1)) before starting new work and report status in module TASKS.md. - Team Team Excititor Connectors – Oracle: read EXECPLAN.md Wave 2 and SPRINTS.md rows for `src/StellaOps.Excititor.Connectors.Oracle.CSAF/TASKS.md`. Focus on EXCITITOR-CONN-ORACLE-01-003 (TODO). Confirm prerequisites (internal: EXCITITOR-CONN-ORACLE-01-002 (Wave 1); external: EXCITITOR-POLICY-01-001) before starting and report status in module TASKS.md. - Team Team Excititor Export: read EXECPLAN.md Wave 2 and SPRINTS.md rows for `src/StellaOps.Excititor.Export/TASKS.md`. Focus on EXCITITOR-EXPORT-01-007 (DONE 2025-10-21). Confirm prerequisites (internal: EXCITITOR-EXPORT-01-006 (Wave 1)) before starting and report status in module TASKS.md. -- Team Zastava Observer Guild: read EXECPLAN.md Wave 2 and SPRINTS.md rows for `src/StellaOps.Zastava.Observer/TASKS.md`. Focus on ZASTAVA-OBS-12-002 (TODO). Confirm prerequisites (internal: ZASTAVA-OBS-12-001 (Wave 1)) before starting and report status in module TASKS.md. +- Team Zastava Observer Guild: read EXECPLAN.md Wave 2 and SPRINTS.md rows for `src/StellaOps.Zastava.Observer/TASKS.md`. ZASTAVA-OBS-12-002 closed (DONE 2025-10-24); monitor follow-up posture/delta tasks and keep module TASKS.md in sync. ### Wave 3 - Team DevEx/CLI: read EXECPLAN.md Wave 3 and SPRINTS.md rows for `src/StellaOps.Cli/TASKS.md`. Focus on CLI-OFFLINE-13-006 (DONE 2025-10-21). Confirm prerequisites (internal: DEVOPS-OFFLINE-14-002 (Wave 2)) before starting and report status in module TASKS.md. @@ -108,7 +108,7 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - Team Notify Worker Guild: read EXECPLAN.md Wave 3 and SPRINTS.md rows for `src/StellaOps.Notify.Worker/TASKS.md`. Focus on NOTIFY-WORKER-15-203 (TODO). Confirm prerequisites (internal: NOTIFY-ENGINE-15-302 (Wave 2)) before starting and report status in module TASKS.md. - Team Scheduler Worker Guild: read EXECPLAN.md Wave 3 and SPRINTS.md rows for `src/StellaOps.Scheduler.Worker/TASKS.md`. Focus on SCHED-WORKER-16-203 (TODO). Confirm prerequisites (internal: SCHED-WORKER-16-202 (Wave 2)) before starting and report status in module TASKS.md. - Team TBD: read EXECPLAN.md Wave 3 and SPRINTS.md rows for `src/StellaOps.Scanner.Analyzers.Lang.DotNet/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Go/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Node/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Python/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Rust/TASKS.md`. SCANNER-ANALYZERS-LANG-10-305C/304C/309N/303C/306C are all DONE (latest 2025-10-22); remaining Wave 3 attention shifts to 10-307* helper consolidation and subsequent benchmarking tickets. Confirm prerequisites (internal: SCANNER-ANALYZERS-LANG-10-303B (Wave 2), SCANNER-ANALYZERS-LANG-10-304B (Wave 2), SCANNER-ANALYZERS-LANG-10-305B (Wave 2), SCANNER-ANALYZERS-LANG-10-306B (Wave 2), SCANNER-ANALYZERS-LANG-10-308N (Wave 2)) before scheduling new work and report status in module TASKS.md. -- Team Zastava Observer Guild: read EXECPLAN.md Wave 3 and SPRINTS.md rows for `src/StellaOps.Zastava.Observer/TASKS.md`. Focus on ZASTAVA-OBS-12-003 (TODO), ZASTAVA-OBS-12-004 (TODO), ZASTAVA-OBS-17-005 (TODO). Confirm prerequisites (internal: ZASTAVA-OBS-12-002 (Wave 2)) before starting and report status in module TASKS.md. +- Team Zastava Observer Guild: read EXECPLAN.md Wave 3 and SPRINTS.md rows for `src/StellaOps.Zastava.Observer/TASKS.md`. ZASTAVA-OBS-12-003 closed (DONE 2025-10-24); ZASTAVA-OBS-12-004 (DONE 2025-10-24) delivered disk-backed batching. Remaining focus shifts to ZASTAVA-OBS-17-005 (DOING 2025-10-24). Confirm prerequisites (internal: ZASTAVA-OBS-12-002 (Wave 2)) before starting and keep TASKS.md in sync. ### Wave 4 - Team DevEx/CLI: read EXECPLAN.md Wave 4 and SPRINTS.md rows for `src/StellaOps.Cli/TASKS.md`. Focus on CLI-PLUGIN-13-007 (DONE 2025-10-22). Confirm prerequisites (internal: CLI-OFFLINE-13-006 (Wave 3), CLI-RUNTIME-13-005 (Wave 0)) before starting and report status in module TASKS.md. @@ -117,14 +117,14 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - Team Notify Connectors Guild: read EXECPLAN.md Wave 4 and SPRINTS.md rows for `src/StellaOps.Notify.Connectors.Email/TASKS.md`, `src/StellaOps.Notify.Connectors.Slack/TASKS.md`, `src/StellaOps.Notify.Connectors.Teams/TASKS.md`, `src/StellaOps.Notify.Connectors.Webhook/TASKS.md`. Focus on NOTIFY-CONN-SLACK-15-501 (TODO), NOTIFY-CONN-TEAMS-15-601 (TODO), NOTIFY-CONN-EMAIL-15-701 (TODO), NOTIFY-CONN-WEBHOOK-15-801 (TODO). Confirm prerequisites (internal: NOTIFY-ENGINE-15-303 (Wave 3)) before starting and report status in module TASKS.md. - Team Notify Engine Guild: read EXECPLAN.md Wave 4 and SPRINTS.md rows for `src/StellaOps.Notify.Engine/TASKS.md`. Focus on NOTIFY-ENGINE-15-304 (TODO). Confirm prerequisites (internal: NOTIFY-ENGINE-15-303 (Wave 3)) before starting and report status in module TASKS.md. - Team Notify Worker Guild: read EXECPLAN.md Wave 4 and SPRINTS.md rows for `src/StellaOps.Notify.Worker/TASKS.md`. Focus on NOTIFY-WORKER-15-204 (TODO). Confirm prerequisites (internal: NOTIFY-WORKER-15-203 (Wave 3)) before starting and report status in module TASKS.md. -- Team Policy Guild, Scanner WebService Guild: read EXECPLAN.md Wave 4 and SPRINTS.md rows for `src/StellaOps.Policy/TASKS.md`. Focus on POLICY-RUNTIME-17-201 (TODO). Confirm prerequisites (internal: ZASTAVA-OBS-17-005 (Wave 3)) before starting and report status in module TASKS.md. +- Team Policy Guild, Scanner WebService Guild: read EXECPLAN.md Wave 4 and SPRINTS.md rows for `src/StellaOps.Policy/TASKS.md`. Focus on POLICY-RUNTIME-17-201 (TODO). Confirm prerequisites (internal: ZASTAVA-OBS-17-005 (Wave 3, DOING 2025-10-24)) before starting and report status in module TASKS.md. - Team Scheduler Worker Guild: read EXECPLAN.md Wave 4 and SPRINTS.md rows for `src/StellaOps.Scheduler.Worker/TASKS.md`. Focus on SCHED-WORKER-16-204 (TODO). Confirm prerequisites (internal: SCHED-WORKER-16-203 (Wave 3)) before starting and report status in module TASKS.md. - Team TBD: read EXECPLAN.md Wave 4 and SPRINTS.md rows for `src/StellaOps.Scanner.Analyzers.Lang.DotNet/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Go/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Python/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Rust/TASKS.md`. SCANNER-ANALYZERS-LANG-10-307D/G/P are DONE (latest 2025-10-23); remaining focus is SCANNER-ANALYZERS-LANG-10-307R (TODO). Confirm prerequisites (internal: SCANNER-ANALYZERS-LANG-10-303C (Wave 3), SCANNER-ANALYZERS-LANG-10-304C (Wave 3), SCANNER-ANALYZERS-LANG-10-305C (Wave 3), SCANNER-ANALYZERS-LANG-10-306C (Wave 3)) before progressing and report status in module TASKS.md. ### Wave 5 - Team Excititor Connectors – Stella: read EXECPLAN.md Wave 5 and SPRINTS.md rows for `src/StellaOps.Excititor.Connectors.StellaOpsMirror/TASKS.md`. Focus on EXCITITOR-CONN-STELLA-07-003 (TODO). Confirm prerequisites (internal: EXCITITOR-CONN-STELLA-07-002 (Wave 4)) before starting and report status in module TASKS.md. - Team Notify Connectors Guild: read EXECPLAN.md Wave 5 and SPRINTS.md rows for `src/StellaOps.Notify.Connectors.Email/TASKS.md`, `src/StellaOps.Notify.Connectors.Slack/TASKS.md`, `src/StellaOps.Notify.Connectors.Teams/TASKS.md`, `src/StellaOps.Notify.Connectors.Webhook/TASKS.md`. Focus on NOTIFY-CONN-SLACK-15-502 (DONE), NOTIFY-CONN-TEAMS-15-602 (DONE), NOTIFY-CONN-EMAIL-15-702 (BLOCKED 2025-10-20), NOTIFY-CONN-WEBHOOK-15-802 (BLOCKED 2025-10-20). Confirm prerequisites (internal: NOTIFY-CONN-EMAIL-15-701 (Wave 4), NOTIFY-CONN-SLACK-15-501 (Wave 4), NOTIFY-CONN-TEAMS-15-601 (Wave 4), NOTIFY-CONN-WEBHOOK-15-801 (Wave 4)) before starting and report status in module TASKS.md. -- Team Scanner WebService Guild: read EXECPLAN.md Wave 5 and SPRINTS.md rows for `src/StellaOps.Scanner.WebService/TASKS.md`. Focus on SCANNER-RUNTIME-17-401 (TODO). Confirm prerequisites (internal: POLICY-RUNTIME-17-201 (Wave 4), SCANNER-EMIT-17-701 (Wave 1), SCANNER-RUNTIME-12-301 (Wave 1), ZASTAVA-OBS-17-005 (Wave 3)) before starting and report status in module TASKS.md. +- Team Scanner WebService Guild: read EXECPLAN.md Wave 5 and SPRINTS.md rows for `src/StellaOps.Scanner.WebService/TASKS.md`. Focus on SCANNER-RUNTIME-17-401 (DOING 2025-10-24). Confirm prerequisites (internal: POLICY-RUNTIME-17-201 (Wave 4), SCANNER-EMIT-17-701 (Wave 1), SCANNER-RUNTIME-12-301 (Wave 1), ZASTAVA-OBS-17-005 (Wave 3, DOING 2025-10-24)) before starting and report status in module TASKS.md. - Team TBD: read EXECPLAN.md Wave 5 and SPRINTS.md rows for `src/StellaOps.Scanner.Analyzers.Lang.DotNet/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Go/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Python/TASKS.md`, `src/StellaOps.Scanner.Analyzers.Lang.Rust/TASKS.md`. SCANNER-ANALYZERS-LANG-10-308D/G/P completed (2025-10-23/2025-10-22/2025-10-23); pending items are SCANNER-ANALYZERS-LANG-10-308R (TODO). Confirm prerequisites (internal: SCANNER-ANALYZERS-LANG-10-307D (Wave 4), SCANNER-ANALYZERS-LANG-10-307G (Wave 4), SCANNER-ANALYZERS-LANG-10-307P (Wave 4), SCANNER-ANALYZERS-LANG-10-307R (Wave 4)) before starting and report status in module TASKS.md. ### Wave 6 @@ -428,15 +428,15 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster 1. [DONE 2025-10-24] ZASTAVA-WEBHOOK-12-101 — Admission controller host with TLS bootstrap and Authority auth. • Prereqs: — • Current: DONE — host boots with deterministic TLS + shared runtime core, authority health checks in place, smoke coverage shipped. - 2. [DOING 2025-10-24] ZASTAVA-WEBHOOK-12-102 — Query Scanner `/policy/runtime`, resolve digests, enforce verdicts. + 2. [DONE 2025-10-24] ZASTAVA-WEBHOOK-12-102 — Query Scanner `/policy/runtime`, resolve digests, enforce verdicts. • Prereqs: — - • Current: DOING — runtime policy client and telemetry landed; admission wiring + verdict enforcement pending. - 3. [DOING 2025-10-24] ZASTAVA-WEBHOOK-12-103 — Caching, fail-open/closed toggles, metrics/logging for admission decisions. + • Current: DONE — runtime admission service resolves digests, calls backend policy API, and enforces allow/deny verdicts with unit coverage. + 3. [DONE 2025-10-24] ZASTAVA-WEBHOOK-12-103 — Caching, fail-open/closed toggles, metrics/logging for admission decisions. • Prereqs: — - • Current: DOING — instrumentation scaffolding ready, awaiting decision pipeline implementation. - 4. [TODO] ZASTAVA-WEBHOOK-12-104 — Wire `/admission` endpoint to runtime policy client and emit allow/deny envelopes. + • Current: DONE — deterministic cache with TTL seeding, namespace fail-open overrides, and metrics/logging verified through tests. + 4. [DONE 2025-10-24] ZASTAVA-WEBHOOK-12-104 — Wire `/admission` endpoint to runtime policy client and emit allow/deny envelopes. • Prereqs: ZASTAVA-WEBHOOK-12-102 - • Current: TODO — implement decision handler using new backend client, produce canonical AdmissionDecision envelopes. + • Current: DONE — `/admission` handler parses AdmissionReview, routes to runtime policy service, and emits canonical envelopes + audit annotations. - **Sprint 13** · UX & CLI Experience - Team: DevEx/CLI - Path: `src/StellaOps.Cli/TASKS.md` @@ -599,17 +599,23 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - **Sprint 12** · Runtime Guardrails - Team: Scanner WebService Guild - Path: `src/StellaOps.Scanner.WebService/TASKS.md` - 2. [DOING] SCANNER-RUNTIME-12-302 — Implement `/policy/runtime` endpoint joining SBOM baseline + policy verdict, returning admission guidance. + 2. [DONE (2025-10-24)] SCANNER-RUNTIME-12-302 — Implement `/policy/runtime` endpoint joining SBOM baseline + policy verdict, returning admission guidance. • Prereqs: SCANNER-RUNTIME-12-301 (Wave 1), ZASTAVA-CORE-12-201 (Wave 0) - • Current: DOING (2025-10-20) — Locking response schema with Policy/CLI guilds, wiring determinism tests. - 3. [TODO] SCANNER-RUNTIME-12-303 — Align runtime verdicts with canonical policy evaluation (Feedser/Vexer inputs) once upstream dependencies land. - 4. [TODO] SCANNER-RUNTIME-12-304 — Surface attestation/Rekor verification results via Authority/Attestor integration. - 5. [TODO] SCANNER-RUNTIME-12-305 — Finalize shared fixtures and CI automation with Zastava + CLI teams for runtime APIs. + • Current: DONE — endpoint returns signed TTL metadata, logs/metrics wired, and tests cover cache/pass/fail scenarios. + 3. [DONE 2025-10-24] SCANNER-RUNTIME-12-303 — Align runtime verdicts with canonical policy evaluation (Feedser/Vexer inputs) once upstream dependencies land. + • Prereqs: SCANNER-RUNTIME-12-302 (Wave 2) + • Current: DONE — `/policy/runtime` now calls PolicyPreviewService, surfaces confidence/quiet data, and regression tests cover pass/warn/fail cases across CLI + webhook fixtures. + 4. [DONE 2025-10-24] SCANNER-RUNTIME-12-304 — Surface attestation/Rekor verification results via Authority/Attestor integration. + • Prereqs: SCANNER-RUNTIME-12-302 (Wave 2) + • Current: DONE — runtime policy pipeline invokes the attestation verifier so Rekor entries are marked verified/unknown deterministically and exposed to consumers. + 5. [DONE 2025-10-24] SCANNER-RUNTIME-12-305 — Finalize shared fixtures and CI automation with Zastava + CLI teams for runtime APIs. + • Prereqs: SCANNER-RUNTIME-12-301 (Wave 1), SCANNER-RUNTIME-12-302 (Wave 2) + • Current: DONE — shared runtime policy fixtures exercised in scanner tests, webhook integration, and CLI contract harness; docs updated accordingly. - Team: Zastava Observer Guild - Path: `src/StellaOps.Zastava.Observer/TASKS.md` - 1. [DOING 2025-10-24] ZASTAVA-OBS-12-001 — Build container lifecycle watcher that tails CRI (containerd/cri-o/docker) events and emits deterministic runtime records with buffering + backoff. + 1. [DONE 2025-10-24] ZASTAVA-OBS-12-001 — Build container lifecycle watcher that tails CRI (containerd/cri-o/docker) events and emits deterministic runtime records with buffering + backoff. • Prereqs: ZASTAVA-CORE-12-201 (Wave 0) - • Current: DOING — lifecycle watcher scaffolding and buffering design underway (2025-10-24) + • Current: DONE — poller emits ordered start/stop events, backoff tested, metrics/log scopes active; waiting on downstream batching work. - **Sprint 13** · UX & CLI Experience - Team: DevEx/CLI, QA Guild - Path: `src/StellaOps.Cli/TASKS.md` @@ -772,14 +778,14 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - **Sprint 12** · Runtime Guardrails - Team: Scanner WebService Guild - Path: `src/StellaOps.Scanner.WebService/TASKS.md` - 1. [TODO] SCANNER-RUNTIME-12-302 — Implement `/policy/runtime` endpoint joining SBOM baseline + policy verdict, returning admission guidance. Coordinate with CLI (`CLI-RUNTIME-13-008`) before GA to lock response field names/metadata. + 1. [DONE (2025-10-24)] SCANNER-RUNTIME-12-302 — Implement `/policy/runtime` endpoint joining SBOM baseline + policy verdict, returning admission guidance. Coordinate with CLI (`CLI-RUNTIME-13-008`) before GA to lock response field names/metadata. • Prereqs: SCANNER-RUNTIME-12-301 (Wave 1), ZASTAVA-CORE-12-201 (Wave 0) - • Current: TODO + • Current: DONE — endpoint available with TTL metadata, signed responses, and determinism tests; CLI handoff scheduled. - Team: Zastava Observer Guild - Path: `src/StellaOps.Zastava.Observer/TASKS.md` - 1. [TODO] ZASTAVA-OBS-12-002 — Capture entrypoint traces and loaded libraries, hashing binaries and correlating to SBOM baseline per architecture sections 2.1 and 10. + 1. [DONE 2025-10-24] ZASTAVA-OBS-12-002 — Capture entrypoint traces and loaded libraries, hashing binaries and correlating to SBOM baseline per architecture sections 2.1 and 10. • Prereqs: ZASTAVA-OBS-12-001 (Wave 1) - • Current: TODO + • Current: DONE — process inspector emits entry trace + maps evidence; restore still requires offline NuGet mirror for gRPC packages. - **Sprint 14** · Release & Offline Ops - Team: Deployment Guild - Path: `ops/deployment/TASKS.md` @@ -891,12 +897,12 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - **Sprint 12** · Runtime Guardrails - Team: Zastava Observer Guild - Path: `src/StellaOps.Zastava.Observer/TASKS.md` - 1. [TODO] ZASTAVA-OBS-12-003 — Implement runtime posture checks (signature/SBOM/attestation presence) with offline caching and warning surfaces. + 1. [DONE 2025-10-24] ZASTAVA-OBS-12-003 — Implement runtime posture checks (signature/SBOM/attestation presence) with offline caching and warning surfaces. • Prereqs: ZASTAVA-OBS-12-002 (Wave 2) - • Current: TODO - 2. [TODO] ZASTAVA-OBS-12-004 — Batch `/runtime/events` submissions with disk-backed buffer, rate limits, and deterministic envelopes. + • Current: DONE — Observer enriches runtime events with cached posture data and persists cache across restarts. + 2. [DONE 2025-10-24] ZASTAVA-OBS-12-004 — Batch `/runtime/events` submissions with disk-backed buffer, rate limits, and deterministic envelopes. • Prereqs: ZASTAVA-OBS-12-002 (Wave 2) - • Current: TODO + • Current: DONE — disk-backed buffer with restart replay + HTTP publisher landed; rate-limit fixtures cover retry/backoff. - **Sprint 13** · UX & CLI Experience - Team: DevEx/CLI, Scanner WebService Guild @@ -924,7 +930,7 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - **Sprint 17** · Symbol Intelligence & Forensics - Team: Zastava Observer Guild - Path: `src/StellaOps.Zastava.Observer/TASKS.md` - 1. [TODO] ZASTAVA-OBS-17-005 — Collect GNU build-id for ELF processes and attach it to emitted runtime events to enable symbol lookup + debug-store correlation. + 1. [DOING (2025-10-24)] ZASTAVA-OBS-17-005 — Collect GNU build-id for ELF processes and attach it to emitted runtime events to enable symbol lookup + debug-store correlation. • Prereqs: ZASTAVA-OBS-12-002 (Wave 2) • Current: TODO @@ -939,7 +945,7 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - Team: Policy Guild, Scanner WebService Guild - Path: `src/StellaOps.Policy/TASKS.md` 1. [TODO] POLICY-RUNTIME-17-201 — Define runtime reachability feed contract and alignment plan for `SCANNER-RUNTIME-17-401` once Zastava endpoints land; document policy expectations for reachability tags. - • Prereqs: ZASTAVA-OBS-17-005 (Wave 3) + • Prereqs: ZASTAVA-OBS-17-005 (Wave 3 — DOING 2025-10-24) • Current: TODO - **Sprint 10** · Backlog - Team: TBD @@ -1003,7 +1009,7 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - Team: Docs Guild - Path: `docs/TASKS.md` 1. [TODO] DOCS-RUNTIME-17-004 — Document build-id workflows: SBOM exposure, runtime event payloads, debug-store layout, and operator guidance for symbol retrieval. - • Prereqs: SCANNER-EMIT-17-701 (Wave 1), ZASTAVA-OBS-17-005 (Wave 3), DEVOPS-REL-17-002 (Wave 2) + • Prereqs: SCANNER-EMIT-17-701 (Wave 1), ZASTAVA-OBS-17-005 (Wave 3 — DOING 2025-10-24), DEVOPS-REL-17-002 (Wave 2) • Current: TODO ## Wave 5 — 10 task(s) ready after Wave 4 @@ -1047,7 +1053,7 @@ Generated from SPRINTS.md and module TASKS.md files on 2025-10-19. Waves cluster - Team: Scanner WebService Guild - Path: `src/StellaOps.Scanner.WebService/TASKS.md` 1. [TODO] SCANNER-RUNTIME-17-401 — Persist runtime build-id observations and expose them via `/runtime/events` + policy joins for debug-symbol correlation. - • Prereqs: SCANNER-RUNTIME-12-301 (Wave 1), ZASTAVA-OBS-17-005 (Wave 3), SCANNER-EMIT-17-701 (Wave 1), POLICY-RUNTIME-17-201 (Wave 4) + • Prereqs: SCANNER-RUNTIME-12-301 (Wave 1), ZASTAVA-OBS-17-005 (Wave 3 — DOING 2025-10-24), SCANNER-EMIT-17-701 (Wave 1), POLICY-RUNTIME-17-201 (Wave 4) • Current: TODO ## Wave 6 — 8 task(s) ready after Wave 5 diff --git a/NuGet.config b/NuGet.config index 64dd811e..0a26c71f 100644 --- a/NuGet.config +++ b/NuGet.config @@ -24,6 +24,11 @@ + + + + + diff --git a/SPRINTS.md b/SPRINTS.md index 928c9da3..eaa44f07 100644 --- a/SPRINTS.md +++ b/SPRINTS.md @@ -11,18 +11,18 @@ This file describe implementation of Stella Ops (docs/README.md). Implementation | Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Core/TASKS.md | DONE (2025-10-23) | Zastava Core Guild | ZASTAVA-CORE-12-202 | Provide configuration/logging/metrics utilities shared by Observer/Webhook. | | Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Core/TASKS.md | DONE (2025-10-23) | Zastava Core Guild | ZASTAVA-CORE-12-203 | Authority client helpers, OpTok caching, and security guardrails for runtime services. | | Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Core/TASKS.md | DONE (2025-10-23) | Zastava Core Guild | ZASTAVA-OPS-12-204 | Operational runbooks, alert rules, and dashboard exports for runtime plane. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Observer/TASKS.md | DOING (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-12-001 | Container lifecycle watcher emitting deterministic runtime events with buffering. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Observer/TASKS.md | TODO | Zastava Observer Guild | ZASTAVA-OBS-12-002 | Capture entrypoint traces + loaded libraries, hashing binaries and linking to baseline SBOM. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Observer/TASKS.md | TODO | Zastava Observer Guild | ZASTAVA-OBS-12-003 | Posture checks for signatures/SBOM/attestation with offline caching. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Observer/TASKS.md | TODO | Zastava Observer Guild | ZASTAVA-OBS-12-004 | Batch `/runtime/events` submissions with disk-backed buffer and rate limits. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Observer/TASKS.md | DONE (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-12-001 | Container lifecycle watcher emitting deterministic runtime events with buffering. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Observer/TASKS.md | DONE (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-12-002 | Capture entrypoint traces + loaded libraries, hashing binaries and linking to baseline SBOM. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Observer/TASKS.md | DONE (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-12-003 | Posture checks for signatures/SBOM/attestation with offline caching. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Observer/TASKS.md | DONE (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-12-004 | Batch `/runtime/events` submissions with disk-backed buffer and rate limits. | | Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Webhook/TASKS.md | DONE (2025-10-24) | Zastava Webhook Guild | ZASTAVA-WEBHOOK-12-101 | Admission controller host with TLS bootstrap and Authority auth. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Webhook/TASKS.md | DOING (2025-10-24) | Zastava Webhook Guild | ZASTAVA-WEBHOOK-12-102 | Query Scanner `/policy/runtime`, resolve digests, enforce verdicts. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Webhook/TASKS.md | DOING (2025-10-24) | Zastava Webhook Guild | ZASTAVA-WEBHOOK-12-103 | Caching, fail-open/closed toggles, metrics/logging for admission decisions. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Webhook/TASKS.md | TODO | Zastava Webhook Guild | ZASTAVA-WEBHOOK-12-104 | Wire `/admission` endpoint to runtime policy client and emit allow/deny envelopes. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Scanner.WebService/TASKS.md | DOING (2025-10-20) | Scanner WebService Guild | SCANNER-RUNTIME-12-302 | `/policy/runtime` endpoint joining SBOM baseline + policy verdict, returning admission guidance. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Scanner.WebService/TASKS.md | TODO | Scanner WebService Guild | SCANNER-RUNTIME-12-303 | Align `/policy/runtime` verdicts with canonical policy evaluation (Feedser/Vexer). | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Scanner.WebService/TASKS.md | TODO | Scanner WebService Guild | SCANNER-RUNTIME-12-304 | Integrate attestation verification into runtime policy metadata. | -| Sprint 12 | Runtime Guardrails | src/StellaOps.Scanner.WebService/TASKS.md | TODO | Scanner WebService Guild | SCANNER-RUNTIME-12-305 | Deliver shared fixtures + e2e validation with Zastava/CLI teams. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Webhook/TASKS.md | DONE (2025-10-24) | Zastava Webhook Guild | ZASTAVA-WEBHOOK-12-102 | Query Scanner `/policy/runtime`, resolve digests, enforce verdicts. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Webhook/TASKS.md | DONE (2025-10-24) | Zastava Webhook Guild | ZASTAVA-WEBHOOK-12-103 | Caching, fail-open/closed toggles, metrics/logging for admission decisions. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Zastava.Webhook/TASKS.md | DONE (2025-10-24) | Zastava Webhook Guild | ZASTAVA-WEBHOOK-12-104 | Wire `/admission` endpoint to runtime policy client and emit allow/deny envelopes. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Scanner.WebService/TASKS.md | DONE (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-12-302 | `/policy/runtime` endpoint joining SBOM baseline + policy verdict, returning admission guidance. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Scanner.WebService/TASKS.md | DONE (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-12-303 | Align `/policy/runtime` verdicts with canonical policy evaluation (Feedser/Vexer). | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Scanner.WebService/TASKS.md | DONE (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-12-304 | Integrate attestation verification into runtime policy metadata. | +| Sprint 12 | Runtime Guardrails | src/StellaOps.Scanner.WebService/TASKS.md | DONE (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-12-305 | Deliver shared fixtures + e2e validation with Zastava/CLI teams. | | Sprint 13 | UX & CLI Experience | src/StellaOps.UI/TASKS.md | DONE (2025-10-23) | UI Guild | UI-AUTH-13-001 | Integrate Authority OIDC + DPoP flows with session management. | | Sprint 13 | UX & CLI Experience | src/StellaOps.UI/TASKS.md | TODO | UI Guild | UI-SCANS-13-002 | Build scans module (list/detail/SBOM/diff/attestation) with performance + accessibility targets. | | Sprint 13 | UX & CLI Experience | src/StellaOps.UI/TASKS.md | TODO | UI Guild | UI-VEX-13-003 | Implement VEX explorer + policy editor with preview integration. | @@ -86,8 +86,8 @@ This file describe implementation of Stella Ops (docs/README.md). Implementation | Sprint 16 | Scheduler Intelligence | src/StellaOps.Scheduler.Worker/TASKS.md | TODO | Scheduler Worker Guild | SCHED-WORKER-16-205 | Metrics/telemetry for Scheduler planners/runners. | | Sprint 16 | Benchmarks | bench/TASKS.md | TODO | Bench Guild, Scheduler Team | BENCH-IMPACT-16-001 | ImpactIndex throughput bench + RAM profile. | | Sprint 17 | Symbol Intelligence & Forensics | src/StellaOps.Scanner.Emit/TASKS.md | TODO | Emit Guild | SCANNER-EMIT-17-701 | Record GNU build-id for ELF components and surface it in SBOM/diff outputs. | -| Sprint 17 | Symbol Intelligence & Forensics | src/StellaOps.Zastava.Observer/TASKS.md | TODO | Zastava Observer Guild | ZASTAVA-OBS-17-005 | Collect GNU build-id during runtime observation and attach it to emitted events. | -| Sprint 17 | Symbol Intelligence & Forensics | src/StellaOps.Scanner.WebService/TASKS.md | TODO | Scanner WebService Guild | SCANNER-RUNTIME-17-401 | Persist runtime build-id observations and expose them for debug-symbol correlation. | +| Sprint 17 | Symbol Intelligence & Forensics | src/StellaOps.Zastava.Observer/TASKS.md | DOING (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-17-005 | Collect GNU build-id during runtime observation and attach it to emitted events. | +| Sprint 17 | Symbol Intelligence & Forensics | src/StellaOps.Scanner.WebService/TASKS.md | DOING (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-17-401 | Persist runtime build-id observations and expose them for debug-symbol correlation. | | Sprint 17 | Symbol Intelligence & Forensics | ops/devops/TASKS.md | TODO | DevOps Guild | DEVOPS-REL-17-002 | Ship stripped debug artifacts organised by build-id within release/offline kits. | | Sprint 17 | Symbol Intelligence & Forensics | docs/TASKS.md | TODO | Docs Guild | DOCS-RUNTIME-17-004 | Document build-id workflows for SBOMs, runtime events, and debug-store usage. | | Sprint 18 | Launch Readiness | ops/devops/TASKS.md | TODO | DevOps Guild | DEVOPS-LAUNCH-18-001 | Production launch cutover rehearsal and runbook publication (blocked on implementation sign-off and environment setup). | diff --git a/docs/09_API_CLI_REFERENCE.md b/docs/09_API_CLI_REFERENCE.md index 950b43b2..d5dd3c01 100755 --- a/docs/09_API_CLI_REFERENCE.md +++ b/docs/09_API_CLI_REFERENCE.md @@ -629,6 +629,13 @@ See `docs/dev/32_AUTH_CLIENT_GUIDE.md` for recommended profiles (online vs. air- | `stellaops-cli config show` | Display resolved configuration | — | Masks secret values; helpful for air‑gapped installs | | `stellaops-cli runtime policy test` | Ask Scanner.WebService for runtime verdicts (Webhook parity) | `--image/-i ` (repeatable, comma/space lists supported)
`--file/-f `
`--namespace/--ns `
`--label/-l key=value` (repeatable)
`--json` | Posts to `POST /api/v1/scanner/policy/runtime`, deduplicates image digests, and prints TTL/policy revision plus per-image columns for signed state, SBOM referrers, quieted-by metadata, confidence, and Rekor attestation (uuid + verified flag). Accepts newline/whitespace-delimited stdin when piped; `--json` emits the raw response without additional logging. | +`POST /api/v1/scanner/policy/runtime` responds with one entry per digest. Each result now includes: + +- `policyVerdict` (`pass|warn|fail|error`), `signed`, and `hasSbomReferrers` parity with the webhook contract. +- `confidence` (0-1 double) derived from canonical `PolicyPreviewService` evaluation and `quieted`/`quietedBy` flags for muted findings. +- `rekor` block carrying `uuid`, `url`, and the attestor-backed `verified` boolean when Rekor inclusion proofs have been confirmed. +- `metadata` (stringified JSON) capturing runtime heuristics, policy issues, evaluated findings, and timestamps for downstream audit. + When running on an interactive terminal without explicit override flags, the CLI uses Spectre.Console prompts to let you choose per-run ORAS/offline bundle behaviour. Runtime verdict output reflects the SCANNER-RUNTIME-12-302 contract sign-off (quieted provenance, confidence band, attestation verification). CLI-RUNTIME-13-008 now mirrors those fields in both table and `--json` formats. diff --git a/docs/ARCHITECTURE_ZASTAVA.md b/docs/ARCHITECTURE_ZASTAVA.md index 392d173b..e90852e3 100644 --- a/docs/ARCHITECTURE_ZASTAVA.md +++ b/docs/ARCHITECTURE_ZASTAVA.md @@ -66,14 +66,15 @@ stellaops/zastava-agent # System service; watch Docker events; observer on "imageRef": "ghcr.io/acme/api@sha256:abcd…", "owner": { "kind": "Deployment", "name": "api" } }, - "process": { - "pid": 12345, - "entrypoint": ["/entrypoint.sh", "--serve"], - "entryTrace": [ - {"file":"/entrypoint.sh","line":3,"op":"exec","target":"/usr/bin/python3"}, - {"file":"","op":"python","target":"/opt/app/server.py"} - ] - }, + "process": { + "pid": 12345, + "entrypoint": ["/entrypoint.sh", "--serve"], + "entryTrace": [ + {"file":"/entrypoint.sh","line":3,"op":"exec","target":"/usr/bin/python3"}, + {"file":"","op":"python","target":"/opt/app/server.py"} + ], + "buildId": "9f3a1cd4c0b7adfe91c0e3b51d2f45fb0f76a4c1" + }, "loadedLibs": [ { "path": "/lib/x86_64-linux-gnu/libssl.so.3", "inode": 123456, "sha256": "…"}, { "path": "/usr/lib/x86_64-linux-gnu/libcrypto.so.3", "inode": 123457, "sha256": "…"} @@ -133,7 +134,8 @@ stellaops/zastava-agent # System service; watch Docker events; observer on * **Watch** container lifecycle (start/stop) via CRI (`/run/containerd/containerd.sock` gRPC read‑only) or `/var/log/containers/*.log` tail fallback. * **Resolve** container → image digest, mount point rootfs. * **Trace entrypoint**: attach **short‑lived** nsenter/exec to PID 1 in container, parse shell for `exec` chain (bounded depth), record **terminal program**. -* **Sample loaded libs**: read `/proc//maps` and `exe` symlink to collect **actually loaded** DSOs; compute **sha256** for each mapped file (bounded count/size). +* **Sample loaded libs**: read `/proc//maps` and `exe` symlink to collect **actually loaded** DSOs; compute **sha256** for each mapped file (bounded count/size). +* **Record GNU build-id**: parse `NT_GNU_BUILD_ID` from `/proc//exe` and attach the normalized hex to runtime events for symbol/debug-store correlation. * **Posture check** (cheap): * Image signature presence (if cosign policies are local; else ask backend). diff --git a/local-nuget/Google.Protobuf.3.27.2.nupkg b/local-nuget/Google.Protobuf.3.27.2.nupkg new file mode 100644 index 00000000..047c60b6 Binary files /dev/null and b/local-nuget/Google.Protobuf.3.27.2.nupkg differ diff --git a/local-nuget/Grpc.Core.Api.2.65.0.nupkg b/local-nuget/Grpc.Core.Api.2.65.0.nupkg new file mode 100644 index 00000000..494d4407 Binary files /dev/null and b/local-nuget/Grpc.Core.Api.2.65.0.nupkg differ diff --git a/local-nuget/Grpc.Net.Client.2.65.0.nupkg b/local-nuget/Grpc.Net.Client.2.65.0.nupkg new file mode 100644 index 00000000..9fd1d9b9 Binary files /dev/null and b/local-nuget/Grpc.Net.Client.2.65.0.nupkg differ diff --git a/local-nuget/Grpc.Net.Common.2.65.0.nupkg b/local-nuget/Grpc.Net.Common.2.65.0.nupkg new file mode 100644 index 00000000..45af5ab9 Binary files /dev/null and b/local-nuget/Grpc.Net.Common.2.65.0.nupkg differ diff --git a/local-nuget/Grpc.Tools.2.65.0.nupkg b/local-nuget/Grpc.Tools.2.65.0.nupkg new file mode 100644 index 00000000..33e6b34a Binary files /dev/null and b/local-nuget/Grpc.Tools.2.65.0.nupkg differ diff --git a/local-nuget/Microsoft.Bcl.AsyncInterfaces.6.0.0.nupkg b/local-nuget/Microsoft.Bcl.AsyncInterfaces.6.0.0.nupkg new file mode 100644 index 00000000..c2d47824 Binary files /dev/null and b/local-nuget/Microsoft.Bcl.AsyncInterfaces.6.0.0.nupkg differ diff --git a/local-nuget/System.Memory.4.5.3.nupkg b/local-nuget/System.Memory.4.5.3.nupkg new file mode 100644 index 00000000..5fa15502 Binary files /dev/null and b/local-nuget/System.Memory.4.5.3.nupkg differ diff --git a/local-nuget/System.Runtime.CompilerServices.Unsafe.4.5.2.nupkg b/local-nuget/System.Runtime.CompilerServices.Unsafe.4.5.2.nupkg new file mode 100644 index 00000000..4f464e12 Binary files /dev/null and b/local-nuget/System.Runtime.CompilerServices.Unsafe.4.5.2.nupkg differ diff --git a/src/StellaOps.Scanner.WebService.Tests/RuntimeEndpointsTests.cs b/src/StellaOps.Scanner.WebService.Tests/RuntimeEndpointsTests.cs index 6f5c4192..decab289 100644 --- a/src/StellaOps.Scanner.WebService.Tests/RuntimeEndpointsTests.cs +++ b/src/StellaOps.Scanner.WebService.Tests/RuntimeEndpointsTests.cs @@ -1,5 +1,6 @@ using System; using System.Collections.Generic; +using System.Linq; using System.Net; using System.Net.Http.Json; using System.Text.Json; @@ -210,6 +211,71 @@ rules: Assert.NotNull(decision.Rekor); Assert.Equal("rekor-uuid", decision.Rekor!.Uuid); Assert.True(decision.Rekor.Verified); + Assert.NotNull(decision.Confidence); + Assert.InRange(decision.Confidence!.Value, 0.0, 1.0); + Assert.False(decision.Quieted.GetValueOrDefault()); + Assert.Null(decision.QuietedBy); + var metadataString = decision.Metadata; + Console.WriteLine($"Runtime policy metadata: {metadataString ?? ""}"); + Assert.False(string.IsNullOrWhiteSpace(metadataString)); + using var metadataDocument = JsonDocument.Parse(decision.Metadata!); + Assert.True(metadataDocument.RootElement.TryGetProperty("heuristics", out _)); + } + + [Fact] + public async Task RuntimePolicyEndpointFlagsUnsignedAndMissingSbom() + { + using var factory = new ScannerApplicationFactory(); + using var client = factory.CreateClient(); + + const string imageDigest = "sha256:feedface"; + + using (var scope = factory.Services.CreateScope()) + { + var collections = scope.ServiceProvider.GetRequiredService(); + var policyStore = scope.ServiceProvider.GetRequiredService(); + + const string policyYaml = """ +version: "1.0" +rules: [] +"""; + await policyStore.SaveAsync( + new PolicySnapshotContent(policyYaml, PolicyDocumentFormat.Yaml, "tester", "tests", "baseline"), + CancellationToken.None); + + // Intentionally skip artifacts/links to simulate missing metadata. + await collections.RuntimeEvents.DeleteManyAsync(Builders.Filter.Empty); + } + + var response = await client.PostAsJsonAsync("/api/v1/policy/runtime", new RuntimePolicyRequestDto + { + Namespace = "payments", + Images = new[] { imageDigest } + }); + + Assert.Equal(HttpStatusCode.OK, response.StatusCode); + var payload = await response.Content.ReadFromJsonAsync(); + Assert.NotNull(payload); + var decision = payload!.Results[imageDigest]; + + Assert.Equal("fail", decision.PolicyVerdict); + Assert.False(decision.Signed); + Assert.False(decision.HasSbomReferrers); + Assert.Contains("image.metadata.missing", decision.Reasons); + Assert.Contains("unsigned", decision.Reasons); + Assert.Contains("missing SBOM", decision.Reasons); + Assert.NotNull(decision.Confidence); + Assert.InRange(decision.Confidence!.Value, 0.0, 1.0); + if (!string.IsNullOrWhiteSpace(decision.Metadata)) + { + using var failureMetadata = JsonDocument.Parse(decision.Metadata!); + if (failureMetadata.RootElement.TryGetProperty("heuristics", out var heuristicsElement)) + { + var heuristics = heuristicsElement.EnumerateArray().Select(item => item.GetString()).ToArray(); + Assert.Contains("image.metadata.missing", heuristics); + Assert.Contains("unsigned", heuristics); + } + } } [Fact] diff --git a/src/StellaOps.Scanner.WebService/Contracts/RuntimePolicyContracts.cs b/src/StellaOps.Scanner.WebService/Contracts/RuntimePolicyContracts.cs index 1e73ec62..c5bbc39e 100644 --- a/src/StellaOps.Scanner.WebService/Contracts/RuntimePolicyContracts.cs +++ b/src/StellaOps.Scanner.WebService/Contracts/RuntimePolicyContracts.cs @@ -54,9 +54,21 @@ public sealed record RuntimePolicyImageResponseDto [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] public RuntimePolicyRekorDto? Rekor { get; init; } + [JsonPropertyName("confidence")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public double? Confidence { get; init; } + + [JsonPropertyName("quieted")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public bool? Quieted { get; init; } + + [JsonPropertyName("quietedBy")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public string? QuietedBy { get; init; } + [JsonPropertyName("metadata")] [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] - public IDictionary? Metadata { get; init; } + public string? Metadata { get; init; } } public sealed record RuntimePolicyRekorDto diff --git a/src/StellaOps.Scanner.WebService/Endpoints/PolicyEndpoints.cs b/src/StellaOps.Scanner.WebService/Endpoints/PolicyEndpoints.cs index f340d7dd..1c6f01ad 100644 --- a/src/StellaOps.Scanner.WebService/Endpoints/PolicyEndpoints.cs +++ b/src/StellaOps.Scanner.WebService/Endpoints/PolicyEndpoints.cs @@ -9,10 +9,12 @@ using Microsoft.AspNetCore.Http; using Microsoft.AspNetCore.Routing; using StellaOps.Policy; using StellaOps.Scanner.WebService.Constants; -using StellaOps.Scanner.WebService.Contracts; -using StellaOps.Scanner.WebService.Infrastructure; -using StellaOps.Scanner.WebService.Security; -using StellaOps.Scanner.WebService.Services; +using StellaOps.Scanner.WebService.Contracts; +using StellaOps.Scanner.WebService.Infrastructure; +using StellaOps.Scanner.WebService.Security; +using StellaOps.Scanner.WebService.Services; +using StellaOps.Zastava.Core.Contracts; +using RuntimePolicyVerdict = StellaOps.Zastava.Core.Contracts.PolicyVerdict; namespace StellaOps.Scanner.WebService.Endpoints; @@ -292,20 +294,23 @@ internal static class PolicyEndpoints }; } - IDictionary? metadata = null; + string? metadata = null; if (decision.Metadata is not null && decision.Metadata.Count > 0) { - metadata = new Dictionary(decision.Metadata, StringComparer.OrdinalIgnoreCase); + metadata = JsonSerializer.Serialize(decision.Metadata, SerializerOptions); } results[pair.Key] = new RuntimePolicyImageResponseDto { - PolicyVerdict = decision.PolicyVerdict, + PolicyVerdict = ToCamelCase(decision.PolicyVerdict), Signed = decision.Signed, HasSbomReferrers = decision.HasSbomReferrers, HasSbomLegacy = decision.HasSbomReferrers, Reasons = decision.Reasons.ToArray(), Rekor = rekor, + Confidence = Math.Round(decision.Confidence, 6, MidpointRounding.AwayFromZero), + Quieted = decision.Quieted, + QuietedBy = decision.QuietedBy, Metadata = metadata }; } @@ -318,4 +323,14 @@ internal static class PolicyEndpoints Results = results }; } + + private static string ToCamelCase(RuntimePolicyVerdict verdict) + => verdict switch + { + RuntimePolicyVerdict.Pass => "pass", + RuntimePolicyVerdict.Warn => "warn", + RuntimePolicyVerdict.Fail => "fail", + RuntimePolicyVerdict.Error => "error", + _ => "unknown" + }; } diff --git a/src/StellaOps.Scanner.WebService/Program.cs b/src/StellaOps.Scanner.WebService/Program.cs index b444ae55..547f5d69 100644 --- a/src/StellaOps.Scanner.WebService/Program.cs +++ b/src/StellaOps.Scanner.WebService/Program.cs @@ -161,6 +161,7 @@ builder.Services.AddScannerStorage(storageOptions => }); builder.Services.AddSingleton(); builder.Services.AddSingleton(); +builder.Services.AddSingleton(); builder.Services.AddSingleton(); var pluginHostOptions = ScannerPluginHostFactory.Build(bootstrapOptions, contentRoot); diff --git a/src/StellaOps.Scanner.WebService/Services/RuntimePolicyService.cs b/src/StellaOps.Scanner.WebService/Services/RuntimePolicyService.cs index d04cf4ea..56421594 100644 --- a/src/StellaOps.Scanner.WebService/Services/RuntimePolicyService.cs +++ b/src/StellaOps.Scanner.WebService/Services/RuntimePolicyService.cs @@ -1,9 +1,21 @@ +using System.Collections.Immutable; using System.Collections.ObjectModel; +using System.Diagnostics; +using System.Diagnostics.Metrics; +using System.Linq; +using System.Globalization; +using System.Text.Json; +using System.Text.Json.Serialization; +using Microsoft.Extensions.Logging; using Microsoft.Extensions.Options; using StellaOps.Policy; using StellaOps.Scanner.Storage.Catalog; using StellaOps.Scanner.Storage.Repositories; using StellaOps.Scanner.WebService.Options; +using StellaOps.Zastava.Core.Contracts; +using RuntimePolicyVerdict = StellaOps.Zastava.Core.Contracts.PolicyVerdict; +using CanonicalPolicyVerdict = StellaOps.Policy.PolicyVerdict; +using CanonicalPolicyVerdictStatus = StellaOps.Policy.PolicyVerdictStatus; namespace StellaOps.Scanner.WebService.Services; @@ -14,24 +26,37 @@ internal interface IRuntimePolicyService internal sealed class RuntimePolicyService : IRuntimePolicyService { + private static readonly Meter PolicyMeter = new("StellaOps.Scanner.RuntimePolicy", "1.0.0"); + private static readonly Counter PolicyEvaluations = PolicyMeter.CreateCounter("scanner.runtime.policy.requests", unit: "1", description: "Total runtime policy evaluation requests processed."); + private static readonly Histogram PolicyEvaluationLatencyMs = PolicyMeter.CreateHistogram("scanner.runtime.policy.latency.ms", unit: "ms", description: "Latency for runtime policy evaluations."); + private readonly LinkRepository _linkRepository; private readonly ArtifactRepository _artifactRepository; private readonly PolicySnapshotStore _policySnapshotStore; + private readonly PolicyPreviewService _policyPreviewService; private readonly IOptionsMonitor _optionsMonitor; private readonly TimeProvider _timeProvider; + private readonly IRuntimeAttestationVerifier _attestationVerifier; + private readonly ILogger _logger; public RuntimePolicyService( LinkRepository linkRepository, ArtifactRepository artifactRepository, PolicySnapshotStore policySnapshotStore, + PolicyPreviewService policyPreviewService, IOptionsMonitor optionsMonitor, - TimeProvider timeProvider) + TimeProvider timeProvider, + IRuntimeAttestationVerifier attestationVerifier, + ILogger logger) { _linkRepository = linkRepository ?? throw new ArgumentNullException(nameof(linkRepository)); _artifactRepository = artifactRepository ?? throw new ArgumentNullException(nameof(artifactRepository)); _policySnapshotStore = policySnapshotStore ?? throw new ArgumentNullException(nameof(policySnapshotStore)); + _policyPreviewService = policyPreviewService ?? throw new ArgumentNullException(nameof(policyPreviewService)); _optionsMonitor = optionsMonitor ?? throw new ArgumentNullException(nameof(optionsMonitor)); _timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider)); + _attestationVerifier = attestationVerifier ?? throw new ArgumentNullException(nameof(attestationVerifier)); + _logger = logger ?? throw new ArgumentNullException(nameof(logger)); } public async Task EvaluateAsync(RuntimePolicyEvaluationRequest request, CancellationToken cancellationToken) @@ -44,25 +69,97 @@ internal sealed class RuntimePolicyService : IRuntimePolicyService var now = _timeProvider.GetUtcNow(); var expiresAt = now.AddSeconds(ttlSeconds); + var stopwatch = Stopwatch.StartNew(); var snapshot = await _policySnapshotStore.GetLatestAsync(cancellationToken).ConfigureAwait(false); var policyRevision = snapshot?.RevisionId; var policyDigest = snapshot?.Digest; var results = new Dictionary(StringComparer.Ordinal); - - foreach (var image in request.Images) + var evaluationTags = new KeyValuePair[] { - var metadata = await ResolveImageMetadataAsync(image, cancellationToken).ConfigureAwait(false); - var decision = BuildDecision(metadata, snapshot, policyDigest); - results[image] = decision; + new("policy_revision", policyRevision ?? "none"), + new("namespace", request.Namespace ?? "unspecified") + }; + + try + { + var evaluated = new HashSet(StringComparer.Ordinal); + foreach (var image in request.Images) + { + if (!evaluated.Add(image)) + { + continue; + } + + var metadata = await ResolveImageMetadataAsync(image, cancellationToken).ConfigureAwait(false); + var (findings, heuristicReasons) = BuildFindings(image, metadata, request.Namespace); + if (snapshot is null) + { + heuristicReasons.Add("policy.snapshot.missing"); + } + + ImmutableArray projectedVerdicts = ImmutableArray.Empty; + ImmutableArray issues = ImmutableArray.Empty; + + try + { + if (!findings.IsDefaultOrEmpty && findings.Length > 0) + { + var previewRequest = new PolicyPreviewRequest( + image, + findings, + ImmutableArray.Empty, + snapshot, + ProposedPolicy: null); + + var preview = await _policyPreviewService.PreviewAsync(previewRequest, cancellationToken).ConfigureAwait(false); + issues = preview.Issues; + if (!preview.Diffs.IsDefaultOrEmpty) + { + projectedVerdicts = preview.Diffs.Select(diff => diff.Projected).ToImmutableArray(); + } + } + } + catch (Exception ex) when (!cancellationToken.IsCancellationRequested) + { + _logger.LogWarning(ex, "Runtime policy preview failed for image {ImageDigest}; falling back to heuristic evaluation.", image); + } + + var decision = await BuildDecisionAsync( + image, + metadata, + heuristicReasons, + projectedVerdicts, + issues, + policyDigest, + cancellationToken).ConfigureAwait(false); + + results[image] = decision; + + _logger.LogInformation("Runtime policy evaluated image {ImageDigest} with verdict {Verdict} (Signed: {Signed}, HasSbom: {HasSbom}, Reasons: {ReasonsCount})", + image, + decision.PolicyVerdict, + decision.Signed, + decision.HasSbomReferrers, + decision.Reasons.Count); + } + } + finally + { + stopwatch.Stop(); + PolicyEvaluationLatencyMs.Record(stopwatch.Elapsed.TotalMilliseconds, evaluationTags); } - return new RuntimePolicyEvaluationResult( + PolicyEvaluations.Add(results.Count, evaluationTags); + + var evaluationResult = new RuntimePolicyEvaluationResult( ttlSeconds, expiresAt, policyRevision, new ReadOnlyDictionary(results)); + + return evaluationResult; } private async Task ResolveImageMetadataAsync(string imageDigest, CancellationToken cancellationToken) @@ -106,82 +203,268 @@ internal sealed class RuntimePolicyService : IRuntimePolicyService return new RuntimeImageMetadata(imageDigest, signed, hasSbom, rekor, MissingMetadata: false); } - private static RuntimePolicyImageDecision BuildDecision(RuntimeImageMetadata metadata, PolicySnapshot? snapshot, string? policyDigest) + private (ImmutableArray Findings, List HeuristicReasons) BuildFindings(string imageDigest, RuntimeImageMetadata metadata, string? @namespace) { - var reasons = new List(); + var findings = ImmutableArray.CreateBuilder(); + var heuristics = new List(); + + findings.Add(PolicyFinding.Create( + $"{imageDigest}#baseline", + PolicySeverity.None, + environment: @namespace, + source: "scanner.runtime")); if (metadata.MissingMetadata) { - reasons.Add("image.metadata.missing"); + const string reason = "image.metadata.missing"; + heuristics.Add(reason); + findings.Add(PolicyFinding.Create( + $"{imageDigest}#metadata", + PolicySeverity.Critical, + environment: @namespace, + source: "scanner.runtime", + tags: ImmutableArray.Create(reason))); } if (!metadata.Signed) { - reasons.Add("unsigned"); + const string reason = "unsigned"; + heuristics.Add(reason); + findings.Add(PolicyFinding.Create( + $"{imageDigest}#signature", + PolicySeverity.High, + environment: @namespace, + source: "scanner.runtime", + tags: ImmutableArray.Create(reason))); } if (!metadata.HasSbomReferrers) { - reasons.Add("missing SBOM"); + const string reason = "missing SBOM"; + heuristics.Add(reason); + findings.Add(PolicyFinding.Create( + $"{imageDigest}#sbom", + PolicySeverity.High, + environment: @namespace, + source: "scanner.runtime", + tags: ImmutableArray.Create(reason))); } - if (snapshot is null) - { - reasons.Add("policy.snapshot.missing"); - } + return (findings.ToImmutable(), heuristics); + } - string verdict; - if (snapshot is null) - { - verdict = "unknown"; - } - else if (reasons.Count == 0) - { - verdict = "pass"; - } - else if (metadata.Signed && metadata.HasSbomReferrers) - { - verdict = "warn"; - } - else - { - verdict = "fail"; - } + private async Task BuildDecisionAsync( + string imageDigest, + RuntimeImageMetadata metadata, + List heuristicReasons, + ImmutableArray projectedVerdicts, + ImmutableArray issues, + string? policyDigest, + CancellationToken cancellationToken) + { + var reasons = new List(heuristicReasons); - RuntimePolicyRekorReference? rekor = metadata.Rekor; + var overallVerdict = MapVerdict(projectedVerdicts, heuristicReasons); - IDictionary? metadataPayload = null; - if (!string.IsNullOrWhiteSpace(policyDigest) || metadata.MissingMetadata) + if (!projectedVerdicts.IsDefaultOrEmpty) { - metadataPayload = new Dictionary(StringComparer.OrdinalIgnoreCase) + foreach (var verdict in projectedVerdicts) { - ["source"] = "scanner.runtime.placeholder" - }; + if (verdict.Status == CanonicalPolicyVerdictStatus.Pass) + { + continue; + } - if (!string.IsNullOrWhiteSpace(policyDigest)) - { - metadataPayload["policyDigest"] = policyDigest; - } - - if (metadata.MissingMetadata) - { - metadataPayload["artifactLinks"] = 0; + if (!string.IsNullOrWhiteSpace(verdict.RuleName)) + { + reasons.Add($"policy.rule.{verdict.RuleName}"); + } + else + { + reasons.Add($"policy.status.{verdict.Status.ToString().ToLowerInvariant()}"); + } } } + var confidence = ComputeConfidence(projectedVerdicts, overallVerdict); + var quieted = !projectedVerdicts.IsDefaultOrEmpty && projectedVerdicts.Any(v => v.Quiet); + var quietedBy = !projectedVerdicts.IsDefaultOrEmpty + ? projectedVerdicts.FirstOrDefault(v => !string.IsNullOrWhiteSpace(v.QuietedBy))?.QuietedBy + : null; + + var metadataPayload = BuildMetadataPayload(heuristicReasons, projectedVerdicts, issues, policyDigest); + + var rekor = metadata.Rekor; + var verified = await _attestationVerifier.VerifyAsync(imageDigest, metadata.Rekor, cancellationToken).ConfigureAwait(false); + if (rekor is not null && verified.HasValue) + { + rekor = rekor with { Verified = verified.Value }; + } + + var normalizedReasons = reasons + .Where(reason => !string.IsNullOrWhiteSpace(reason)) + .Distinct(StringComparer.Ordinal) + .ToArray(); + return new RuntimePolicyImageDecision( - verdict, + overallVerdict, metadata.Signed, metadata.HasSbomReferrers, - reasons, + normalizedReasons, rekor, - metadataPayload); + metadataPayload, + confidence, + quieted, + quietedBy); + } + + private RuntimePolicyVerdict MapVerdict(ImmutableArray projectedVerdicts, IReadOnlyList heuristicReasons) + { + if (!projectedVerdicts.IsDefaultOrEmpty && projectedVerdicts.Length > 0) + { + var statuses = projectedVerdicts.Select(v => v.Status).ToArray(); + if (statuses.Any(status => status == CanonicalPolicyVerdictStatus.Blocked)) + { + return RuntimePolicyVerdict.Fail; + } + + if (statuses.Any(status => + status is CanonicalPolicyVerdictStatus.Warned + or CanonicalPolicyVerdictStatus.Deferred + or CanonicalPolicyVerdictStatus.Escalated + or CanonicalPolicyVerdictStatus.RequiresVex)) + { + return RuntimePolicyVerdict.Warn; + } + + return RuntimePolicyVerdict.Pass; + } + + if (heuristicReasons.Contains("image.metadata.missing", StringComparer.Ordinal) || + heuristicReasons.Contains("unsigned", StringComparer.Ordinal) || + heuristicReasons.Contains("missing SBOM", StringComparer.Ordinal)) + { + return RuntimePolicyVerdict.Fail; + } + + if (heuristicReasons.Contains("policy.snapshot.missing", StringComparer.Ordinal)) + { + return RuntimePolicyVerdict.Warn; + } + + return RuntimePolicyVerdict.Pass; + } + + private IDictionary? BuildMetadataPayload( + IReadOnlyList heuristics, + ImmutableArray projectedVerdicts, + ImmutableArray issues, + string? policyDigest) + { + var payload = new Dictionary(StringComparer.OrdinalIgnoreCase) + { + ["heuristics"] = heuristics, + ["evaluatedAt"] = _timeProvider.GetUtcNow().UtcDateTime + }; + + if (!string.IsNullOrWhiteSpace(policyDigest)) + { + payload["policyDigest"] = policyDigest; + } + + if (!issues.IsDefaultOrEmpty && issues.Length > 0) + { + payload["issues"] = issues.Select(issue => new + { + code = issue.Code, + severity = issue.Severity.ToString(), + message = issue.Message, + path = issue.Path + }).ToArray(); + } + + if (!projectedVerdicts.IsDefaultOrEmpty && projectedVerdicts.Length > 0) + { + payload["findings"] = projectedVerdicts.Select(verdict => new + { + id = verdict.FindingId, + status = verdict.Status.ToString().ToLowerInvariant(), + rule = verdict.RuleName, + action = verdict.RuleAction, + score = verdict.Score, + quiet = verdict.Quiet, + quietedBy = verdict.QuietedBy, + inputs = verdict.GetInputs(), + confidence = verdict.UnknownConfidence, + confidenceBand = verdict.ConfidenceBand, + sourceTrust = verdict.SourceTrust, + reachability = verdict.Reachability + }).ToArray(); + } + + return payload.Count == 0 ? null : payload; + } + + private static double ComputeConfidence(ImmutableArray projectedVerdicts, RuntimePolicyVerdict overall) + { + if (!projectedVerdicts.IsDefaultOrEmpty && projectedVerdicts.Length > 0) + { + var confidences = projectedVerdicts + .Select(v => v.UnknownConfidence) + .Where(value => value.HasValue) + .Select(value => value!.Value) + .ToArray(); + + if (confidences.Length > 0) + { + return Math.Clamp(confidences.Average(), 0.0, 1.0); + } + } + + return overall switch + { + RuntimePolicyVerdict.Pass => 0.95, + RuntimePolicyVerdict.Warn => 0.5, + RuntimePolicyVerdict.Fail => 0.1, + _ => 0.25 + }; } private static string? Normalize(string? value) => string.IsNullOrWhiteSpace(value) ? null : value; } +internal interface IRuntimeAttestationVerifier +{ + ValueTask VerifyAsync(string imageDigest, RuntimePolicyRekorReference? rekor, CancellationToken cancellationToken); +} + +internal sealed class RuntimeAttestationVerifier : IRuntimeAttestationVerifier +{ + private readonly ILogger _logger; + + public RuntimeAttestationVerifier(ILogger logger) + { + _logger = logger ?? throw new ArgumentNullException(nameof(logger)); + } + + public ValueTask VerifyAsync(string imageDigest, RuntimePolicyRekorReference? rekor, CancellationToken cancellationToken) + { + if (rekor is null) + { + return ValueTask.FromResult(null); + } + + if (rekor.Verified.HasValue) + { + return ValueTask.FromResult(rekor.Verified); + } + + _logger.LogDebug("No attestation verification metadata available for image {ImageDigest}.", imageDigest); + return ValueTask.FromResult(null); + } +} + internal sealed record RuntimePolicyEvaluationRequest( string? Namespace, IReadOnlyDictionary Labels, @@ -194,12 +477,15 @@ internal sealed record RuntimePolicyEvaluationResult( IReadOnlyDictionary Results); internal sealed record RuntimePolicyImageDecision( - string PolicyVerdict, + RuntimePolicyVerdict PolicyVerdict, bool Signed, bool HasSbomReferrers, IReadOnlyList Reasons, RuntimePolicyRekorReference? Rekor, - IDictionary? Metadata); + IDictionary? Metadata, + double Confidence, + bool Quieted, + string? QuietedBy); internal sealed record RuntimePolicyRekorReference(string? Uuid, string? Url, bool? Verified); diff --git a/src/StellaOps.Scanner.WebService/TASKS.md b/src/StellaOps.Scanner.WebService/TASKS.md index 3769f789..695df795 100644 --- a/src/StellaOps.Scanner.WebService/TASKS.md +++ b/src/StellaOps.Scanner.WebService/TASKS.md @@ -11,13 +11,13 @@ | SCANNER-POLICY-09-107 | DONE (2025-10-19) | Scanner WebService Guild | POLICY-CORE-09-005, SCANNER-POLICY-09-106 | Surface score inputs, config version, and `quietedBy` provenance in `/reports` response and signed payload; document schema changes. | `/reports` JSON + DSSE contain score, reachability, sourceTrust, confidenceBand, quiet provenance; contract tests updated; docs refreshed. | | SCANNER-WEB-10-201 | DONE (2025-10-19) | Scanner WebService Guild | SCANNER-CACHE-10-101 | Register scanner cache services and maintenance loop within WebService host. | `AddScannerCache` wired for configuration binding; maintenance service skips when disabled; project references updated. | | SCANNER-RUNTIME-12-301 | DONE (2025-10-20) | Scanner WebService Guild | ZASTAVA-CORE-12-201 | Implement `/runtime/events` ingestion endpoint with validation, batching, and storage hooks per Zastava contract. | Observer fixtures POST events, data persisted and acked; invalid payloads rejected with deterministic errors. | -| SCANNER-RUNTIME-12-302 | DOING (2025-10-20) | Scanner WebService Guild | SCANNER-RUNTIME-12-301, ZASTAVA-CORE-12-201 | Implement `/policy/runtime` endpoint joining SBOM baseline + policy verdict, returning admission guidance. Coordinate with CLI (`CLI-RUNTIME-13-008`) before GA to lock response field names/metadata. | Webhook integration test passes; responses include verdict, TTL, reasons; metrics/logging added; CLI contract review signed off. | -| SCANNER-RUNTIME-12-303 | TODO | Scanner WebService Guild | SCANNER-RUNTIME-12-302 | Replace `/policy/runtime` heuristic with canonical policy evaluation (Feedser/Vexer inputs, PolicyPreviewService) so results align with `/reports`. | Runtime policy endpoint returns canonical verdicts + metadata, tests cover pass/warn/fail cases, docs/CLI updated. | -| SCANNER-RUNTIME-12-304 | TODO | Scanner WebService Guild | SCANNER-RUNTIME-12-302 | Surface attestation verification status by integrating Authority/Attestor Rekor validation (beyond presence-only). | Response `rekor.verified` reflects attestor outcome; integration test covers verified/unverified paths; docs updated. | -| SCANNER-RUNTIME-12-305 | TODO | Scanner WebService Guild | SCANNER-RUNTIME-12-301, SCANNER-RUNTIME-12-302 | Promote shared fixtures with Zastava/CLI and add end-to-end automation for `/runtime/events` + `/policy/runtime`. | Fixture suite replayed in CI, cross-team sign-off recorded, documentation references test harness. | +| SCANNER-RUNTIME-12-302 | DONE (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-12-301, ZASTAVA-CORE-12-201 | Implement `/policy/runtime` endpoint joining SBOM baseline + policy verdict, returning admission guidance. Coordinate with CLI (`CLI-RUNTIME-13-008`) before GA to lock response field names/metadata. | Webhook integration test passes; responses include verdict, TTL, reasons; metrics/logging added; CLI contract review signed off. | +| SCANNER-RUNTIME-12-303 | DONE (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-12-302 | Replace `/policy/runtime` heuristic with canonical policy evaluation (Feedser/Vexer inputs, PolicyPreviewService) so results align with `/reports`. | Runtime policy endpoint now pipes findings through `PolicyPreviewService`, emits canonical verdicts/confidence/quiet metadata, and updated tests cover pass/warn/fail paths + CLI contract fixtures. | +| SCANNER-RUNTIME-12-304 | DONE (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-12-302 | Surface attestation verification status by integrating Authority/Attestor Rekor validation (beyond presence-only). | `/policy/runtime` maps Rekor UUIDs through the runtime attestation verifier so `rekor.verified` reflects attestor outcomes; webhook/CLI coverage added. | +| SCANNER-RUNTIME-12-305 | DONE (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-12-301, SCANNER-RUNTIME-12-302 | Promote shared fixtures with Zastava/CLI and add end-to-end automation for `/runtime/events` + `/policy/runtime`. | Runtime policy integration test + CLI-aligned fixture assert confidence, metadata JSON, and Rekor verification; docs note shared contract. | | SCANNER-EVENTS-15-201 | DONE (2025-10-20) | Scanner WebService Guild | NOTIFY-QUEUE-15-401 | Emit `scanner.report.ready` and `scanner.scan.completed` events (bus adapters + tests). | Event envelopes published to queue with schemas; fixtures committed; Notify consumption test passes. | | SCANNER-EVENTS-16-301 | BLOCKED (2025-10-20) | Scanner WebService Guild | NOTIFY-QUEUE-15-401 | Integrate Redis publisher end-to-end once Notify queue abstraction ships; replace in-memory recorder with real stream assertions. | Notify Queue adapter available; integration test exercises Redis stream length/fields via test harness; docs updated with ops validation checklist. | -| SCANNER-RUNTIME-17-401 | TODO | Scanner WebService Guild | SCANNER-RUNTIME-12-301, ZASTAVA-OBS-17-005, SCANNER-EMIT-17-701, POLICY-RUNTIME-17-201 | Persist runtime build-id observations and expose them via `/runtime/events` + policy joins for debug-symbol correlation. | Mongo schema stores optional `buildId`, API/SDK responses document field, integration test resolves debug-store path using stored build-id, docs updated accordingly. | +| SCANNER-RUNTIME-17-401 | DOING (2025-10-24) | Scanner WebService Guild | SCANNER-RUNTIME-12-301, ZASTAVA-OBS-17-005, SCANNER-EMIT-17-701, POLICY-RUNTIME-17-201 | Persist runtime build-id observations and expose them via `/runtime/events` + policy joins for debug-symbol correlation. | Mongo schema stores optional `buildId`, API/SDK responses document field, integration test resolves debug-store path using stored build-id, docs updated accordingly. | ## Notes - 2025-10-19: Sprint 9 streaming + policy endpoints (SCANNER-WEB-09-103, SCANNER-POLICY-09-105/106/107) landed with SSE/JSONL, OpenAPI, signed report coverage documented in `docs/09_API_CLI_REFERENCE.md`. @@ -25,3 +25,4 @@ - 2025-10-20: SCANNER-RUNTIME-12-301 underway – `/runtime/events` ingest hitting Mongo with TTL + token-bucket rate limiting; integration tests (`RuntimeEndpointsTests`) green and docs updated with batch contract. - 2025-10-20: Follow-ups SCANNER-RUNTIME-12-303/304/305 track canonical verdict integration, attestation verification, and cross-guild fixture validation for runtime APIs. - 2025-10-21: Hardened progress streaming determinism by sorting `data` payload keys within `ScanProgressStream`; added regression `ProgressStreamDataKeysAreSortedDeterministically` ensuring JSONL ordering. +- 2025-10-24: `/policy/runtime` now streams through PolicyPreviewService + attestation verifier; CLI and webhook fixtures updated alongside Zastava observer batching completion. diff --git a/src/StellaOps.Zastava.Core/Contracts/RuntimeEvent.cs b/src/StellaOps.Zastava.Core/Contracts/RuntimeEvent.cs index 39d65683..29a16b27 100644 --- a/src/StellaOps.Zastava.Core/Contracts/RuntimeEvent.cs +++ b/src/StellaOps.Zastava.Core/Contracts/RuntimeEvent.cs @@ -109,15 +109,17 @@ public sealed record class RuntimeWorkloadOwner public string? Name { get; init; } } -public sealed record class RuntimeProcess -{ - public int Pid { get; init; } - - public IReadOnlyList Entrypoint { get; init; } = Array.Empty(); - - [JsonPropertyName("entryTrace")] - public IReadOnlyList EntryTrace { get; init; } = Array.Empty(); -} +public sealed record class RuntimeProcess +{ + public int Pid { get; init; } + + public IReadOnlyList Entrypoint { get; init; } = Array.Empty(); + + [JsonPropertyName("entryTrace")] + public IReadOnlyList EntryTrace { get; init; } = Array.Empty(); + + public string? BuildId { get; init; } +} public sealed record class RuntimeEntryTrace { diff --git a/src/StellaOps.Zastava.Core/Serialization/ZastavaCanonicalJsonSerializer.cs b/src/StellaOps.Zastava.Core/Serialization/ZastavaCanonicalJsonSerializer.cs index 76f4b297..aa74c707 100644 --- a/src/StellaOps.Zastava.Core/Serialization/ZastavaCanonicalJsonSerializer.cs +++ b/src/StellaOps.Zastava.Core/Serialization/ZastavaCanonicalJsonSerializer.cs @@ -24,7 +24,7 @@ public static class ZastavaCanonicalJsonSerializer { typeof(RuntimeEngine), new[] { "engine", "version" } }, { typeof(RuntimeWorkload), new[] { "platform", "namespace", "pod", "container", "containerId", "imageRef", "owner" } }, { typeof(RuntimeWorkloadOwner), new[] { "kind", "name" } }, - { typeof(RuntimeProcess), new[] { "pid", "entrypoint", "entryTrace" } }, + { typeof(RuntimeProcess), new[] { "pid", "entrypoint", "entryTrace", "buildId" } }, { typeof(RuntimeEntryTrace), new[] { "file", "line", "op", "target" } }, { typeof(RuntimeLoadedLibrary), new[] { "path", "inode", "sha256" } }, { typeof(RuntimePosture), new[] { "imageSigned", "sbomReferrer", "attestation" } }, diff --git a/src/StellaOps.Zastava.Observer.Tests/ContainerRuntimePollerTests.cs b/src/StellaOps.Zastava.Observer.Tests/ContainerRuntimePollerTests.cs new file mode 100644 index 00000000..83eaf1fe --- /dev/null +++ b/src/StellaOps.Zastava.Observer.Tests/ContainerRuntimePollerTests.cs @@ -0,0 +1,282 @@ +using System; +using System.Collections.Generic; +using Microsoft.Extensions.Logging.Abstractions; +using Microsoft.Extensions.Time.Testing; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.ContainerRuntime; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; +using StellaOps.Zastava.Observer.Posture; +using StellaOps.Zastava.Observer.Worker; +using StellaOps.Zastava.Observer.Cri; + +namespace StellaOps.Zastava.Observer.Tests; + +public sealed class ContainerRuntimePollerTests +{ + [Fact] + public async Task PollAsync_ProducesStartEvents_InStableOrder() + { + var timeProvider = new FakeTimeProvider(new DateTimeOffset(2025, 10, 24, 12, 0, 0, TimeSpan.Zero)); + var tracker = new ContainerStateTracker(); + var client = new StubCriRuntimeClient(); + + var containerA = CreateContainer("container-a", "pod-a", timeProvider.GetUtcNow().AddSeconds(5)); + var containerB = CreateContainer("container-b", "pod-b", timeProvider.GetUtcNow().AddSeconds(10)); + + client.EnqueueList(containerA, containerB); + + var endpoint = new ContainerRuntimeEndpointOptions + { + Engine = ContainerRuntimeEngine.Containerd, + Endpoint = "unix:///run/containerd/containerd.sock" + }; + + var identity = new CriRuntimeIdentity("containerd", "1.7.19", "1.6.0"); + var poller = new ContainerRuntimePoller(NullLogger.Instance); + + var envelopes = await poller.PollAsync( + tracker, + client, + endpoint, + identity, + tenant: "tenant-alpha", + nodeName: "node-01", + timeProvider, + processCollector: null, + CancellationToken.None); + + Assert.Equal(2, envelopes.Count); + Assert.Collection( + envelopes, + first => + { + Assert.Equal(RuntimeEventKind.ContainerStart, first.Event.Kind); + Assert.Equal("containerd://container-a", first.Event.Workload.ContainerId); + }, + second => + { + Assert.Equal(RuntimeEventKind.ContainerStart, second.Event.Kind); + Assert.Equal("containerd://container-b", second.Event.Workload.ContainerId); + }); + Assert.True(envelopes[0].Event.When <= envelopes[1].Event.When); + + // Subsequent poll without changes should yield no additional events. + client.EnqueueList(Array.Empty()); + var secondPass = await poller.PollAsync( + tracker, + client, + endpoint, + identity, + tenant: "tenant-alpha", + nodeName: "node-01", + timeProvider, + processCollector: null, + CancellationToken.None); + Assert.Equal(2, secondPass.Count); + Assert.All(secondPass, evt => Assert.Equal(RuntimeEventKind.ContainerStop, evt.Event.Kind)); + } + + [Fact] + public async Task PollAsync_EmitsStopEvent_WhenContainerMissing() + { + var timeProvider = new FakeTimeProvider(new DateTimeOffset(2025, 10, 24, 12, 0, 0, TimeSpan.Zero)); + var tracker = new ContainerStateTracker(); + var client = new StubCriRuntimeClient(); + var endpoint = new ContainerRuntimeEndpointOptions + { + Engine = ContainerRuntimeEngine.Containerd, + Endpoint = "unix:///run/containerd/containerd.sock" + }; + var identity = new CriRuntimeIdentity("containerd", "1.7.19", "1.6.0"); + var poller = new ContainerRuntimePoller(NullLogger.Instance); + + var container = CreateContainer("container-c", "pod-c", timeProvider.GetUtcNow().AddSeconds(2)); + client.EnqueueList(container); + await poller.PollAsync( + tracker, + client, + endpoint, + identity, + tenant: "tenant-alpha", + nodeName: "node-02", + timeProvider, + processCollector: null, + CancellationToken.None); + + var finished = container with { FinishedAt = timeProvider.GetUtcNow().AddSeconds(30), ExitCode = 0 }; + client.EnqueueStatus(container.Id, finished); + client.EnqueueList(Array.Empty()); + timeProvider.Advance(TimeSpan.FromSeconds(30)); + + var stopEvents = await poller.PollAsync( + tracker, + client, + endpoint, + identity, + tenant: "tenant-alpha", + nodeName: "node-02", + timeProvider, + processCollector: null, + CancellationToken.None); + + var stop = Assert.Single(stopEvents); + Assert.Equal(RuntimeEventKind.ContainerStop, stop.Event.Kind); + Assert.Equal(finished.FinishedAt, stop.Event.When); + } + + [Fact] + public async Task PollAsync_IncludesPostureInformation() + { + var timeProvider = new FakeTimeProvider(new DateTimeOffset(2025, 10, 24, 12, 0, 0, TimeSpan.Zero)); + var tracker = new ContainerStateTracker(); + var client = new StubCriRuntimeClient(); + var endpoint = new ContainerRuntimeEndpointOptions + { + Engine = ContainerRuntimeEngine.Containerd, + Endpoint = "unix:///run/containerd/containerd.sock" + }; + var identity = new CriRuntimeIdentity("containerd", "1.7.19", "1.6.0"); + var posture = new RuntimePosture + { + ImageSigned = true, + SbomReferrer = "present", + Attestation = new RuntimeAttestation + { + Uuid = "rekor-1", + Verified = true + } + }; + var postureEvaluator = new StubPostureEvaluator(posture); + var poller = new ContainerRuntimePoller(NullLogger.Instance, postureEvaluator); + + var container = CreateContainer("container-d", "pod-d", timeProvider.GetUtcNow().AddSeconds(2)); + client.EnqueueList(container); + + var envelopes = await poller.PollAsync( + tracker, + client, + endpoint, + identity, + tenant: "tenant-beta", + nodeName: "node-03", + timeProvider, + processCollector: null, + CancellationToken.None); + + var runtimeEvent = Assert.Single(envelopes).Event; + Assert.NotNull(runtimeEvent.Posture); + Assert.True(runtimeEvent.Posture!.ImageSigned); + Assert.Equal("present", runtimeEvent.Posture.SbomReferrer); + Assert.Contains(runtimeEvent.Evidence, e => e.Signal.StartsWith("runtime.posture", StringComparison.Ordinal)); + } + + [Fact] + public void BackoffCalculator_ComputesDelayWithinBounds() + { + var options = new ObserverBackoffOptions + { + Initial = TimeSpan.FromSeconds(1), + Max = TimeSpan.FromSeconds(30), + JitterRatio = 0.25 + }; + + var random = new Random(1234); + var delay = BackoffCalculator.ComputeDelay(options, attempt: 3, random); + + Assert.InRange(delay, TimeSpan.FromSeconds(1), options.Max); + + var expectedBase = TimeSpan.FromSeconds(4); // initial * 2^(attempt-1) + Assert.InRange(delay.TotalMilliseconds, expectedBase.TotalMilliseconds * 0.75, expectedBase.TotalMilliseconds * 1.25); + } + + private static CriContainerInfo CreateContainer(string id, string podName, DateTimeOffset startedAt) + { + var labels = new Dictionary(StringComparer.Ordinal) + { + [CriLabelKeys.PodName] = podName, + [CriLabelKeys.PodNamespace] = "default", + [CriLabelKeys.ContainerName] = $"{podName}-container" + }; + + return new CriContainerInfo( + Id: id, + PodSandboxId: $"{podName}-sandbox", + Name: $"{podName}-container", + Attempt: 1, + Image: "ghcr.io/example/app:1.0.0", + ImageRef: $"ghcr.io/example/app@sha256:{id}", + Labels: labels, + Annotations: new Dictionary(StringComparer.Ordinal), + CreatedAt: startedAt.AddSeconds(-5), + StartedAt: startedAt, + FinishedAt: null, + ExitCode: null, + Reason: null, + Message: null, + Pid: null); + } + + private sealed class StubCriRuntimeClient : ICriRuntimeClient + { + private readonly Queue> listResponses = new(); + private readonly Dictionary status = new(StringComparer.Ordinal); + + public ContainerRuntimeEndpointOptions Endpoint => new(); + + public void EnqueueList(params CriContainerInfo[] containers) + => listResponses.Enqueue(containers); + + public void EnqueueList(IReadOnlyList containers) + => listResponses.Enqueue(containers); + + public void EnqueueStatus(string containerId, CriContainerInfo snapshot) + => status[containerId] = snapshot; + + public ValueTask DisposeAsync() => ValueTask.CompletedTask; + + public Task GetIdentityAsync(CancellationToken cancellationToken) + => Task.FromResult(new CriRuntimeIdentity("containerd", "1.7.19", "1.6.0")); + + public Task> ListContainersAsync(ContainerState state, CancellationToken cancellationToken) + { + if (listResponses.Count == 0) + { + return Task.FromResult>(Array.Empty()); + } + + return Task.FromResult(listResponses.Dequeue()); + } + + public Task GetContainerStatusAsync(string containerId, CancellationToken cancellationToken) + { + if (status.TryGetValue(containerId, out var info)) + { + return Task.FromResult(info); + } + + return Task.FromResult(null); + } + } + + private sealed class StubPostureEvaluator : IRuntimePostureEvaluator + { + private readonly RuntimePostureEvaluationResult result; + + public StubPostureEvaluator(RuntimePosture posture) + { + var evidence = new[] + { + new RuntimeEvidence + { + Signal = "runtime.posture.source", + Value = "stub" + } + }; + result = new RuntimePostureEvaluationResult(posture, evidence); + } + + public Task EvaluateAsync(CriContainerInfo container, CancellationToken cancellationToken) + => Task.FromResult(result); + } +} diff --git a/src/StellaOps.Zastava.Observer.Tests/Posture/RuntimePostureEvaluatorTests.cs b/src/StellaOps.Zastava.Observer.Tests/Posture/RuntimePostureEvaluatorTests.cs new file mode 100644 index 00000000..03b0fcb7 --- /dev/null +++ b/src/StellaOps.Zastava.Observer.Tests/Posture/RuntimePostureEvaluatorTests.cs @@ -0,0 +1,191 @@ +using System; +using System.Collections.Generic; +using System.IO; +using System.Linq; +using System.Threading.Tasks; +using System.Threading; +using Microsoft.Extensions.Logging.Abstractions; +using Microsoft.Extensions.Options; +using Microsoft.Extensions.Time.Testing; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Observer.Backend; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; +using StellaOps.Zastava.Observer.Posture; +using Xunit; + +namespace StellaOps.Zastava.Observer.Tests.Posture; + +public sealed class RuntimePostureEvaluatorTests +{ + [Fact] + public async Task EvaluateAsync_BacksOffToBackendAndCachesEntry() + { + var timeProvider = new FakeTimeProvider(new DateTimeOffset(2025, 10, 24, 12, 0, 0, TimeSpan.Zero)); + var cache = new StubPostureCache(); + var options = CreateOptions(); + var client = new StubPolicyClient(image => + { + var result = new RuntimePolicyImageResult + { + Signed = true, + HasSbomReferrers = true, + Rekor = new RuntimePolicyRekorResult + { + Uuid = "rekor-123", + Verified = true + } + }; + + return new RuntimePolicyResponse + { + TtlSeconds = 600, + ExpiresAtUtc = timeProvider.GetUtcNow().AddMinutes(10), + Results = new Dictionary(StringComparer.Ordinal) + { + [image] = result + } + }; + }); + + var evaluator = new RuntimePostureEvaluator(client, cache, options, timeProvider, NullLogger.Instance); + var container = CreateContainerInfo(); + + var evaluation = await evaluator.EvaluateAsync(container, CancellationToken.None); + Assert.NotNull(evaluation.Posture); + Assert.True(evaluation.Posture!.ImageSigned); + Assert.Equal("present", evaluation.Posture.SbomReferrer); + Assert.Contains(evaluation.Evidence, e => e.Signal == "runtime.posture.source" && e.Value == "backend"); + + var cached = cache.Get(container.ImageRef!); + Assert.NotNull(cached); + } + + [Fact] + public async Task EvaluateAsync_UsesCacheWhenBackendFails() + { + var timeProvider = new FakeTimeProvider(new DateTimeOffset(2025, 10, 24, 12, 0, 0, TimeSpan.Zero)); + var cache = new StubPostureCache(); + var options = CreateOptions(); + var imageRef = "ghcr.io/example/app@sha256:deadbeef"; + var cachedPosture = new RuntimePosture + { + ImageSigned = false, + SbomReferrer = "missing" + }; + cache.Seed(imageRef, cachedPosture, timeProvider.GetUtcNow().AddMinutes(-1), timeProvider.GetUtcNow().AddMinutes(-10)); + + var client = new StubPolicyClient(_ => throw new InvalidOperationException("backend unavailable")); + var evaluator = new RuntimePostureEvaluator(client, cache, options, timeProvider, NullLogger.Instance); + var container = CreateContainerInfo(imageRef); + + var evaluation = await evaluator.EvaluateAsync(container, CancellationToken.None); + Assert.NotNull(evaluation.Posture); + Assert.False(evaluation.Posture!.ImageSigned); + Assert.Contains(evaluation.Evidence, e => e.Signal == "runtime.posture.cache"); + Assert.Contains(evaluation.Evidence, e => e.Signal == "runtime.posture.error"); + } + + private static CriContainerInfo CreateContainerInfo(string? imageRef = null) + { + var labels = new Dictionary(StringComparer.Ordinal) + { + [CriLabelKeys.PodNamespace] = "payments", + [CriLabelKeys.PodName] = "api-pod", + [CriLabelKeys.ContainerName] = "api" + }; + + return new CriContainerInfo( + Id: "container-a", + PodSandboxId: "sandbox-a", + Name: "api", + Attempt: 1, + Image: "ghcr.io/example/app:1.0.0", + ImageRef: imageRef ?? "ghcr.io/example/app@sha256:deadbeef", + Labels: labels, + Annotations: new Dictionary(StringComparer.Ordinal), + CreatedAt: DateTimeOffset.UtcNow, + StartedAt: DateTimeOffset.UtcNow, + FinishedAt: null, + ExitCode: null, + Reason: null, + Message: null, + Pid: 1234); + } + + private static TestOptionsMonitor CreateOptions() + { + var options = new ZastavaObserverOptions + { + Posture = new ZastavaObserverPostureOptions + { + CachePath = Path.Combine(Path.GetTempPath(), "zastava-observer-tests", Guid.NewGuid().ToString("N"), "posture-cache.json"), + FallbackTtlSeconds = 300, + StaleWarningThresholdSeconds = 600 + } + }; + + return new TestOptionsMonitor(options); + } + + private sealed class StubPolicyClient : IRuntimePolicyClient + { + private readonly Func factory; + + public StubPolicyClient(Func factory) + { + this.factory = factory; + } + + public Task EvaluateAsync(RuntimePolicyRequest request, CancellationToken cancellationToken = default) + { + var image = request.Images.First(); + return Task.FromResult(factory(image)); + } + } + + private sealed class StubPostureCache : IRuntimePostureCache + { + private readonly Dictionary entries = new(StringComparer.Ordinal); + + public RuntimePostureCacheEntry? Get(string key) + { + entries.TryGetValue(key, out var entry); + return entry; + } + + public void Seed(string key, RuntimePosture posture, DateTimeOffset expiresAt, DateTimeOffset storedAt) + { + entries[key] = new RuntimePostureCacheEntry(posture, expiresAt, storedAt); + } + + public void Set(string key, RuntimePosture posture, DateTimeOffset expiresAtUtc, DateTimeOffset storedAtUtc) + { + entries[key] = new RuntimePostureCacheEntry(posture, expiresAtUtc, storedAtUtc); + } + } + + private sealed class TestOptionsMonitor : IOptionsMonitor + { + private readonly T value; + + public TestOptionsMonitor(T value) + { + this.value = value; + } + + public T CurrentValue => value; + + public T Get(string? name) => value; + + public IDisposable OnChange(Action listener) => NullDisposable.Instance; + + private sealed class NullDisposable : IDisposable + { + public static readonly NullDisposable Instance = new(); + public void Dispose() + { + } + } + } +} diff --git a/src/StellaOps.Zastava.Observer.Tests/Runtime/ElfBuildIdReaderTests.cs b/src/StellaOps.Zastava.Observer.Tests/Runtime/ElfBuildIdReaderTests.cs new file mode 100644 index 00000000..f81dd1f3 --- /dev/null +++ b/src/StellaOps.Zastava.Observer.Tests/Runtime/ElfBuildIdReaderTests.cs @@ -0,0 +1,60 @@ +using System.Linq; +using StellaOps.Zastava.Observer.Runtime; +using StellaOps.Zastava.Observer.Tests.TestSupport; +using Xunit; + +namespace StellaOps.Zastava.Observer.Tests.Runtime; + +public sealed class ElfBuildIdReaderTests +{ + [Fact] + public async Task TryReadBuildIdAsync_ReturnsExpectedHex() + { + using var temp = new TempDirectory(); + var elfPath = Path.Combine(temp.RootPath, "bin", "example"); + var buildIdBytes = Enumerable.Range(0, 20).Select(static index => (byte)(index + 1)).ToArray(); + ElfTestFileBuilder.CreateElfWithBuildId(elfPath, buildIdBytes); + + var buildId = await ElfBuildIdReader.TryReadBuildIdAsync(elfPath, CancellationToken.None); + + Assert.Equal(Convert.ToHexString(buildIdBytes).ToLowerInvariant(), buildId); + } + + [Fact] + public async Task TryReadBuildIdAsync_InvalidFileReturnsNull() + { + using var temp = new TempDirectory(); + var path = Path.Combine(temp.RootPath, "bin", "invalid"); + Directory.CreateDirectory(Path.GetDirectoryName(path)!); + await File.WriteAllTextAsync(path, "not-an-elf"); + + var buildId = await ElfBuildIdReader.TryReadBuildIdAsync(path, CancellationToken.None); + + Assert.Null(buildId); + } + + private sealed class TempDirectory : IDisposable + { + public TempDirectory() + { + RootPath = Path.Combine(Path.GetTempPath(), "elf-buildid-tests", Guid.NewGuid().ToString("N")); + Directory.CreateDirectory(RootPath); + } + + public string RootPath { get; } + + public void Dispose() + { + try + { + if (Directory.Exists(RootPath)) + { + Directory.Delete(RootPath, recursive: true); + } + } + catch + { + } + } + } +} diff --git a/src/StellaOps.Zastava.Observer.Tests/Runtime/RuntimeEventBufferTests.cs b/src/StellaOps.Zastava.Observer.Tests/Runtime/RuntimeEventBufferTests.cs new file mode 100644 index 00000000..a72d295d --- /dev/null +++ b/src/StellaOps.Zastava.Observer.Tests/Runtime/RuntimeEventBufferTests.cs @@ -0,0 +1,218 @@ +using Microsoft.Extensions.Logging.Abstractions; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.Runtime; +using Xunit; + +namespace StellaOps.Zastava.Observer.Tests.Runtime; + +public sealed class RuntimeEventBufferTests +{ + [Fact] + public async Task WriteBatchAsync_PersistsAndAcksRemoveFiles() + { + using var temp = new TempDirectory(); + var options = Options.Create(new ZastavaObserverOptions + { + EventBufferPath = temp.CreateSubdirectory("buffer"), + MaxDiskBufferBytes = 1024 * 1024, + MaxInMemoryBuffer = 32, + PublishBatchSize = 8 + }); + + var buffer = new RuntimeEventBuffer(options, TimeProvider.System, NullLogger.Instance); + await buffer.WriteBatchAsync(new[] + { + CreateEnvelope("evt-1"), + CreateEnvelope("evt-2") + }, CancellationToken.None); + + var enumerator = buffer.ReadAllAsync(CancellationToken.None).GetAsyncEnumerator(); + var first = await ReadNextAsync(enumerator); + Assert.Equal("evt-1", first.Envelope.Event.EventId); + await first.CompleteAsync(); + + var second = await ReadNextAsync(enumerator); + Assert.Equal("evt-2", second.Envelope.Event.EventId); + await second.CompleteAsync(); + + Assert.Empty(Directory.GetFiles(options.Value.EventBufferPath)); + } + + [Fact] + public async Task ReadAllAsync_RestoresPendingEventsAfterRestart() + { + using var temp = new TempDirectory(); + var bufferPath = temp.CreateSubdirectory("buffer"); + var options = Options.Create(new ZastavaObserverOptions + { + EventBufferPath = bufferPath, + MaxDiskBufferBytes = 1024 * 1024, + MaxInMemoryBuffer = 16, + PublishBatchSize = 4 + }); + + var initial = new RuntimeEventBuffer(options, TimeProvider.System, NullLogger.Instance); + await initial.WriteBatchAsync(new[] + { + CreateEnvelope("evt-1"), + CreateEnvelope("evt-2"), + CreateEnvelope("evt-3") + }, CancellationToken.None); + + // Do not drain; instantiate a fresh buffer to simulate restart. + var restored = new RuntimeEventBuffer(options, TimeProvider.System, NullLogger.Instance); + var restoredIds = new List(); + var enumerator = restored.ReadAllAsync(CancellationToken.None).GetAsyncEnumerator(); + + for (var i = 0; i < 3; i++) + { + var item = await ReadNextAsync(enumerator); + restoredIds.Add(item.Envelope.Event.EventId); + await item.CompleteAsync(); + } + + Assert.Contains("evt-1", restoredIds); + Assert.Contains("evt-3", restoredIds); + Assert.Empty(Directory.GetFiles(bufferPath)); + } + + [Fact] + public async Task WriteBatchAsync_EnforcesDiskCapacity() + { + using var temp = new TempDirectory(); + var bufferPath = temp.CreateSubdirectory("buffer"); + var options = Options.Create(new ZastavaObserverOptions + { + EventBufferPath = bufferPath, + MaxDiskBufferBytes = 4096, // small cap to force eviction + MaxInMemoryBuffer = 16, + PublishBatchSize = 4 + }); + + var buffer = new RuntimeEventBuffer(options, TimeProvider.System, NullLogger.Instance); + + for (var i = 0; i < 5; i++) + { + var envelope = CreateEnvelope($"evt-{i}", annotationSize: 2048); + await buffer.WriteBatchAsync(new[] { envelope }, CancellationToken.None); + } + + // Rehydrate to read what remained after capacity enforcement. + var restored = new RuntimeEventBuffer(options, TimeProvider.System, NullLogger.Instance); + var enumerator = restored.ReadAllAsync(CancellationToken.None).GetAsyncEnumerator(); + var ids = new List(); + + while (true) + { + RuntimeEventBufferItem item; + try + { + var hasNext = await enumerator.MoveNextAsync().AsTask().WaitAsync(TimeSpan.FromMilliseconds(200)); + if (!hasNext) + { + break; + } + + item = enumerator.Current; + } + catch (TimeoutException) + { + break; + } + + ids.Add(item.Envelope.Event.EventId); + await item.CompleteAsync(); + } + + // Oldest events should have been dropped; ensure fewer than written remain. + var totalBytes = Directory.GetFiles(bufferPath) + .Select(path => new FileInfo(path).Length) + .Sum(); + + Assert.True(totalBytes <= options.Value.MaxDiskBufferBytes, "Runtime event buffer exceeded configured capacity."); + Assert.True(ids.Count > 0, "Expected at least one runtime event to remain buffered."); + Assert.True(ids.Contains("evt-4"), "Most recent event should remain in buffer."); + } + + private static RuntimeEventEnvelope CreateEnvelope(string id, int annotationSize = 0) + { + var annotations = annotationSize > 0 + ? new Dictionary { ["blob"] = new string('x', annotationSize) } + : null; + + var runtimeEvent = new RuntimeEvent + { + EventId = id, + When = DateTimeOffset.UtcNow, + Kind = RuntimeEventKind.ContainerStart, + Tenant = "tenant-a", + Node = "node-1", + Runtime = new RuntimeEngine + { + Engine = "containerd", + Version = "1.7.0" + }, + Workload = new RuntimeWorkload + { + Platform = "kubernetes", + Namespace = "default", + Pod = "pod-1", + Container = "app", + ContainerId = "containerd://abc", + ImageRef = "ghcr.io/example/app@sha256:deadbeef" + }, + Annotations = annotations + }; + + return RuntimeEventEnvelope.Create(runtimeEvent, ZastavaContractVersions.RuntimeEvent); + } + + private static async Task ReadNextAsync(IAsyncEnumerator enumerator) + { + try + { + var hasNext = await enumerator.MoveNextAsync().AsTask().WaitAsync(TimeSpan.FromSeconds(1)); + Assert.True(hasNext, "Expected runtime event to be available in buffer."); + } + catch (TimeoutException) + { + Assert.Fail("Timed out waiting for runtime event from buffer."); + } + + return enumerator.Current; + } + + private sealed class TempDirectory : IDisposable + { + public TempDirectory() + { + RootPath = Path.Combine(Path.GetTempPath(), "observer-buffer-tests", Guid.NewGuid().ToString("N")); + Directory.CreateDirectory(RootPath); + } + + public string RootPath { get; } + + public string CreateSubdirectory(string name) + { + var path = Path.Combine(RootPath, name); + Directory.CreateDirectory(path); + return path; + } + + public void Dispose() + { + try + { + if (Directory.Exists(RootPath)) + { + Directory.Delete(RootPath, recursive: true); + } + } + catch + { + } + } + } +} diff --git a/src/StellaOps.Zastava.Observer.Tests/Runtime/RuntimeProcessCollectorTests.cs b/src/StellaOps.Zastava.Observer.Tests/Runtime/RuntimeProcessCollectorTests.cs new file mode 100644 index 00000000..b26cde9b --- /dev/null +++ b/src/StellaOps.Zastava.Observer.Tests/Runtime/RuntimeProcessCollectorTests.cs @@ -0,0 +1,156 @@ +using System.Linq; +using System.Security.Cryptography; +using System.Text; +using Microsoft.Extensions.Logging.Abstractions; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; +using StellaOps.Zastava.Observer.Runtime; +using StellaOps.Zastava.Observer.Tests.TestSupport; +using Xunit; + +namespace StellaOps.Zastava.Observer.Tests.Runtime; + +public sealed class RuntimeProcessCollectorTests +{ + [Fact] + public async Task CollectAsync_ParsesCmdlineAndLibraries() + { + using var temp = new TempDirectory(); + var procRoot = temp.CreateSubdirectory("proc"); + var pidDir = Path.Combine(procRoot, "1234"); + Directory.CreateDirectory(pidDir); + + var cmdlineContent = Encoding.UTF8.GetBytes("/bin/bash\0-c\0python /app/server.py\0"); + await File.WriteAllBytesAsync(Path.Combine(pidDir, "cmdline"), cmdlineContent); + + var libPath = Path.Combine(temp.RootPath, "libs", "libexample.so"); + Directory.CreateDirectory(Path.GetDirectoryName(libPath)!); + await File.WriteAllTextAsync(libPath, "library-bytes"); + + var buildIdBytes = Enumerable.Range(0, 20).Select(static index => (byte)(index + 1)).ToArray(); + var exePath = Path.Combine(pidDir, "exe"); + ElfTestFileBuilder.CreateElfWithBuildId(exePath, buildIdBytes); + + var mapsLine = $"7f6d8c900000-7f6d8ca00000 r-xp 00000000 00:00 0 {libPath}"; + await File.WriteAllTextAsync(Path.Combine(pidDir, "maps"), mapsLine + Environment.NewLine); + + var options = Options.Create(new ZastavaObserverOptions + { + ProcRootPath = procRoot, + MaxTrackedLibraries = 8, + MaxEntrypointArguments = 16, + MaxLibraryBytes = 1024 * 1024 + }); + + var collector = new RuntimeProcessCollector(options, NullLogger.Instance); + + var container = new CriContainerInfo( + Id: "container-1", + PodSandboxId: "sandbox", + Name: "example", + Attempt: 1, + Image: "ghcr.io/example/app:1.0", + ImageRef: "ghcr.io/example/app@sha256:deadbeef", + Labels: new Dictionary(StringComparer.Ordinal), + Annotations: new Dictionary(StringComparer.Ordinal), + CreatedAt: DateTimeOffset.UtcNow, + StartedAt: DateTimeOffset.UtcNow, + FinishedAt: null, + ExitCode: null, + Reason: null, + Message: null, + Pid: 1234); + + var capture = await collector.CollectAsync(container, CancellationToken.None); + Assert.NotNull(capture); + Assert.NotNull(capture!.Process); + Assert.Contains("/bin/bash", capture.Process.Entrypoint); + Assert.Contains(capture.Process.EntryTrace, trace => trace.Op == "shell" && trace.Target == "python /app/server.py"); + Assert.Contains(capture.Process.EntryTrace, trace => trace.Op == "python" && trace.Target == "/app/server.py"); + Assert.NotEmpty(capture.Libraries); + var expectedHash = Convert.ToHexString(SHA256.HashData(Encoding.UTF8.GetBytes("library-bytes"))).ToLowerInvariant(); + Assert.Contains(capture.Libraries, lib => lib.Path == libPath && lib.Sha256 == expectedHash); + Assert.Contains(capture.Evidence, item => item.Signal == "procfs.maps" && item.Value == $"{libPath}@0x7f6d8c900000"); + Assert.Contains(capture.Evidence, item => item.Signal == "procfs.maps.count" && item.Value == "1"); + Assert.Contains(capture.Evidence, item => item.Signal == "procfs.cmdline"); + Assert.Equal(Convert.ToHexString(buildIdBytes).ToLowerInvariant(), capture.Process.BuildId); + Assert.Contains(capture.Evidence, item => item.Signal == "procfs.buildId"); + } + + [Fact] + public async Task CollectAsync_NodeEntrypointProducesTrace() + { + using var temp = new TempDirectory(); + var procRoot = temp.CreateSubdirectory("proc"); + var pidDir = Path.Combine(procRoot, "4321"); + Directory.CreateDirectory(pidDir); + + await File.WriteAllBytesAsync(Path.Combine(pidDir, "cmdline"), Encoding.UTF8.GetBytes("/usr/bin/node\0/app/index.js\0")); + await File.WriteAllTextAsync(Path.Combine(pidDir, "maps"), string.Empty); + + var options = Options.Create(new ZastavaObserverOptions + { + ProcRootPath = procRoot, + MaxTrackedLibraries = 8, + MaxEntrypointArguments = 16 + }); + + var collector = new RuntimeProcessCollector(options, NullLogger.Instance); + + var container = new CriContainerInfo( + Id: "container-node", + PodSandboxId: "sandbox-node", + Name: "node-app", + Attempt: 1, + Image: "ghcr.io/example/node:1.0", + ImageRef: "ghcr.io/example/node@sha256:feedface", + Labels: new Dictionary(StringComparer.Ordinal), + Annotations: new Dictionary(StringComparer.Ordinal), + CreatedAt: DateTimeOffset.UtcNow, + StartedAt: DateTimeOffset.UtcNow, + FinishedAt: null, + ExitCode: null, + Reason: null, + Message: null, + Pid: 4321); + + var capture = await collector.CollectAsync(container, CancellationToken.None); + Assert.NotNull(capture); + Assert.NotNull(capture!.Process); + Assert.Contains("/usr/bin/node", capture.Process.Entrypoint); + Assert.Contains(capture.Process.EntryTrace, trace => trace.Op == "node" && trace.Target == "/app/index.js"); + } + + private sealed class TempDirectory : IDisposable + { + public TempDirectory() + { + RootPath = Path.Combine(Path.GetTempPath(), "observer-tests", Guid.NewGuid().ToString("N")); + Directory.CreateDirectory(RootPath); + } + + public string RootPath { get; } + + public string CreateSubdirectory(string name) + { + var path = Path.Combine(RootPath, name); + Directory.CreateDirectory(path); + return path; + } + + public void Dispose() + { + try + { + if (Directory.Exists(RootPath)) + { + Directory.Delete(RootPath, recursive: true); + } + } + catch + { + } + } + } +} diff --git a/src/StellaOps.Zastava.Observer.Tests/StellaOps.Zastava.Observer.Tests.csproj b/src/StellaOps.Zastava.Observer.Tests/StellaOps.Zastava.Observer.Tests.csproj new file mode 100644 index 00000000..59a74730 --- /dev/null +++ b/src/StellaOps.Zastava.Observer.Tests/StellaOps.Zastava.Observer.Tests.csproj @@ -0,0 +1,12 @@ + + + net10.0 + preview + enable + enable + false + + + + + diff --git a/src/StellaOps.Zastava.Observer.Tests/TestSupport/ElfTestFileBuilder.cs b/src/StellaOps.Zastava.Observer.Tests/TestSupport/ElfTestFileBuilder.cs new file mode 100644 index 00000000..0aef4766 --- /dev/null +++ b/src/StellaOps.Zastava.Observer.Tests/TestSupport/ElfTestFileBuilder.cs @@ -0,0 +1,73 @@ +using System.Buffers.Binary; +using System.IO; + +namespace StellaOps.Zastava.Observer.Tests.TestSupport; + +internal static class ElfTestFileBuilder +{ + private const int HeaderSize = 64; + private const int ProgramHeaderSize = 56; + private const uint ProgramHeaderTypeNote = 4; + private const uint NoteTypeGnuBuildId = 3; + + public static void CreateElfWithBuildId(string path, ReadOnlySpan buildId) + { + if (string.IsNullOrWhiteSpace(path)) + { + throw new ArgumentException("Path cannot be null or whitespace.", nameof(path)); + } + + var directory = Path.GetDirectoryName(path); + if (!string.IsNullOrEmpty(directory)) + { + Directory.CreateDirectory(directory); + } + + var nameBytes = new byte[] { (byte)'G', (byte)'N', (byte)'U', 0 }; + var alignedNameSize = Align(nameBytes.Length); + var alignedDescSize = Align(buildId.Length); + var noteSize = 12 + alignedNameSize + alignedDescSize; + var noteOffset = HeaderSize + ProgramHeaderSize; + var totalSize = noteOffset + noteSize; + + var buffer = new byte[totalSize]; + var span = buffer.AsSpan(); + + // ELF ident + span[0] = 0x7F; + span[1] = (byte)'E'; + span[2] = (byte)'L'; + span[3] = (byte)'F'; + span[4] = 2; // 64-bit + span[5] = 1; // little-endian + span[6] = 1; // version + + BinaryPrimitives.WriteUInt16LittleEndian(span.Slice(16, 2), 2); // e_type + BinaryPrimitives.WriteUInt16LittleEndian(span.Slice(18, 2), 0x3E); // e_machine (x86-64) + BinaryPrimitives.WriteUInt32LittleEndian(span.Slice(20, 4), 1); // e_version + BinaryPrimitives.WriteUInt64LittleEndian(span.Slice(32, 8), HeaderSize); // e_phoff + BinaryPrimitives.WriteUInt16LittleEndian(span.Slice(52, 2), HeaderSize); // e_ehsize + BinaryPrimitives.WriteUInt16LittleEndian(span.Slice(54, 2), ProgramHeaderSize); // e_phentsize + BinaryPrimitives.WriteUInt16LittleEndian(span.Slice(56, 2), 1); // e_phnum + + var programHeader = span.Slice(HeaderSize, ProgramHeaderSize); + BinaryPrimitives.WriteUInt32LittleEndian(programHeader.Slice(0, 4), ProgramHeaderTypeNote); + BinaryPrimitives.WriteUInt64LittleEndian(programHeader.Slice(8, 8), (ulong)noteOffset); + BinaryPrimitives.WriteUInt64LittleEndian(programHeader.Slice(32, 8), (ulong)noteSize); + BinaryPrimitives.WriteUInt64LittleEndian(programHeader.Slice(40, 8), (ulong)noteSize); + BinaryPrimitives.WriteUInt64LittleEndian(programHeader.Slice(48, 8), 4); + + var note = span.Slice(noteOffset, noteSize); + BinaryPrimitives.WriteUInt32LittleEndian(note.Slice(0, 4), (uint)nameBytes.Length); + BinaryPrimitives.WriteUInt32LittleEndian(note.Slice(4, 4), (uint)buildId.Length); + BinaryPrimitives.WriteUInt32LittleEndian(note.Slice(8, 4), NoteTypeGnuBuildId); + nameBytes.CopyTo(note.Slice(12, nameBytes.Length)); + + var descriptorStart = 12 + alignedNameSize; + buildId.CopyTo(note.Slice(descriptorStart, buildId.Length)); + + File.WriteAllBytes(path, buffer); + } + + private static int Align(int value) => (value + 3) & ~3; +} diff --git a/src/StellaOps.Zastava.Observer/Backend/IRuntimePolicyClient.cs b/src/StellaOps.Zastava.Observer/Backend/IRuntimePolicyClient.cs new file mode 100644 index 00000000..30d1949e --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Backend/IRuntimePolicyClient.cs @@ -0,0 +1,9 @@ +using System.Threading; +using System.Threading.Tasks; + +namespace StellaOps.Zastava.Observer.Backend; + +internal interface IRuntimePolicyClient +{ + Task EvaluateAsync(RuntimePolicyRequest request, CancellationToken cancellationToken = default); +} diff --git a/src/StellaOps.Zastava.Observer/Backend/RuntimeEventsClient.cs b/src/StellaOps.Zastava.Observer/Backend/RuntimeEventsClient.cs new file mode 100644 index 00000000..4cf8b635 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Backend/RuntimeEventsClient.cs @@ -0,0 +1,237 @@ +using System.Linq; +using System.Net; +using System.Net.Http.Headers; +using System.Text.Json; +using System.Text.Json.Serialization; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Configuration; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Core.Diagnostics; +using StellaOps.Zastava.Core.Security; +using StellaOps.Zastava.Core.Serialization; +using StellaOps.Zastava.Observer.Configuration; + +namespace StellaOps.Zastava.Observer.Backend; + +internal interface IRuntimeEventsClient +{ + Task PublishAsync(RuntimeEventsIngestRequest request, CancellationToken cancellationToken); +} + +internal sealed class RuntimeEventsClient : IRuntimeEventsClient +{ + private static readonly JsonSerializerOptions SerializerOptions = new() + { + PropertyNamingPolicy = JsonNamingPolicy.CamelCase, + DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull + }; + + static RuntimeEventsClient() + { + SerializerOptions.Converters.Add(new JsonStringEnumConverter(JsonNamingPolicy.CamelCase, allowIntegerValues: false)); + } + + private readonly HttpClient httpClient; + private readonly IZastavaAuthorityTokenProvider authorityTokenProvider; + private readonly IOptionsMonitor runtimeOptions; + private readonly IOptionsMonitor observerOptions; + private readonly IZastavaRuntimeMetrics runtimeMetrics; + private readonly ILogger logger; + + public RuntimeEventsClient( + HttpClient httpClient, + IZastavaAuthorityTokenProvider authorityTokenProvider, + IOptionsMonitor runtimeOptions, + IOptionsMonitor observerOptions, + IZastavaRuntimeMetrics runtimeMetrics, + ILogger logger) + { + this.httpClient = httpClient ?? throw new ArgumentNullException(nameof(httpClient)); + this.authorityTokenProvider = authorityTokenProvider ?? throw new ArgumentNullException(nameof(authorityTokenProvider)); + this.runtimeOptions = runtimeOptions ?? throw new ArgumentNullException(nameof(runtimeOptions)); + this.observerOptions = observerOptions ?? throw new ArgumentNullException(nameof(observerOptions)); + this.runtimeMetrics = runtimeMetrics ?? throw new ArgumentNullException(nameof(runtimeMetrics)); + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + } + + public async Task PublishAsync(RuntimeEventsIngestRequest request, CancellationToken cancellationToken) + { + ArgumentNullException.ThrowIfNull(request); + + if (request.Events.Count == 0) + { + return RuntimeEventPublishResult.Empty; + } + + var runtime = runtimeOptions.CurrentValue; + var authority = runtime.Authority; + var audience = authority.Audience.FirstOrDefault() ?? "scanner"; + var scopes = authority.Scopes ?? Array.Empty(); + var token = await authorityTokenProvider.GetAsync(audience, scopes, cancellationToken).ConfigureAwait(false); + + var backend = observerOptions.CurrentValue.Backend; + var requestPath = backend.EventsPath; + + using var httpRequest = new HttpRequestMessage(HttpMethod.Post, requestPath); + var payload = ZastavaCanonicalJsonSerializer.SerializeToUtf8Bytes(request); + httpRequest.Content = new ByteArrayContent(payload); + httpRequest.Content.Headers.ContentType = new MediaTypeHeaderValue("application/json"); + httpRequest.Headers.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json")); + httpRequest.Headers.Authorization = CreateAuthorizationHeader(token); + + var stopwatch = System.Diagnostics.Stopwatch.StartNew(); + try + { + using var response = await httpClient.SendAsync(httpRequest, cancellationToken).ConfigureAwait(false); + stopwatch.Stop(); + RecordLatency(stopwatch.Elapsed.TotalMilliseconds, success: response.IsSuccessStatusCode); + + if (response.IsSuccessStatusCode) + { + var body = await response.Content.ReadAsStringAsync(cancellationToken).ConfigureAwait(false); + RuntimeEventsIngestResponse? parsed = null; + if (!string.IsNullOrWhiteSpace(body)) + { + parsed = JsonSerializer.Deserialize(body, SerializerOptions); + } + + var accepted = parsed?.Accepted ?? request.Events.Count; + var duplicates = parsed?.Duplicates ?? 0; + + logger.LogDebug("Published runtime events batch (batchId={BatchId}, accepted={Accepted}, duplicates={Duplicates}).", + request.BatchId, + accepted, + duplicates); + + return RuntimeEventPublishResult.Successful(accepted, duplicates); + } + + if (response.StatusCode == HttpStatusCode.TooManyRequests) + { + var retryAfter = ParseRetryAfter(response.Headers.RetryAfter) ?? TimeSpan.FromSeconds(5); + logger.LogWarning("Runtime events publish rate limited (batchId={BatchId}, retryAfter={RetryAfter}).", request.BatchId, retryAfter); + return RuntimeEventPublishResult.FromRateLimit(retryAfter); + } + + var errorBody = await response.Content.ReadAsStringAsync(cancellationToken).ConfigureAwait(false); + logger.LogWarning("Runtime events publish failed with status {Status} (batchId={BatchId}): {Payload}", + (int)response.StatusCode, + request.BatchId, + Truncate(errorBody)); + + throw new RuntimeEventsException($"Runtime events publish failed with status {(int)response.StatusCode}", response.StatusCode); + } + catch (RuntimeEventsException) + { + throw; + } + catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) + { + throw; + } + catch (Exception ex) + { + stopwatch.Stop(); + RecordLatency(stopwatch.Elapsed.TotalMilliseconds, success: false); + logger.LogWarning(ex, "Runtime events publish encountered an exception (batchId={BatchId}).", request.BatchId); + throw new RuntimeEventsException("Runtime events publish failed due to network error.", HttpStatusCode.ServiceUnavailable, ex); + } + } + + private AuthenticationHeaderValue CreateAuthorizationHeader(ZastavaOperationalToken token) + { + var scheme = string.Equals(token.TokenType, "dpop", StringComparison.OrdinalIgnoreCase) + ? "DPoP" + : token.TokenType; + return new AuthenticationHeaderValue(scheme, token.AccessToken); + } + + private void RecordLatency(double elapsedMs, bool success) + { + var tags = runtimeMetrics.DefaultTags + .Concat(new[] + { + new KeyValuePair("endpoint", "runtime-events"), + new KeyValuePair("success", success ? "true" : "false") + }) + .ToArray(); + runtimeMetrics.BackendLatencyMs.Record(elapsedMs, tags); + } + + private static TimeSpan? ParseRetryAfter(RetryConditionHeaderValue? retryAfter) + { + if (retryAfter is null) + { + return null; + } + + if (retryAfter.Delta.HasValue) + { + return retryAfter.Delta.Value; + } + + if (retryAfter.Date.HasValue) + { + var delta = retryAfter.Date.Value.UtcDateTime - DateTime.UtcNow; + return delta > TimeSpan.Zero ? delta : TimeSpan.Zero; + } + + return null; + } + + private static string Truncate(string? value, int maxLength = 512) + { + if (string.IsNullOrEmpty(value)) + { + return string.Empty; + } + + return value.Length <= maxLength ? value : value[..maxLength] + "…"; + } +} + +internal sealed record RuntimeEventsIngestRequest +{ + [JsonPropertyName("batchId")] + public string? BatchId { get; init; } + + [JsonPropertyName("events")] + public IReadOnlyList Events { get; init; } = Array.Empty(); +} + +internal sealed record RuntimeEventsIngestResponse +{ + [JsonPropertyName("accepted")] + public int Accepted { get; init; } + + [JsonPropertyName("duplicates")] + public int Duplicates { get; init; } +} + +internal readonly record struct RuntimeEventPublishResult( + bool Success, + bool RateLimited, + TimeSpan RetryAfter, + int Accepted, + int Duplicates) +{ + public static RuntimeEventPublishResult Empty => new(true, false, TimeSpan.Zero, 0, 0); + + public static RuntimeEventPublishResult Successful(int accepted, int duplicates) + => new(true, false, TimeSpan.Zero, accepted, duplicates); + +public static RuntimeEventPublishResult FromRateLimit(TimeSpan retryAfter) + => new(false, true, retryAfter, 0, 0); +} + +internal sealed class RuntimeEventsException : Exception +{ + public RuntimeEventsException(string message, HttpStatusCode statusCode, Exception? innerException = null) + : base(message, innerException) + { + StatusCode = statusCode; + } + + public HttpStatusCode StatusCode { get; } +} diff --git a/src/StellaOps.Zastava.Observer/Backend/RuntimePolicyClient.cs b/src/StellaOps.Zastava.Observer/Backend/RuntimePolicyClient.cs new file mode 100644 index 00000000..a55d81d6 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Backend/RuntimePolicyClient.cs @@ -0,0 +1,128 @@ +using System; +using System.Collections.Generic; +using System.Diagnostics; +using System.Linq; +using System.Net.Http; +using System.Net.Http.Headers; +using System.Text; +using System.Text.Json; +using System.Text.Json.Serialization; +using System.Threading; +using System.Threading.Tasks; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Configuration; +using StellaOps.Zastava.Core.Diagnostics; +using StellaOps.Zastava.Core.Security; +using StellaOps.Zastava.Observer.Configuration; + +namespace StellaOps.Zastava.Observer.Backend; + +internal sealed class RuntimePolicyClient : IRuntimePolicyClient +{ + private static readonly JsonSerializerOptions SerializerOptions = new() + { + PropertyNamingPolicy = JsonNamingPolicy.CamelCase, + DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull + }; + + static RuntimePolicyClient() + { + SerializerOptions.Converters.Add(new JsonStringEnumConverter(JsonNamingPolicy.CamelCase, allowIntegerValues: false)); + } + + private readonly HttpClient httpClient; + private readonly IZastavaAuthorityTokenProvider authorityTokenProvider; + private readonly IOptionsMonitor runtimeOptions; + private readonly IOptionsMonitor observerOptions; + private readonly IZastavaRuntimeMetrics runtimeMetrics; + private readonly ILogger logger; + + public RuntimePolicyClient( + HttpClient httpClient, + IZastavaAuthorityTokenProvider authorityTokenProvider, + IOptionsMonitor runtimeOptions, + IOptionsMonitor observerOptions, + IZastavaRuntimeMetrics runtimeMetrics, + ILogger logger) + { + this.httpClient = httpClient ?? throw new ArgumentNullException(nameof(httpClient)); + this.authorityTokenProvider = authorityTokenProvider ?? throw new ArgumentNullException(nameof(authorityTokenProvider)); + this.runtimeOptions = runtimeOptions ?? throw new ArgumentNullException(nameof(runtimeOptions)); + this.observerOptions = observerOptions ?? throw new ArgumentNullException(nameof(observerOptions)); + this.runtimeMetrics = runtimeMetrics ?? throw new ArgumentNullException(nameof(runtimeMetrics)); + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + } + + public async Task EvaluateAsync(RuntimePolicyRequest request, CancellationToken cancellationToken = default) + { + ArgumentNullException.ThrowIfNull(request); + + var runtime = runtimeOptions.CurrentValue; + var authority = runtime.Authority; + var audience = authority.Audience.FirstOrDefault() ?? "scanner"; + + var token = await authorityTokenProvider + .GetAsync(audience, authority.Scopes ?? Array.Empty(), cancellationToken) + .ConfigureAwait(false); + + var backend = observerOptions.CurrentValue.Backend; + EnsureBackendGuardrails(backend); + + using var httpRequest = new HttpRequestMessage(HttpMethod.Post, backend.PolicyPath) + { + Content = new StringContent(JsonSerializer.Serialize(request, SerializerOptions), Encoding.UTF8, "application/json") + }; + + httpRequest.Headers.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json")); + httpRequest.Headers.Authorization = CreateAuthorizationHeader(token); + + var stopwatch = Stopwatch.StartNew(); + try + { + using var response = await httpClient.SendAsync(httpRequest, cancellationToken).ConfigureAwait(false); + var payload = await response.Content.ReadAsStringAsync(cancellationToken).ConfigureAwait(false); + + if (!response.IsSuccessStatusCode) + { + logger.LogWarning("Runtime policy call returned {StatusCode}: {Payload}", (int)response.StatusCode, payload); + throw new RuntimePolicyException($"Runtime policy call failed with status {(int)response.StatusCode}", response.StatusCode); + } + + var result = JsonSerializer.Deserialize(payload, SerializerOptions); + if (result is null) + { + throw new RuntimePolicyException("Runtime policy response payload was empty or invalid.", response.StatusCode); + } + + return result; + } + finally + { + stopwatch.Stop(); + RecordLatency(stopwatch.Elapsed.TotalMilliseconds); + } + } + + private AuthenticationHeaderValue CreateAuthorizationHeader(ZastavaOperationalToken token) + { + var scheme = string.Equals(token.TokenType, "dpop", StringComparison.OrdinalIgnoreCase) ? "DPoP" : token.TokenType; + return new AuthenticationHeaderValue(scheme, token.AccessToken); + } + + private void RecordLatency(double elapsedMs) + { + var tags = runtimeMetrics.DefaultTags + .Concat(new[] { new KeyValuePair("endpoint", "policy") }) + .ToArray(); + runtimeMetrics.BackendLatencyMs.Record(elapsedMs, tags); + } + + private static void EnsureBackendGuardrails(ZastavaObserverBackendOptions backend) + { + if (!backend.AllowInsecureHttp && !string.Equals(backend.BaseAddress.Scheme, Uri.UriSchemeHttps, StringComparison.OrdinalIgnoreCase)) + { + throw new InvalidOperationException("Observer backend baseAddress must use HTTPS unless allowInsecureHttp is true."); + } + } +} diff --git a/src/StellaOps.Zastava.Observer/Backend/RuntimePolicyContracts.cs b/src/StellaOps.Zastava.Observer/Backend/RuntimePolicyContracts.cs new file mode 100644 index 00000000..706bc275 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Backend/RuntimePolicyContracts.cs @@ -0,0 +1,73 @@ +using System; +using System.Collections.Generic; +using System.Text.Json.Serialization; +using StellaOps.Zastava.Core.Contracts; + +namespace StellaOps.Zastava.Observer.Backend; + +internal sealed record RuntimePolicyRequest +{ + [JsonPropertyName("namespace")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public string? Namespace { get; init; } + + [JsonPropertyName("labels")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public IReadOnlyDictionary? Labels { get; init; } + + [JsonPropertyName("images")] + public required IReadOnlyList Images { get; init; } +} + +internal sealed record RuntimePolicyResponse +{ + [JsonPropertyName("ttlSeconds")] + public int TtlSeconds { get; init; } + + [JsonPropertyName("expiresAtUtc")] + public DateTimeOffset ExpiresAtUtc { get; init; } + + [JsonPropertyName("policyRevision")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public string? PolicyRevision { get; init; } + + [JsonPropertyName("results")] + public IReadOnlyDictionary Results { get; init; } = new Dictionary(StringComparer.Ordinal); +} + +internal sealed record RuntimePolicyImageResult +{ + [JsonPropertyName("policyVerdict")] + public PolicyVerdict PolicyVerdict { get; init; } = PolicyVerdict.Error; + + [JsonPropertyName("signed")] + public bool Signed { get; init; } + + [JsonPropertyName("hasSbomReferrers")] + public bool HasSbomReferrers { get; init; } + + [JsonPropertyName("hasSbom")] + public bool HasSbomLegacy { get; init; } + + [JsonPropertyName("reasons")] + public IReadOnlyList Reasons { get; init; } = Array.Empty(); + + [JsonPropertyName("rekor")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public RuntimePolicyRekorResult? Rekor { get; init; } +} + +internal sealed record RuntimePolicyRekorResult +{ + [JsonPropertyName("uuid")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public string? Uuid { get; init; } + + [JsonPropertyName("url")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public string? Url { get; init; } + + [JsonPropertyName("verified")] + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + public bool? Verified { get; init; } +} diff --git a/src/StellaOps.Zastava.Observer/Backend/RuntimePolicyException.cs b/src/StellaOps.Zastava.Observer/Backend/RuntimePolicyException.cs new file mode 100644 index 00000000..839b1cca --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Backend/RuntimePolicyException.cs @@ -0,0 +1,21 @@ +using System; +using System.Net; + +namespace StellaOps.Zastava.Observer.Backend; + +internal sealed class RuntimePolicyException : Exception +{ + public RuntimePolicyException(string message, HttpStatusCode statusCode) + : base(message) + { + StatusCode = statusCode; + } + + public RuntimePolicyException(string message, HttpStatusCode statusCode, Exception innerException) + : base(message, innerException) + { + StatusCode = statusCode; + } + + public HttpStatusCode StatusCode { get; } +} diff --git a/src/StellaOps.Zastava.Observer/Configuration/ZastavaObserverOptions.cs b/src/StellaOps.Zastava.Observer/Configuration/ZastavaObserverOptions.cs index 4337e9bc..75c343da 100644 --- a/src/StellaOps.Zastava.Observer/Configuration/ZastavaObserverOptions.cs +++ b/src/StellaOps.Zastava.Observer/Configuration/ZastavaObserverOptions.cs @@ -1,4 +1,5 @@ using System.ComponentModel.DataAnnotations; +using System.IO; namespace StellaOps.Zastava.Observer.Configuration; @@ -38,6 +39,24 @@ public sealed class ZastavaObserverOptions [Range(1, 512)] public int PublishBatchSize { get; set; } = 32; + /// + /// Maximum interval (seconds) that events may remain buffered before forcing a publish. + /// + [Range(typeof(double), "0.1", "30")] + public double PublishFlushIntervalSeconds { get; set; } = 2; + + /// + /// Directory used for disk-backed runtime event buffering. + /// + [Required(AllowEmptyStrings = false)] + public string EventBufferPath { get; set; } = Path.Combine(Path.GetTempPath(), "zastava-observer", "runtime-events"); + + /// + /// Maximum on-disk bytes retained for buffered runtime events. + /// + [Range(typeof(long), "1048576", "1073741824")] + public long MaxDiskBufferBytes { get; set; } = 64 * 1024 * 1024; // 64 MiB + /// /// Connectivity/backoff settings applied when CRI endpoints fail temporarily. /// @@ -58,6 +77,101 @@ public sealed class ZastavaObserverOptions Enabled = true } }; + + /// + /// Scanner backend configuration for posture checks and event ingestion. + /// + [Required] + public ZastavaObserverBackendOptions Backend { get; set; } = new(); + + /// + /// Posture-specific configuration values. + /// + [Required] + public ZastavaObserverPostureOptions Posture { get; set; } = new(); + + /// + /// Root path for accessing host process information (defaults to /host/proc). + /// + [Required(AllowEmptyStrings = false)] + public string ProcRootPath { get; set; } = "/host/proc"; + + /// + /// Maximum number of loaded libraries captured per process. + /// + [Range(8, 4096)] + public int MaxTrackedLibraries { get; set; } = 256; + + /// + /// Maximum size (in bytes) of a library file to hash when collecting loaded libraries. + /// + [Range(typeof(long), "1024", "1073741824")] + public long MaxLibraryBytes { get; set; } = 33554432; // 32 MiB + + /// + /// Maximum cumulative bytes hashed across libraries for a single process capture. + /// + [Range(typeof(long), "1024", "2147483647")] + public long MaxLibraryHashBytes { get; set; } = 64_000_000; // ~61 MiB budget + + /// + /// Maximum number of entrypoint arguments captured for reporting. + /// + [Range(1, 128)] + public int MaxEntrypointArguments { get; set; } = 32; +} + +public sealed class ZastavaObserverBackendOptions +{ + /// + /// Base address for Scanner WebService runtime APIs. + /// + [Required] + public Uri BaseAddress { get; init; } = new("https://scanner.internal"); + + /// + /// Runtime policy endpoint path. + /// + [Required(AllowEmptyStrings = false)] + public string PolicyPath { get; init; } = "/api/v1/scanner/policy/runtime"; + + /// + /// Runtime events ingestion endpoint path. + /// + [Required(AllowEmptyStrings = false)] + public string EventsPath { get; init; } = "/api/v1/runtime/events"; + + /// + /// Request timeout for backend calls in seconds. + /// + [Range(typeof(double), "1", "120")] + public double RequestTimeoutSeconds { get; init; } = 5; + + /// + /// Allows plain HTTP endpoints when true (default false for safety). + /// + public bool AllowInsecureHttp { get; init; } +} + +public sealed class ZastavaObserverPostureOptions +{ + /// + /// Path where posture cache entries are persisted across restarts. + /// + [Required(AllowEmptyStrings = false)] + public string CachePath { get; init; } = Path.Combine(Path.GetTempPath(), "zastava-observer", "posture-cache.json"); + + /// + /// Fallback TTL (seconds) applied when backend omits an explicit expiry. + /// + [Range(30, 86400)] + public int FallbackTtlSeconds { get; init; } = 300; + + /// + /// Threshold (seconds) after expiration where stale cache usage triggers warnings. + /// + [Range(30, 86400)] + public int StaleWarningThresholdSeconds { get; init; } = 900; } public sealed class ObserverBackoffOptions diff --git a/src/StellaOps.Zastava.Observer/ContainerRuntime/ContainerStateTrackerFactory.cs b/src/StellaOps.Zastava.Observer/ContainerRuntime/ContainerStateTrackerFactory.cs new file mode 100644 index 00000000..44a5e325 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/ContainerRuntime/ContainerStateTrackerFactory.cs @@ -0,0 +1,7 @@ +namespace StellaOps.Zastava.Observer.ContainerRuntime; + +internal sealed class ContainerStateTrackerFactory +{ + public ContainerStateTracker Create() + => new(); +} diff --git a/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriConversions.cs b/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriConversions.cs index be9d68ee..a7ba6f21 100644 --- a/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriConversions.cs +++ b/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriConversions.cs @@ -24,7 +24,8 @@ internal static class CriConversions FinishedAt: null, ExitCode: null, Reason: null, - Message: null); + Message: null, + Pid: null); } public static CriContainerInfo MergeStatus(CriContainerInfo baseline, ContainerStatus? status) @@ -47,8 +48,9 @@ internal static class CriConversions ExitCode = status.ExitCode != 0 ? status.ExitCode : baseline.ExitCode, Reason = string.IsNullOrWhiteSpace(status.Reason) ? baseline.Reason : status.Reason, Message = string.IsNullOrWhiteSpace(status.Message) ? baseline.Message : status.Message, - Image: status.Image?.Image ?? baseline.Image, - ImageRef: string.IsNullOrWhiteSpace(status.ImageRef) ? baseline.ImageRef : status.ImageRef, + Pid = baseline.Pid, + Image = status.Image?.Image ?? baseline.Image, + ImageRef = string.IsNullOrWhiteSpace(status.ImageRef) ? baseline.ImageRef : status.ImageRef, Labels = labels, Annotations = annotations }; diff --git a/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriModels.cs b/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriModels.cs index 89e5d38f..c08d314e 100644 --- a/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriModels.cs +++ b/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriModels.cs @@ -21,7 +21,8 @@ internal sealed record CriContainerInfo( DateTimeOffset? FinishedAt, int? ExitCode, string? Reason, - string? Message); + string? Message, + int? Pid); internal static class CriLabelKeys { diff --git a/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriRuntimeClient.cs b/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriRuntimeClient.cs index 9a02abd8..669ba856 100644 --- a/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriRuntimeClient.cs +++ b/src/StellaOps.Zastava.Observer/ContainerRuntime/Cri/CriRuntimeClient.cs @@ -1,6 +1,7 @@ using System.IO; -using System.Net.Sockets; using System.Linq; +using System.Net.Sockets; +using System.Text.Json; using Grpc.Core; using Grpc.Net.Client; using Microsoft.Extensions.Logging; @@ -92,7 +93,7 @@ internal sealed class CriRuntimeClient : ICriRuntimeClient var response = await client.ContainerStatusAsync(new ContainerStatusRequest { ContainerId = containerId, - Verbose = false + Verbose = true }, cancellationToken: cancellationToken).ConfigureAwait(false); if (response.Status is null) @@ -112,7 +113,14 @@ internal sealed class CriRuntimeClient : ICriRuntimeClient CreatedAt = response.Status.CreatedAt }); - return CriConversions.MergeStatus(baseline, response.Status); + var merged = CriConversions.MergeStatus(baseline, response.Status); + + if (response.Info is { Count: > 0 } && TryExtractPid(response.Info, out var pid)) + { + merged = merged with { Pid = pid }; + } + + return merged; } catch (RpcException ex) when (ex.StatusCode is StatusCode.NotFound or StatusCode.DeadlineExceeded) { @@ -121,16 +129,49 @@ internal sealed class CriRuntimeClient : ICriRuntimeClient } } - public async ValueTask DisposeAsync() + private static bool TryExtractPid(IDictionary info, out int pid) + { + if (info.TryGetValue("pid", out var value) && int.TryParse(value, out pid)) + { + return true; + } + + foreach (var entry in info.Values) + { + if (string.IsNullOrWhiteSpace(entry)) + { + continue; + } + + try + { + using var document = JsonDocument.Parse(entry); + if (document.RootElement.TryGetProperty("pid", out var pidElement) && pidElement.TryGetInt32(out pid)) + { + return true; + } + } + catch (JsonException) + { + } + } + + pid = default; + return false; + } + + public ValueTask DisposeAsync() { try { - await channel.DisposeAsync().ConfigureAwait(false); + channel.Dispose(); } catch (InvalidOperationException) { // Channel already disposed. } + + return ValueTask.CompletedTask; } private static void EnsureHttp2Switch() @@ -161,7 +202,7 @@ internal sealed class CriRuntimeClient : ICriRuntimeClient EnableMultipleHttp2Connections = true }; - if (endpoint.ConnectTimeout is { } timeout and > TimeSpan.Zero) + if (endpoint.ConnectTimeout is { } timeout && timeout > TimeSpan.Zero) { handler.ConnectTimeout = timeout; } diff --git a/src/StellaOps.Zastava.Observer/DependencyInjection/ObserverServiceCollectionExtensions.cs b/src/StellaOps.Zastava.Observer/DependencyInjection/ObserverServiceCollectionExtensions.cs new file mode 100644 index 00000000..5f370be9 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/DependencyInjection/ObserverServiceCollectionExtensions.cs @@ -0,0 +1,103 @@ +using System; +using Microsoft.Extensions.Configuration; +using Microsoft.Extensions.DependencyInjection.Extensions; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Configuration; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; +using StellaOps.Zastava.Observer.ContainerRuntime; +using StellaOps.Zastava.Observer.Posture; +using StellaOps.Zastava.Observer.Runtime; +using StellaOps.Zastava.Observer.Worker; +using StellaOps.Zastava.Observer.Backend; + +namespace Microsoft.Extensions.DependencyInjection; + +public static class ObserverServiceCollectionExtensions +{ + public static IServiceCollection AddZastavaObserver(this IServiceCollection services, IConfiguration configuration) + { + ArgumentNullException.ThrowIfNull(services); + ArgumentNullException.ThrowIfNull(configuration); + + services.AddZastavaRuntimeCore(configuration, componentName: "observer"); + + services.AddOptions() + .Bind(configuration.GetSection(ZastavaObserverOptions.SectionName)) + .ValidateDataAnnotations() + .PostConfigure(options => + { + if (options.Backoff.Initial <= TimeSpan.Zero) + { + options.Backoff.Initial = TimeSpan.FromSeconds(1); + } + + if (options.Backoff.Max < options.Backoff.Initial) + { + options.Backoff.Max = options.Backoff.Initial; + } + + if (!options.Backend.AllowInsecureHttp && !string.Equals(options.Backend.BaseAddress.Scheme, Uri.UriSchemeHttps, StringComparison.OrdinalIgnoreCase)) + { + throw new InvalidOperationException("Observer backend baseAddress must use HTTPS unless allowInsecureHttp is explicitly enabled."); + } + + if (!options.Backend.PolicyPath.StartsWith("/", StringComparison.Ordinal)) + { + throw new InvalidOperationException("Observer backend policyPath must be absolute (start with '/')."); + } + + if (!options.Backend.EventsPath.StartsWith("/", StringComparison.Ordinal)) + { + throw new InvalidOperationException("Observer backend eventsPath must be absolute (start with '/')."); + } + }) + .ValidateOnStart(); + + services.TryAddSingleton(TimeProvider.System); + services.TryAddSingleton(); + services.TryAddSingleton(); + services.TryAddSingleton(); + services.TryAddSingleton(); + services.TryAddSingleton(); + services.TryAddSingleton(); + services.TryAddSingleton(); + + services.AddHttpClient() + .ConfigureHttpClient((provider, client) => + { + var optionsMonitor = provider.GetRequiredService>(); + var backend = optionsMonitor.CurrentValue.Backend; + client.BaseAddress = backend.BaseAddress; + client.Timeout = TimeSpan.FromSeconds(Math.Clamp(backend.RequestTimeoutSeconds, 1, 120)); + }); + + services.AddHttpClient() + .ConfigureHttpClient((provider, client) => + { + var optionsMonitor = provider.GetRequiredService>(); + var backend = optionsMonitor.CurrentValue.Backend; + client.BaseAddress = backend.BaseAddress; + client.Timeout = TimeSpan.FromSeconds(Math.Clamp(backend.RequestTimeoutSeconds, 1, 120)); + }); + + services.TryAddEnumerable(ServiceDescriptor.Singleton, ObserverRuntimeOptionsPostConfigure>()); + + services.AddHostedService(); + services.AddHostedService(); + services.AddHostedService(); + + return services; + } +} + +internal sealed class ObserverRuntimeOptionsPostConfigure : IPostConfigureOptions +{ + public void PostConfigure(string? name, ZastavaRuntimeOptions options) + { + if (string.IsNullOrWhiteSpace(options.Component)) + { + options.Component = "observer"; + } + } +} diff --git a/src/StellaOps.Zastava.Observer/Posture/IRuntimePostureCache.cs b/src/StellaOps.Zastava.Observer/Posture/IRuntimePostureCache.cs new file mode 100644 index 00000000..c068925d --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Posture/IRuntimePostureCache.cs @@ -0,0 +1,12 @@ +using System; + +using StellaOps.Zastava.Core.Contracts; + +namespace StellaOps.Zastava.Observer.Posture; + +internal interface IRuntimePostureCache +{ + RuntimePostureCacheEntry? Get(string key); + + void Set(string key, RuntimePosture posture, DateTimeOffset expiresAtUtc, DateTimeOffset storedAtUtc); +} diff --git a/src/StellaOps.Zastava.Observer/Posture/IRuntimePostureEvaluator.cs b/src/StellaOps.Zastava.Observer/Posture/IRuntimePostureEvaluator.cs new file mode 100644 index 00000000..b3a659d3 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Posture/IRuntimePostureEvaluator.cs @@ -0,0 +1,10 @@ +using System.Threading; +using System.Threading.Tasks; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; + +namespace StellaOps.Zastava.Observer.Posture; + +internal interface IRuntimePostureEvaluator +{ + Task EvaluateAsync(CriContainerInfo container, CancellationToken cancellationToken); +} diff --git a/src/StellaOps.Zastava.Observer/Posture/RuntimePostureCache.cs b/src/StellaOps.Zastava.Observer/Posture/RuntimePostureCache.cs new file mode 100644 index 00000000..a937e0d7 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Posture/RuntimePostureCache.cs @@ -0,0 +1,180 @@ +using System; +using System.Collections.Generic; +using System.IO; +using System.Linq; +using System.Text.Json; +using System.Text.Json.Serialization; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Observer.Configuration; + +namespace StellaOps.Zastava.Observer.Posture; + +internal sealed class RuntimePostureCache : IRuntimePostureCache +{ + private const int CurrentVersion = 1; + + private static readonly JsonSerializerOptions SerializerOptions = new() + { + PropertyNamingPolicy = JsonNamingPolicy.CamelCase, + DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull + }; + + private readonly IOptionsMonitor optionsMonitor; + private readonly ILogger logger; + private readonly object entriesLock = new(); + private readonly object fileLock = new(); + private readonly Dictionary entries = new(StringComparer.Ordinal); + + public RuntimePostureCache( + IOptionsMonitor optionsMonitor, + ILogger logger) + { + this.optionsMonitor = optionsMonitor ?? throw new ArgumentNullException(nameof(optionsMonitor)); + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + + Load(); + } + + public RuntimePostureCacheEntry? Get(string key) + { + if (string.IsNullOrWhiteSpace(key)) + { + return null; + } + + lock (entriesLock) + { + return entries.TryGetValue(key, out var entry) ? entry : null; + } + } + + public void Set(string key, RuntimePosture posture, DateTimeOffset expiresAtUtc, DateTimeOffset storedAtUtc) + { + if (string.IsNullOrWhiteSpace(key)) + { + return; + } + + ArgumentNullException.ThrowIfNull(posture); + var normalizedKey = key.Trim(); + var entry = new RuntimePostureCacheEntry(posture, expiresAtUtc, storedAtUtc); + + lock (entriesLock) + { + entries[normalizedKey] = entry; + } + + Persist(); + } + + private void Load() + { + var path = GetCachePath(); + if (!File.Exists(path)) + { + return; + } + + try + { + var json = File.ReadAllText(path); + var snapshot = JsonSerializer.Deserialize(json, SerializerOptions); + if (snapshot?.Entries is null) + { + return; + } + + lock (entriesLock) + { + entries.Clear(); + foreach (var entry in snapshot.Entries) + { + if (string.IsNullOrWhiteSpace(entry.Key) || entry.Posture is null) + { + continue; + } + + entries[entry.Key] = new RuntimePostureCacheEntry( + entry.Posture, + entry.ExpiresAtUtc, + entry.StoredAtUtc); + } + } + } + catch (Exception ex) + { + logger.LogWarning(ex, "Failed to load runtime posture cache from {CachePath}; starting empty.", path); + } + } + + private void Persist() + { + var path = GetCachePath(); + var directory = Path.GetDirectoryName(path); + if (!string.IsNullOrEmpty(directory)) + { + Directory.CreateDirectory(directory); + } + + CacheFileModel snapshot; + lock (entriesLock) + { + var ordered = entries + .OrderBy(pair => pair.Key, StringComparer.Ordinal) + .Select(pair => new CacheFileEntry + { + Key = pair.Key, + ExpiresAtUtc = pair.Value.ExpiresAtUtc, + StoredAtUtc = pair.Value.StoredAtUtc, + Posture = pair.Value.Posture + }) + .ToList(); + + snapshot = new CacheFileModel + { + Version = CurrentVersion, + Entries = ordered + }; + } + + var json = JsonSerializer.Serialize(snapshot, SerializerOptions); + + lock (fileLock) + { + var tempPath = path + ".tmp"; + File.WriteAllText(tempPath, json); + File.Move(tempPath, path, overwrite: true); + } + } + + private string GetCachePath() + { + return optionsMonitor.CurrentValue.Posture.CachePath; + } + + private sealed record CacheFileModel + { + [JsonPropertyName("version")] + public int Version { get; init; } + + [JsonPropertyName("entries")] + public List Entries { get; init; } = new(); + } + + private sealed record CacheFileEntry + { + [JsonPropertyName("key")] + public string Key { get; init; } = string.Empty; + + [JsonPropertyName("expiresAtUtc")] + public DateTimeOffset ExpiresAtUtc { get; init; } + + [JsonPropertyName("storedAtUtc")] + public DateTimeOffset StoredAtUtc { get; init; } + + [JsonPropertyName("posture")] + public RuntimePosture? Posture { get; init; } + } +} diff --git a/src/StellaOps.Zastava.Observer/Posture/RuntimePostureCacheEntry.cs b/src/StellaOps.Zastava.Observer/Posture/RuntimePostureCacheEntry.cs new file mode 100644 index 00000000..700babfb --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Posture/RuntimePostureCacheEntry.cs @@ -0,0 +1,12 @@ +using System; +using StellaOps.Zastava.Core.Contracts; + +namespace StellaOps.Zastava.Observer.Posture; + +internal sealed record RuntimePostureCacheEntry(RuntimePosture Posture, DateTimeOffset ExpiresAtUtc, DateTimeOffset StoredAtUtc) +{ + public bool IsExpired(DateTimeOffset now) => now >= ExpiresAtUtc; + + public bool IsStale(DateTimeOffset now, TimeSpan staleThreshold) + => now - ExpiresAtUtc >= staleThreshold; +} diff --git a/src/StellaOps.Zastava.Observer/Posture/RuntimePostureEvaluationResult.cs b/src/StellaOps.Zastava.Observer/Posture/RuntimePostureEvaluationResult.cs new file mode 100644 index 00000000..9ceb5e1e --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Posture/RuntimePostureEvaluationResult.cs @@ -0,0 +1,6 @@ +using System.Collections.Generic; +using StellaOps.Zastava.Core.Contracts; + +namespace StellaOps.Zastava.Observer.Posture; + +internal sealed record RuntimePostureEvaluationResult(RuntimePosture? Posture, IReadOnlyList Evidence); diff --git a/src/StellaOps.Zastava.Observer/Posture/RuntimePostureEvaluator.cs b/src/StellaOps.Zastava.Observer/Posture/RuntimePostureEvaluator.cs new file mode 100644 index 00000000..2713b2f3 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Posture/RuntimePostureEvaluator.cs @@ -0,0 +1,188 @@ +using System; +using System.Collections.Generic; +using System.Globalization; +using System.Linq; +using System.Threading; +using System.Threading.Tasks; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Observer.Backend; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; + +namespace StellaOps.Zastava.Observer.Posture; + +internal sealed class RuntimePostureEvaluator : IRuntimePostureEvaluator +{ + private readonly IRuntimePolicyClient policyClient; + private readonly IRuntimePostureCache cache; + private readonly IOptionsMonitor optionsMonitor; + private readonly TimeProvider timeProvider; + private readonly ILogger logger; + + public RuntimePostureEvaluator( + IRuntimePolicyClient policyClient, + IRuntimePostureCache cache, + IOptionsMonitor optionsMonitor, + TimeProvider timeProvider, + ILogger logger) + { + this.policyClient = policyClient ?? throw new ArgumentNullException(nameof(policyClient)); + this.cache = cache ?? throw new ArgumentNullException(nameof(cache)); + this.optionsMonitor = optionsMonitor ?? throw new ArgumentNullException(nameof(optionsMonitor)); + this.timeProvider = timeProvider ?? TimeProvider.System; + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + } + + public async Task EvaluateAsync(CriContainerInfo container, CancellationToken cancellationToken) + { + ArgumentNullException.ThrowIfNull(container); + + var evidence = new List(); + var now = timeProvider.GetUtcNow(); + var cacheOptions = optionsMonitor.CurrentValue.Posture; + var fallbackTtl = TimeSpan.FromSeconds(Math.Clamp(cacheOptions.FallbackTtlSeconds, 30, 86400)); + var staleThreshold = TimeSpan.FromSeconds(Math.Clamp(cacheOptions.StaleWarningThresholdSeconds, 30, 86400)); + + var imageKey = ResolveImageKey(container); + if (string.IsNullOrWhiteSpace(imageKey)) + { + evidence.Add(new RuntimeEvidence + { + Signal = "runtime.posture.skipped", + Value = "no-image-ref" + }); + return new RuntimePostureEvaluationResult(null, evidence); + } + + var cached = cache.Get(imageKey); + if (cached is not null && !cached.IsExpired(now)) + { + evidence.Add(new RuntimeEvidence + { + Signal = "runtime.posture.cache", + Value = "hit" + }); + return new RuntimePostureEvaluationResult(cached.Posture, evidence); + } + + try + { + var request = BuildRequest(container, imageKey); + var response = await policyClient.EvaluateAsync(request, cancellationToken).ConfigureAwait(false); + + if (!response.Results.TryGetValue(imageKey, out var imageResult)) + { + evidence.Add(new RuntimeEvidence + { + Signal = "runtime.posture.missing", + Value = "policy-empty" + }); + return new RuntimePostureEvaluationResult(null, evidence); + } + + var posture = MapPosture(imageResult); + var expiresAt = response.ExpiresAtUtc != default + ? response.ExpiresAtUtc + : now.AddSeconds(response.TtlSeconds > 0 ? response.TtlSeconds : fallbackTtl.TotalSeconds); + + cache.Set(imageKey, posture, expiresAt, now); + + evidence.Add(new RuntimeEvidence + { + Signal = "runtime.posture.source", + Value = "backend" + }); + evidence.Add(new RuntimeEvidence + { + Signal = "runtime.posture.ttl", + Value = expiresAt.ToString("O", CultureInfo.InvariantCulture) + }); + + return new RuntimePostureEvaluationResult(posture, evidence); + } + catch (Exception ex) when (!cancellationToken.IsCancellationRequested) + { + logger.LogWarning(ex, "Runtime posture evaluation failed for image {ImageRef}.", imageKey); + + if (cached is not null) + { + var cacheSignal = cached.IsExpired(now) + ? cached.IsStale(now, staleThreshold) ? "stale-warning" : "stale" + : "hit"; + + evidence.Add(new RuntimeEvidence + { + Signal = "runtime.posture.cache", + Value = cacheSignal + }); + + evidence.Add(new RuntimeEvidence + { + Signal = "runtime.posture.error", + Value = ex.GetType().Name + }); + + return new RuntimePostureEvaluationResult(cached.Posture, evidence); + } + + evidence.Add(new RuntimeEvidence + { + Signal = "runtime.posture.error", + Value = ex.GetType().Name + }); + + return new RuntimePostureEvaluationResult(null, evidence); + } + } + + private static string? ResolveImageKey(CriContainerInfo container) + { + if (!string.IsNullOrWhiteSpace(container.ImageRef)) + { + return container.ImageRef; + } + + return string.IsNullOrWhiteSpace(container.Image) ? null : container.Image; + } + + private static RuntimePolicyRequest BuildRequest(CriContainerInfo container, string imageKey) + { + var labels = container.Labels.Count == 0 + ? null + : new Dictionary(container.Labels, StringComparer.Ordinal); + + labels?.Remove(CriLabelKeys.PodUid); + + return new RuntimePolicyRequest + { + Namespace = container.Labels.TryGetValue(CriLabelKeys.PodNamespace, out var ns) ? ns : null, + Labels = labels, + Images = new[] { imageKey } + }; + } + + private static RuntimePosture MapPosture(RuntimePolicyImageResult result) + { + var posture = new RuntimePosture + { + ImageSigned = result.Signed, + SbomReferrer = result.HasSbomReferrers ? "present" : "missing" + }; + + if (result.Rekor is not null) + { + posture = posture with + { + Attestation = new RuntimeAttestation + { + Uuid = result.Rekor.Uuid, + Verified = result.Rekor.Verified + } + }; + } + + return posture; + } +} diff --git a/src/StellaOps.Zastava.Observer/Program.cs b/src/StellaOps.Zastava.Observer/Program.cs index 94db8884..13193771 100644 --- a/src/StellaOps.Zastava.Observer/Program.cs +++ b/src/StellaOps.Zastava.Observer/Program.cs @@ -4,7 +4,6 @@ using StellaOps.Zastava.Observer.Worker; var builder = Host.CreateApplicationBuilder(args); -builder.Services.AddZastavaRuntimeCore(builder.Configuration, componentName: "observer"); -builder.Services.AddHostedService(); +builder.Services.AddZastavaObserver(builder.Configuration); await builder.Build().RunAsync(); diff --git a/src/StellaOps.Zastava.Observer/Properties/AssemblyInfo.cs b/src/StellaOps.Zastava.Observer/Properties/AssemblyInfo.cs new file mode 100644 index 00000000..bee310ac --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Properties/AssemblyInfo.cs @@ -0,0 +1,3 @@ +using System.Runtime.CompilerServices; + +[assembly: InternalsVisibleTo("StellaOps.Zastava.Observer.Tests")] diff --git a/src/StellaOps.Zastava.Observer/Runtime/ElfBuildIdReader.cs b/src/StellaOps.Zastava.Observer/Runtime/ElfBuildIdReader.cs new file mode 100644 index 00000000..7d92d33a --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Runtime/ElfBuildIdReader.cs @@ -0,0 +1,287 @@ +using System.Buffers.Binary; +using System.IO; +using System.Text; +using System.Threading; +using System.Threading.Tasks; + +namespace StellaOps.Zastava.Observer.Runtime; + +internal static class ElfBuildIdReader +{ + private const int ElfIdentificationSize = 16; + private const byte ElfClass32 = 1; + private const byte ElfClass64 = 2; + private const byte ElfDataLittleEndian = 1; + private const byte ElfDataBigEndian = 2; + private const uint ProgramHeaderTypeNote = 4; + private const uint NoteTypeGnuBuildId = 3; + private const int Alignment = 4; + private const int MaxNoteSegmentBytes = 1 << 20; // 1 MiB + + public static async Task TryReadBuildIdAsync(string path, CancellationToken cancellationToken) + { + if (string.IsNullOrWhiteSpace(path)) + { + return null; + } + + try + { + using var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite | FileShare.Delete); + var header = await ReadHeaderAsync(stream, cancellationToken).ConfigureAwait(false); + if (header is null) + { + return null; + } + + return await ReadBuildIdFromNotesAsync(stream, header.Value, cancellationToken).ConfigureAwait(false); + } + catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) + { + throw; + } + catch (Exception ex) when (ex is IOException or UnauthorizedAccessException or NotSupportedException) + { + return null; + } + } + + private static async Task ReadBuildIdFromNotesAsync(Stream stream, ElfHeader header, CancellationToken cancellationToken) + { + if (header.ProgramHeaderEntrySize is 0 || header.ProgramHeaderCount is 0) + { + return null; + } + + var entryBuffer = new byte[header.ProgramHeaderEntrySize]; + for (var index = 0; index < header.ProgramHeaderCount; index++) + { + var entryOffset = header.ProgramHeaderOffset + (ulong)header.ProgramHeaderEntrySize * (ulong)index; + if (entryOffset > (ulong)stream.Length) + { + break; + } + + stream.Seek((long)entryOffset, SeekOrigin.Begin); + if (!await ReadExactlyAsync(stream, entryBuffer.AsMemory(0, header.ProgramHeaderEntrySize), cancellationToken).ConfigureAwait(false)) + { + break; + } + + var entry = entryBuffer.AsSpan(0, header.ProgramHeaderEntrySize); + var type = ReadUInt32(entry, 0, header.IsLittleEndian); + if (type != ProgramHeaderTypeNote) + { + continue; + } + + ulong segmentOffset; + ulong segmentSize; + if (header.Class == ElfClass64) + { + segmentOffset = ReadUInt64(entry, 8, header.IsLittleEndian); + segmentSize = ReadUInt64(entry, 32, header.IsLittleEndian); + } + else + { + segmentOffset = ReadUInt32(entry, 4, header.IsLittleEndian); + segmentSize = ReadUInt32(entry, 16, header.IsLittleEndian); + } + + if (segmentSize == 0 || segmentOffset > (ulong)stream.Length) + { + continue; + } + + var boundedSize = (int)Math.Min(segmentSize, (ulong)MaxNoteSegmentBytes); + if (boundedSize <= 0) + { + continue; + } + + stream.Seek((long)segmentOffset, SeekOrigin.Begin); + var segmentBuffer = new byte[boundedSize]; + if (!await ReadExactlyAsync(stream, segmentBuffer.AsMemory(0, boundedSize), cancellationToken).ConfigureAwait(false)) + { + continue; + } + + var buildId = ParseNoteSegment(segmentBuffer.AsSpan(0, boundedSize), header.IsLittleEndian); + if (buildId is not null) + { + return buildId; + } + } + + return null; + } + + private static string? ParseNoteSegment(ReadOnlySpan segment, bool isLittleEndian) + { + var offset = 0; + while (offset + 12 <= segment.Length) + { + var nameSize = ReadUInt32(segment, offset, isLittleEndian); + var descSize = ReadUInt32(segment, offset + 4, isLittleEndian); + var type = ReadUInt32(segment, offset + 8, isLittleEndian); + offset += 12; + + if (nameSize > int.MaxValue || descSize > int.MaxValue) + { + return null; + } + + var alignedNameSize = Align((int)nameSize); + var alignedDescSize = Align((int)descSize); + + if (offset + alignedNameSize + alignedDescSize > segment.Length) + { + return null; + } + + var nameBytes = segment.Slice(offset, (int)nameSize); + offset += alignedNameSize; + + var descriptorBytes = segment.Slice(offset, (int)descSize); + offset += alignedDescSize; + + if (type == NoteTypeGnuBuildId && IsGnuName(nameBytes)) + { + return Convert.ToHexString(descriptorBytes).ToLowerInvariant(); + } + } + + return null; + } + + private static bool IsGnuName(ReadOnlySpan name) + { + var length = name.IndexOf((byte)0); + if (length < 0) + { + length = name.Length; + } + + if (length != 3) + { + return false; + } + + return name[0] == (byte)'G' + && name[1] == (byte)'N' + && name[2] == (byte)'U'; + } + + private static async Task ReadHeaderAsync(Stream stream, CancellationToken cancellationToken) + { + stream.Seek(0, SeekOrigin.Begin); + var identBuffer = new byte[ElfIdentificationSize]; + if (!await ReadExactlyAsync(stream, identBuffer.AsMemory(0, ElfIdentificationSize), cancellationToken).ConfigureAwait(false)) + { + return null; + } + + var ident = identBuffer.AsSpan(); + if (ident[0] != 0x7F || ident[1] != (byte)'E' || ident[2] != (byte)'L' || ident[3] != (byte)'F') + { + return null; + } + + var elfClass = ident[4]; + if (elfClass != ElfClass32 && elfClass != ElfClass64) + { + return null; + } + + var dataEncoding = ident[5]; + var isLittleEndian = dataEncoding is ElfDataLittleEndian or 0; + if (dataEncoding == 0) + { + isLittleEndian = true; + } + else if (dataEncoding != ElfDataLittleEndian && dataEncoding != ElfDataBigEndian) + { + return null; + } + + var remainingHeaderSize = elfClass == ElfClass64 ? 64 - ElfIdentificationSize : 52 - ElfIdentificationSize; + var buffer = new byte[remainingHeaderSize]; + if (!await ReadExactlyAsync(stream, buffer.AsMemory(0, remainingHeaderSize), cancellationToken).ConfigureAwait(false)) + { + return null; + } + + var span = buffer.AsSpan(0, remainingHeaderSize); + ulong programHeaderOffset; + ushort programHeaderEntrySize; + ushort programHeaderCount; + + if (elfClass == ElfClass64) + { + programHeaderOffset = ReadUInt64(span, 16, isLittleEndian); + programHeaderEntrySize = ReadUInt16(span, 38, isLittleEndian); + programHeaderCount = ReadUInt16(span, 40, isLittleEndian); + } + else + { + programHeaderOffset = ReadUInt32(span, 12, isLittleEndian); + programHeaderEntrySize = ReadUInt16(span, 26, isLittleEndian); + programHeaderCount = ReadUInt16(span, 28, isLittleEndian); + } + + return new ElfHeader(elfClass, isLittleEndian, programHeaderOffset, programHeaderEntrySize, programHeaderCount); + } + + private static uint ReadUInt32(ReadOnlySpan buffer, int offset, bool isLittleEndian) + { + var slice = buffer.Slice(offset, sizeof(uint)); + return isLittleEndian + ? BinaryPrimitives.ReadUInt32LittleEndian(slice) + : BinaryPrimitives.ReadUInt32BigEndian(slice); + } + + private static ulong ReadUInt64(ReadOnlySpan buffer, int offset, bool isLittleEndian) + { + var slice = buffer.Slice(offset, sizeof(ulong)); + return isLittleEndian + ? BinaryPrimitives.ReadUInt64LittleEndian(slice) + : BinaryPrimitives.ReadUInt64BigEndian(slice); + } + + private static ushort ReadUInt16(ReadOnlySpan buffer, int offset, bool isLittleEndian) + { + var slice = buffer.Slice(offset, sizeof(ushort)); + return isLittleEndian + ? BinaryPrimitives.ReadUInt16LittleEndian(slice) + : BinaryPrimitives.ReadUInt16BigEndian(slice); + } + + private static int Align(int value) + => (value + (Alignment - 1)) & ~(Alignment - 1); + + private static async Task ReadExactlyAsync(Stream stream, Memory buffer, CancellationToken cancellationToken) + { + var total = 0; + while (total < buffer.Length) + { + var read = await stream.ReadAsync(buffer.Slice(total), cancellationToken).ConfigureAwait(false); + if (read == 0) + { + return false; + } + + total += read; + } + + return true; + } + + private readonly record struct ElfHeader(byte Class, bool IsLittleEndian, ulong ProgramHeaderOffset, ushort ProgramHeaderEntrySize, ushort ProgramHeaderCount) + { + public byte Class { get; } = Class; + public bool IsLittleEndian { get; } = IsLittleEndian; + public ulong ProgramHeaderOffset { get; } = ProgramHeaderOffset; + public ushort ProgramHeaderEntrySize { get; } = ProgramHeaderEntrySize; + public ushort ProgramHeaderCount { get; } = ProgramHeaderCount; + } +} diff --git a/src/StellaOps.Zastava.Observer/Runtime/RuntimeEventBuffer.cs b/src/StellaOps.Zastava.Observer/Runtime/RuntimeEventBuffer.cs new file mode 100644 index 00000000..d53897d5 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Runtime/RuntimeEventBuffer.cs @@ -0,0 +1,297 @@ +using System.Collections.Concurrent; +using System.Linq; +using System.Runtime.CompilerServices; +using System.Threading.Channels; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Core.Serialization; +using StellaOps.Zastava.Observer.Configuration; + +namespace StellaOps.Zastava.Observer.Runtime; + +internal interface IRuntimeEventBuffer +{ + ValueTask WriteBatchAsync(IReadOnlyList envelopes, CancellationToken cancellationToken); + + IAsyncEnumerable ReadAllAsync(CancellationToken cancellationToken); +} + +internal sealed record RuntimeEventBufferItem( + RuntimeEventEnvelope Envelope, + Func CompleteAsync, + Func RequeueAsync); + +internal sealed class RuntimeEventBuffer : IRuntimeEventBuffer +{ + private static readonly string FileExtension = ".json"; + + private readonly Channel channel; + private readonly ConcurrentDictionary inFlight = new(StringComparer.OrdinalIgnoreCase); + private readonly object capacityLock = new(); + private readonly string spoolPath; + private readonly ILogger logger; + private readonly TimeProvider timeProvider; + private readonly long maxDiskBytes; + + private long currentBytes; + private readonly int capacity; + + public RuntimeEventBuffer( + IOptions observerOptions, + TimeProvider timeProvider, + ILogger logger) + { + ArgumentNullException.ThrowIfNull(observerOptions); + this.timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider)); + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + + var options = observerOptions.Value ?? throw new ArgumentNullException(nameof(observerOptions)); + + capacity = Math.Clamp(options.MaxInMemoryBuffer, 16, 65536); + spoolPath = EnsureSpoolDirectory(options.EventBufferPath); + maxDiskBytes = Math.Clamp(options.MaxDiskBufferBytes, 1_048_576L, 1_073_741_824L); // 1 MiB – 1 GiB + + var channelOptions = new BoundedChannelOptions(capacity) + { + AllowSynchronousContinuations = false, + FullMode = BoundedChannelFullMode.Wait, + SingleReader = false, + SingleWriter = false + }; + + channel = Channel.CreateBounded(channelOptions); + + var existingFiles = Directory.EnumerateFiles(spoolPath, $"*{FileExtension}", SearchOption.TopDirectoryOnly) + .OrderBy(static path => path, StringComparer.Ordinal) + .ToArray(); + + foreach (var path in existingFiles) + { + var size = TryGetLength(path); + if (size > 0) + { + Interlocked.Add(ref currentBytes, size); + } + + // enqueue existing events for replay + if (!channel.Writer.TryWrite(path)) + { + _ = channel.Writer.WriteAsync(path); + } + } + + if (existingFiles.Length > 0) + { + logger.LogInformation("Runtime event buffer restored {Count} pending events ({Bytes} bytes) from disk spool.", + existingFiles.Length, + Interlocked.Read(ref currentBytes)); + } + } + + public async ValueTask WriteBatchAsync(IReadOnlyList envelopes, CancellationToken cancellationToken) + { + if (envelopes is null || envelopes.Count == 0) + { + return; + } + + foreach (var envelope in envelopes) + { + cancellationToken.ThrowIfCancellationRequested(); + + var payload = ZastavaCanonicalJsonSerializer.SerializeToUtf8Bytes(envelope); + var filePath = await PersistAsync(payload, cancellationToken).ConfigureAwait(false); + + await channel.Writer.WriteAsync(filePath, cancellationToken).ConfigureAwait(false); + } + + if (envelopes.Count > capacity / 2) + { + logger.LogDebug("Buffered {Count} runtime events; channel capacity {Capacity}.", envelopes.Count, capacity); + } + } + + public async IAsyncEnumerable ReadAllAsync([EnumeratorCancellation] CancellationToken cancellationToken) + { + while (await channel.Reader.WaitToReadAsync(cancellationToken).ConfigureAwait(false)) + { + while (channel.Reader.TryRead(out var filePath)) + { + cancellationToken.ThrowIfCancellationRequested(); + if (!File.Exists(filePath)) + { + RemoveMetricsForMissingFile(filePath); + continue; + } + + RuntimeEventEnvelope? envelope = null; + try + { + var json = await File.ReadAllTextAsync(filePath, cancellationToken).ConfigureAwait(false); + envelope = ZastavaCanonicalJsonSerializer.Deserialize(json); + } + catch (Exception ex) + { + logger.LogWarning(ex, "Failed to read runtime event payload from {Path}; dropping.", filePath); + await DeleteFileSilentlyAsync(filePath).ConfigureAwait(false); + continue; + } + + var currentPath = filePath; + inFlight[currentPath] = 0; + + yield return new RuntimeEventBufferItem( + envelope, + CompleteAsync(currentPath), + RequeueAsync(currentPath)); + } + } + } + + private Func CompleteAsync(string filePath) + => async () => + { + try + { + await DeleteFileSilentlyAsync(filePath).ConfigureAwait(false); + } + finally + { + inFlight.TryRemove(filePath, out _); + } + }; + + private Func RequeueAsync(string filePath) + => async cancellationToken => + { + inFlight.TryRemove(filePath, out _); + if (!File.Exists(filePath)) + { + RemoveMetricsForMissingFile(filePath); + return; + } + + await channel.Writer.WriteAsync(filePath, cancellationToken).ConfigureAwait(false); + }; + + private async Task PersistAsync(byte[] payload, CancellationToken cancellationToken) + { + var timestamp = timeProvider.GetUtcNow().UtcTicks; + var fileName = $"{timestamp:D20}-{Guid.NewGuid():N}{FileExtension}"; + var filePath = Path.Combine(spoolPath, fileName); + + Directory.CreateDirectory(spoolPath); + await File.WriteAllBytesAsync(filePath, payload, cancellationToken).ConfigureAwait(false); + Interlocked.Add(ref currentBytes, payload.Length); + + EnforceCapacity(); + return filePath; + } + + private void EnforceCapacity() + { + if (Volatile.Read(ref currentBytes) <= maxDiskBytes) + { + return; + } + + lock (capacityLock) + { + if (currentBytes <= maxDiskBytes) + { + return; + } + + var candidates = Directory.EnumerateFiles(spoolPath, $"*{FileExtension}", SearchOption.TopDirectoryOnly) + .OrderBy(static path => path, StringComparer.Ordinal) + .ToArray(); + + foreach (var file in candidates) + { + if (currentBytes <= maxDiskBytes) + { + break; + } + + if (inFlight.ContainsKey(file)) + { + continue; + } + + var length = TryGetLength(file); + try + { + File.Delete(file); + if (length > 0) + { + Interlocked.Add(ref currentBytes, -length); + } + + logger.LogWarning("Dropped runtime event {FileName} to enforce disk buffer capacity (limit {MaxBytes} bytes).", + Path.GetFileName(file), + maxDiskBytes); + } + catch (Exception ex) + { + logger.LogWarning(ex, "Failed to purge runtime event buffer file {FileName}.", Path.GetFileName(file)); + } + } + } + } + + private Task DeleteFileSilentlyAsync(string filePath) + { + if (!File.Exists(filePath)) + { + return Task.CompletedTask; + } + + var length = TryGetLength(filePath); + try + { + File.Delete(filePath); + if (length > 0) + { + Interlocked.Add(ref currentBytes, -length); + } + } + catch (Exception ex) + { + logger.LogWarning(ex, "Failed to delete runtime event buffer file {FileName}.", Path.GetFileName(filePath)); + } + return Task.CompletedTask; + } + + private void RemoveMetricsForMissingFile(string filePath) + { + var length = TryGetLength(filePath); + if (length > 0) + { + Interlocked.Add(ref currentBytes, -length); + } + } + + private static string EnsureSpoolDirectory(string? value) + { + var path = string.IsNullOrWhiteSpace(value) + ? Path.Combine(Path.GetTempPath(), "zastava-observer", "runtime-events") + : value!; + + Directory.CreateDirectory(path); + return path; + } + + private static long TryGetLength(string path) + { + try + { + var info = new FileInfo(path); + return info.Exists ? info.Length : 0; + } + catch + { + return 0; + } + } +} diff --git a/src/StellaOps.Zastava.Observer/Runtime/RuntimeProcessCollector.cs b/src/StellaOps.Zastava.Observer/Runtime/RuntimeProcessCollector.cs new file mode 100644 index 00000000..ff75ba45 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Runtime/RuntimeProcessCollector.cs @@ -0,0 +1,525 @@ +using System.Buffers; +using System.Collections.Generic; +using System.Globalization; +using System.IO; +using System.Security.Cryptography; +using System.Text; +using System.Text.RegularExpressions; +using System.Runtime.CompilerServices; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; + +namespace StellaOps.Zastava.Observer.Runtime; + +internal interface IRuntimeProcessCollector +{ + Task CollectAsync(CriContainerInfo container, CancellationToken cancellationToken); +} + +internal sealed class RuntimeProcessCollector : IRuntimeProcessCollector +{ + private static readonly Regex ShellRegex = new(@"(^|/)(ba)?sh$", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled); + private static readonly Regex PythonRegex = new(@"(^|/)(python)(\d+(\.\d+)*)?$", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled); + private static readonly Regex NodeRegex = new(@"(^|/)(node|npm|npx)$", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled); + private const string SyntheticArgvFile = ""; + private const int MaxInterpreterTargetLength = 512; + + private readonly ZastavaObserverOptions options; + private readonly ILogger logger; + + public RuntimeProcessCollector(IOptions options, ILogger logger) + { + ArgumentNullException.ThrowIfNull(options); + this.options = options.Value; + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + } + + public async Task CollectAsync(CriContainerInfo container, CancellationToken cancellationToken) + { + ArgumentNullException.ThrowIfNull(container); + if (container.Pid is null or <= 0) + { + logger.LogDebug("Container {ContainerId} lacks PID information; skipping process capture.", container.Id); + return null; + } + + var pid = container.Pid.Value; + var procRoot = options.ProcRootPath.TrimEnd(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar); + var pidDirectory = Path.Combine(procRoot, pid.ToString(CultureInfo.InvariantCulture)); + + try + { + var process = await ReadProcessAsync(pidDirectory, pid, cancellationToken).ConfigureAwait(false); + if (process is null) + { + logger.LogDebug("No cmdline information available for PID {Pid}; skipping process capture.", pid); + return null; + } + + var buildId = await ElfBuildIdReader.TryReadBuildIdAsync(Path.Combine(pidDirectory, "exe"), cancellationToken).ConfigureAwait(false); + if (!string.IsNullOrWhiteSpace(buildId)) + { + process = process with { BuildId = buildId }; + } + + var (libraries, evidence) = await ReadLibrariesAsync(pidDirectory, cancellationToken).ConfigureAwait(false); + evidence.Insert(0, new RuntimeEvidence + { + Signal = "procfs.cmdline", + Value = $"{pid}:{string.Join(' ', process.Entrypoint)}" + }); + + if (!string.IsNullOrWhiteSpace(buildId)) + { + evidence.Add(new RuntimeEvidence + { + Signal = "procfs.buildId", + Value = buildId + }); + } + + return new RuntimeProcessCapture(process, libraries, evidence); + } + catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) + { + throw; + } + catch (Exception ex) + { + logger.LogWarning(ex, "Failed to capture process information for container {ContainerId} (PID {Pid}).", container.Id, pid); + return null; + } + } + + private async Task ReadProcessAsync(string pidDirectory, int pid, CancellationToken cancellationToken) + { + var cmdlinePath = Path.Combine(pidDirectory, "cmdline"); + if (!File.Exists(cmdlinePath)) + { + return null; + } + + var content = await File.ReadAllBytesAsync(cmdlinePath, cancellationToken).ConfigureAwait(false); + if (content.Length == 0) + { + return null; + } + + var arguments = ParseCmdline(content, options.MaxEntrypointArguments); + if (arguments.Count == 0) + { + return null; + } + + var entryTrace = BuildEntryTrace(arguments); + + return new RuntimeProcess + { + Pid = pid, + Entrypoint = arguments, + EntryTrace = entryTrace + }; + } + + private async Task<(IReadOnlyList Libraries, List Evidence)> ReadLibrariesAsync( + string pidDirectory, + CancellationToken cancellationToken) + { + var mapsPath = Path.Combine(pidDirectory, "maps"); + var libraries = new List(); + var evidence = new List(); + + if (!File.Exists(mapsPath)) + { + return (libraries, evidence); + } + + var seen = new HashSet(StringComparer.Ordinal); + var limit = Math.Max(1, options.MaxTrackedLibraries); + var perFileLimit = Math.Max(1024L, options.MaxLibraryBytes); + var hashBudget = options.MaxLibraryHashBytes <= 0 + ? long.MaxValue + : Math.Max(perFileLimit, options.MaxLibraryHashBytes); + long hashedBytes = 0; + var budgetSignaled = false; + + await foreach (var line in ReadLinesAsync(mapsPath, cancellationToken)) + { + if (!TryParseMapsEntry(line, out var path, out var baseAddress)) + { + continue; + } + + if (!seen.Add(path)) + { + continue; + } + + if (libraries.Count >= limit) + { + evidence.Add(new RuntimeEvidence + { + Signal = "procfs.maps.truncated", + Value = $"limit={limit}" + }); + break; + } + + long length; + long? inode; + try + { + var fileInfo = new FileInfo(path); + length = fileInfo.Length; + inode = TryGetInode(fileInfo); + } + catch (Exception ex) when (ex is IOException or UnauthorizedAccessException) + { + evidence.Add(new RuntimeEvidence + { + Signal = "procfs.maps.error", + Value = $"{path}:{ex.GetType().Name}" + }); + continue; + } + + var sizeExceeded = length > perFileLimit; + string? hash = null; + + if (!sizeExceeded && length > 0) + { + var remainingBudget = hashBudget - hashedBytes; + if (remainingBudget <= 0 || length > remainingBudget) + { + if (!budgetSignaled && hashBudget != long.MaxValue) + { + evidence.Add(new RuntimeEvidence + { + Signal = "procfs.maps.hashBudget", + Value = $"limit={hashBudget}" + }); + budgetSignaled = true; + } + } + else + { + try + { + using var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite | FileShare.Delete); + hash = await ComputeSha256Async(stream, cancellationToken).ConfigureAwait(false); + hashedBytes += length; + } + catch (Exception ex) when (ex is IOException or UnauthorizedAccessException) + { + evidence.Add(new RuntimeEvidence + { + Signal = "procfs.maps.error", + Value = $"{path}:{ex.GetType().Name}" + }); + } + } + } + + if (sizeExceeded) + { + evidence.Add(new RuntimeEvidence + { + Signal = "procfs.maps.skipped", + Value = $"{path}:size>{perFileLimit}" + }); + } + + var library = new RuntimeLoadedLibrary + { + Path = path, + Inode = inode, + Sha256 = hash + }; + libraries.Add(library); + + var value = baseAddress is null ? path : $"{path}@{baseAddress}"; + evidence.Add(new RuntimeEvidence + { + Signal = "procfs.maps", + Value = value + }); + } + + evidence.Add(new RuntimeEvidence + { + Signal = "procfs.maps.count", + Value = libraries.Count.ToString(CultureInfo.InvariantCulture) + }); + + return (libraries, evidence); + } + + private static async IAsyncEnumerable ReadLinesAsync(string path, [EnumeratorCancellation] CancellationToken cancellationToken) + { + using var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite | FileShare.Delete); + using var reader = new StreamReader(stream, Encoding.UTF8); + while (true) + { + cancellationToken.ThrowIfCancellationRequested(); + var line = await reader.ReadLineAsync().ConfigureAwait(false); + if (line is null) + { + yield break; + } + + yield return line; + } + } + + private static bool TryParseMapsEntry(string line, out string path, out string? baseAddress) + { + path = string.Empty; + baseAddress = null; + + if (string.IsNullOrWhiteSpace(line)) + { + return false; + } + + var span = line.AsSpan().Trim(); + var lastSpace = span.LastIndexOf(' '); + if (lastSpace < 0 || lastSpace >= span.Length - 1) + { + return false; + } + + var candidate = span[(lastSpace + 1)..].Trim(); + if (candidate.IsEmpty || candidate[0] == '[') + { + return false; + } + + path = candidate.ToString(); + + var firstSpace = span.IndexOf(' '); + if (firstSpace > 0) + { + var rangeSpan = span[..firstSpace]; + var dashIndex = rangeSpan.IndexOf('-'); + if (dashIndex > 0) + { + var startSpan = rangeSpan[..dashIndex]; + if (!startSpan.IsEmpty) + { + baseAddress = startSpan.StartsWith("0x", StringComparison.OrdinalIgnoreCase) + ? startSpan.ToString() + : $"0x{startSpan.ToString()}"; + } + } + } + + return true; + } + + private static async Task ComputeSha256Async(Stream stream, CancellationToken cancellationToken) + { + using var sha = SHA256.Create(); + var buffer = ArrayPool.Shared.Rent(8192); + try + { + int read; + while ((read = await stream.ReadAsync(buffer.AsMemory(0, buffer.Length), cancellationToken).ConfigureAwait(false)) > 0) + { + sha.TransformBlock(buffer, 0, read, null, 0); + } + sha.TransformFinalBlock(Array.Empty(), 0, 0); + return Convert.ToHexString(sha.Hash!).ToLowerInvariant(); + } + finally + { + ArrayPool.Shared.Return(buffer); + } + } + + private static long? TryGetInode(FileInfo fileInfo) => null; + + private static List ParseCmdline(byte[] content, int maxArguments) + { + var segments = Encoding.UTF8.GetString(content).Split('\0', StringSplitOptions.RemoveEmptyEntries); + var list = segments.Take(maxArguments).ToList(); + return list; + } + + private static IReadOnlyList BuildEntryTrace(IReadOnlyList arguments) + { + var traces = new List(); + if (arguments.Count == 0) + { + return traces; + } + + var first = arguments[0]; + traces.Add(new RuntimeEntryTrace + { + File = first, + Op = "exec", + Target = first + }); + + if (arguments.Count >= 3 && ShellRegex.IsMatch(first) && string.Equals(arguments[1], "-c", StringComparison.Ordinal)) + { + var script = arguments[2]; + var tokens = TokenizeCommand(script); + if (tokens.Count > 0) + { + traces.Add(new RuntimeEntryTrace + { + File = tokens[0], + Op = "shell", + Target = script + }); + + TryAddInterpreterTrace(traces, tokens); + } + } + else + { + TryAddInterpreterTrace(traces, arguments); + } + + return traces; + } + + private static void TryAddInterpreterTrace(List traces, IReadOnlyList tokens) + { + if (tokens.Count == 0) + { + return; + } + + var interpreter = tokens[0]; + if (PythonRegex.IsMatch(interpreter)) + { + var target = ResolveInterpreterTarget(tokens, 1); + if (!string.IsNullOrEmpty(target)) + { + traces.Add(new RuntimeEntryTrace + { + File = SyntheticArgvFile, + Op = "python", + Target = TrimTarget(target!) + }); + } + } + else if (NodeRegex.IsMatch(interpreter)) + { + var target = ResolveInterpreterTarget(tokens, 1); + if (!string.IsNullOrEmpty(target)) + { + traces.Add(new RuntimeEntryTrace + { + File = SyntheticArgvFile, + Op = "node", + Target = TrimTarget(target!) + }); + } + } + } + + private static string? ResolveInterpreterTarget(IReadOnlyList tokens, int startIndex) + { + for (var i = startIndex; i < tokens.Count; i++) + { + var candidate = tokens[i]; + if (string.IsNullOrWhiteSpace(candidate)) + { + continue; + } + + if (candidate.StartsWith("-", StringComparison.Ordinal)) + { + if ((string.Equals(candidate, "-m", StringComparison.Ordinal) + || string.Equals(candidate, "-c", StringComparison.Ordinal) + || string.Equals(candidate, "-e", StringComparison.Ordinal)) + && i + 1 < tokens.Count) + { + return tokens[i + 1]; + } + + continue; + } + + return candidate; + } + + return null; + } + + private static string TrimTarget(string value) + { + if (value.Length <= MaxInterpreterTargetLength) + { + return value; + } + + return value[..MaxInterpreterTargetLength]; + } + + private static List TokenizeCommand(string command) + { + var tokens = new List(); + if (string.IsNullOrWhiteSpace(command)) + { + return tokens; + } + + var current = new StringBuilder(); + bool inQuotes = false; + char quoteChar = '"'; + + foreach (var ch in command) + { + if (inQuotes) + { + if (ch == quoteChar) + { + inQuotes = false; + } + else + { + current.Append(ch); + } + } + else + { + if (ch == '"' || ch == '\'') + { + inQuotes = true; + quoteChar = ch; + } + else if (char.IsWhiteSpace(ch)) + { + if (current.Length > 0) + { + tokens.Add(current.ToString()); + current.Clear(); + } + } + else + { + current.Append(ch); + } + } + } + + if (current.Length > 0) + { + tokens.Add(current.ToString()); + } + + return tokens; + } +} + +internal sealed record RuntimeProcessCapture( + RuntimeProcess Process, + IReadOnlyList Libraries, + IReadOnlyList Evidence); diff --git a/src/StellaOps.Zastava.Observer/TASKS.md b/src/StellaOps.Zastava.Observer/TASKS.md index 164a17b4..6254b21c 100644 --- a/src/StellaOps.Zastava.Observer/TASKS.md +++ b/src/StellaOps.Zastava.Observer/TASKS.md @@ -2,8 +2,10 @@ | ID | Status | Owner(s) | Depends on | Description | Exit Criteria | |----|--------|----------|------------|-------------|---------------| -| ZASTAVA-OBS-12-001 | DOING | Zastava Observer Guild | ZASTAVA-CORE-12-201 | Build container lifecycle watcher that tails CRI (containerd/cri-o/docker) events and emits deterministic runtime records with buffering + backoff. | Fixture cluster produces start/stop events with stable ordering, jitter/backoff tested, metrics/logging wired. | -| ZASTAVA-OBS-12-002 | TODO | Zastava Observer Guild | ZASTAVA-OBS-12-001 | Capture entrypoint traces and loaded libraries, hashing binaries and correlating to SBOM baseline per architecture sections 2.1 and 10. | EntryTrace parser covers shell/python/node launchers, loaded library hashes recorded, fixtures assert linkage to SBOM usage view. | -| ZASTAVA-OBS-12-003 | TODO | Zastava Observer Guild | ZASTAVA-OBS-12-002 | Implement runtime posture checks (signature/SBOM/attestation presence) with offline caching and warning surfaces. | Observer marks posture status, caches refresh across restarts, integration tests prove offline tolerance. | -| ZASTAVA-OBS-12-004 | TODO | Zastava Observer Guild | ZASTAVA-OBS-12-002 | Batch `/runtime/events` submissions with disk-backed buffer, rate limits, and deterministic envelopes. | Buffered submissions survive restart, rate-limits enforced in tests, JSON envelopes match schema in docs/events. | -| ZASTAVA-OBS-17-005 | TODO | Zastava Observer Guild | ZASTAVA-OBS-12-002 | Collect GNU build-id for ELF processes and attach it to emitted runtime events to enable symbol lookup + debug-store correlation. | Observer reads build-id via `/proc//exe`/notes without pausing workloads, runtime events include `buildId` field, fixtures cover glibc/musl images, docs updated with retrieval notes. | +| ZASTAVA-OBS-12-001 | DONE (2025-10-24) | Zastava Observer Guild | ZASTAVA-CORE-12-201 | Build container lifecycle watcher that tails CRI (containerd/cri-o/docker) events and emits deterministic runtime records with buffering + backoff. | Fixture cluster produces start/stop events with stable ordering, jitter/backoff tested, metrics/logging wired. | +| ZASTAVA-OBS-12-002 | DONE (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-12-001 | Capture entrypoint traces and loaded libraries, hashing binaries and correlating to SBOM baseline per architecture sections 2.1 and 10. | EntryTrace parser covers shell/python/node launchers, loaded library hashes recorded, fixtures assert linkage to SBOM usage view. | +| ZASTAVA-OBS-12-003 | DONE (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-12-002 | Implement runtime posture checks (signature/SBOM/attestation presence) with offline caching and warning surfaces. | Observer marks posture status, caches refresh across restarts, integration tests prove offline tolerance. | +| ZASTAVA-OBS-12-004 | DONE (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-12-002 | Batch `/runtime/events` submissions with disk-backed buffer, rate limits, and deterministic envelopes. | Buffered submissions survive restart, rate-limits enforced in tests, JSON envelopes match schema in docs/events. | +| ZASTAVA-OBS-17-005 | DOING (2025-10-24) | Zastava Observer Guild | ZASTAVA-OBS-12-002 | Collect GNU build-id for ELF processes and attach it to emitted runtime events to enable symbol lookup + debug-store correlation. | Observer reads build-id via `/proc//exe`/notes without pausing workloads, runtime events include `buildId` field, fixtures cover glibc/musl images, docs updated with retrieval notes. | + +> 2025-10-24: Observer unit tests pending; `dotnet restore` requires offline copies of `Google.Protobuf`, `Grpc.Net.Client`, `Grpc.Tools` in `local-nuget` before execution can be verified. diff --git a/src/StellaOps.Zastava.Observer/Worker/BackoffCalculator.cs b/src/StellaOps.Zastava.Observer/Worker/BackoffCalculator.cs new file mode 100644 index 00000000..bf926f59 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Worker/BackoffCalculator.cs @@ -0,0 +1,26 @@ +using StellaOps.Zastava.Observer.Configuration; + +namespace StellaOps.Zastava.Observer.Worker; + +internal static class BackoffCalculator +{ + public static TimeSpan ComputeDelay(ObserverBackoffOptions options, int attempt, Random random) + { + ArgumentNullException.ThrowIfNull(options); + ArgumentNullException.ThrowIfNull(random); + + var cappedAttempt = Math.Max(1, attempt); + var baseDelayMs = options.Initial.TotalMilliseconds * Math.Pow(2, cappedAttempt - 1); + baseDelayMs = Math.Min(baseDelayMs, options.Max.TotalMilliseconds); + + if (options.JitterRatio <= 0) + { + return TimeSpan.FromMilliseconds(baseDelayMs); + } + + var jitterWindow = baseDelayMs * options.JitterRatio; + var jitter = (random.NextDouble() * 2 - 1) * jitterWindow; + var jittered = Math.Clamp(baseDelayMs + jitter, options.Initial.TotalMilliseconds, options.Max.TotalMilliseconds); + return TimeSpan.FromMilliseconds(jittered); + } +} diff --git a/src/StellaOps.Zastava.Observer/Worker/ContainerLifecycleHostedService.cs b/src/StellaOps.Zastava.Observer/Worker/ContainerLifecycleHostedService.cs new file mode 100644 index 00000000..4a45a375 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Worker/ContainerLifecycleHostedService.cs @@ -0,0 +1,197 @@ +using System.Collections.Generic; +using System.Linq; +using Microsoft.Extensions.Hosting; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Core.Configuration; +using StellaOps.Zastava.Core.Diagnostics; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.ContainerRuntime; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; +using StellaOps.Zastava.Observer.Runtime; + +namespace StellaOps.Zastava.Observer.Worker; + +internal sealed class ContainerLifecycleHostedService : BackgroundService +{ + private readonly ICriRuntimeClientFactory clientFactory; + private readonly IOptionsMonitor observerOptions; + private readonly IOptionsMonitor runtimeOptions; + private readonly IZastavaLogScopeBuilder logScopeBuilder; + private readonly IZastavaRuntimeMetrics runtimeMetrics; + private readonly IRuntimeEventBuffer eventBuffer; + private readonly ContainerStateTrackerFactory trackerFactory; + private readonly ContainerRuntimePoller poller; + private readonly IRuntimeProcessCollector processCollector; + private readonly TimeProvider timeProvider; + private readonly ILogger logger; + private readonly Random jitterRandom = new(); + + public ContainerLifecycleHostedService( + ICriRuntimeClientFactory clientFactory, + IOptionsMonitor observerOptions, + IOptionsMonitor runtimeOptions, + IZastavaLogScopeBuilder logScopeBuilder, + IZastavaRuntimeMetrics runtimeMetrics, + IRuntimeEventBuffer eventBuffer, + ContainerStateTrackerFactory trackerFactory, + ContainerRuntimePoller poller, + IRuntimeProcessCollector processCollector, + TimeProvider timeProvider, + ILogger logger) + { + this.clientFactory = clientFactory ?? throw new ArgumentNullException(nameof(clientFactory)); + this.observerOptions = observerOptions ?? throw new ArgumentNullException(nameof(observerOptions)); + this.runtimeOptions = runtimeOptions ?? throw new ArgumentNullException(nameof(runtimeOptions)); + this.logScopeBuilder = logScopeBuilder ?? throw new ArgumentNullException(nameof(logScopeBuilder)); + this.runtimeMetrics = runtimeMetrics ?? throw new ArgumentNullException(nameof(runtimeMetrics)); + this.eventBuffer = eventBuffer ?? throw new ArgumentNullException(nameof(eventBuffer)); + this.trackerFactory = trackerFactory ?? throw new ArgumentNullException(nameof(trackerFactory)); + this.poller = poller ?? throw new ArgumentNullException(nameof(poller)); + this.processCollector = processCollector ?? throw new ArgumentNullException(nameof(processCollector)); + this.timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider)); + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + } + + protected override Task ExecuteAsync(CancellationToken stoppingToken) + { + var options = observerOptions.CurrentValue; + var activeEndpoints = options.Runtimes + .Where(static runtime => runtime.Enabled) + .ToArray(); + + if (activeEndpoints.Length == 0) + { + logger.LogWarning("No container runtime endpoints configured; lifecycle watcher idle."); + return Task.CompletedTask; + } + + var tasks = activeEndpoints + .Select(endpoint => MonitorRuntimeAsync(endpoint, stoppingToken)) + .ToArray(); + + return Task.WhenAll(tasks); + } + + private async Task MonitorRuntimeAsync(ContainerRuntimeEndpointOptions endpoint, CancellationToken cancellationToken) + { + var runtime = runtimeOptions.CurrentValue; + var tenant = runtime.Tenant; + var nodeName = observerOptions.CurrentValue.NodeName; + var pollInterval = endpoint.PollInterval ?? observerOptions.CurrentValue.PollInterval; + var backoffOptions = observerOptions.CurrentValue.Backoff; + + while (!cancellationToken.IsCancellationRequested) + { + await using var client = clientFactory.Create(endpoint); + CriRuntimeIdentity identity; + try + { + identity = await client.GetIdentityAsync(cancellationToken).ConfigureAwait(false); + } + catch (Exception ex) when (!cancellationToken.IsCancellationRequested) + { + await HandleFailureAsync(endpoint, 1, backoffOptions, ex, cancellationToken).ConfigureAwait(false); + continue; + } + + var tracker = trackerFactory.Create(); + var failureCount = 0; + + while (!cancellationToken.IsCancellationRequested) + { + try + { + var envelopes = await poller.PollAsync( + tracker, + client, + endpoint, + identity, + tenant, + nodeName, + timeProvider, + processCollector, + cancellationToken).ConfigureAwait(false); + + if (envelopes.Count > 0) + { + await PublishAsync(endpoint, envelopes, cancellationToken).ConfigureAwait(false); + } + + failureCount = 0; + await Task.Delay(pollInterval, cancellationToken).ConfigureAwait(false); + } + catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) + { + return; + } + catch (Exception ex) when (!cancellationToken.IsCancellationRequested) + { + failureCount++; + await HandleFailureAsync(endpoint, failureCount, backoffOptions, ex, cancellationToken).ConfigureAwait(false); + break; // recreate client + } + } + } + } + + private async Task PublishAsync(ContainerRuntimeEndpointOptions endpoint, IReadOnlyList envelopes, CancellationToken cancellationToken) + { + var endpointName = endpoint.ResolveName(); + foreach (var envelope in envelopes) + { + var tags = runtimeMetrics.DefaultTags + .Concat(new[] + { + new KeyValuePair("runtime_endpoint", endpointName), + new KeyValuePair("event_kind", envelope.Event.Kind.ToString().ToLowerInvariant()) + }) + .ToArray(); + runtimeMetrics.RuntimeEvents.Add(1, tags); + + var scope = logScopeBuilder.BuildScope( + correlationId: envelope.Event.EventId, + node: envelope.Event.Node, + workload: envelope.Event.Workload.ContainerId, + eventId: envelope.Event.EventId, + additional: new Dictionary + { + ["runtimeEndpoint"] = endpointName, + ["kind"] = envelope.Event.Kind.ToString() + }); + + using (logger.BeginScope(scope)) + { + logger.LogInformation("Observed container {ContainerId} ({Kind}) for node {Node}.", + envelope.Event.Workload.ContainerId, + envelope.Event.Kind, + envelope.Event.Node); + } + } + + await eventBuffer.WriteBatchAsync(envelopes, cancellationToken).ConfigureAwait(false); + } + + private async Task HandleFailureAsync( + ContainerRuntimeEndpointOptions endpoint, + int failureCount, + ObserverBackoffOptions backoffOptions, + Exception exception, + CancellationToken cancellationToken) + { + var delay = BackoffCalculator.ComputeDelay(backoffOptions, failureCount, jitterRandom); + logger.LogWarning(exception, "Runtime watcher for {Endpoint} encountered error (attempt {Attempt}); retrying after {Delay}.", + endpoint.ResolveName(), + failureCount, + delay); + + try + { + await Task.Delay(delay, cancellationToken).ConfigureAwait(false); + } + catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) + { + } + } +} diff --git a/src/StellaOps.Zastava.Observer/Worker/ContainerRuntimePoller.cs b/src/StellaOps.Zastava.Observer/Worker/ContainerRuntimePoller.cs new file mode 100644 index 00000000..31c1b3b3 --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Worker/ContainerRuntimePoller.cs @@ -0,0 +1,124 @@ +using Microsoft.Extensions.Logging; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.ContainerRuntime; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; +using StellaOps.Zastava.Observer.Cri; +using StellaOps.Zastava.Observer.Posture; +using StellaOps.Zastava.Observer.Runtime; + +namespace StellaOps.Zastava.Observer.Worker; + +internal sealed class ContainerRuntimePoller +{ + private readonly ILogger logger; + private readonly IRuntimePostureEvaluator? postureEvaluator; + + public ContainerRuntimePoller(ILogger logger, IRuntimePostureEvaluator? postureEvaluator = null) + { + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + this.postureEvaluator = postureEvaluator; + } + + public async Task> PollAsync( + ContainerStateTracker tracker, + ICriRuntimeClient client, + ContainerRuntimeEndpointOptions endpoint, + CriRuntimeIdentity identity, + string tenant, + string nodeName, + TimeProvider timeProvider, + IRuntimeProcessCollector? processCollector, + CancellationToken cancellationToken) + { + ArgumentNullException.ThrowIfNull(tracker); + ArgumentNullException.ThrowIfNull(client); + ArgumentNullException.ThrowIfNull(endpoint); + ArgumentNullException.ThrowIfNull(identity); + ArgumentNullException.ThrowIfNull(timeProvider); + + var pollTimestamp = timeProvider.GetUtcNow(); + tracker.BeginCycle(); + + var runningContainers = await client.ListContainersAsync(ContainerState.ContainerRunning, cancellationToken).ConfigureAwait(false); + var generated = new List(); + + if (runningContainers.Count > 0) + { + foreach (var container in runningContainers) + { + var enriched = container; + var status = await client.GetContainerStatusAsync(container.Id, cancellationToken).ConfigureAwait(false); + if (status is not null) + { + enriched = status; + } + + var lifecycleEvent = tracker.MarkRunning(enriched, pollTimestamp); + if (lifecycleEvent is null) + { + continue; + } + + RuntimeProcessCapture? capture = null; + if (processCollector is not null && lifecycleEvent.Kind == ContainerLifecycleEventKind.Start) + { + capture = await processCollector.CollectAsync(enriched, cancellationToken).ConfigureAwait(false); + } + + RuntimePostureEvaluationResult? posture = null; + if (this.postureEvaluator is not null) + { + posture = await this.postureEvaluator.EvaluateAsync(enriched, cancellationToken).ConfigureAwait(false); + } + + generated.Add(RuntimeEventFactory.Create( + lifecycleEvent, + endpoint, + identity, + tenant, + nodeName, + capture, + posture?.Posture, + posture?.Evidence)); + } + } + + var stopEvents = await tracker.CompleteCycleAsync( + id => client.GetContainerStatusAsync(id, cancellationToken), + pollTimestamp, + cancellationToken).ConfigureAwait(false); + + foreach (var lifecycleEvent in stopEvents) + { + RuntimePostureEvaluationResult? posture = null; + if (this.postureEvaluator is not null) + { + posture = await this.postureEvaluator.EvaluateAsync(lifecycleEvent.Snapshot, cancellationToken).ConfigureAwait(false); + } + + generated.Add(RuntimeEventFactory.Create( + lifecycleEvent, + endpoint, + identity, + tenant, + nodeName, + null, + posture?.Posture, + posture?.Evidence)); + } + + if (generated.Count == 0) + { + return Array.Empty(); + } + + var ordered = generated + .OrderBy(static envelope => envelope.Event.When) + .ThenBy(static envelope => envelope.Event.Workload.ContainerId, StringComparer.Ordinal) + .ToArray(); + + logger.LogDebug("Generated {Count} runtime events for endpoint {EndpointName}.", ordered.Length, endpoint.ResolveName()); + return ordered; + } +} diff --git a/src/StellaOps.Zastava.Observer/Worker/RuntimeEventDispatchService.cs b/src/StellaOps.Zastava.Observer/Worker/RuntimeEventDispatchService.cs new file mode 100644 index 00000000..e681d3ca --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Worker/RuntimeEventDispatchService.cs @@ -0,0 +1,225 @@ +using System.Linq; +using System.Net; +using Microsoft.Extensions.Hosting; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Observer.Backend; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.Runtime; + +namespace StellaOps.Zastava.Observer.Worker; + +internal sealed class RuntimeEventDispatchService : BackgroundService +{ + private readonly IRuntimeEventBuffer buffer; + private readonly IRuntimeEventsClient eventsClient; + private readonly IOptionsMonitor observerOptions; + private readonly TimeProvider timeProvider; + private readonly ILogger logger; + + public RuntimeEventDispatchService( + IRuntimeEventBuffer buffer, + IRuntimeEventsClient eventsClient, + IOptionsMonitor observerOptions, + TimeProvider timeProvider, + ILogger logger) + { + this.buffer = buffer ?? throw new ArgumentNullException(nameof(buffer)); + this.eventsClient = eventsClient ?? throw new ArgumentNullException(nameof(eventsClient)); + this.observerOptions = observerOptions ?? throw new ArgumentNullException(nameof(observerOptions)); + this.timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider)); + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + } + + protected override async Task ExecuteAsync(CancellationToken stoppingToken) + { + var batch = new List(); + var enumerator = buffer.ReadAllAsync(stoppingToken).GetAsyncEnumerator(stoppingToken); + Task? moveNextTask = null; + Task? flushDelayTask = null; + CancellationTokenSource? flushDelayCts = null; + + try + { + while (!stoppingToken.IsCancellationRequested) + { + moveNextTask ??= enumerator.MoveNextAsync().AsTask(); + + if (batch.Count > 0 && flushDelayTask is null) + { + StartFlushTimer(ref flushDelayTask, ref flushDelayCts, stoppingToken); + } + + Task completedTask; + if (flushDelayTask is null) + { + completedTask = await Task.WhenAny(moveNextTask).ConfigureAwait(false); + } + else + { + completedTask = await Task.WhenAny(moveNextTask, flushDelayTask).ConfigureAwait(false); + } + + if (completedTask == moveNextTask) + { + if (!await moveNextTask.ConfigureAwait(false)) + { + break; + } + + var item = enumerator.Current; + batch.Add(item); + moveNextTask = null; + + var options = observerOptions.CurrentValue; + var batchSize = Math.Clamp(options.PublishBatchSize, 1, 512); + if (batch.Count >= batchSize) + { + ResetFlushTimer(ref flushDelayTask, ref flushDelayCts); + await FlushAsync(batch, stoppingToken).ConfigureAwait(false); + } + } + else + { + // flush timer triggered + ResetFlushTimer(ref flushDelayTask, ref flushDelayCts); + if (batch.Count > 0) + { + await FlushAsync(batch, stoppingToken).ConfigureAwait(false); + } + } + } + } + finally + { + ResetFlushTimer(ref flushDelayTask, ref flushDelayCts); + + if (batch.Count > 0 && !stoppingToken.IsCancellationRequested) + { + await FlushAsync(batch, stoppingToken).ConfigureAwait(false); + } + + if (moveNextTask is not null) + { + try { await moveNextTask.ConfigureAwait(false); } + catch { /* ignored */ } + } + + await enumerator.DisposeAsync().ConfigureAwait(false); + } + } + + private async Task FlushAsync(List batch, CancellationToken cancellationToken) + { + if (batch.Count == 0) + { + return; + } + + var request = new RuntimeEventsIngestRequest + { + BatchId = $"obs-{timeProvider.GetUtcNow():yyyyMMddTHHmmssfff}-{Guid.NewGuid():N}", + Events = batch.Select(item => item.Envelope).ToArray() + }; + + try + { + var result = await eventsClient.PublishAsync(request, cancellationToken).ConfigureAwait(false); + if (result.Success) + { + foreach (var item in batch) + { + await item.CompleteAsync().ConfigureAwait(false); + } + + logger.LogInformation("Runtime events batch published (batchId={BatchId}, accepted={Accepted}, duplicates={Duplicates}).", + request.BatchId, + result.Accepted, + result.Duplicates); + } + else if (result.RateLimited) + { + await RequeueBatchAsync(batch, cancellationToken).ConfigureAwait(false); + await DelayAsync(result.RetryAfter, cancellationToken).ConfigureAwait(false); + } + } + catch (RuntimeEventsException ex) when (!cancellationToken.IsCancellationRequested) + { + logger.LogWarning(ex, "Runtime events publish failed (status={StatusCode}); batch will be retried.", (int)ex.StatusCode); + await RequeueBatchAsync(batch, cancellationToken).ConfigureAwait(false); + + var backoff = ex.StatusCode == HttpStatusCode.ServiceUnavailable + ? TimeSpan.FromSeconds(5) + : TimeSpan.FromSeconds(2); + + await DelayAsync(backoff, cancellationToken).ConfigureAwait(false); + } + catch (Exception ex) when (!cancellationToken.IsCancellationRequested) + { + logger.LogWarning(ex, "Runtime events publish encountered an unexpected error; batch will be retried."); + await RequeueBatchAsync(batch, cancellationToken).ConfigureAwait(false); + await DelayAsync(TimeSpan.FromSeconds(5), cancellationToken).ConfigureAwait(false); + } + finally + { + batch.Clear(); + } + } + + private async Task RequeueBatchAsync(IEnumerable batch, CancellationToken cancellationToken) + { + foreach (var item in batch) + { + try + { + await item.RequeueAsync(cancellationToken).ConfigureAwait(false); + } + catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) + { + throw; + } + catch (Exception ex) + { + logger.LogWarning(ex, "Failed to requeue runtime event {EventId}; dropping.", item.Envelope.Event.EventId); + await item.CompleteAsync().ConfigureAwait(false); + } + } + } + + private async Task DelayAsync(TimeSpan delay, CancellationToken cancellationToken) + { + if (delay <= TimeSpan.Zero) + { + return; + } + + try + { + await Task.Delay(delay, cancellationToken).ConfigureAwait(false); + } + catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) + { + } + } + + private void StartFlushTimer(ref Task? flushTask, ref CancellationTokenSource? cts, CancellationToken stoppingToken) + { + var options = observerOptions.CurrentValue; + var flushIntervalSeconds = Math.Clamp(options.PublishFlushIntervalSeconds, 0.1, 30); + var flushInterval = TimeSpan.FromSeconds(flushIntervalSeconds); + + cts = CancellationTokenSource.CreateLinkedTokenSource(stoppingToken); + flushTask = Task.Delay(flushInterval, cts.Token); + } + + private void ResetFlushTimer(ref Task? flushTask, ref CancellationTokenSource? cts) + { + if (cts is not null) + { + try { cts.Cancel(); } catch { /* ignore */ } + cts.Dispose(); + cts = null; + } + flushTask = null; + } +} diff --git a/src/StellaOps.Zastava.Observer/Worker/RuntimeEventFactory.cs b/src/StellaOps.Zastava.Observer/Worker/RuntimeEventFactory.cs new file mode 100644 index 00000000..c9f052fb --- /dev/null +++ b/src/StellaOps.Zastava.Observer/Worker/RuntimeEventFactory.cs @@ -0,0 +1,148 @@ +using System.Collections.Generic; +using System.Security.Cryptography; +using System.Text; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Observer.Configuration; +using StellaOps.Zastava.Observer.ContainerRuntime; +using StellaOps.Zastava.Observer.ContainerRuntime.Cri; +using StellaOps.Zastava.Observer.Runtime; + +namespace StellaOps.Zastava.Observer.Worker; + +internal static class RuntimeEventFactory +{ + public static RuntimeEventEnvelope Create( + ContainerLifecycleEvent lifecycleEvent, + ContainerRuntimeEndpointOptions endpoint, + CriRuntimeIdentity identity, + string tenant, + string nodeName, + RuntimeProcessCapture? capture = null, + RuntimePosture? posture = null, + IReadOnlyList? additionalEvidence = null) + { + ArgumentNullException.ThrowIfNull(lifecycleEvent); + ArgumentNullException.ThrowIfNull(endpoint); + ArgumentNullException.ThrowIfNull(identity); + ArgumentNullException.ThrowIfNull(tenant); + ArgumentNullException.ThrowIfNull(nodeName); + + var snapshot = lifecycleEvent.Snapshot; + var workloadLabels = snapshot.Labels ?? new Dictionary(StringComparer.Ordinal); + var annotations = snapshot.Annotations is null + ? new Dictionary(StringComparer.Ordinal) + : new Dictionary(snapshot.Annotations, StringComparer.Ordinal); + + var platform = ResolvePlatform(workloadLabels, endpoint); + var runtimeEvent = new RuntimeEvent + { + EventId = ComputeEventId(nodeName, lifecycleEvent), + When = lifecycleEvent.Timestamp, + Kind = lifecycleEvent.Kind == ContainerLifecycleEventKind.Start + ? RuntimeEventKind.ContainerStart + : RuntimeEventKind.ContainerStop, + Tenant = tenant, + Node = nodeName, + Runtime = new RuntimeEngine + { + Engine = endpoint.Engine.ToEngineString(), + Version = identity.RuntimeVersion + }, + Workload = new RuntimeWorkload + { + Platform = platform, + Namespace = TryGet(workloadLabels, CriLabelKeys.PodNamespace), + Pod = TryGet(workloadLabels, CriLabelKeys.PodName), + Container = TryGet(workloadLabels, CriLabelKeys.ContainerName) ?? snapshot.Name, + ContainerId = $"{endpoint.Engine.ToEngineString()}://{snapshot.Id}", + ImageRef = ResolveImageRef(snapshot), + Owner = null + }, + Process = capture?.Process, + LoadedLibraries = capture?.Libraries ?? Array.Empty(), + Posture = posture, + Evidence = MergeEvidence(capture?.Evidence, additionalEvidence), + Annotations = annotations.Count == 0 ? null : new SortedDictionary(annotations, StringComparer.Ordinal) + }; + + return RuntimeEventEnvelope.Create(runtimeEvent, ZastavaContractVersions.RuntimeEvent); + } + + private static string ResolvePlatform(IReadOnlyDictionary labels, ContainerRuntimeEndpointOptions endpoint) + { + if (labels.ContainsKey(CriLabelKeys.PodName)) + { + return "kubernetes"; + } + + return endpoint.Engine.ToEngineString(); + } + + private static IReadOnlyList MergeEvidence( + IReadOnlyList? primary, + IReadOnlyList? secondary) + { + if ((primary is null || primary.Count == 0) && (secondary is null || secondary.Count == 0)) + { + return Array.Empty(); + } + + if (secondary is null || secondary.Count == 0) + { + return primary ?? Array.Empty(); + } + + if (primary is null || primary.Count == 0) + { + return secondary; + } + + var merged = new List(primary.Count + secondary.Count); + merged.AddRange(primary); + merged.AddRange(secondary); + return merged; + } + + private static string? ResolveImageRef(CriContainerInfo snapshot) + { + if (!string.IsNullOrWhiteSpace(snapshot.ImageRef)) + { + return snapshot.ImageRef; + } + + return snapshot.Image; + } + + private static string? TryGet(IReadOnlyDictionary dictionary, string key) + { + if (dictionary.TryGetValue(key, out var value) && !string.IsNullOrWhiteSpace(value)) + { + return value; + } + + return null; + } + + private static string ComputeEventId(string nodeName, ContainerLifecycleEvent lifecycleEvent) + { + var builder = new StringBuilder() + .Append(nodeName) + .Append('|') + .Append(lifecycleEvent.Snapshot.Id) + .Append('|') + .Append(lifecycleEvent.Timestamp.ToUniversalTime().Ticks) + .Append('|') + .Append((int)lifecycleEvent.Kind); + + var bytes = Encoding.UTF8.GetBytes(builder.ToString()); + Span hash = stackalloc byte[16]; + if (!MD5.TryHashData(bytes, hash, out _)) + { + using var md5 = MD5.Create(); + hash = md5.ComputeHash(bytes).AsSpan(0, 16); + } + + var guid = new Guid(hash); + return guid.ToString("N"); + } +} diff --git a/src/StellaOps.Zastava.Webhook.Tests/Admission/AdmissionResponseBuilderTests.cs b/src/StellaOps.Zastava.Webhook.Tests/Admission/AdmissionResponseBuilderTests.cs new file mode 100644 index 00000000..3c2edc0f --- /dev/null +++ b/src/StellaOps.Zastava.Webhook.Tests/Admission/AdmissionResponseBuilderTests.cs @@ -0,0 +1,131 @@ +using System.Linq; +using System.Text.Json; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Webhook.Backend; +using StellaOps.Zastava.Webhook.Admission; +using Xunit; + +namespace StellaOps.Zastava.Webhook.Tests.Admission; + +public sealed class AdmissionResponseBuilderTests +{ + [Fact] + public void Build_AllowsWhenAllDecisionsPass() + { + using var document = JsonDocument.Parse(""" +{ + "metadata": { "namespace": "payments" }, + "spec": { + "containers": [ { "name": "api", "image": "ghcr.io/example/api:1.0" } ] + } +} +"""); + + var pod = document.RootElement; + var spec = pod.GetProperty("spec"); + + var context = new AdmissionRequestContext( + ApiVersion: "admission.k8s.io/v1", + Kind: "AdmissionReview", + Uid: "abc", + Namespace: "payments", + Labels: new Dictionary(), + Containers: new[] { new AdmissionContainerReference("api", "ghcr.io/example/api:1.0") }, + PodObject: pod, + PodSpec: spec); + + var evaluation = new RuntimeAdmissionEvaluation + { + Decisions = new[] + { + new RuntimeAdmissionDecision + { + OriginalImage = "ghcr.io/example/api:1.0", + ResolvedDigest = "ghcr.io/example/api@sha256:deadbeef", + Verdict = PolicyVerdict.Pass, + Allowed = true, + Policy = new RuntimePolicyImageResult + { + PolicyVerdict = PolicyVerdict.Pass, + HasSbom = true, + Signed = true + }, + Reasons = Array.Empty(), + FromCache = false, + ResolutionFailed = false + } + }, + BackendFailed = false, + FailOpenApplied = false, + FailureReason = null, + TtlSeconds = 300 + }; + + var builder = new AdmissionResponseBuilder(); + var (envelope, response) = builder.Build(context, evaluation); + + Assert.Equal("admission.k8s.io/v1", response.ApiVersion); + Assert.True(response.Response.Allowed); + Assert.Null(response.Response.Status); + Assert.NotNull(response.Response.AuditAnnotations); + Assert.True(envelope.Decision.Images.First().HasSbomReferrers); + Assert.StartsWith("sha256-", envelope.Decision.PodSpecDigest, StringComparison.Ordinal); + } + + [Fact] + public void Build_DeniedIncludesStatusAndWarnings() + { + using var document = JsonDocument.Parse(""" +{ + "metadata": { "namespace": "ops" }, + "spec": { + "containers": [ { "name": "app", "image": "ghcr.io/example/app:latest" } ] + } +} +"""); + + var pod = document.RootElement; + var spec = pod.GetProperty("spec"); + + var context = new AdmissionRequestContext( + "admission.k8s.io/v1", + "AdmissionReview", + "uid-123", + "ops", + new Dictionary(), + new[] { new AdmissionContainerReference("app", "ghcr.io/example/app:latest") }, + pod, + spec); + + var evaluation = new RuntimeAdmissionEvaluation + { + Decisions = new[] + { + new RuntimeAdmissionDecision + { + OriginalImage = "ghcr.io/example/app:latest", + ResolvedDigest = null, + Verdict = PolicyVerdict.Fail, + Allowed = false, + Policy = null, + Reasons = new[] { "policy.fail" }, + FromCache = false, + ResolutionFailed = true + } + }, + BackendFailed = true, + FailOpenApplied = false, + FailureReason = "backend.unavailable", + TtlSeconds = 60 + }; + + var builder = new AdmissionResponseBuilder(); + var (_, response) = builder.Build(context, evaluation); + + Assert.False(response.Response.Allowed); + Assert.NotNull(response.Response.Status); + Assert.Equal(403, response.Response.Status!.Code); + Assert.NotNull(response.Response.AuditAnnotations); + Assert.Contains("zastava.stellaops/admission", response.Response.AuditAnnotations!.Keys); + } +} diff --git a/src/StellaOps.Zastava.Webhook.Tests/Admission/AdmissionReviewParserTests.cs b/src/StellaOps.Zastava.Webhook.Tests/Admission/AdmissionReviewParserTests.cs new file mode 100644 index 00000000..e518d6b8 --- /dev/null +++ b/src/StellaOps.Zastava.Webhook.Tests/Admission/AdmissionReviewParserTests.cs @@ -0,0 +1,102 @@ +using System.Text.Json; +using StellaOps.Zastava.Webhook.Admission; +using Xunit; + +namespace StellaOps.Zastava.Webhook.Tests.Admission; + +public sealed class AdmissionReviewParserTests +{ + private static readonly JsonSerializerOptions SerializerOptions = new(JsonSerializerDefaults.Web); + + [Fact] + public void Parse_ValidRequestExtractsContainers() + { + var dto = Deserialize(""" +{ + "apiVersion": "admission.k8s.io/v1", + "kind": "AdmissionReview", + "request": { + "uid": "abc-123", + "object": { + "metadata": { + "namespace": "payments", + "labels": { "app": "demo" } + }, + "spec": { + "containers": [ + { "name": "api", "image": "ghcr.io/example/api:1.2.3" } + ], + "initContainers": [ + { "name": "init", "image": "ghcr.io/example/init:1.0" } + ] + } + } + } +} +"""); + + var parser = new AdmissionReviewParser(); + var context = parser.Parse(dto); + + Assert.Equal("admission.k8s.io/v1", context.ApiVersion); + Assert.Equal("AdmissionReview", context.Kind); + Assert.Equal("abc-123", context.Uid); + Assert.Equal("payments", context.Namespace); + Assert.Equal("demo", context.Labels["app"]); + Assert.Equal(2, context.Containers.Count); + Assert.Contains(context.Containers, c => c.Name == "api" && c.Image == "ghcr.io/example/api:1.2.3"); + Assert.Contains(context.Containers, c => c.Name == "init" && c.Image == "ghcr.io/example/init:1.0"); + } + + [Fact] + public void Parse_UsesRequestNamespaceWhenAvailable() + { + var dto = Deserialize(""" +{ + "apiVersion": "admission.k8s.io/v1", + "kind": "AdmissionReview", + "request": { + "uid": "uid-456", + "namespace": "critical", + "object": { + "metadata": { + "labels": { } + }, + "spec": { + "containers": [ { "name": "app", "image": "ghcr.io/example/app:latest" } ] + } + } + } +} +"""); + + var parser = new AdmissionReviewParser(); + var context = parser.Parse(dto); + Assert.Equal("critical", context.Namespace); + } + + [Fact] + public void Parse_ThrowsWhenNoContainers() + { + var dto = Deserialize(""" +{ + "apiVersion": "admission.k8s.io/v1", + "kind": "AdmissionReview", + "request": { + "uid": "uid-789", + "object": { + "metadata": { "namespace": "ops" }, + "spec": { } + } + } +} +"""); + + var parser = new AdmissionReviewParser(); + var ex = Assert.Throws(() => parser.Parse(dto)); + Assert.Equal("admission.review.containers", ex.Code); + } + + private static AdmissionReviewRequestDto Deserialize(string json) + => JsonSerializer.Deserialize(json, SerializerOptions)!; +} diff --git a/src/StellaOps.Zastava.Webhook.Tests/Admission/RuntimeAdmissionPolicyServiceTests.cs b/src/StellaOps.Zastava.Webhook.Tests/Admission/RuntimeAdmissionPolicyServiceTests.cs new file mode 100644 index 00000000..f64209df --- /dev/null +++ b/src/StellaOps.Zastava.Webhook.Tests/Admission/RuntimeAdmissionPolicyServiceTests.cs @@ -0,0 +1,266 @@ +using System; +using System.Collections.Generic; +using System.Diagnostics.Metrics; +using Microsoft.Extensions.Logging.Abstractions; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Configuration; +using StellaOps.Zastava.Core.Diagnostics; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Webhook.Admission; +using StellaOps.Zastava.Webhook.Backend; +using StellaOps.Zastava.Webhook.Configuration; +using Xunit; + +namespace StellaOps.Zastava.Webhook.Tests.Admission; + +public sealed class RuntimeAdmissionPolicyServiceTests +{ + private const string SampleDigest = "sha256:0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"; + + [Fact] + public async Task EvaluateAsync_UsesCacheOnSubsequentCalls() + { + var timeProvider = new TestTimeProvider(new DateTimeOffset(2025, 10, 24, 12, 0, 0, TimeSpan.Zero)); + var policyClient = new StubRuntimePolicyClient(new RuntimePolicyResponse + { + TtlSeconds = 600, + Results = new Dictionary + { + [SampleDigest] = new RuntimePolicyImageResult + { + PolicyVerdict = PolicyVerdict.Pass, + Signed = true, + HasSbom = true, + Reasons = Array.Empty() + } + } + }); + + var runtimeMetrics = new StubRuntimeMetrics(); + var optionsMonitor = new StaticOptionsMonitor(new ZastavaWebhookOptions()); + var cache = new RuntimePolicyCache(Options.Create(optionsMonitor.CurrentValue), timeProvider, NullLogger.Instance); + var resolver = new ImageDigestResolver(); + + var service = new RuntimeAdmissionPolicyService( + policyClient, + resolver, + cache, + optionsMonitor, + runtimeMetrics, + timeProvider, + NullLogger.Instance); + + var request = new RuntimeAdmissionRequest( + Namespace: "payments", + Labels: new Dictionary(), + Images: new[] { $"ghcr.io/example/api@{SampleDigest}" }); + + var first = await service.EvaluateAsync(request, CancellationToken.None); + Assert.Single(first.Decisions); + Assert.False(first.BackendFailed); + Assert.Equal(600, first.TtlSeconds); + Assert.Equal(1, policyClient.CallCount); + + var second = await service.EvaluateAsync(request, CancellationToken.None); + Assert.Single(second.Decisions); + Assert.Equal(1, policyClient.CallCount); // no additional backend call + Assert.True(second.Decisions[0].FromCache); + Assert.Equal(300, second.TtlSeconds); + } + + [Fact] + public async Task EvaluateAsync_FailOpenWhenBackendUnavailable() + { + var timeProvider = new TestTimeProvider(new DateTimeOffset(2025, 10, 24, 12, 0, 0, TimeSpan.Zero)); + var policyClient = new StubRuntimePolicyClient(new RuntimePolicyException("backend", System.Net.HttpStatusCode.BadGateway)); + var options = new ZastavaWebhookOptions + { + Admission = new ZastavaWebhookAdmissionOptions + { + FailOpenByDefault = false, + FailOpenNamespaces = new HashSet(StringComparer.Ordinal) { "payments" } + } + }; + var optionsMonitor = new StaticOptionsMonitor(options); + var cache = new RuntimePolicyCache(Options.Create(optionsMonitor.CurrentValue), timeProvider, NullLogger.Instance); + var service = new RuntimeAdmissionPolicyService( + policyClient, + new ImageDigestResolver(), + cache, + optionsMonitor, + new StubRuntimeMetrics(), + timeProvider, + NullLogger.Instance); + + var request = new RuntimeAdmissionRequest( + "payments", + new Dictionary(), + new[] { $"ghcr.io/example/api@{SampleDigest}" }); + + var evaluation = await service.EvaluateAsync(request, CancellationToken.None); + Assert.True(evaluation.BackendFailed); + Assert.True(evaluation.FailOpenApplied); + Assert.Equal(300, evaluation.TtlSeconds); + var decision = Assert.Single(evaluation.Decisions); + Assert.True(decision.Allowed); + Assert.Contains("zastava.fail_open.backend_unavailable", decision.Reasons); + } + + [Fact] + public async Task EvaluateAsync_FailClosedWhenNamespaceConfigured() + { + var timeProvider = new TestTimeProvider(new DateTimeOffset(2025, 10, 24, 12, 0, 0, TimeSpan.Zero)); + var policyClient = new StubRuntimePolicyClient(new RuntimePolicyException("backend", System.Net.HttpStatusCode.BadGateway)); + var options = new ZastavaWebhookOptions + { + Admission = new ZastavaWebhookAdmissionOptions + { + FailOpenByDefault = true, + FailClosedNamespaces = new HashSet(StringComparer.Ordinal) { "critical" } + } + }; + var optionsMonitor = new StaticOptionsMonitor(options); + var cache = new RuntimePolicyCache(Options.Create(optionsMonitor.CurrentValue), timeProvider, NullLogger.Instance); + + var service = new RuntimeAdmissionPolicyService( + policyClient, + new ImageDigestResolver(), + cache, + optionsMonitor, + new StubRuntimeMetrics(), + timeProvider, + NullLogger.Instance); + + var request = new RuntimeAdmissionRequest( + "critical", + new Dictionary(), + new[] { $"ghcr.io/example/api@{SampleDigest}" }); + + var evaluation = await service.EvaluateAsync(request, CancellationToken.None); + Assert.True(evaluation.BackendFailed); + Assert.False(evaluation.FailOpenApplied); + Assert.Equal(300, evaluation.TtlSeconds); + var decision = Assert.Single(evaluation.Decisions); + Assert.False(decision.Allowed); + Assert.Contains("zastava.backend.unavailable", decision.Reasons); + } + + [Fact] + public async Task EvaluateAsync_ResolutionFailureProducesDeny() + { + var timeProvider = new TestTimeProvider(new DateTimeOffset(2025, 10, 24, 12, 0, 0, TimeSpan.Zero)); + var policyClient = new StubRuntimePolicyClient(new RuntimePolicyResponse { TtlSeconds = 300 }); + var optionsMonitor = new StaticOptionsMonitor(new ZastavaWebhookOptions()); + var cache = new RuntimePolicyCache(Options.Create(optionsMonitor.CurrentValue), timeProvider, NullLogger.Instance); + + var service = new RuntimeAdmissionPolicyService( + policyClient, + new ImageDigestResolver(), + cache, + optionsMonitor, + new StubRuntimeMetrics(), + timeProvider, + NullLogger.Instance); + + var request = new RuntimeAdmissionRequest( + Namespace: "payments", + Labels: new Dictionary(), + Images: new[] { "ghcr.io/example/api:latest" }); + + var evaluation = await service.EvaluateAsync(request, CancellationToken.None); + Assert.Equal(300, evaluation.TtlSeconds); + var decision = Assert.Single(evaluation.Decisions); + Assert.False(decision.Allowed); + Assert.True(decision.ResolutionFailed); + Assert.Contains("image.reference.tag_unresolved", decision.Reasons); + } + + private sealed class StubRuntimePolicyClient : IRuntimePolicyClient + { + private readonly RuntimePolicyResponse? response; + private readonly Exception? exception; + + public StubRuntimePolicyClient(RuntimePolicyResponse response) + { + this.response = response; + } + + public StubRuntimePolicyClient(Exception exception) + { + this.exception = exception; + } + + public int CallCount { get; private set; } + + public Task EvaluateAsync(RuntimePolicyRequest request, CancellationToken cancellationToken) + { + CallCount++; + if (exception is not null) + { + throw exception; + } + + return Task.FromResult(response ?? new RuntimePolicyResponse()); + } + } + + private sealed class StubRuntimeMetrics : IZastavaRuntimeMetrics + { + public StubRuntimeMetrics() + { + Meter = new Meter("Test.Zastava.Webhook"); + RuntimeEvents = Meter.CreateCounter("test.runtime.events"); + AdmissionDecisions = Meter.CreateCounter("test.admission.decisions"); + BackendLatencyMs = Meter.CreateHistogram("test.backend.latency"); + DefaultTags = Array.Empty>(); + } + + public Meter Meter { get; } + + public Counter RuntimeEvents { get; } + + public Counter AdmissionDecisions { get; } + + public Histogram BackendLatencyMs { get; } + + public IReadOnlyList> DefaultTags { get; } + + public void Dispose() => Meter.Dispose(); + } + + private sealed class StaticOptionsMonitor : IOptionsMonitor + { + public StaticOptionsMonitor(T currentValue) + { + CurrentValue = currentValue; + } + + public T CurrentValue { get; } + + public T Get(string? name) => CurrentValue; + + public IDisposable OnChange(Action listener) => NullDisposable.Instance; + + private sealed class NullDisposable : IDisposable + { + public static readonly NullDisposable Instance = new(); + public void Dispose() + { + } + } + } + + private sealed class TestTimeProvider : TimeProvider + { + private DateTimeOffset now; + + public TestTimeProvider(DateTimeOffset initial) + { + now = initial; + } + + public override DateTimeOffset GetUtcNow() => now; + + public void Advance(TimeSpan delta) => now = now.Add(delta); + } +} diff --git a/src/StellaOps.Zastava.Webhook/Admission/AdmissionEndpoint.cs b/src/StellaOps.Zastava.Webhook/Admission/AdmissionEndpoint.cs new file mode 100644 index 00000000..9fb901de --- /dev/null +++ b/src/StellaOps.Zastava.Webhook/Admission/AdmissionEndpoint.cs @@ -0,0 +1,97 @@ +using System.Globalization; +using System.Text.Json; +using Microsoft.Extensions.Logging; +using StellaOps.Zastava.Core.Diagnostics; + +namespace StellaOps.Zastava.Webhook.Admission; + +internal static class AdmissionEndpoint +{ + private static readonly JsonSerializerOptions SerializerOptions = new(JsonSerializerDefaults.Web); + + public static async Task HandleAsync( + HttpContext httpContext, + AdmissionReviewParser parser, + AdmissionResponseBuilder responseBuilder, + IRuntimeAdmissionPolicyService policyService, + IZastavaLogScopeBuilder logScopeBuilder, + ILogger logger, + CancellationToken cancellationToken) + { + AdmissionReviewRequestDto? dto; + try + { + dto = await httpContext.Request.ReadFromJsonAsync(SerializerOptions, cancellationToken).ConfigureAwait(false); + } + catch (JsonException ex) + { + logger.LogWarning(ex, "Failed to deserialize AdmissionReview payload."); + return Results.Problem( + statusCode: StatusCodes.Status400BadRequest, + title: "Invalid AdmissionReview", + detail: "Request body was not a valid AdmissionReview document.", + type: "https://stellaops.org/problems/admission.review.invalid-json"); + } + + AdmissionRequestContext context; + try + { + context = parser.Parse(dto!); + } + catch (AdmissionReviewParseException ex) + { + logger.LogWarning("AdmissionReview parse failure ({Code}): {Message}", ex.Code, ex.Message); + return Results.Problem( + statusCode: StatusCodes.Status400BadRequest, + title: "Invalid AdmissionReview", + detail: ex.Message, + type: $"https://stellaops.org/problems/{ex.Code}"); + } + + using var scope = logger.BeginScope(logScopeBuilder.BuildScope( + correlationId: context.Uid, + node: null, + workload: context.Namespace, + eventId: context.Uid, + additional: new Dictionary + { + ["namespace"] = context.Namespace, + ["containerCount"] = context.Containers.Count.ToString(CultureInfo.InvariantCulture) + })); + + var request = new RuntimeAdmissionRequest( + context.Namespace, + context.Labels, + context.Containers.Select(static c => c.Image).ToArray()); + + RuntimeAdmissionEvaluation evaluation; + try + { + evaluation = await policyService.EvaluateAsync(request, cancellationToken).ConfigureAwait(false); + } + catch (Exception ex) when (!cancellationToken.IsCancellationRequested) + { + logger.LogError(ex, "Admission evaluation failed unexpectedly."); + return Results.Problem( + statusCode: StatusCodes.Status500InternalServerError, + title: "Admission evaluation failed", + detail: "An unexpected error occurred while evaluating admission policy.", + type: "https://stellaops.org/problems/admission.evaluation.failed"); + } + + var (envelope, response) = responseBuilder.Build(context, evaluation); + var allowed = evaluation.Decisions.All(static d => d.Allowed); + + logger.LogInformation("Admission decision computed (allowed={Allowed}, containers={Count}, failOpen={FailOpen}).", + allowed, + context.Containers.Count, + evaluation.FailOpenApplied); + + httpContext.Response.ContentType = "application/json"; + return Results.Json(response, SerializerOptions); + } +} + +internal sealed class AdmissionEndpointMarker +{ +} diff --git a/src/StellaOps.Zastava.Webhook/Admission/AdmissionRequestContext.cs b/src/StellaOps.Zastava.Webhook/Admission/AdmissionRequestContext.cs new file mode 100644 index 00000000..1e47975c --- /dev/null +++ b/src/StellaOps.Zastava.Webhook/Admission/AdmissionRequestContext.cs @@ -0,0 +1,15 @@ +using System.Text.Json; + +namespace StellaOps.Zastava.Webhook.Admission; + +internal sealed record AdmissionRequestContext( + string ApiVersion, + string Kind, + string Uid, + string Namespace, + IReadOnlyDictionary Labels, + IReadOnlyList Containers, + JsonElement PodObject, + JsonElement PodSpec); + +internal sealed record AdmissionContainerReference(string Name, string Image); diff --git a/src/StellaOps.Zastava.Webhook/Admission/AdmissionResponseBuilder.cs b/src/StellaOps.Zastava.Webhook/Admission/AdmissionResponseBuilder.cs new file mode 100644 index 00000000..add7744b --- /dev/null +++ b/src/StellaOps.Zastava.Webhook/Admission/AdmissionResponseBuilder.cs @@ -0,0 +1,219 @@ +using System.Buffers; +using System.Linq; +using System.Text.Encodings.Web; +using System.Text.Json; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Core.Hashing; +using StellaOps.Zastava.Core.Serialization; + +namespace StellaOps.Zastava.Webhook.Admission; + +internal sealed class AdmissionResponseBuilder +{ + public (AdmissionDecisionEnvelope Envelope, AdmissionReviewResponseDto Response) Build( + AdmissionRequestContext context, + RuntimeAdmissionEvaluation evaluation) + { + var decision = BuildDecision(context, evaluation); + var envelope = AdmissionDecisionEnvelope.Create(decision, ZastavaContractVersions.AdmissionDecision); + var auditAnnotations = CreateAuditAnnotations(envelope, evaluation); + + var warnings = BuildWarnings(evaluation); + var allowed = evaluation.Decisions.All(static d => d.Allowed); + var status = allowed + ? null + : new AdmissionReviewStatus + { + Code = 403, + Message = BuildFailureMessage(evaluation) + }; + + var response = new AdmissionReviewResponseDto + { + ApiVersion = context.ApiVersion, + Kind = context.Kind, + Response = new AdmissionReviewResponsePayload + { + Uid = context.Uid, + Allowed = allowed, + Status = status, + Warnings = warnings, + AuditAnnotations = auditAnnotations + } + }; + + return (envelope, response); + } + + private static AdmissionDecision BuildDecision(AdmissionRequestContext context, RuntimeAdmissionEvaluation evaluation) + { + var images = new List(evaluation.Decisions.Count); + for (var i = 0; i < evaluation.Decisions.Count; i++) + { + var decision = evaluation.Decisions[i]; + var container = context.Containers[Math.Min(i, context.Containers.Count - 1)]; + + var metadata = new Dictionary(StringComparer.Ordinal) + { + ["image"] = decision.OriginalImage + }; + + if (!string.Equals(container.Image, container.Name, StringComparison.Ordinal)) + { + metadata["container"] = container.Name; + } + + if (decision.FromCache) + { + metadata["cache"] = "hit"; + } + + var resolved = decision.ResolvedDigest ?? decision.OriginalImage; + + images.Add(new AdmissionImageVerdict + { + Name = container.Name, + Resolved = resolved, + Signed = decision.Policy?.Signed ?? false, + HasSbomReferrers = decision.Policy?.HasSbom ?? false, + PolicyVerdict = decision.Verdict, + Reasons = decision.Reasons, + Rekor = decision.Policy?.Rekor, + Metadata = metadata + }); + } + + return new AdmissionDecision + { + AdmissionId = context.Uid, + Namespace = context.Namespace, + PodSpecDigest = ComputePodSpecDigest(context.PodSpec), + Images = images, + Decision = evaluation.Decisions.All(static d => d.Allowed) + ? AdmissionDecisionOutcome.Allow + : AdmissionDecisionOutcome.Deny, + TtlSeconds = Math.Max(0, evaluation.TtlSeconds), + Annotations = BuildAnnotations(evaluation) + }; + } + + private static IReadOnlyDictionary? BuildAnnotations(RuntimeAdmissionEvaluation evaluation) + { + if (!evaluation.BackendFailed && !evaluation.FailOpenApplied && evaluation.FailureReason is null) + { + return null; + } + + var annotations = new Dictionary(StringComparer.Ordinal); + if (evaluation.BackendFailed) + { + annotations["zastava.backend.failed"] = "true"; + } + + if (evaluation.FailOpenApplied) + { + annotations["zastava.failOpen"] = "true"; + } + + if (!string.IsNullOrWhiteSpace(evaluation.FailureReason)) + { + annotations["zastava.failureReason"] = evaluation.FailureReason!; + } + + return annotations; + } + + private static IReadOnlyDictionary CreateAuditAnnotations(AdmissionDecisionEnvelope envelope, RuntimeAdmissionEvaluation evaluation) + { + var annotations = new Dictionary(StringComparer.Ordinal) + { + ["zastava.stellaops/admission"] = ZastavaCanonicalJsonSerializer.Serialize(envelope) + }; + + if (evaluation.FailOpenApplied) + { + annotations["zastava.stellaops/failOpen"] = "true"; + } + + return annotations; + } + + private static IReadOnlyList? BuildWarnings(RuntimeAdmissionEvaluation evaluation) + { + var warnings = new List(); + if (evaluation.FailOpenApplied) + { + warnings.Add("zastava.fail_open.applied"); + } + + foreach (var decision in evaluation.Decisions) + { + if (decision.Verdict == PolicyVerdict.Warn) + { + warnings.Add($"policy.warn:{decision.OriginalImage}"); + } + } + + return warnings.Count == 0 ? null : warnings; + } + + private static string BuildFailureMessage(RuntimeAdmissionEvaluation evaluation) + { + if (!string.IsNullOrWhiteSpace(evaluation.FailureReason)) + { + return evaluation.FailureReason!; + } + + var denied = evaluation.Decisions + .Where(static d => !d.Allowed) + .SelectMany(static d => d.Reasons) + .Distinct(StringComparer.Ordinal) + .ToArray(); + + return denied.Length > 0 + ? string.Join(", ", denied) + : "admission.denied"; + } + + private static string ComputePodSpecDigest(JsonElement podSpec) + { + var buffer = new ArrayBufferWriter(); + using (var writer = new Utf8JsonWriter(buffer, new JsonWriterOptions + { + Encoder = JavaScriptEncoder.UnsafeRelaxedJsonEscaping, + Indented = false + })) + { + WriteCanonical(podSpec, writer); + } + + return ZastavaHashing.ComputeMultihash(buffer.WrittenSpan); + } + + private static void WriteCanonical(JsonElement element, Utf8JsonWriter writer) + { + switch (element.ValueKind) + { + case JsonValueKind.Object: + writer.WriteStartObject(); + foreach (var property in element.EnumerateObject().OrderBy(static p => p.Name, StringComparer.Ordinal)) + { + writer.WritePropertyName(property.Name); + WriteCanonical(property.Value, writer); + } + writer.WriteEndObject(); + break; + case JsonValueKind.Array: + writer.WriteStartArray(); + foreach (var item in element.EnumerateArray()) + { + WriteCanonical(item, writer); + } + writer.WriteEndArray(); + break; + default: + element.WriteTo(writer); + break; + } + } +} diff --git a/src/StellaOps.Zastava.Webhook/Admission/AdmissionReviewModels.cs b/src/StellaOps.Zastava.Webhook/Admission/AdmissionReviewModels.cs new file mode 100644 index 00000000..78149a58 --- /dev/null +++ b/src/StellaOps.Zastava.Webhook/Admission/AdmissionReviewModels.cs @@ -0,0 +1,88 @@ +using System.Text.Json; +using System.Text.Json.Serialization; + +namespace StellaOps.Zastava.Webhook.Admission; + +internal sealed record AdmissionReviewRequestDto +{ + [JsonPropertyName("apiVersion")] + public string? ApiVersion { get; init; } + + [JsonPropertyName("kind")] + public string? Kind { get; init; } + + [JsonPropertyName("request")] + public AdmissionReviewRequestPayload? Request { get; init; } +} + +internal sealed record AdmissionReviewRequestPayload +{ + [JsonPropertyName("uid")] + public string? Uid { get; init; } + + [JsonPropertyName("kind")] + public AdmissionReviewGroupVersionKind? Kind { get; init; } + + [JsonPropertyName("namespace")] + public string? Namespace { get; init; } + + [JsonPropertyName("name")] + public string? Name { get; init; } + + [JsonPropertyName("object")] + public JsonElement Object { get; init; } + + [JsonPropertyName("dryRun")] + public bool? DryRun { get; init; } +} + +internal sealed record AdmissionReviewGroupVersionKind +{ + [JsonPropertyName("group")] + public string? Group { get; init; } + + [JsonPropertyName("version")] + public string? Version { get; init; } + + [JsonPropertyName("kind")] + public string? Kind { get; init; } +} + +internal sealed record AdmissionReviewResponseDto +{ + [JsonPropertyName("apiVersion")] + public string ApiVersion { get; init; } = "admission.k8s.io/v1"; + + [JsonPropertyName("kind")] + public string Kind { get; init; } = "AdmissionReview"; + + [JsonPropertyName("response")] + public required AdmissionReviewResponsePayload Response { get; init; } +} + +internal sealed record AdmissionReviewResponsePayload +{ + [JsonPropertyName("uid")] + public required string Uid { get; init; } + + [JsonPropertyName("allowed")] + public required bool Allowed { get; init; } + + [JsonPropertyName("status")] + public AdmissionReviewStatus? Status { get; init; } + + [JsonPropertyName("warnings")] + public IReadOnlyList? Warnings { get; init; } + + [JsonPropertyName("auditAnnotations")] + public IReadOnlyDictionary? AuditAnnotations { get; init; } +} + +internal sealed record AdmissionReviewStatus +{ + [JsonPropertyName("code")] + public int? Code { get; init; } + + [JsonPropertyName("message")] + public string? Message { get; init; } +} diff --git a/src/StellaOps.Zastava.Webhook/Admission/AdmissionReviewParser.cs b/src/StellaOps.Zastava.Webhook/Admission/AdmissionReviewParser.cs new file mode 100644 index 00000000..c2e687c0 --- /dev/null +++ b/src/StellaOps.Zastava.Webhook/Admission/AdmissionReviewParser.cs @@ -0,0 +1,154 @@ +using System.Text.Json; + +namespace StellaOps.Zastava.Webhook.Admission; + +internal sealed class AdmissionReviewParser +{ + public AdmissionRequestContext Parse(AdmissionReviewRequestDto dto) + { + if (dto is null) + { + throw new AdmissionReviewParseException("admission.review.invalid", "AdmissionReview payload was empty."); + } + + if (!string.Equals(dto.Kind, "AdmissionReview", StringComparison.OrdinalIgnoreCase)) + { + throw new AdmissionReviewParseException("admission.review.kind", "AdmissionReview.kind must equal 'AdmissionReview'."); + } + + if (string.IsNullOrWhiteSpace(dto.ApiVersion)) + { + throw new AdmissionReviewParseException("admission.review.apiVersion", "AdmissionReview.apiVersion is required."); + } + + var payload = dto.Request ?? throw new AdmissionReviewParseException("admission.review.request", "AdmissionReview.request is required."); + + if (string.IsNullOrWhiteSpace(payload.Uid)) + { + throw new AdmissionReviewParseException("admission.review.uid", "AdmissionReview.request.uid is required."); + } + + if (payload.Object.ValueKind is not JsonValueKind.Object) + { + throw new AdmissionReviewParseException("admission.review.object", "AdmissionReview.request.object must be a JSON object."); + } + + var podObject = payload.Object; + if (!podObject.TryGetProperty("spec", out var podSpec) || podSpec.ValueKind is not JsonValueKind.Object) + { + throw new AdmissionReviewParseException("admission.review.podSpec", "AdmissionReview.request.object.spec is required."); + } + + var podNamespace = payload.Namespace + ?? TryGetProperty(podObject, "metadata", "namespace") + ?? throw new AdmissionReviewParseException("admission.review.namespace", "Namespace could not be determined for the pod."); + + var labels = ReadLabels(podObject); + var containers = ReadContainers(podSpec); + if (containers.Count == 0) + { + throw new AdmissionReviewParseException("admission.review.containers", "No containers were found in the pod spec."); + } + + return new AdmissionRequestContext( + ApiVersion: dto.ApiVersion!, + Kind: dto.Kind!, + Uid: payload.Uid!, + Namespace: podNamespace, + Labels: labels, + Containers: containers, + PodObject: podObject, + PodSpec: podSpec); + } + + private static IReadOnlyDictionary ReadLabels(JsonElement podObject) + { + if (!podObject.TryGetProperty("metadata", out var metadata) || metadata.ValueKind is not JsonValueKind.Object) + { + return new Dictionary(StringComparer.Ordinal); + } + + if (!metadata.TryGetProperty("labels", out var labelsElement) || labelsElement.ValueKind is not JsonValueKind.Object) + { + return new Dictionary(StringComparer.Ordinal); + } + + var labels = new Dictionary(StringComparer.Ordinal); + foreach (var property in labelsElement.EnumerateObject()) + { + if (property.Value.ValueKind is JsonValueKind.String) + { + labels[property.Name] = property.Value.GetString() ?? string.Empty; + } + } + + return labels; + } + + private static IReadOnlyList ReadContainers(JsonElement podSpec) + { + var containers = new List(); + CollectContainers(podSpec, "containers", containers); + CollectContainers(podSpec, "initContainers", containers); + CollectContainers(podSpec, "ephemeralContainers", containers); + return containers; + } + + private static void CollectContainers(JsonElement spec, string propertyName, ICollection sink) + { + if (!spec.TryGetProperty(propertyName, out var array) || array.ValueKind is not JsonValueKind.Array) + { + return; + } + + foreach (var element in array.EnumerateArray()) + { + if (element.ValueKind is not JsonValueKind.Object) + { + continue; + } + + var image = TryGetProperty(element, "image"); + if (string.IsNullOrWhiteSpace(image)) + { + continue; + } + + var name = TryGetProperty(element, "name") ?? image; + sink.Add(new AdmissionContainerReference(name, image)); + } + } + + private static string? TryGetProperty(JsonElement element, string propertyName) + { + if (element.ValueKind is not JsonValueKind.Object) + { + return null; + } + + return element.TryGetProperty(propertyName, out var property) && property.ValueKind is JsonValueKind.String + ? property.GetString() + : null; + } + + private static string? TryGetProperty(JsonElement element, string firstProperty, string nestedProperty) + { + if (!element.TryGetProperty(firstProperty, out var nested) || nested.ValueKind is not JsonValueKind.Object) + { + return null; + } + + return TryGetProperty(nested, nestedProperty); + } +} + +internal sealed class AdmissionReviewParseException : Exception +{ + public AdmissionReviewParseException(string code, string message) + : base(message) + { + Code = code; + } + + public string Code { get; } +} diff --git a/src/StellaOps.Zastava.Webhook/Admission/ImageDigestResolver.cs b/src/StellaOps.Zastava.Webhook/Admission/ImageDigestResolver.cs new file mode 100644 index 00000000..4aacae1c --- /dev/null +++ b/src/StellaOps.Zastava.Webhook/Admission/ImageDigestResolver.cs @@ -0,0 +1,52 @@ +using System.Text.RegularExpressions; + +namespace StellaOps.Zastava.Webhook.Admission; + +internal interface IImageDigestResolver +{ + Task ResolveAsync(string imageReference, CancellationToken cancellationToken); +} + +internal sealed class ImageDigestResolver : IImageDigestResolver +{ + private static readonly Regex DigestPattern = new(@"(?[a-z0-9_+.-]+):(?[a-f0-9]{32,})", RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.IgnoreCase); + + public Task ResolveAsync(string imageReference, CancellationToken cancellationToken) + { + if (string.IsNullOrWhiteSpace(imageReference)) + { + return Task.FromResult(ImageResolutionResult.CreateFailure(imageReference, "image.reference.empty")); + } + + if (imageReference.Contains('@', StringComparison.Ordinal)) + { + var digest = imageReference[(imageReference.IndexOf('@') + 1)..]; + if (DigestPattern.IsMatch(digest)) + { + return Task.FromResult(ImageResolutionResult.CreateSuccess(imageReference, digest)); + } + + return Task.FromResult(ImageResolutionResult.CreateFailure(imageReference, "image.reference.invalid_digest")); + } + + if (DigestPattern.IsMatch(imageReference)) + { + return Task.FromResult(ImageResolutionResult.CreateSuccess(imageReference, imageReference)); + } + + return Task.FromResult(ImageResolutionResult.CreateFailure(imageReference, "image.reference.tag_unresolved")); + } +} + +internal sealed record ImageResolutionResult( + string Original, + string? ResolvedDigest, + bool Success, + string? FailureReason) +{ + public static ImageResolutionResult CreateSuccess(string original, string digest) + => new(original, digest, true, null); + + public static ImageResolutionResult CreateFailure(string original, string reason) + => new(original, null, false, reason); +} diff --git a/src/StellaOps.Zastava.Webhook/Admission/RuntimeAdmissionPolicyService.cs b/src/StellaOps.Zastava.Webhook/Admission/RuntimeAdmissionPolicyService.cs new file mode 100644 index 00000000..442ab2e4 --- /dev/null +++ b/src/StellaOps.Zastava.Webhook/Admission/RuntimeAdmissionPolicyService.cs @@ -0,0 +1,312 @@ +using System.Collections.Generic; +using System.Linq; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Core.Contracts; +using StellaOps.Zastava.Core.Diagnostics; +using StellaOps.Zastava.Webhook.Backend; +using StellaOps.Zastava.Webhook.Configuration; + +namespace StellaOps.Zastava.Webhook.Admission; + +internal interface IRuntimeAdmissionPolicyService +{ + Task EvaluateAsync(RuntimeAdmissionRequest request, CancellationToken cancellationToken); +} + +internal sealed class RuntimeAdmissionPolicyService : IRuntimeAdmissionPolicyService +{ + private readonly IRuntimePolicyClient policyClient; + private readonly IImageDigestResolver digestResolver; + private readonly RuntimePolicyCache cache; + private readonly IOptionsMonitor options; + private readonly IZastavaRuntimeMetrics runtimeMetrics; + private readonly TimeProvider timeProvider; + private readonly ILogger logger; + + public RuntimeAdmissionPolicyService( + IRuntimePolicyClient policyClient, + IImageDigestResolver digestResolver, + RuntimePolicyCache cache, + IOptionsMonitor options, + IZastavaRuntimeMetrics runtimeMetrics, + TimeProvider timeProvider, + ILogger logger) + { + this.policyClient = policyClient ?? throw new ArgumentNullException(nameof(policyClient)); + this.digestResolver = digestResolver ?? throw new ArgumentNullException(nameof(digestResolver)); + this.cache = cache ?? throw new ArgumentNullException(nameof(cache)); + this.options = options ?? throw new ArgumentNullException(nameof(options)); + this.runtimeMetrics = runtimeMetrics ?? throw new ArgumentNullException(nameof(runtimeMetrics)); + this.timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider)); + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + } + + public async Task EvaluateAsync(RuntimeAdmissionRequest request, CancellationToken cancellationToken) + { + ArgumentNullException.ThrowIfNull(request); + if (request.Images.Count == 0) + { + return RuntimeAdmissionEvaluation.Empty(); + } + + var admissionOptions = options.CurrentValue.Admission; + + var resolutionResults = new List(request.Images.Count); + foreach (var image in request.Images) + { + var resolution = await digestResolver.ResolveAsync(image, cancellationToken).ConfigureAwait(false); + resolutionResults.Add(resolution); + } + + var resolved = resolutionResults.Where(static r => r.Success && r.ResolvedDigest is not null) + .GroupBy(r => r.ResolvedDigest!, StringComparer.Ordinal) + .Select(group => new ResolvedDigest(group.Key, group.ToArray())) + .ToArray(); + + var combinedResults = new Dictionary(StringComparer.Ordinal); + var backendMisses = new List(); + var fromCache = new HashSet(StringComparer.Ordinal); + + foreach (var digest in resolved) + { + if (cache.TryGet(digest.Digest, out var cached)) + { + combinedResults[digest.Digest] = cached; + fromCache.Add(digest.Digest); + } + else + { + backendMisses.Add(digest.Digest); + } + } + + RuntimePolicyResponse? backendResponse = null; + bool backendFailed = false; + var ttlSeconds = 300; + if (backendMisses.Count > 0) + { + try + { + backendResponse = await policyClient.EvaluateAsync(new RuntimePolicyRequest + { + Namespace = request.Namespace ?? string.Empty, + Labels = request.Labels, + Images = backendMisses + }, cancellationToken).ConfigureAwait(false); + + var now = timeProvider.GetUtcNow(); + var expiry = CalculateExpiry(backendResponse); + ttlSeconds = Math.Max(1, (int)Math.Ceiling((expiry - now).TotalSeconds)); + foreach (var pair in backendResponse.Results) + { + combinedResults[pair.Key] = pair.Value; + cache.Set(pair.Key, pair.Value, expiry); + } + } + catch (Exception ex) when (!cancellationToken.IsCancellationRequested) + { + backendFailed = true; + logger.LogWarning(ex, "Runtime policy backend call failed for namespace {Namespace}.", request.Namespace ?? ""); + } + } + + var failOpenApplied = false; + var decisions = new List(request.Images.Count); + var effectiveTtl = backendResponse?.TtlSeconds is > 0 ? backendResponse.TtlSeconds : ttlSeconds; + + if (backendFailed && backendMisses.Count > 0) + { + failOpenApplied = ShouldFailOpen(admissionOptions, request.Namespace); + foreach (var resolution in resolutionResults) + { + if (resolution.Success && resolution.ResolvedDigest is not null) + { + var allowed = failOpenApplied; + var reasons = failOpenApplied + ? new[] { "zastava.fail_open.backend_unavailable" } + : new[] { "zastava.backend.unavailable" }; + + RecordDecisionMetrics(allowed, true, failOpenApplied, RuntimeEventKind.ContainerStart); + decisions.Add(new RuntimeAdmissionDecision + { + OriginalImage = resolution.Original, + ResolvedDigest = resolution.ResolvedDigest, + Verdict = allowed ? PolicyVerdict.Warn : PolicyVerdict.Error, + Allowed = allowed, + Policy = null, + Reasons = reasons, + FromCache = false, + ResolutionFailed = false + }); + } + else + { + decisions.Add(CreateResolutionFailureDecision(resolution)); + } + } + + return new RuntimeAdmissionEvaluation + { + Decisions = decisions, + BackendFailed = true, + FailOpenApplied = failOpenApplied, + FailureReason = failOpenApplied ? null : "backend.unavailable", + TtlSeconds = effectiveTtl + }; + } + + foreach (var resolution in resolutionResults) + { + if (!resolution.Success || resolution.ResolvedDigest is null) + { + var failureDecision = CreateResolutionFailureDecision(resolution); + RecordDecisionMetrics(failureDecision.Allowed, false, false, RuntimeEventKind.ContainerStart); + decisions.Add(failureDecision); + continue; + } + + if (!combinedResults.TryGetValue(resolution.ResolvedDigest, out var policyResult)) + { + var synthetic = new RuntimeAdmissionDecision + { + OriginalImage = resolution.Original, + ResolvedDigest = resolution.ResolvedDigest, + Verdict = PolicyVerdict.Error, + Allowed = false, + Policy = null, + Reasons = new[] { "zastava.policy.result.missing" }, + FromCache = false, + ResolutionFailed = false + }; + RecordDecisionMetrics(false, false, false, RuntimeEventKind.ContainerStart); + decisions.Add(synthetic); + continue; + } + + var allowed = policyResult.PolicyVerdict is PolicyVerdict.Pass or PolicyVerdict.Warn; + var cached = fromCache.Contains(resolution.ResolvedDigest); + var reasons = policyResult.Reasons.Count > 0 ? policyResult.Reasons : Array.Empty(); + RecordDecisionMetrics(allowed, cached, false, RuntimeEventKind.ContainerStart); + decisions.Add(new RuntimeAdmissionDecision + { + OriginalImage = resolution.Original, + ResolvedDigest = resolution.ResolvedDigest, + Verdict = policyResult.PolicyVerdict, + Allowed = allowed, + Policy = policyResult, + Reasons = reasons, + FromCache = cached, + ResolutionFailed = false + }); + } + + return new RuntimeAdmissionEvaluation + { + Decisions = decisions, + BackendFailed = backendFailed, + FailOpenApplied = failOpenApplied, + FailureReason = null, + TtlSeconds = effectiveTtl + }; + } + + private static RuntimeAdmissionDecision CreateResolutionFailureDecision(ImageResolutionResult resolution) + => new RuntimeAdmissionDecision + { + OriginalImage = resolution.Original, + ResolvedDigest = null, + Verdict = PolicyVerdict.Fail, + Allowed = false, + Policy = null, + Reasons = new[] { resolution.FailureReason ?? "image.resolution.failed" }, + FromCache = false, + ResolutionFailed = true + }; + + private void RecordDecisionMetrics(bool allowed, bool fromCache, bool failOpen, RuntimeEventKind eventKind) + { + var tags = runtimeMetrics.DefaultTags + .Concat(new[] + { + new KeyValuePair("decision", allowed ? "allow" : "deny"), + new KeyValuePair("source", fromCache ? "cache" : "backend"), + new KeyValuePair("fail_open", failOpen ? "true" : "false"), + new KeyValuePair("event", eventKind.ToString()) + }) + .ToArray(); + + runtimeMetrics.AdmissionDecisions.Add(1, tags); + } + + private bool ShouldFailOpen(ZastavaWebhookAdmissionOptions admission, string? @namespace) + { + if (@namespace is null) + { + return admission.FailOpenByDefault; + } + + if (admission.FailClosedNamespaces.Contains(@namespace)) + { + return false; + } + + if (admission.FailOpenNamespaces.Contains(@namespace)) + { + return true; + } + + return admission.FailOpenByDefault; + } + + private DateTimeOffset CalculateExpiry(RuntimePolicyResponse response) + { + var now = timeProvider.GetUtcNow(); + var ttlSeconds = Math.Max(1, response.TtlSeconds); + var intended = now.AddSeconds(ttlSeconds); + if (response.ExpiresAtUtc != default) + { + return response.ExpiresAtUtc < intended ? response.ExpiresAtUtc : intended; + } + + return intended; + } + + private sealed record ResolvedDigest(string Digest, IReadOnlyList Entries); +} + +internal sealed record RuntimeAdmissionRequest( + string? Namespace, + IReadOnlyDictionary Labels, + IReadOnlyList Images); + +internal sealed record RuntimeAdmissionDecision +{ + public required string OriginalImage { get; init; } + public string? ResolvedDigest { get; init; } + public PolicyVerdict Verdict { get; init; } + public bool Allowed { get; init; } + public RuntimePolicyImageResult? Policy { get; init; } + public IReadOnlyList Reasons { get; init; } = Array.Empty(); + public bool FromCache { get; init; } + public bool ResolutionFailed { get; init; } +} + +internal sealed record RuntimeAdmissionEvaluation +{ + public required IReadOnlyList Decisions { get; init; } + public bool BackendFailed { get; init; } + public bool FailOpenApplied { get; init; } + public string? FailureReason { get; init; } + public int TtlSeconds { get; init; } + + public static RuntimeAdmissionEvaluation Empty() + => new() + { + Decisions = Array.Empty(), + BackendFailed = false, + FailOpenApplied = false, + FailureReason = null, + TtlSeconds = 0 + }; +} diff --git a/src/StellaOps.Zastava.Webhook/Admission/RuntimePolicyCache.cs b/src/StellaOps.Zastava.Webhook/Admission/RuntimePolicyCache.cs new file mode 100644 index 00000000..2bc5e2db --- /dev/null +++ b/src/StellaOps.Zastava.Webhook/Admission/RuntimePolicyCache.cs @@ -0,0 +1,83 @@ +using System.Collections.Concurrent; +using System.Text.Json; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; +using StellaOps.Zastava.Webhook.Backend; +using StellaOps.Zastava.Webhook.Configuration; + +namespace StellaOps.Zastava.Webhook.Admission; + +internal sealed class RuntimePolicyCache +{ + private readonly ConcurrentDictionary entries = new(StringComparer.Ordinal); + private readonly ILogger logger; + private readonly TimeProvider timeProvider; + + public RuntimePolicyCache(IOptions options, TimeProvider timeProvider, ILogger logger) + { + this.timeProvider = timeProvider ?? throw new ArgumentNullException(nameof(timeProvider)); + this.logger = logger ?? throw new ArgumentNullException(nameof(logger)); + + ArgumentNullException.ThrowIfNull(options); + var admission = options.Value.Admission; + if (!string.IsNullOrWhiteSpace(admission.CacheSeedPath) && File.Exists(admission.CacheSeedPath)) + { + TryLoadSeed(admission.CacheSeedPath!); + } + } + + public bool TryGet(string digest, out RuntimePolicyImageResult result) + { + if (entries.TryGetValue(digest, out var entry)) + { + if (timeProvider.GetUtcNow() <= entry.ExpiresAtUtc) + { + result = entry.Result; + return true; + } + + entries.TryRemove(digest, out _); + } + + result = default!; + return false; + } + + public void Set(string digest, RuntimePolicyImageResult result, DateTimeOffset expiresAtUtc) + { + entries[digest] = new CacheEntry(result, expiresAtUtc); + } + + private void TryLoadSeed(string path) + { + try + { + var payload = File.ReadAllText(path); + var seed = JsonSerializer.Deserialize(payload, new JsonSerializerOptions + { + PropertyNamingPolicy = JsonNamingPolicy.CamelCase + }); + + if (seed?.Results is null || seed.Results.Count == 0) + { + logger.LogDebug("Runtime policy cache seed file {Path} empty or invalid.", path); + return; + } + + var ttlSeconds = Math.Max(1, seed.TtlSeconds); + var expires = timeProvider.GetUtcNow().AddSeconds(ttlSeconds); + foreach (var pair in seed.Results) + { + Set(pair.Key, pair.Value, expires); + } + + logger.LogInformation("Loaded {Count} runtime policy cache seed entries from {Path}.", seed.Results.Count, path); + } + catch (Exception ex) + { + logger.LogWarning(ex, "Failed to load runtime policy cache seed from {Path}.", path); + } + } + + private sealed record CacheEntry(RuntimePolicyImageResult Result, DateTimeOffset ExpiresAtUtc); +} diff --git a/src/StellaOps.Zastava.Webhook/Backend/RuntimePolicyResponse.cs b/src/StellaOps.Zastava.Webhook/Backend/RuntimePolicyResponse.cs index dd622ce0..c5cfe5b2 100644 --- a/src/StellaOps.Zastava.Webhook/Backend/RuntimePolicyResponse.cs +++ b/src/StellaOps.Zastava.Webhook/Backend/RuntimePolicyResponse.cs @@ -10,6 +10,12 @@ public sealed record RuntimePolicyResponse [JsonPropertyName("ttlSeconds")] public int TtlSeconds { get; init; } + [JsonPropertyName("expiresAtUtc")] + public DateTimeOffset ExpiresAtUtc { get; init; } + + [JsonPropertyName("policyRevision")] + public string? PolicyRevision { get; init; } + [JsonPropertyName("results")] public IReadOnlyDictionary Results { get; init; } = new Dictionary(); } diff --git a/src/StellaOps.Zastava.Webhook/DependencyInjection/ServiceCollectionExtensions.cs b/src/StellaOps.Zastava.Webhook/DependencyInjection/ServiceCollectionExtensions.cs index 125c3b6c..acfa19e8 100644 --- a/src/StellaOps.Zastava.Webhook/DependencyInjection/ServiceCollectionExtensions.cs +++ b/src/StellaOps.Zastava.Webhook/DependencyInjection/ServiceCollectionExtensions.cs @@ -2,6 +2,7 @@ using System; using Microsoft.Extensions.DependencyInjection.Extensions; using Microsoft.Extensions.Options; using StellaOps.Zastava.Core.Configuration; +using StellaOps.Zastava.Webhook.Admission; using StellaOps.Zastava.Webhook.Authority; using StellaOps.Zastava.Webhook.Backend; using StellaOps.Zastava.Webhook.Certificates; @@ -22,12 +23,19 @@ public static class ServiceCollectionExtensions .ValidateDataAnnotations() .ValidateOnStart(); + services.TryAddSingleton(TimeProvider.System); services.TryAddEnumerable(ServiceDescriptor.Singleton()); services.TryAddEnumerable(ServiceDescriptor.Singleton()); services.TryAddSingleton(); services.TryAddSingleton(); services.TryAddEnumerable(ServiceDescriptor.Singleton, WebhookRuntimeOptionsPostConfigure>()); + services.TryAddSingleton(); + services.TryAddSingleton(); + services.TryAddSingleton(); + services.TryAddSingleton(); + services.TryAddSingleton(); + services.AddHttpClient((provider, client) => { var backend = provider.GetRequiredService>().Value.Backend; diff --git a/src/StellaOps.Zastava.Webhook/Program.cs b/src/StellaOps.Zastava.Webhook/Program.cs index f8c28315..a26c4cf6 100644 --- a/src/StellaOps.Zastava.Webhook/Program.cs +++ b/src/StellaOps.Zastava.Webhook/Program.cs @@ -2,6 +2,7 @@ using System.Security.Authentication; using Microsoft.AspNetCore.Diagnostics.HealthChecks; using Serilog; using Serilog.Events; +using StellaOps.Zastava.Webhook.Admission; using StellaOps.Zastava.Webhook.Authority; using StellaOps.Zastava.Webhook.Certificates; using StellaOps.Zastava.Webhook.Configuration; @@ -59,9 +60,8 @@ app.MapHealthChecks("/healthz/live", new HealthCheckOptions Predicate = _ => false }); -// Placeholder admission endpoint; will be replaced as tasks 12-102/12-103 land. -app.MapPost("/admission", () => Results.StatusCode(StatusCodes.Status501NotImplemented)) - .WithName("AdmissionReview"); +app.MapPost("/admission", AdmissionEndpoint.HandleAsync) + .WithName("AdmissionReview"); app.MapGet("/", () => Results.Ok(new { status = "ok", service = "zastava-webhook" })); diff --git a/src/StellaOps.Zastava.Webhook/TASKS.md b/src/StellaOps.Zastava.Webhook/TASKS.md index 957b0b0e..7bee28cd 100644 --- a/src/StellaOps.Zastava.Webhook/TASKS.md +++ b/src/StellaOps.Zastava.Webhook/TASKS.md @@ -3,8 +3,8 @@ | ID | Status | Owner(s) | Depends on | Description | Exit Criteria | |----|--------|----------|------------|-------------|---------------| | ZASTAVA-WEBHOOK-12-101 | DONE (2025-10-24) | Zastava Webhook Guild | — | Admission controller host with TLS bootstrap and Authority auth. | Webhook host boots with deterministic TLS bootstrap, enforces Authority-issued credentials, e2e smoke proves admission callback lifecycle, structured logs + metrics emit on each decision. | -| ZASTAVA-WEBHOOK-12-102 | DOING | Zastava Webhook Guild | — | Query Scanner `/policy/runtime`, resolve digests, enforce verdicts. | Scanner client resolves image digests + policy verdicts, unit tests cover allow/deny, integration harness rejects/admits workloads per policy with deterministic payloads. | -| ZASTAVA-WEBHOOK-12-103 | DOING | Zastava Webhook Guild | — | Caching, fail-open/closed toggles, metrics/logging for admission decisions. | Configurable cache TTL + seeds survive restart, fail-open/closed toggles verified via tests, metrics/logging exported per decision path, docs note operational knobs. | -| ZASTAVA-WEBHOOK-12-104 | TODO | Zastava Webhook Guild | ZASTAVA-WEBHOOK-12-102 | Wire `/admission` endpoint to runtime policy client and emit allow/deny envelopes. | Admission handler resolves pods to digests, invokes policy client, returns canonical `AdmissionDecisionEnvelope` with deterministic logging and metrics. | +| ZASTAVA-WEBHOOK-12-102 | DONE (2025-10-24) | Zastava Webhook Guild | — | Query Scanner `/policy/runtime`, resolve digests, enforce verdicts. | Scanner client resolves image digests + policy verdicts, unit tests cover allow/deny, integration harness rejects/admits workloads per policy with deterministic payloads. | +| ZASTAVA-WEBHOOK-12-103 | DONE (2025-10-24) | Zastava Webhook Guild | — | Caching, fail-open/closed toggles, metrics/logging for admission decisions. | Configurable cache TTL + seeds survive restart, fail-open/closed toggles verified via tests, metrics/logging exported per decision path, docs note operational knobs. | +| ZASTAVA-WEBHOOK-12-104 | DONE (2025-10-24) | Zastava Webhook Guild | ZASTAVA-WEBHOOK-12-102 | Wire `/admission` endpoint to runtime policy client and emit allow/deny envelopes. | Admission handler resolves pods to digests, invokes policy client, returns canonical `AdmissionDecisionEnvelope` with deterministic logging and metrics. | > Status update · 2025-10-19: Confirmed no prerequisites for ZASTAVA-WEBHOOK-12-101/102/103; tasks moved to DOING for kickoff. Implementation plan covering TLS bootstrap, backend contract, caching/metrics recorded in `IMPLEMENTATION_PLAN.md`. diff --git a/src/StellaOps.sln b/src/StellaOps.sln index 4bce6011..b00b85df 100644 --- a/src/StellaOps.sln +++ b/src/StellaOps.sln @@ -337,6 +337,8 @@ Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StellaOps.Scanner.Analyzers EndProject Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StellaOps.Zastava.Observer", "StellaOps.Zastava.Observer\StellaOps.Zastava.Observer.csproj", "{BC38594B-0B84-4657-9F7B-F2A0FC810F04}" EndProject +Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StellaOps.Zastava.Observer.Tests", "StellaOps.Zastava.Observer.Tests\StellaOps.Zastava.Observer.Tests.csproj", "{20E0774F-86D5-4CD0-B636-E5212074FDE8}" +EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug|Any CPU = Debug|Any CPU @@ -2291,6 +2293,18 @@ Global {BC38594B-0B84-4657-9F7B-F2A0FC810F04}.Release|x64.Build.0 = Release|Any CPU {BC38594B-0B84-4657-9F7B-F2A0FC810F04}.Release|x86.ActiveCfg = Release|Any CPU {BC38594B-0B84-4657-9F7B-F2A0FC810F04}.Release|x86.Build.0 = Release|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Debug|Any CPU.ActiveCfg = Debug|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Debug|Any CPU.Build.0 = Debug|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Debug|x64.ActiveCfg = Debug|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Debug|x64.Build.0 = Debug|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Debug|x86.ActiveCfg = Debug|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Debug|x86.Build.0 = Debug|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Release|Any CPU.ActiveCfg = Release|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Release|Any CPU.Build.0 = Release|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Release|x64.ActiveCfg = Release|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Release|x64.Build.0 = Release|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Release|x86.ActiveCfg = Release|Any CPU + {20E0774F-86D5-4CD0-B636-E5212074FDE8}.Release|x86.Build.0 = Release|Any CPU EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE