feat: Implement Filesystem and MongoDB provenance writers for PackRun execution context
- Added `FilesystemPackRunProvenanceWriter` to write provenance manifests to the filesystem. - Introduced `MongoPackRunArtifactReader` to read artifacts from MongoDB. - Created `MongoPackRunProvenanceWriter` to store provenance manifests in MongoDB. - Developed unit tests for filesystem and MongoDB provenance writers. - Established `ITimelineEventStore` and `ITimelineIngestionService` interfaces for timeline event handling. - Implemented `TimelineIngestionService` to validate and persist timeline events with hashing. - Created PostgreSQL schema and migration scripts for timeline indexing. - Added dependency injection support for timeline indexer services. - Developed tests for timeline ingestion and schema validation.
This commit is contained in:
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"_note": "Placeholder Grafana dashboard stub for Zastava. Replace panels when metrics endpoints are available; keep offline-import friendly.",
|
||||
"schemaVersion": 39,
|
||||
"title": "Zastava Observability (stub)",
|
||||
"panels": []
|
||||
}
|
||||
38
docs/modules/zastava/operations/observability.md
Normal file
38
docs/modules/zastava/operations/observability.md
Normal file
@@ -0,0 +1,38 @@
|
||||
# Zastava observability runbook (stub · 2025-11-29 demo)
|
||||
|
||||
## Dashboards (offline import)
|
||||
- Grafana JSON: `docs/modules/zastava/operations/dashboards/zastava-observability.json` (import locally; no external data sources assumed).
|
||||
- Planned panels: admission decision rate, webhook latency p95/p99, cache freshness (Surface.FS), Surface.Env key misses, Secrets fetch failures, policy violation counts, and drift events.
|
||||
|
||||
## Key metrics
|
||||
- `zastava_admission_latency_seconds_bucket{webhook}` — admission webhook latency.
|
||||
- `zastava_admission_decisions_total{result}` — allow/deny counts.
|
||||
- `zastava_surface_env_miss_total` — Surface.Env key misses.
|
||||
- `zastava_surface_secrets_failures_total{reason}` — secret retrieval failures.
|
||||
- `zastava_surface_fs_cache_freshness_seconds` — cache age vs Scanner surface metadata.
|
||||
- `zastava_drift_events_total{type}` — drift detections by category.
|
||||
|
||||
## Logs & traces
|
||||
- Correlate by `correlationId`, `tenant`, `cluster`, and `admissionId`. Include `policyVersion`, `surfaceEnvProfile`, and `secretsProvider` fields.
|
||||
- Traces disabled by default for air-gap; enable via `Telemetry:ExportEnabled=true` pointing to on-prem collector.
|
||||
|
||||
## Health/diagnostics
|
||||
- `/health/liveness` and `/health/readiness` (webhook + observer) check cache reachability, Secrets provider connectivity, and policy fetch.
|
||||
- `/status` exposes build version, commit, feature flags; verify against offline bundle manifest.
|
||||
- Cache probe: `GET /surface/fs/cache/status` returns freshness and hash for cached surfaces.
|
||||
|
||||
## Alert hints
|
||||
- Admission latency p99 > 800ms.
|
||||
- Deny rate spike > 5% over 10m without policy change.
|
||||
- Surface.Env miss rate > 1% or Secrets failure > 0 over 10m.
|
||||
- Cache freshness > 10m behind Scanner surface metadata.
|
||||
|
||||
## Offline verification steps
|
||||
1) Import Grafana JSON locally; point to Prometheus scrape labeled `zastava`.
|
||||
2) Replay a sealed admission bundle and verify `/status` + cache probe hashes match the manifest in the offline kit.
|
||||
3) Run webhook smoke (`kubectl apply --dry-run=server -f samples/admission-request.yaml`) and confirm metrics increment locally.
|
||||
|
||||
## Evidence locations
|
||||
- Sprint tracker: `docs/implplan/SPRINT_0335_0001_0001_docs_modules_zastava.md`.
|
||||
- Module docs: `README.md`, `architecture.md`, `implementation_plan.md`.
|
||||
- Dashboard stub: `operations/dashboards/zastava-observability.json`.
|
||||
Reference in New Issue
Block a user