feat: Initialize Zastava Webhook service with TLS and Authority authentication

- Added Program.cs to set up the web application with Serilog for logging, health check endpoints, and a placeholder admission endpoint.
- Configured Kestrel server to use TLS 1.3 and handle client certificates appropriately.
- Created StellaOps.Zastava.Webhook.csproj with necessary dependencies including Serilog and Polly.
- Documented tasks in TASKS.md for the Zastava Webhook project, outlining current work and exit criteria for each task.
This commit is contained in:
master
2025-10-19 18:36:22 +03:00
parent 2062da7a8b
commit d099a90f9b
966 changed files with 91038 additions and 1850 deletions

View File

@@ -5,12 +5,12 @@
## 1. Overview
Authority plug-ins extend the **StellaOps Authority** service with custom identity providers, credential stores, and client-management logic. Unlike Concelier plug-ins (which ingest or export advisories), Authority plug-ins participate directly in authentication flows:
- **Use cases:** integrate corporate directories (LDAP/AD), delegate to external IDPs, enforce bespoke password/lockout policies, or add client provisioning automation.
- **Constraints:** plug-ins load only during service start (no hot-reload), must function without outbound internet access, and must emit deterministic results for identical configuration and input data.
- **Ship targets:** target the same .NET 10 preview as the host, honour offline-first requirements, and provide clear diagnostics so operators can triage issues from `/ready`.
- **Use cases:** integrate corporate directories (LDAP/AD)[^ldap-rfc], delegate to external IDPs, enforce bespoke password/lockout policies, or add client provisioning automation.
- **Constraints:** plug-ins load only during service start (no hot-reload), must function without outbound internet access, and must emit deterministic results for identical configuration input.
- **Ship targets:** build against the hosts .NET 10 preview SDK, honour offline-first requirements, and surface actionable diagnostics so operators can triage issues from `/ready`.
## 2. Architecture Snapshot
Authority hosts follow a deterministic plug-in lifecycle. The flow below can be rendered as a sequence diagram in the final authored documentation, but all touchpoints are described here for offline viewers:
Authority hosts follow a deterministic plug-in lifecycle. The exported diagram (`docs/assets/authority/authority-plugin-lifecycle.svg`) mirrors the steps below; regenerate it from the Mermaid source if you update the flow.
1. **Configuration load** `AuthorityPluginConfigurationLoader` resolves YAML manifests under `etc/authority.plugins/`.
2. **Assembly discovery** the shared `PluginHost` scans `PluginBinaries/Authority` for `StellaOps.Authority.Plugin.*.dll` assemblies.
@@ -199,6 +199,8 @@ _Source:_ `docs/assets/authority/authority-rate-limit-flow.mmd`
- Document any external prerequisites (e.g., CA cert bundle) in your plug-in README.
- Update `etc/authority.plugins/<plugin>.yaml` samples and include deterministic SHA256 hashes for optional bootstrap payloads when distributing Offline Kit artefacts.
[^ldap-rfc]: Lightweight Directory Access Protocol (LDAPv3) specification — [RFC 4511](https://datatracker.ietf.org/doc/html/rfc4511).
## 12. Checklist & Handoff
- ✅ Capabilities declared and validated in automated tests.
- ✅ Bootstrap workflows documented (if `bootstrap` capability used) and repeatable.

View File

@@ -20,8 +20,10 @@ dotnet publish src/StellaOps.Scanner.Sbomer.BuildXPlugin/StellaOps.Scanner.Sbome
-o out/buildx
```
- `out/buildx/` now contains `StellaOps.Scanner.Sbomer.BuildXPlugin.dll` and the manifest `stellaops.sbom-indexer.manifest.json`.
- `plugins/scanner/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin/` receives the same artefacts for release packaging.
- `out/buildx/` now contains `StellaOps.Scanner.Sbomer.BuildXPlugin.dll` and the manifest `stellaops.sbom-indexer.manifest.json`.
- `plugins/scanner/buildx/StellaOps.Scanner.Sbomer.BuildXPlugin/` receives the same artefacts for release packaging.
- The CI pipeline also tars and signs (SHA-256 manifest) the OS analyzer plug-ins located under
`plugins/scanner/analyzers/os/` so they ship alongside the BuildX generator artefacts.
## 3. Verify the CAS handshake

View File

@@ -0,0 +1,108 @@
# Scanner Cache Configuration Guide
The scanner cache stores layer-level SBOM fragments and file content that can be reused across scans. This document explains how to configure and operate the cache subsystem introduced in Sprint 10 (Group SP10-G5).
## 1. Overview
- **Layer cache** persists SBOM fragments per layer digest under `<root>/layers/<digest>/` with deterministic metadata (`meta.json`).
- **File CAS** (content-addressable store) keeps deduplicated blobs (e.g., analyzer fixtures, imported SBOM layers) under `<root>/cas/<prefix>/<hash>/`.
- **Maintenance** runs via `ScannerCacheMaintenanceService`, evicting expired entries and compacting the cache to stay within size limits.
- **Metrics** emit on the `StellaOps.Scanner.Cache` meter with counters for hits, misses, evictions, and byte histograms.
- **Offline workflows** use the CAS import/export helpers to package cache warmups inside the Offline Kit.
## 2. Configuration keys (`scanner:cache`)
| Key | Default | Description |
| --- | --- | --- |
| `enabled` | `true` | Globally disable cache if `false`. |
| `rootPath` | `cache/scanner` | Base directory for cache data. Use an SSD-backed path for best warm-scan latency. |
| `layersDirectoryName` | `layers` | Subdirectory for layer cache entries. |
| `fileCasDirectoryName` | `cas` | Subdirectory for file CAS entries. |
| `layerTtl` | `45.00:00:00` | Time-to-live for layer cache entries (`TimeSpan`). `0` disables TTL eviction. |
| `fileTtl` | `30.00:00:00` | Time-to-live for CAS entries. `0` disables TTL eviction. |
| `maxBytes` | `5368709120` (5 GiB) | Hard cap for combined cache footprint. Compaction trims data back to `warmBytesThreshold`. |
| `warmBytesThreshold` | `maxBytes / 5` | Target size after compaction. |
| `coldBytesThreshold` | `maxBytes * 0.8` | Upper bound that triggers compaction. |
| `enableAutoEviction` | `true` | If `false`, callers must invoke `ILayerCacheStore.CompactAsync` / `IFileContentAddressableStore.CompactAsync` manually. |
| `maintenanceInterval` | `00:15:00` | Interval for the maintenance hosted service. |
| `enableFileCas` | `true` | Disable to prevent CAS usage (APIs throw on `PutAsync`). |
| `importDirectory` / `exportDirectory` | `null` | Optional defaults for offline import/export tooling. |
> **Tip:** configure `scanner:cache:rootPath` to a dedicated volume and mount it into worker containers when running in Kubernetes or Nomad.
## 3. Metrics
Instrumentation lives in `ScannerCacheMetrics` on meter `StellaOps.Scanner.Cache`.
| Instrument | Unit | Description |
| --- | --- | --- |
| `scanner.layer_cache_hits_total` | count | Layer cache hit counter. Tag: `layer`. |
| `scanner.layer_cache_misses_total` | count | Layer cache miss counter. Tag: `layer`. |
| `scanner.layer_cache_evictions_total` | count | Layer entries evicted due to TTL or compaction. Tag: `layer`. |
| `scanner.layer_cache_bytes` | bytes | Histogram of per-entry payload size when stored. |
| `scanner.file_cas_hits_total` | count | File CAS hit counter. Tag: `sha256`. |
| `scanner.file_cas_misses_total` | count | File CAS miss counter. Tag: `sha256`. |
| `scanner.file_cas_evictions_total` | count | CAS eviction counter. Tag: `sha256`. |
| `scanner.file_cas_bytes` | bytes | Histogram of CAS payload sizes on insert. |
## 4. Import / Export workflow
1. **Export warm cache**
```bash
dotnet tool run stellaops-cache export --destination ./offline-kit/cache
```
Internally this calls `IFileContentAddressableStore.ExportAsync` which copies each CAS entry (metadata + `content.bin`).
2. **Import on air-gapped hosts**
```bash
dotnet tool run stellaops-cache import --source ./offline-kit/cache
```
The import API merges newer metadata and skips older snapshots automatically.
3. **Layer cache seeding**
Layer cache entries are deterministic and can be packaged the same way (copy `<root>/layers`). For now we keep seeding optional because layers are larger; follow-up tooling can compress directories as needed.
## 5. Hosted maintenance loop
`ScannerCacheMaintenanceService` runs as a background service within Scanner Worker or WebService hosts when `AddScannerCache` is registered. Behaviour:
- At startup it performs an immediate eviction/compaction run.
- Every `maintenanceInterval` it triggers:
- `ILayerCacheStore.EvictExpiredAsync`
- `ILayerCacheStore.CompactAsync`
- `IFileContentAddressableStore.EvictExpiredAsync`
- `IFileContentAddressableStore.CompactAsync`
- Failures are logged at `Error` with preserved stack traces; the next tick continues normally.
Set `enableAutoEviction=false` when hosting the cache inside ephemeral build pipelines that want to drive eviction explicitly.
## 6. API surface summary
```csharp
public interface ILayerCacheStore
{
ValueTask<LayerCacheEntry?> TryGetAsync(string layerDigest, CancellationToken ct = default);
Task<LayerCacheEntry> PutAsync(LayerCachePutRequest request, CancellationToken ct = default);
Task RemoveAsync(string layerDigest, CancellationToken ct = default);
Task<int> EvictExpiredAsync(CancellationToken ct = default);
Task<int> CompactAsync(CancellationToken ct = default);
Task<Stream?> OpenArtifactAsync(string layerDigest, string artifactName, CancellationToken ct = default);
}
public interface IFileContentAddressableStore
{
ValueTask<FileCasEntry?> TryGetAsync(string sha256, CancellationToken ct = default);
Task<FileCasEntry> PutAsync(FileCasPutRequest request, CancellationToken ct = default);
Task<bool> RemoveAsync(string sha256, CancellationToken ct = default);
Task<int> EvictExpiredAsync(CancellationToken ct = default);
Task<int> CompactAsync(CancellationToken ct = default);
Task<int> ExportAsync(string destinationDirectory, CancellationToken ct = default);
Task<int> ImportAsync(string sourceDirectory, CancellationToken ct = default);
}
```
Register both stores via `services.AddScannerCache(configuration);` in WebService or Worker hosts.
---
_Last updated: 2025-10-19_

View File

@@ -0,0 +1,140 @@
# Authority DPoP & mTLS Implementation Plan (2025-10-19)
## Purpose
- Provide the implementation blueprint for AUTH-DPOP-11-001 and AUTH-MTLS-11-002.
- Unify sender-constraint validation across Authority, downstream services, and clients.
- Capture deterministic, testable steps that unblock UI/Signer guilds depending on DPoP/mTLS hardening.
## Scope
- Token endpoint validation, issuance, and storage changes inside `StellaOps.Authority`.
- Shared security primitives consumed by Authority, Scanner, Signer, CLI, and UI.
- Operator-facing configuration, auditing, and observability.
- Out of scope: PoE enforcement (Signer) and CLI/UI client UX; those teams consume the new capabilities.
## Design Summary
- Extract the existing Scanner `DpopProofValidator` stack into a shared `StellaOps.Auth.Security` library used by Authority and resource servers.
- Extend Authority configuration (`authority.yaml`) with strongly-typed `senderConstraints.dpop` and `senderConstraints.mtls` sections (map to sample already shown in architecture doc).
- Require DPoP proofs on `/token` when the registered client policy is `senderConstraint=dpop`; bind issued access tokens via `cnf.jkt`.
- Introduce Authority-managed nonce issuance for “high value” audiences (default: `signer`, `attestor`) with Redis-backed persistence and deterministic auditing.
- Enable OAuth 2.0 mTLS (RFC 8705) by storing certificate bindings per client, requesting client certificates at TLS termination, and stamping `cnf.x5t#S256` into issued tokens plus introspection output.
- Surface structured logs and counters for both DPoP and mTLS flows; provide integration tests that cover success, replay, invalid proof, and certificate mismatch cases.
## AUTH-DPOP-11-001 — Proof Validation & Nonce Handling
**Shared validator**
- Move `DpopProofValidator`, option types, and replay cache interfaces from `StellaOps.Scanner.Core` into a new assembly `StellaOps.Auth.Security`.
- Provide pluggable caches: `InMemoryDpopReplayCache` (existing) and new `RedisDpopReplayCache` (leveraging the Authority Redis connection).
- Ensure the validator exposes the validated `SecurityKey`, `jti`, and `iat` so Authority can construct the `cnf` claim and compute nonce expiry.
**Configuration model**
- Extend `StellaOpsAuthorityOptions.Security` with a `SenderConstraints` property containing:
- `Dpop` (`enabled`, `allowedAlgorithms`, `maxAgeSeconds`, `clockSkewSeconds`, `replayWindowSeconds`, `nonce` settings with `enabled`, `ttlSeconds`, `requiredAudiences`, `maxIssuancePerMinute`).
- `Mtls` (`enabled`, `requireChainValidation`, `clientCaBundle`, `allowedSubjectPatterns`, `allowedSanTypes`).
- Bind from YAML (`authority.security.senderConstraints.*`) while preserving backwards compatibility (defaults keep both disabled).
**Token endpoint pipeline**
- Introduce a scoped OpenIddict handler `ValidateDpopProofHandler` inserted before `ValidateClientCredentialsHandler`.
- Determine the required sender constraint from client metadata:
- Add `AuthorityClientMetadataKeys.SenderConstraint` storing `dpop` or `mtls`.
- Optionally allow per-client overrides for nonce requirement.
- When `dpop` is required:
- Read the `DPoP` header from the ASP.NET request, reject with `invalid_token` + `WWW-Authenticate: DPoP error="invalid_dpop_proof"` if absent.
- Call the shared validator with method/URI. Enforce algorithm allowlist and `iat` window from options.
- Persist the `jkt` thumbprint plus replay cache state in the OpenIddict transaction (`AuthorityOpenIddictConstants.DpopKeyThumbprintProperty`, `DpopIssuedAtProperty`).
- When the requested audience intersects `SenderConstraints.Dpop.Nonce.RequiredAudiences`, require `nonce` in the proof; on first failure respond with HTTP 401, `error="use_dpop_nonce"`, and include `DPoP-Nonce` header (see nonce note below). Cache the rejection reason for audit logging.
**Nonce service**
- Add `IDpopNonceStore` with methods `IssueAsync(audience, clientId, jkt)` and `TryConsumeAsync(nonce, audience, clientId, jkt)`.
- Default implementation `RedisDpopNonceStore` storing SHA-256 hashes of nonces keyed by `audience:clientId:jkt`. TTL comes from `SenderConstraints.Dpop.Nonce.Ttl`.
- Create helper `DpopNonceIssuer` used by `ValidateDpopProofHandler` to issue nonces when missing/expired, enforcing issuance rate limits (per options) and tagging audit/log records.
- On successful validation (nonce supplied and consumed) stamp metadata into the transaction for auditing.
- Update `ClientCredentialsHandlers` to observe nonce enforcement: when a nonce challenge was sent, emit structured audit with `nonce_issued`, `audiences`, and `retry`.
**Token issuance**
- In `HandleClientCredentialsHandler`, if the transaction contains a validated DPoP key:
- Build `cnf.jkt` using thumbprint from validator.
- Include `auth_time`/`dpop_jti` as needed for diagnostics.
- Persist the thumbprint alongside token metadata in Mongo (extend `AuthorityTokenDocument` with `SenderConstraint`, `KeyThumbprint`, `Nonce` fields).
**Auditing & observability**
- Emit new audit events:
- `authority.dpop.proof.validated` (success/failure, clientId, audience, thumbprint, nonce status, jti).
- `authority.dpop.nonce.issued` and `authority.dpop.nonce.consumed`.
- Metrics (Prometheus style):
- `authority_dpop_validations_total{result,reason}`.
- `authority_dpop_nonce_issued_total{audience}` and `authority_dpop_nonce_fails_total{reason}`.
- Structured logs include `authority.sender_constraint=dpop`, `authority.dpop_thumbprint`, `authority.dpop_nonce`.
**Testing**
- Unit tests for the handler pipeline using fake OpenIddict transactions.
- Replay/nonce tests with in-memory and Redis stores.
- Integration tests in `StellaOps.Authority.Tests` covering:
- Valid DPoP proof issuing `cnf.jkt`.
- Missing header → challenge with nonce.
- Replayed `jti` rejected.
- Invalid nonce rejected even after issuance.
- Contract tests to ensure `/.well-known/openid-configuration` advertises `dpop_signing_alg_values_supported` and `dpop_nonce_supported` when enabled.
## AUTH-MTLS-11-002 — Certificate-Bound Tokens
**Configuration model**
- Reuse `SenderConstraints.Mtls` described above; include:
- `enforceForAudiences` list (defaults `signer`, `attestor`, `scheduler`).
- `certificateRotationGraceSeconds` for overlap.
- `allowedClientCertificateAuthorities` absolute paths.
**Kestrel/TLS pipeline**
- Configure Kestrel with `ClientCertificateMode.AllowCertificate` globally and implement middleware that enforces certificate presence only when the resolved client requires mTLS.
- Add `IAuthorityClientCertificateValidator` that validates presented certificate chain, SANs (`dns`, `uri`, optional SPIFFE), and thumbprint matches one of the stored bindings.
- Cache validation results per connection id to avoid rehashing on every request.
**Client registration & storage**
- Extend `AuthorityClientDocument` with `List<AuthorityClientCertificateBinding>` containing:
- `Thumbprint`, `SerialNumber`, `Subject`, `NotBefore`, `NotAfter`, `Sans`, `CreatedAt`, `UpdatedAt`, `Label`.
- Provide admin API mutations (`/admin/clients/{id}/certificates`) for ops tooling (deferred implementation but schema ready).
- Update plugin provisioning store (`StandardClientProvisioningStore`) to map descriptors with certificate bindings and `senderConstraint`.
- Persist binding state in Mongo migrations (index on `{clientId, thumbprint}`).
**Token issuance & introspection**
- Add a transaction property capturing the validated client certificate thumbprint.
- `HandleClientCredentialsHandler`:
- When mTLS required, ensure certificate info present; reject otherwise.
- Stamp `cnf` claim: `principal.SetClaim("cnf", JsonSerializer.Serialize(new { x5t#S256 = thumbprint }))`.
- Store binding metadata in issued token document for audit.
- Update `ValidateAccessTokenHandler` and introspection responses to surface `cnf.x5t#S256`.
- Ensure refresh tokens (if ever enabled) copy the binding data.
**Auditing & observability**
- Audit events:
- `authority.mtls.handshake` (success/failure, clientId, thumbprint, issuer, subject).
- `authority.mtls.binding.missing` when a required client posts without a cert.
- Metrics:
- `authority_mtls_handshakes_total{result}`.
- `authority_mtls_certificate_rotations_total`.
- Logs include `authority.sender_constraint=mtls`, `authority.mtls_thumbprint`, `authority.mtls_subject`.
**Testing**
- Unit tests for certificate validation rules (SAN mismatches, expiry, CA trust).
- Integration tests running Kestrel with test certificates:
- Successful token issuance with bound certificate.
- Request without certificate → `invalid_client`.
- Token introspection reveals `cnf.x5t#S256`.
- Rotation scenario (old + new cert allowed during grace window).
## Implementation Checklist
**DPoP work-stream**
1. Extract shared validator into `StellaOps.Auth.Security`; update Scanner references.
2. Introduce configuration classes and bind from YAML/environment.
3. Implement nonce store (Redis + in-memory), handler integration, and OpenIddict transaction plumbing.
4. Stamp `cnf.jkt`, audit events, and metrics; update Mongo documents and migrations.
5. Extend docs: `docs/ARCHITECTURE_AUTHORITY.md`, `docs/security/audit-events.md`, `docs/security/rate-limits.md`, CLI/UI references.
**mTLS work-stream**
1. Extend client document/schema and provisioning stores with certificate bindings + sender constraint flag.
2. Configure Kestrel/middleware for optional client certificates and validation service.
3. Update token issuance/introspection to honour certificate bindings and emit `cnf.x5t#S256`.
4. Add auditing/metrics and integration tests (happy path + failure).
5. Refresh operator documentation (`docs/ops/authority-backup-restore.md`, `docs/ops/authority-monitoring.md`, sample `authority.yaml`) to cover certificate lifecycle.
Both streams should conclude with `dotnet test src/StellaOps.Authority.sln` and documentation cross-links so dependent guilds can unblock UI/Signer work.