# Surface.Env Design (Epic: SURFACE-SHARING) > **Status:** Draft v1.0 — aligns with tasks `SURFACE-ENV-01..05`, `SCANNER-ENV-01..03`, `ZASTAVA-ENV-01..02`, `OPS-ENV-01`. > > **Audience:** Scanner Worker/WebService engineers, Zastava engineers, DevOps/Ops teams. ## 1. Goals Surface.Env centralises configuration discovery for every component that touches the shared Scanner “surface” (cache, manifests, secrets). The library replaces ad-hoc environment lookups with a deterministic, validated contract that: 1. Works identically across Scanner Worker, Scanner WebService, BuildX plug-ins, Zastava Observer/Webhook, and future consumers (Scheduler planners, CLI runners). 2. Supports both connected and air-gapped deployments with clear defaults. 3. Records configuration intent (tenant isolation, cache limits, TLS, feature flags) so Surface.Validation can enforce preconditions before any work executes. ## 2. Architecture Overview ``` +-----------------------+ | Host (Worker/WebSvc) | | - IConfiguration | | - ILogger | | | | +-----------------+ | | | SurfaceEnv | | loads env vars / config file | | - Provider |--+------------------------------+ | | - Validators | | | +-----------------+ | | | | | | IResolvedSurfaceConfiguration | | v v | Surface.FS / Surface.Secrets / Surface.Validation consumers +------------------------------------------------------------- ``` Surface.Env exposes `ISurfaceEnvironment` which returns an immutable `SurfaceEnvironmentSettings` record. Hosts call `SurfaceEnvBuilder.Build()` during startup, passing optional configuration overrides (for example, Helm chart values). The builder resolves environment variables, applies defaults, and executes Surface.Validation rules before handing settings to downstream services. ## 3. Configuration Schema ### 3.1 Common keys | Variable | Description | Default | Notes | |----------|-------------|---------|-------| | `SCANNER_SURFACE_FS_ENDPOINT` | Base URI for Surface.FS / RustFS / S3-compatible store. | _required_ | Throws `SurfaceEnvironmentException` when `RequireSurfaceEndpoint = true`. When disabled (tests), builder falls back to `https://surface.invalid` so validation can fail fast. Also binds `Surface:Fs:Endpoint` from `IConfiguration`. | | `SCANNER_SURFACE_FS_BUCKET` | Bucket/container used for manifests and artefacts. | `surface-cache` | Must be unique per tenant; validators enforce non-empty value. | | `SCANNER_SURFACE_FS_REGION` | Optional region for S3-compatible stores. | `null` | Needed only when the backing store requires it (AWS/GCS). | | `SCANNER_SURFACE_CACHE_ROOT` | Local directory for warm caches. | `/stellaops/surface` | Directory is created if missing. Override to `/var/lib/stellaops/surface` (or another fast SSD) in production. | | `SCANNER_SURFACE_CACHE_QUOTA_MB` | Soft limit for on-disk cache usage. | `4096` | Enforced range 64–262144 MB; validation emits `SURFACE_ENV_CACHE_QUOTA_INVALID` outside the range. | | `SCANNER_SURFACE_PREFETCH_ENABLED` | Enables manifest prefetch threads. | `false` | Workers honour this before analyzer execution. | | `SCANNER_SURFACE_TENANT` | Tenant namespace used by cache + secret resolvers. | `TenantResolver(...)` or `"default"` | Default resolver may pull from Authority claims; you can override via env for multi-tenant pools. | | `SCANNER_SURFACE_FEATURES` | Comma-separated feature switches. | `""` | Compared against `SurfaceEnvironmentOptions.KnownFeatureFlags`; unknown flags raise warnings. | | `SCANNER_SURFACE_TLS_CERT_PATH` | Path to PEM/PKCS#12 file for client auth. | `null` | When present, `SurfaceEnvironmentBuilder` loads the certificate into `SurfaceTlsConfiguration`. | | `SCANNER_SURFACE_TLS_KEY_PATH` | Optional private-key path when cert/key are stored separately. | `null` | Stored in `SurfaceTlsConfiguration` for hosts that need to hydrate the key themselves. | ### 3.2 Secrets provider keys | Variable | Description | Notes | |----------|-------------|-------| | `SCANNER_SURFACE_SECRETS_PROVIDER` | Provider ID (`kubernetes`, `file`, `inline`, future back-ends). | Defaults to `kubernetes`; validators reject unknown values via `SURFACE_SECRET_PROVIDER_UNKNOWN`. | | `SCANNER_SURFACE_SECRETS_ROOT` | Path or base namespace for the provider. | Required for the `file` provider (e.g., `/etc/stellaops/secrets`). | | `SCANNER_SURFACE_SECRETS_NAMESPACE` | Kubernetes namespace used by the secrets provider. | Mandatory when `provider = kubernetes`. | | `SCANNER_SURFACE_SECRETS_FALLBACK_PROVIDER` | Optional secondary provider ID. | Enables tiered lookups (e.g., `kubernetes` → `inline`) without changing code. | | `SCANNER_SURFACE_SECRETS_ALLOW_INLINE` | Allows returning inline secrets (useful for tests). | Defaults to `false`; Production deployments should keep this disabled. | | `SCANNER_SURFACE_SECRETS_TENANT` | Tenant override for secret lookups. | Defaults to `SCANNER_SURFACE_TENANT` or the tenant resolver result. | ### 3.3 Component-specific prefixes `SurfaceEnvironmentOptions.Prefixes` controls the order in which suffixes are probed. Every suffix listed above is combined with each prefix (e.g., `SCANNER_SURFACE_FS_ENDPOINT`, `ZASTAVA_SURFACE_FS_ENDPOINT`) and finally the bare suffix (`SURFACE_FS_ENDPOINT`). Configure prefixes per host so local overrides win but global scanner defaults remain available: | Component | Suggested prefixes (first match wins) | Notes | |-----------|---------------------------------------|-------| | Scanner.Worker / WebService | `SCANNER` | Default – already added by `AddSurfaceEnvironment`. | | Zastava Observer/Webhook (planned) | `ZASTAVA`, `SCANNER` | Call `options.AddPrefix("ZASTAVA")` before relying on `ZASTAVA_*` overrides. | | Future CLI / BuildX plug-ins | `CLI`, `SCANNER` | Allows per-user overrides without breaking shared env files. | This approach means operators can define a single env file (SCANNER_*) and only override the handful of settings that diverge for a specific component by introducing an additional prefix. ### 3.4 Configuration precedence The builder resolves every suffix using the following precedence: 1. Environment variables using the configured prefixes (e.g., `ZASTAVA_SURFACE_FS_ENDPOINT`, then `SCANNER_SURFACE_FS_ENDPOINT`, then the bare `SURFACE_FS_ENDPOINT`). 2. Configuration values under the `Surface:*` section (for example `Surface:Fs:Endpoint`, `Surface:Cache:Root` in `appsettings.json` or Helm values). 3. Hard-coded defaults baked into `SurfaceEnvironmentBuilder` (temporary directory, `surface-cache` bucket, etc.). `SurfaceEnvironmentOptions.RequireSurfaceEndpoint` controls whether a missing endpoint results in an exception (default: `true`). Other values fall back to the default listed in § 3.1/3.2 and are further validated by the Surface.Validation pipeline. ## 4. API Surface ```csharp public interface ISurfaceEnvironment { SurfaceEnvironmentSettings Settings { get; } IReadOnlyDictionary RawVariables { get; } } public sealed record SurfaceEnvironmentSettings( Uri SurfaceFsEndpoint, string SurfaceFsBucket, string? SurfaceFsRegion, DirectoryInfo CacheRoot, int CacheQuotaMegabytes, bool PrefetchEnabled, IReadOnlyCollection FeatureFlags, SurfaceSecretsConfiguration Secrets, string Tenant, SurfaceTlsConfiguration Tls) { public DateTimeOffset CreatedAtUtc { get; init; } } public sealed record SurfaceSecretsConfiguration( string Provider, string Tenant, string? Root, string? Namespace, string? FallbackProvider, bool AllowInline); public sealed record SurfaceTlsConfiguration( string? CertificatePath, string? PrivateKeyPath, X509Certificate2Collection? ClientCertificates); ``` `ISurfaceEnvironment.RawVariables` captures the exact env/config keys that produced the snapshot so operators can export them in diagnostics bundles. `SurfaceEnvironmentOptions` configures how the snapshot is built: * `ComponentName` – used in logs/validation output. * `Prefixes` – ordered list of env prefixes (see § 3.3). Defaults to `["SCANNER"]`. * `RequireSurfaceEndpoint` – throw when no endpoint is provided (default `true`). * `TenantResolver` – delegate invoked when `SCANNER_SURFACE_TENANT` is absent. * `KnownFeatureFlags` – recognised feature switches; unexpected values raise warnings. Example registration: ```csharp builder.Services.AddSurfaceEnvironment(options => { options.ComponentName = "Scanner.Worker"; options.AddPrefix("ZASTAVA"); // optional future override options.KnownFeatureFlags.Add("validation"); options.TenantResolver = sp => sp.GetRequiredService().TenantId; }); ``` Consumers access `ISurfaceEnvironment.Settings` and pass the record into Surface.FS, Surface.Secrets, cache, and validation helpers. The interface memoises results so repeated access is cheap. ## 5. Validation `SurfaceEnvironmentBuilder` only throws `SurfaceEnvironmentException` for malformed inputs (non-integer quota, invalid URI, missing required variable when `RequireSurfaceEndpoint = true`). The richer validation pipeline lives in `StellaOps.Scanner.Surface.Validation` and runs via `services.AddSurfaceValidation()`: 1. **SurfaceEndpointValidator** – checks for a non-placeholder endpoint and bucket (`SURFACE_ENV_MISSING_ENDPOINT`, `SURFACE_FS_BUCKET_MISSING`). 2. **SurfaceCacheValidator** – verifies the cache directory exists/is writable and that the quota is positive (`SURFACE_ENV_CACHE_DIR_UNWRITABLE`, `SURFACE_ENV_CACHE_QUOTA_INVALID`). 3. **SurfaceSecretsValidator** – validates provider names, required namespace/root fields, and tenant presence (`SURFACE_SECRET_PROVIDER_UNKNOWN`, `SURFACE_SECRET_CONFIGURATION_MISSING`, `SURFACE_ENV_TENANT_MISSING`). Validators emit `SurfaceValidationIssue` instances with codes defined in `SurfaceValidationIssueCodes`. `LoggingSurfaceValidationReporter` writes structured log entries (Info/Warning/Error) using the component name, issue code, and remediation hint. Hosts fail startup if any issue has `Error` severity; warnings allow startup but surface actionable hints. ## 6. Integration Guidance - **Scanner Worker**: register `AddSurfaceEnvironment`, `AddSurfaceValidation`, `AddSurfaceFileCache`, and `AddSurfaceSecrets` before analyzer/services (see `src/Scanner/StellaOps.Scanner.Worker/Program.cs`). `SurfaceCacheOptionsConfigurator` already binds the cache root from `ISurfaceEnvironment`. - **Scanner WebService**: identical wiring, plus `SurfacePointerService`/`ScannerSurfaceSecretConfigurator` reuse the resolved settings (`Program.cs` demonstrates the pattern). - **Zastava Observer/Webhook**: will reuse the same helper once the service adds `AddSurfaceEnvironment(options => options.AddPrefix("ZASTAVA"))` so per-component overrides function without diverging defaults. - **Scheduler / CLI / BuildX (future)**: treat `ISurfaceEnvironment` as read-only input; secret lookup, cache plumbing, and validation happen before any queue/enqueue work. Readiness probes should invoke `ISurfaceValidatorRunner` (registered by `AddSurfaceValidation`) and fail the endpoint when any issue is returned. The Scanner Worker/WebService hosted services already run the validators on startup; other consumers should follow the same pattern. ### 6.1 Validation output `LoggingSurfaceValidationReporter` produces log entries that include: ``` Surface validation issue for component Scanner.Worker: SURFACE_ENV_MISSING_ENDPOINT - Surface FS endpoint is missing or invalid. Hint: Set SCANNER_SURFACE_FS_ENDPOINT to the RustFS/S3 endpoint. ``` Treat `SurfaceValidationIssueCodes.*` with severity `Error` as hard blockers (readiness must fail). `Warning` entries flag configuration drift (for example, missing namespaces) but allow startup so staging/offline runs can proceed. The codes appear in both the structured log state and the reporter payload, making it easy to alert on them. ## 7. Security & Observability - Surface.Env never logs raw values; only suffix names and issue codes appear in logs. `RawVariables` is intended for diagnostics bundles and should be treated as sensitive metadata. - TLS certificates are loaded into memory and not re-serialised; only the configured paths are exposed to downstream services. - To emit metrics, register a custom `ISurfaceValidationReporter` (e.g., wrapping Prometheus counters) in addition to the logging reporter. ## 8. Offline & Air-Gap Support - Defaults assume no public network access; point `SCANNER_SURFACE_FS_ENDPOINT` at an internal RustFS/S3 mirror. - Offline bundles must capture an env file (Ops track this under the Offline Kit tasks) so operators can seed `SCANNER_*` values before first boot. - Keep `docs/modules/devops/runbooks/zastava-deployment.md` in sync so Zastava deployments reuse the same env contract. ## 9. Testing Strategy - Unit tests for each resolver/validator. - Integration tests for Worker & Observer verifying that missing configuration causes deterministic failures. - Golden tests for configuration precedence (component overrides, defaults). ## 10. Open Questions / Future Work - Dynamic refresh of environment (watch ConfigMap) is out of scope for v1. - Evaluate adding support for environment discovery via `IConfiguration` only (no env vars) for Windows service deployments. ## 11. References - Surface.FS Design (`docs/modules/scanner/design/surface-fs.md`) - Surface.Secrets Design (`docs/modules/scanner/design/surface-secrets.md`) - Surface.Validation Design (`docs/modules/scanner/design/surface-validation.md`) - AirGap mode overview (`docs/airgap/airgap-mode.md`)