up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-11-28 09:40:40 +02:00
parent 1c6730a1d2
commit 05da719048
206 changed files with 34741 additions and 1751 deletions

View File

@@ -0,0 +1,502 @@
# CLI Vulnerability Explorer Commands Reference
> **Audience:** DevEx engineers, security operators, and CI authors managing vulnerabilities through the `stella` CLI.
> **Scope:** Command synopsis, options, exit codes, and CI integration patterns for `stella vuln` commands as introduced in Sprint 205.
The Vulnerability Explorer CLI enables comprehensive vulnerability management including listing, inspection, workflow operations, policy simulation, and export. All commands support multi-tenant environments and integrate with StellaOps Authority for authentication.
---
## 1. Prerequisites
- CLI version: `stella` >= 0.21.0 (Vulnerability Explorer feature gate enabled).
- Required scopes (DPoP-bound):
- `vuln:view` for listing and viewing vulnerabilities.
- `vuln:workflow` for workflow operations (assign, comment, accept-risk, etc.).
- `vuln:simulate` for policy simulation.
- `vuln:export` for exporting evidence bundles.
- `tenant:select` if using tenant switching.
- Connectivity: direct access to Backend APIs or configured backend URL.
- Environment: set `STELLAOPS_BACKEND_URL`, `STELLA_TENANT`, and authenticate via `stella auth login`.
---
## 2. `stella vuln list`
### 2.1 Synopsis
```bash
stella vuln list \
[--vuln-id <id>] \
[--severity critical|high|medium|low] \
[--status open|triaged|accepted|fixed|risk_accepted] \
[--purl <package-url>] \
[--cpe <cpe>] \
[--sbom-id <sbom-id>] \
[--policy-id <policy-id>] \
[--policy-version <version>] \
[--group-by severity|status|sbom|policy] \
[--limit <n>] [--offset <n>] [--cursor <token>] \
[--tenant <tenant-id>] \
[--json] [--csv] [--verbose]
```
### 2.2 Description
Lists vulnerabilities matching the specified filters with pagination support. Supports grouped summaries for reporting and machine-readable output for automation.
### 2.3 Options
| Option | Description |
|--------|-------------|
| `--vuln-id <id>` | Filter by vulnerability ID (e.g., CVE-2024-1234). |
| `--severity <level>` | Filter by severity (critical, high, medium, low). |
| `--status <status>` | Filter by workflow status. |
| `--purl <package-url>` | Filter by Package URL pattern. |
| `--cpe <cpe>` | Filter by CPE pattern. |
| `--sbom-id <sbom-id>` | Filter by SBOM identifier. |
| `--policy-id <policy-id>` | Filter by policy ID. |
| `--policy-version <version>` | Filter by policy version. |
| `--group-by <field>` | Group results by field (shows summary counts). |
| `--limit <n>` | Maximum results to return (default 50). |
| `--offset <n>` | Number of results to skip. |
| `--cursor <token>` | Pagination cursor from previous response. |
| `--tenant <tenant-id>` | Override tenant for multi-tenant deployments. |
| `--json` | Output as JSON for automation. |
| `--csv` | Output as CSV for spreadsheet import. |
| `--verbose` | Enable debug logging. |
### 2.4 Examples
List critical vulnerabilities:
```bash
stella vuln list --severity critical
```
Group by status for reporting:
```bash
stella vuln list --group-by status --json > status-summary.json
```
Export CSV for compliance audit:
```bash
stella vuln list --severity critical --severity high --csv > critical-vulns.csv
```
---
## 3. `stella vuln show`
### 3.1 Synopsis
```bash
stella vuln show <vulnerability-id> \
[--tenant <tenant-id>] \
[--json] [--verbose]
```
### 3.2 Description
Displays detailed information about a specific vulnerability including severity, affected packages, policy rationale, evidence, dependency paths, and workflow history.
### 3.3 Output Sections
- **Header:** Vulnerability ID, status, severity, VEX status, aliases, assignee, dates.
- **Description:** Full vulnerability description.
- **Affected Packages:** Table of affected packages with versions and fix status.
- **Policy Rationale:** Active policy rules and their evaluation results.
- **Evidence:** Timeline of evidence collected.
- **Dependency Paths:** Transitive dependency chains leading to vulnerability.
- **Workflow History:** Audit ledger of all workflow actions.
- **References:** Links to advisories, patches, and documentation.
### 3.4 Example
```bash
stella vuln show CVE-2024-1234 --json
```
---
## 4. Workflow Commands
All workflow commands support bulk operations via `--vuln-id` (repeatable) or filter options.
### 4.1 `stella vuln assign`
Assign vulnerabilities to a team member.
```bash
stella vuln assign <assignee> \
[--vuln-id <id>]... \
[--filter-severity <level>] \
[--filter-status <status>] \
[--filter-purl <pattern>] \
[--filter-sbom <sbom-id>] \
[--tenant <tenant-id>] \
[--idempotency-key <key>] \
[--json] [--verbose]
```
Example:
```bash
stella vuln assign security-team \
--filter-severity critical \
--filter-status open
```
### 4.2 `stella vuln comment`
Add a comment to vulnerabilities.
```bash
stella vuln comment "<text>" \
--vuln-id CVE-2024-1234 \
[--json]
```
### 4.3 `stella vuln accept-risk`
Accept risk for vulnerabilities with documented justification.
```bash
stella vuln accept-risk "<justification>" \
--vuln-id CVE-2024-1234 \
[--due-date 2025-12-31] \
[--json]
```
### 4.4 `stella vuln verify-fix`
Mark vulnerabilities as fixed and verified.
```bash
stella vuln verify-fix <fix-version> \
--vuln-id CVE-2024-1234 \
[--json]
```
### 4.5 `stella vuln target-fix`
Set target fix version and due date.
```bash
stella vuln target-fix <version> \
--vuln-id CVE-2024-1234 \
[--due-date 2025-06-30] \
[--json]
```
### 4.6 `stella vuln reopen`
Reopen previously closed vulnerabilities.
```bash
stella vuln reopen "<reason>" \
--vuln-id CVE-2024-1234 \
[--json]
```
### 4.7 Idempotency
All workflow commands support `--idempotency-key` for safe retries in CI pipelines:
```bash
stella vuln assign security-team \
--vuln-id CVE-2024-1234 \
--idempotency-key "assign-cve-2024-1234-$(date +%Y%m%d)"
```
---
## 5. `stella vuln simulate`
### 5.1 Synopsis
```bash
stella vuln simulate \
[--policy-id <id>] \
[--policy-version <version>] \
[--vex-override <vulnId>=<status>]... \
[--severity-threshold <level>] \
[--sbom-id <id>]... \
[--markdown] \
[--changed-only] \
[--output <file>] \
[--tenant <tenant-id>] \
[--json] [--verbose]
```
### 5.2 Description
Simulates the impact of policy or VEX changes without modifying data. Produces delta summaries showing which vulnerabilities would change status, useful for policy review and CI gates.
### 5.3 Options
| Option | Description |
|--------|-------------|
| `--policy-id <id>` | Policy ID to simulate. |
| `--policy-version <version>` | Policy version to simulate against. |
| `--vex-override <vulnId>=<status>` | Override VEX status for simulation (repeatable). |
| `--severity-threshold <level>` | Minimum severity to include. |
| `--sbom-id <id>` | Limit simulation to specific SBOMs (repeatable). |
| `--markdown` | Include Markdown report for CI. |
| `--changed-only` | Only show items that would change. |
| `--output <file>` | Write Markdown report to file. |
| `--json` | Output full simulation results as JSON. |
### 5.4 Output
The command displays:
- **Summary Panel:** Total evaluated, changed, upgrades, downgrades.
- **Delta Table:** Before/after status comparison with UPGRADE/DOWNGRADE indicators.
- **Markdown Report:** Optional CI-friendly report.
### 5.5 CI Integration Example
```bash
# Run simulation and fail if any downgrades
stella vuln simulate \
--policy-id prod-policy \
--changed-only \
--markdown \
--output simulation-report.md
# Check exit code
if [ $? -ne 0 ]; then
echo "Simulation found issues - see simulation-report.md"
exit 1
fi
```
---
## 6. `stella vuln export`
### 6.1 Synopsis
```bash
stella vuln export \
--output <file> \
[--vuln-id <id>]... \
[--sbom-id <id>]... \
[--policy-id <id>] \
[--format ndjson|json] \
[--include-evidence] \
[--include-ledger] \
[--signed] \
[--tenant <tenant-id>] \
[--verbose]
```
### 6.2 Description
Exports vulnerability evidence bundles for compliance documentation, audits, or offline analysis. Bundles can be cryptographically signed for integrity verification.
### 6.3 Options
| Option | Description |
|--------|-------------|
| `--output <file>` | Output file path (required). |
| `--vuln-id <id>` | Vulnerability IDs to include (repeatable). |
| `--sbom-id <id>` | SBOM IDs to scope export (repeatable). |
| `--policy-id <id>` | Policy ID for filtering. |
| `--format <fmt>` | Output format: `ndjson` (default) or `json`. |
| `--include-evidence` | Include evidence data (default: true). |
| `--include-ledger` | Include workflow ledger (default: true). |
| `--signed` | Request signed bundle (default: true). |
### 6.4 Example
```bash
stella vuln export \
--output compliance-bundle.ndjson \
--sbom-id prod-app-sbom \
--signed
```
---
## 7. `stella vuln export verify`
### 7.1 Synopsis
```bash
stella vuln export verify <file> \
[--expected-digest <sha256:hex>] \
[--public-key <key-file>] \
[--verbose]
```
### 7.2 Description
Verifies the integrity and optional signature of an exported vulnerability bundle. Use this to validate bundles received from external sources or stored archives.
### 7.3 Example
```bash
stella vuln export verify compliance-bundle.ndjson \
--expected-digest sha256:abc123... \
--public-key /path/to/public.pem
```
---
## 8. Exit Codes
| Exit Code | Meaning |
|-----------|---------|
| `0` | Command completed successfully. |
| `1` | General error (see error message). |
| `130` | Operation cancelled by user (Ctrl+C). |
---
## 9. Compliance Checklist
Use these commands to demonstrate vulnerability management compliance:
### 9.1 Vulnerability Inventory
```bash
# Generate complete vulnerability inventory
stella vuln list --json > inventory.json
# Summary by severity
stella vuln list --group-by severity --json > severity-summary.json
```
### 9.2 SLA Compliance
```bash
# Find critical vulns older than 30 days
stella vuln list \
--severity critical \
--status open \
--json | jq '.items[] | select(.updatedAt < (now - 2592000 | todate))'
```
### 9.3 Risk Acceptance Audit
```bash
# Export all risk-accepted vulnerabilities with justifications
stella vuln list --status risk_accepted --json > risk-accepted.json
```
### 9.4 Evidence Bundle for Audit
```bash
# Export signed evidence bundle
stella vuln export \
--output audit-$(date +%Y%m%d).ndjson \
--signed
# Verify bundle integrity
stella vuln export verify audit-$(date +%Y%m%d).ndjson
```
---
## 10. CI Pipeline Snippets
### 10.1 GitHub Actions
```yaml
- name: Check Critical Vulnerabilities
run: |
count=$(stella vuln list --severity critical --status open --json | jq '.total')
if [ "$count" -gt 0 ]; then
echo "::error::Found $count critical open vulnerabilities"
stella vuln list --severity critical --status open
exit 1
fi
- name: Policy Simulation Gate
run: |
stella vuln simulate \
--policy-id ${{ env.POLICY_ID }} \
--changed-only \
--markdown \
--output ${{ github.workspace }}/simulation.md
cat ${{ github.workspace }}/simulation.md >> $GITHUB_STEP_SUMMARY
```
### 10.2 GitLab CI
```yaml
vuln-check:
script:
- stella auth login --token $STELLA_TOKEN
- |
if stella vuln list --severity critical --status open --json | jq -e '.total > 0'; then
echo "Critical vulnerabilities found!"
stella vuln list --severity critical --status open
exit 1
fi
artifacts:
reports:
dotenv: vuln-status.env
```
### 10.3 Jenkins Pipeline
```groovy
stage('Vulnerability Check') {
steps {
sh '''
stella vuln list \
--severity critical \
--severity high \
--status open \
--csv > vulns.csv
'''
archiveArtifacts artifacts: 'vulns.csv'
script {
def count = sh(
script: "stella vuln list --severity critical --status open --json | jq '.total'",
returnStdout: true
).trim().toInteger()
if (count > 0) {
error("Found ${count} critical vulnerabilities")
}
}
}
}
```
---
## 11. Offline Operation
When operating in air-gapped environments:
1. Export vulnerability data before going offline:
```bash
stella vuln export --output vuln-bundle.ndjson --signed
```
2. Transfer bundle to air-gapped system.
3. Verify bundle integrity:
```bash
stella vuln export verify vuln-bundle.ndjson \
--expected-digest sha256:...
```
For full offline kit support, see the [Offline Kit documentation](../../../24_OFFLINE_KIT.md).
---
## 12. Related Documentation
- [VEX Consensus CLI](./vex-cli.md) - VEX status management
- [Policy Simulation](../../policy/guides/simulation.md) - Policy testing
- [Authentication Guide](./auth-cli.md) - Token management
- [API Reference](../../../09_API_CLI_REFERENCE.md) - Full API documentation

View File

@@ -0,0 +1,414 @@
# Surface.FS Workflow Guide
> **Version:** 1.0 (2025-11-28)
>
> **Audience:** Scanner Worker/WebService integrators, Zastava operators, Offline Kit builders
## Overview
Surface.FS provides a content-addressable storage layer for Scanner-derived artefacts. This guide covers the end-to-end workflow from artefact generation to consumption, including offline bundle handling.
## Workflow Stages
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Scanner Worker │───▶│ Surface.FS │───▶│ Consumers │
│ - Scan image │ │ - Store manifest│ │ - WebService │
│ - Generate │ │ - Store payload │ │ - Zastava │
│ artefacts │ │ - Local cache │ │ - CLI │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Generate: │ │ Store: │ │ Consume: │
│ - Layer frags │ │ - RustFS/S3 │ │ - Report API │
│ - EntryTrace │ │ - Local disk │ │ - Drift detect │
│ - SBOM frags │ │ - Offline kit │ │ - Rescan plan │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```
## Stage 1: Artefact Generation (Scanner Worker)
### 1.1 Configure Surface.FS
```csharp
// In Scanner Worker startup
builder.Services.AddSurfaceFileCache();
builder.Services.AddSurfaceManifestStore();
```
Environment variables (see [Surface.Env guide](../design/surface-env.md)):
```bash
SCANNER_SURFACE_FS_ENDPOINT=http://rustfs:8080
SCANNER_SURFACE_FS_BUCKET=surface-cache
SCANNER_SURFACE_CACHE_ROOT=/var/lib/stellaops/surface
SCANNER_SURFACE_TENANT=default
```
### 1.2 Generate and Publish Artefacts
```csharp
public async Task<ScanResult> ExecuteScanAsync(ScanJob job, CancellationToken ct)
{
// 1. Run analyzers to generate artefacts
var layerFragments = await AnalyzeLayersAsync(job.Image, ct);
var entryTrace = await AnalyzeEntryPointsAsync(job.Image, ct);
var sbomFragments = await GenerateSbomAsync(job.Image, ct);
// 2. Create manifest document
var manifest = new SurfaceManifestDocument
{
Schema = "stellaops.surface.manifest@1",
Tenant = _environment.Settings.Tenant,
ImageDigest = job.Image.Digest,
ScanId = job.Id,
GeneratedAt = DateTimeOffset.UtcNow,
Source = new SurfaceManifestSource
{
Component = "scanner.worker",
Version = _version,
WorkerInstance = Environment.MachineName,
Attempt = job.Attempt
},
Artifacts = new List<SurfaceManifestArtifact>()
};
// 3. Add artefacts to manifest
foreach (var fragment in layerFragments)
{
var payloadUri = await _manifestWriter.StorePayloadAsync(
fragment.Content,
"layer.fragments",
ct);
manifest.Artifacts.Add(new SurfaceManifestArtifact
{
Kind = "layer.fragments",
Uri = payloadUri,
Digest = fragment.Digest,
MediaType = "application/vnd.stellaops.layer-fragments+json",
Format = "json",
SizeBytes = fragment.Content.Length
});
}
// 4. Publish manifest
var result = await _manifestWriter.PublishAsync(manifest, ct);
_logger.LogInformation(
"Published manifest {Digest} with {Count} artefacts",
result.Digest,
manifest.Artifacts.Count);
return new ScanResult
{
ManifestUri = result.Uri,
ManifestDigest = result.Digest
};
}
```
### 1.3 Cache EntryTrace Results
```csharp
public async Task<EntryTraceGraph?> GetOrComputeEntryTraceAsync(
ImageReference image,
EntryTraceOptions options,
CancellationToken ct)
{
// Create deterministic cache key
var cacheKey = new SurfaceCacheKey(
@namespace: "entrytrace.graph",
tenant: _environment.Settings.Tenant,
digest: ComputeOptionsHash(options, image.Digest));
// Try cache first
var cached = await _cache.TryGetAsync<EntryTraceGraph>(cacheKey, ct);
if (cached is not null)
{
_logger.LogDebug("EntryTrace cache hit for {Key}", cacheKey);
return cached;
}
// Compute and cache
var graph = await ComputeEntryTraceAsync(image, options, ct);
await _cache.SetAsync(cacheKey, graph, ct);
return graph;
}
```
## Stage 2: Storage (Surface.FS)
### 2.1 Manifest Storage Layout
```
<bucket>/
├── manifests/
│ └── <tenant>/
│ └── <digest[0..1]>/
│ └── <digest[2..3]>/
│ └── <digest>.json
└── payloads/
└── <tenant>/
└── <kind>/
└── sha256/
└── <digest[0..1]>/
└── <digest[2..3]>/
└── <digest>.json.zst
```
### 2.2 Local Cache Layout
```
<cache_root>/
├── manifests/ # Manifest JSON files
│ └── <tenant>/...
├── cache/ # Hot artefacts
│ └── <namespace>/
│ └── <tenant>/
│ └── <digest>
└── temp/ # In-progress writes
```
### 2.3 Manifest URI Format
```
cas://<bucket>/<prefix>/<tenant>/<digest[0..1]>/<digest[2..3]>/<digest>.json
```
Example:
```
cas://surface-cache/manifests/acme/ab/cd/abcdef0123456789...json
```
## Stage 3: Consumption
### 3.1 WebService API
```http
GET /api/v1/scans/{id}
```
Response includes Surface manifest pointer:
```json
{
"id": "scan-1234",
"status": "completed",
"surface": {
"manifestUri": "cas://surface-cache/manifests/acme/ab/cd/...",
"manifestDigest": "sha256:abcdef...",
"artifacts": [
{
"kind": "layer.fragments",
"uri": "cas://surface-cache/payloads/acme/layer.fragments/...",
"digest": "sha256:123456...",
"mediaType": "application/vnd.stellaops.layer-fragments+json"
}
]
}
}
```
### 3.2 Zastava Drift Detection
```csharp
public async Task<DriftResult> DetectDriftAsync(
string imageDigest,
CancellationToken ct)
{
// 1. Fetch baseline manifest
var manifestUri = await _surfacePointerService.GetManifestUriAsync(imageDigest, ct);
var manifest = await _manifestReader.TryGetByUriAsync(manifestUri, ct);
if (manifest is null)
{
return DriftResult.NoBaseline();
}
// 2. Get EntryTrace artefact
var entryTraceArtifact = manifest.Artifacts
.FirstOrDefault(a => a.Kind == "entrytrace.graph");
if (entryTraceArtifact is null)
{
return DriftResult.NoEntryTrace();
}
// 3. Compare with runtime
var baseline = await _payloadStore.GetAsync<EntryTraceGraph>(
entryTraceArtifact.Uri, ct);
var runtime = await _runtimeCollector.CollectAsync(ct);
return CompareGraphs(baseline, runtime);
}
```
### 3.3 Scheduler Rescan Planning
```csharp
public async Task<RescanPlan> CreateRescanPlanAsync(
string imageDigest,
CancellationToken ct)
{
// 1. Read manifest to understand what was scanned
var manifest = await _manifestReader.TryGetByDigestAsync(imageDigest, ct);
if (manifest is null || IsExpired(manifest))
{
return RescanPlan.FullRescan();
}
// 2. Check for layer changes
var layerArtifact = manifest.Artifacts
.FirstOrDefault(a => a.Kind == "layer.fragments");
if (layerArtifact is not null)
{
var layers = await _payloadStore.GetAsync<LayerFragments>(
layerArtifact.Uri, ct);
var changedLayers = await DetectChangedLayersAsync(layers, ct);
if (changedLayers.Any())
{
return RescanPlan.IncrementalRescan(changedLayers);
}
}
return RescanPlan.NoRescanNeeded();
}
```
## Offline Kit Workflow
### Export (Online Environment)
```bash
# 1. Build offline kit with Surface manifests
python ops/offline-kit/build_offline_kit.py \
--version 2025.10.0 \
--include-surface-manifests \
--output-dir out/offline-kit
# 2. Kit structure includes:
# offline/
# surface/
# manifests/
# <tenant>/<digest[0..1]>/<digest[2..3]>/<digest>.json
# payloads/
# <tenant>/<kind>/sha256/<digest[0..1]>/<digest[2..3]>/<digest>.json.zst
# manifest-index.json
```
### Import (Air-Gapped Environment)
```csharp
public async Task ImportOfflineKitAsync(
string kitPath,
CancellationToken ct)
{
var surfacePath = Path.Combine(kitPath, "surface");
var indexPath = Path.Combine(surfacePath, "manifest-index.json");
var index = await LoadIndexAsync(indexPath, ct);
foreach (var entry in index.Manifests)
{
// 1. Load and verify manifest
var manifestPath = Path.Combine(surfacePath, entry.RelativePath);
var manifest = await LoadManifestAsync(manifestPath, ct);
// 2. Verify digest
var computedDigest = ComputeDigest(manifest);
if (computedDigest != entry.Digest)
{
throw new InvalidOperationException(
$"Manifest digest mismatch: expected {entry.Digest}, got {computedDigest}");
}
// 3. Import via Surface.FS API
await _manifestWriter.PublishAsync(manifest, ct);
_logger.LogInformation(
"Imported manifest {Digest} for image {Image}",
entry.Digest,
manifest.ImageDigest);
}
// 4. Import payloads
foreach (var payload in index.Payloads)
{
var payloadPath = Path.Combine(surfacePath, payload.RelativePath);
await _payloadStore.ImportAsync(payloadPath, payload.Uri, ct);
}
}
```
### Offline Operation
Once imported, Surface.FS consumers operate normally:
```csharp
// Same code works online and offline
var manifest = await _manifestReader.TryGetByUriAsync(manifestUri, ct);
var payload = await _payloadStore.GetAsync(artifact.Uri, ct);
```
## Configuration Reference
### SurfaceManifestStoreOptions
| Option | Default | Description |
|--------|---------|-------------|
| `Bucket` | `surface-cache` | Object store bucket |
| `ManifestPrefix` | `manifests` | Prefix for manifest objects |
| `PayloadPrefix` | `payloads` | Prefix for payload objects |
| `LocalManifestRoot` | `<cache>/manifests` | Local manifest directory |
### SurfaceCacheOptions
| Option | Default | Description |
|--------|---------|-------------|
| `Root` | `<temp>/stellaops/surface` | Cache root directory |
| `QuotaMegabytes` | `4096` | Cache size limit |
| `EvictionThreshold` | `0.9` | Trigger eviction at 90% quota |
## Metrics
| Metric | Labels | Description |
|--------|--------|-------------|
| `surface_manifest_published_total` | `tenant`, `kind` | Manifests published |
| `surface_manifest_cache_hit_total` | `namespace`, `tenant` | Cache hits |
| `surface_manifest_publish_duration_ms` | `tenant` | Publish latency |
| `surface_payload_persisted_total` | `kind` | Payloads stored |
## Troubleshooting
### Manifest Not Found
1. Check tenant matches between writer and reader
2. Verify Surface.FS endpoint is reachable
3. Check bucket permissions
4. Review `surface_manifest_published_total` metric
### Cache Miss Despite Expected Hit
1. Verify cache key components match (namespace, tenant, digest)
2. Check cache quota - eviction may have occurred
3. Review `surface_manifest_cache_hit_total` metric
### Offline Import Failures
1. Verify manifest digest matches index
2. Check file permissions on import path
3. Ensure Surface.FS endpoint is writable
4. Review import logs for specific errors
## References
- [Surface.FS Design](../design/surface-fs.md)
- [Surface.Env Design](../design/surface-env.md)
- [Surface.Validation Guide](./surface-validation-extensibility.md)
- [Offline Kit Documentation](../../../../24_OFFLINE_KIT.md)

View File

@@ -0,0 +1,455 @@
# Surface.Validation Extensibility Guide
> **Version:** 1.0 (2025-11-28)
>
> **Audience:** Scanner Worker/WebService integrators, custom analyzer developers, Zastava contributors
## Overview
Surface.Validation provides a pluggable validator framework for ensuring configuration and data preconditions before performing scanner work. This guide covers how to extend the validation system with custom validators, customize reporting, and integrate validation into your components.
## Quick Start
### Basic Registration
```csharp
// In Program.cs or your DI configuration
builder.Services.AddSurfaceValidation();
```
This registers the default validators:
- `SurfaceEndpointValidator` - Validates Surface.FS endpoint and bucket
- `SurfaceCacheValidator` - Validates cache directory writability and quota
- `SurfaceSecretsValidator` - Validates secrets provider configuration
### Adding Custom Validators
```csharp
builder.Services.AddSurfaceValidation(builder =>
{
builder.AddValidator<MyCustomValidator>();
builder.AddValidator<AnotherValidator>();
});
```
## Writing Custom Validators
### Validator Interface
Implement `ISurfaceValidator` to create a custom validator:
```csharp
public interface ISurfaceValidator
{
ValueTask<SurfaceValidationResult> ValidateAsync(
SurfaceValidationContext context,
CancellationToken cancellationToken = default);
}
```
### Example: Registry Credentials Validator
```csharp
public sealed class RegistryCredentialsValidator : ISurfaceValidator
{
private readonly IHttpClientFactory _httpClientFactory;
public RegistryCredentialsValidator(IHttpClientFactory httpClientFactory)
{
_httpClientFactory = httpClientFactory;
}
public async ValueTask<SurfaceValidationResult> ValidateAsync(
SurfaceValidationContext context,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(context);
var issues = new List<SurfaceValidationIssue>();
// Access secrets configuration from context
var secrets = context.Environment.Secrets;
if (secrets.Provider == "file" && string.IsNullOrEmpty(secrets.Root))
{
issues.Add(SurfaceValidationIssue.Error(
"REGISTRY_SECRETS_ROOT_MISSING",
"Registry secrets root path is not configured.",
"Set SCANNER_SURFACE_SECRETS_ROOT to the secrets directory."));
}
// Access custom properties passed during validation
if (context.Properties.TryGetValue("registryEndpoint", out var endpoint))
{
var reachable = await CheckEndpointAsync(endpoint?.ToString(), cancellationToken);
if (!reachable)
{
issues.Add(SurfaceValidationIssue.Warning(
"REGISTRY_ENDPOINT_UNREACHABLE",
$"Registry endpoint {endpoint} is not reachable.",
"Verify network connectivity to the container registry."));
}
}
return issues.Count == 0
? SurfaceValidationResult.Success()
: SurfaceValidationResult.FromIssues(issues);
}
private async Task<bool> CheckEndpointAsync(string? endpoint, CancellationToken ct)
{
if (string.IsNullOrEmpty(endpoint)) return true;
try
{
var client = _httpClientFactory.CreateClient();
client.Timeout = TimeSpan.FromMilliseconds(500); // Keep validations fast
var response = await client.GetAsync(endpoint, ct);
return response.IsSuccessStatusCode;
}
catch
{
return false;
}
}
}
```
### Best Practices for Validators
1. **Keep validations fast** - Target < 500ms per validator to avoid blocking startup
2. **Use appropriate severity levels**:
- `Error` - Fatal misconfiguration that prevents operation
- `Warning` - Suboptimal configuration that may cause issues
- `Info` - Informational notices
3. **Provide actionable hints** - Include remediation steps in the hint parameter
4. **Access services via context** - Use `context.Services.GetService<T>()` for DI
5. **Check cancellation tokens** - Honor cancellation for async operations
## Validation Context
### Creating Context with Properties
```csharp
var context = SurfaceValidationContext.Create(
serviceProvider,
componentName: "Scanner.Worker",
environment: surfaceEnvironment,
properties: new Dictionary<string, object?>
{
["jobId"] = currentJob.Id,
["imageDigest"] = image.Digest,
["configPath"] = "/etc/scanner/config.yaml"
});
```
### Accessing Context in Validators
```csharp
public ValueTask<SurfaceValidationResult> ValidateAsync(
SurfaceValidationContext context,
CancellationToken cancellationToken = default)
{
// Access environment settings
var endpoint = context.Environment.SurfaceFsEndpoint;
var bucket = context.Environment.SurfaceFsBucket;
var tenant = context.Environment.Tenant;
// Access custom properties
if (context.Properties.TryGetValue("imageDigest", out var digest))
{
// Validate specific to this image
}
// Access DI services
var logger = context.Services.GetService<ILogger<MyValidator>>();
}
```
## Running Validators
### Using the Validator Runner
```csharp
public class MyService
{
private readonly ISurfaceValidatorRunner _runner;
private readonly ISurfaceEnvironment _environment;
public MyService(ISurfaceValidatorRunner runner, ISurfaceEnvironment environment)
{
_runner = runner;
_environment = environment;
}
public async Task ExecuteAsync(CancellationToken ct)
{
var context = SurfaceValidationContext.Create(
_serviceProvider,
"MyService",
_environment.Settings);
// Option 1: Get results and handle manually
var result = await _runner.RunAllAsync(context, ct);
if (!result.IsSuccess)
{
foreach (var issue in result.Issues.Where(i => i.Severity == SurfaceValidationSeverity.Error))
{
_logger.LogError("Validation failed: {Code} - {Message}", issue.Code, issue.Message);
}
return;
}
// Option 2: Throw on failure (respects options)
await _runner.EnsureAsync(context, ct);
// Continue with work...
}
}
```
## Custom Reporting
### Implementing a Reporter
```csharp
public sealed class MetricsSurfaceValidationReporter : ISurfaceValidationReporter
{
private readonly IMetricsFactory _metrics;
public MetricsSurfaceValidationReporter(IMetricsFactory metrics)
{
_metrics = metrics;
}
public void Report(SurfaceValidationContext context, SurfaceValidationResult result)
{
var counter = _metrics.CreateCounter<long>("surface_validation_issues_total");
foreach (var issue in result.Issues)
{
counter.Add(1, new KeyValuePair<string, object?>[]
{
new("code", issue.Code),
new("severity", issue.Severity.ToString().ToLowerInvariant()),
new("component", context.ComponentName)
});
}
}
}
```
### Registering Custom Reporters
```csharp
// Replace default reporter
builder.Services.AddSingleton<ISurfaceValidationReporter, MetricsSurfaceValidationReporter>();
// Or add alongside default (using composite pattern)
builder.Services.Decorate<ISurfaceValidationReporter>((inner, sp) =>
new CompositeSurfaceValidationReporter(
inner,
sp.GetRequiredService<MetricsSurfaceValidationReporter>()));
```
## Configuration Options
### SurfaceValidationOptions
| Option | Default | Description |
|--------|---------|-------------|
| `ThrowOnFailure` | `true` | Whether `EnsureAsync()` throws on validation failure |
| `ContinueOnError` | `false` | Whether to continue running validators after first error |
Configure via `IConfiguration`:
```json
{
"Surface": {
"Validation": {
"ThrowOnFailure": true,
"ContinueOnError": false
}
}
}
```
Or programmatically:
```csharp
builder.Services.Configure<SurfaceValidationOptions>(options =>
{
options.ThrowOnFailure = true;
options.ContinueOnError = true; // Useful for diagnostics
});
```
## Issue Codes
### Standard Codes
| Code | Severity | Validator |
|------|----------|-----------|
| `SURFACE_ENV_MISSING_ENDPOINT` | Error | SurfaceEndpointValidator |
| `SURFACE_FS_BUCKET_MISSING` | Error | SurfaceEndpointValidator |
| `SURFACE_ENV_CACHE_DIR_UNWRITABLE` | Error | SurfaceCacheValidator |
| `SURFACE_ENV_CACHE_QUOTA_INVALID` | Error | SurfaceCacheValidator |
| `SURFACE_SECRET_PROVIDER_UNKNOWN` | Error | SurfaceSecretsValidator |
| `SURFACE_SECRET_CONFIGURATION_MISSING` | Error | SurfaceSecretsValidator |
| `SURFACE_ENV_TENANT_MISSING` | Error | SurfaceSecretsValidator |
### Custom Issue Codes
Follow the naming convention: `<SUBSYSTEM>_<COMPONENT>_<ISSUE>`
```csharp
public static class MyValidationCodes
{
public const string RegistrySecretsRootMissing = "REGISTRY_SECRETS_ROOT_MISSING";
public const string RegistryEndpointUnreachable = "REGISTRY_ENDPOINT_UNREACHABLE";
public const string CacheWarmupFailed = "CACHE_WARMUP_FAILED";
}
```
## Integration Examples
### Scanner Worker Startup
```csharp
// In hosted service
public async Task StartAsync(CancellationToken ct)
{
var context = SurfaceValidationContext.Create(
_services,
"Scanner.Worker",
_surfaceEnv.Settings);
try
{
await _validatorRunner.EnsureAsync(context, ct);
_logger.LogInformation("Surface validation passed");
}
catch (SurfaceValidationException ex)
{
_logger.LogCritical(ex, "Surface validation failed; worker cannot start");
throw;
}
}
```
### Per-Scan Validation
```csharp
public async Task<ScanResult> ScanImageAsync(ImageReference image, CancellationToken ct)
{
var context = SurfaceValidationContext.Create(
_services,
"Scanner.Analyzer",
_surfaceEnv.Settings,
new Dictionary<string, object?>
{
["imageDigest"] = image.Digest,
["imageReference"] = image.Reference
});
var result = await _validatorRunner.RunAllAsync(context, ct);
if (result.HasErrors)
{
return ScanResult.Failed(result.Issues.Select(i => i.Message));
}
// Proceed with scan...
}
```
### Zastava Webhook Readiness
```csharp
app.MapGet("/readyz", async (ISurfaceValidatorRunner runner, ISurfaceEnvironment env) =>
{
var context = SurfaceValidationContext.Create(
app.Services,
"Zastava.Webhook",
env.Settings);
var result = await runner.RunAllAsync(context);
if (!result.IsSuccess)
{
return Results.Json(new
{
status = "unhealthy",
issues = result.Issues.Select(i => new { i.Code, i.Message, i.Hint })
}, statusCode: 503);
}
return Results.Ok(new { status = "healthy" });
});
```
## Testing Validators
### Unit Testing
```csharp
[Fact]
public async Task Validator_MissingEndpoint_ReturnsError()
{
// Arrange
var settings = new SurfaceEnvironmentSettings(
SurfaceFsEndpoint: new Uri("https://surface.invalid"),
SurfaceFsBucket: "",
// ... other settings
);
var context = SurfaceValidationContext.Create(
new ServiceCollection().BuildServiceProvider(),
"Test",
settings);
var validator = new SurfaceEndpointValidator();
// Act
var result = await validator.ValidateAsync(context);
// Assert
Assert.False(result.IsSuccess);
Assert.Contains(result.Issues, i => i.Code == SurfaceValidationIssueCodes.SurfaceEndpointMissing);
}
```
### Integration Testing
```csharp
[Fact]
public async Task ValidationRunner_AllValidatorsExecute()
{
// Arrange
var services = new ServiceCollection();
services.AddSurfaceValidation(builder =>
{
builder.AddValidator<TestValidator1>();
builder.AddValidator<TestValidator2>();
});
var provider = services.BuildServiceProvider();
var runner = provider.GetRequiredService<ISurfaceValidatorRunner>();
var context = SurfaceValidationContext.Create(
provider,
"IntegrationTest",
CreateValidSettings());
// Act
var result = await runner.RunAllAsync(context);
// Assert
Assert.True(result.IsSuccess);
}
```
## References
- [Surface.Validation Design](../design/surface-validation.md)
- [Surface.Env Design](../design/surface-env.md)
- [Surface.Secrets Schema](../design/surface-secrets-schema.md)