# SPRINT_1229_001_BE: SBOM Sources Foundation ## Executive Summary This sprint establishes the **backend foundation** for unified SBOM source management across all scanner types: Zastava (registry webhooks), Docker Scanner (direct image scans), CLI Scanner (external submissions), and Git/Sources Scanner (repository scans). **Working Directory:** `src/Scanner/`, `src/Orchestrator/`, `src/SbomService/` **Module:** BE (Backend) **Dependencies:** Orchestrator Source model, Authority (credentials), Scanner WebService --- ## Problem Statement Currently, StellaOps has fragmented source management: - **Orchestrator Sources** - General job producers (advisory, vex, sbom types) - **Concelier Connectors** - Advisory-specific with heartbeat/command protocol - **SBOM Provenance** - Attribution only (tool, version, CI context) - **Scanner Jobs** - No source configuration, just ad-hoc submissions - **Zastava** - Webhook receiver with no UI-based configuration **Gap:** No unified way to configure, manage, and monitor SBOM ingestion sources. --- ## Architecture Overview ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ SBOM Sources Manager │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │ │ │ Zastava │ │ Docker │ │ CLI │ │ Git/Sources │ │ │ │ (Registry │ │ (Direct │ │ (External │ │ (Repository │ │ │ │ Webhooks) │ │ Image) │ │ Submission) │ │ Scans) │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────────┬──────────┘ │ │ │ │ │ │ │ │ └───────────────┴───────────────┴────────────────────┘ │ │ │ │ │ ┌─────────▼─────────┐ │ │ │ SbomSource │ │ │ │ Domain Model │ │ │ └─────────┬─────────┘ │ │ │ │ │ ┌──────────────────────────┼──────────────────────────┐ │ │ │ │ │ │ │ ┌──────▼──────┐ ┌────────────────▼────────────────┐ ┌──────▼──────┐ │ │ │ Credential │ │ Configuration │ │ Status │ │ │ │ Vault │ │ (Type-specific) │ │ Tracking │ │ │ │ (AuthRef) │ │ │ │ & History │ │ │ └─────────────┘ └─────────────────────────────────┘ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ Scan Trigger Service │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │ │ │ Webhook │ │ Scheduled │ │ On-Demand │ │ Event │ │ │ │ Handler │ │ (Cron) │ │ (Manual) │ │ (Git Push) │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ Scanner → SBOM Service → Lineage Ledger │ └─────────────────────────────────────────────────────────────────────────────┘ ``` --- ## Source Type Specifications ### 1. Zastava (Registry Webhook) **Trigger:** Push webhooks from container registries **Configuration:** ```typescript interface ZastavaSourceConfig { registryType: 'dockerhub' | 'harbor' | 'quay' | 'ecr' | 'gcr' | 'acr' | 'ghcr' | 'generic'; registryUrl: string; webhookPath: string; // Generated: /api/v1/webhooks/zastava/{sourceId} webhookSecret: string; // AuthRef, not inline filters: { repositories: string[]; // Glob patterns: ["myorg/*", "prod-*"] tags: string[]; // Glob patterns: ["v*", "latest", "!*-dev"] excludeRepositories?: string[]; excludeTags?: string[]; }; scanOptions: { analyzers: string[]; // ["os", "lang.node", "lang.python"] enableReachability: boolean; enableVexLookup: boolean; }; } ``` **Credentials (AuthRef):** - `registry.{sourceId}.username` - `registry.{sourceId}.password` or `.token` - `registry.{sourceId}.webhookSecret` ### 2. Docker Scanner (Direct Image) **Trigger:** Scheduled (cron) or on-demand **Configuration:** ```typescript interface DockerSourceConfig { registryUrl: string; images: ImageSpec[]; schedule?: { cron: string; // "0 2 * * *" (daily at 2am) timezone: string; // "UTC" }; scanOptions: { analyzers: string[]; enableReachability: boolean; enableVexLookup: boolean; platforms?: string[]; // ["linux/amd64", "linux/arm64"] }; } interface ImageSpec { reference: string; // "nginx:latest" or "myrepo/app:v1.2.3" tagPatterns?: string[]; // Scan matching tags: ["v*", "release-*"] digestPin?: boolean; // Pin to specific digest after first scan } ``` **Credentials (AuthRef):** - `registry.{sourceId}.username` - `registry.{sourceId}.password` ### 3. CLI Scanner (External Submission) **Trigger:** External CLI invocations with API token **Configuration:** ```typescript interface CliSourceConfig { allowedTools: string[]; // ["stella-cli", "trivy", "syft"] allowedCiSystems?: string[]; // ["github-actions", "gitlab-ci", "jenkins"] validation: { requireSignedSbom: boolean; allowedSigners?: string[]; // Public key fingerprints maxSbomSizeBytes: number; allowedFormats: ('spdx-json' | 'cyclonedx-json' | 'cyclonedx-xml')[]; }; attribution: { requireBuildId: boolean; requireRepository: boolean; requireCommitSha: boolean; }; } ``` **Credentials (AuthRef):** - `cli.{sourceId}.apiToken` - Token for CLI authentication - Scopes: `sbom:upload`, `scan:trigger` ### 4. Git/Sources Scanner (Repository) **Trigger:** Webhook (push/PR), scheduled, or on-demand **Configuration:** ```typescript interface GitSourceConfig { provider: 'github' | 'gitlab' | 'bitbucket' | 'azure-devops' | 'gitea'; repositoryUrl: string; branches: { include: string[]; // ["main", "release/*"] exclude?: string[]; // ["feature/*", "wip/*"] }; triggers: { onPush: boolean; onPullRequest: boolean; onTag: boolean; tagPatterns?: string[]; // ["v*", "release-*"] scheduled?: { cron: string; timezone: string; }; }; scanOptions: { analyzers: string[]; scanPaths?: string[]; // [".", "services/*"] excludePaths?: string[]; // ["vendor/", "node_modules/"] enableLockfileOnly: boolean; enableReachability: boolean; }; webhookConfig?: { webhookPath: string; // Generated webhookSecret: string; // AuthRef }; } ``` **Credentials (AuthRef):** - `git.{sourceId}.token` - PAT or OAuth token - `git.{sourceId}.sshKey` - SSH private key (optional) - `git.{sourceId}.webhookSecret` --- ## Domain Model ### SbomSource Entity **File:** `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Domain/SbomSource.cs` ```csharp public sealed class SbomSource { public Guid SourceId { get; init; } public string TenantId { get; init; } = null!; public string Name { get; init; } = null!; public string? Description { get; set; } public SbomSourceType SourceType { get; init; } public SbomSourceStatus Status { get; private set; } // Type-specific configuration (JSON) public JsonDocument Configuration { get; set; } = null!; // Credential reference (NOT the actual secret) public string? AuthRef { get; set; } // Webhook endpoint (generated for webhook-based sources) public string? WebhookEndpoint { get; private set; } public string? WebhookSecretRef { get; private set; } // Scheduling public string? CronSchedule { get; set; } public string? CronTimezone { get; set; } public DateTimeOffset? NextScheduledRun { get; private set; } // Status tracking public DateTimeOffset? LastRunAt { get; private set; } public SbomSourceRunStatus? LastRunStatus { get; private set; } public string? LastRunError { get; private set; } public int ConsecutiveFailures { get; private set; } // Pause/Resume public bool Paused { get; private set; } public string? PauseReason { get; private set; } public string? PauseTicket { get; private set; } public DateTimeOffset? PausedAt { get; private set; } public string? PausedBy { get; private set; } // Rate limiting public int? MaxScansPerHour { get; set; } public int? CurrentHourScans { get; private set; } public DateTimeOffset? HourWindowStart { get; private set; } // Audit public DateTimeOffset CreatedAt { get; init; } public string CreatedBy { get; init; } = null!; public DateTimeOffset UpdatedAt { get; private set; } public string UpdatedBy { get; private set; } = null!; // Tags for organization public List Tags { get; set; } = []; // Metadata (custom key-value pairs) public Dictionary Metadata { get; set; } = []; } public enum SbomSourceType { Zastava, // Registry webhook Docker, // Direct image scan Cli, // External CLI submission Git // Git repository } public enum SbomSourceStatus { Active, // Operational Paused, // Manually paused Error, // Last run failed Disabled, // Administratively disabled Pending // Awaiting first run / validation } public enum SbomSourceRunStatus { Succeeded, Failed, PartialSuccess, // Some items succeeded, some failed Skipped, // No matching items Cancelled } ``` ### SbomSourceRun Entity (History) **File:** `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Domain/SbomSourceRun.cs` ```csharp public sealed class SbomSourceRun { public Guid RunId { get; init; } public Guid SourceId { get; init; } public string TenantId { get; init; } = null!; public SbomSourceRunTrigger Trigger { get; init; } public string? TriggerDetails { get; init; } // Webhook payload digest, cron expression, etc. public SbomSourceRunStatus Status { get; private set; } public DateTimeOffset StartedAt { get; init; } public DateTimeOffset? CompletedAt { get; private set; } public long DurationMs => CompletedAt.HasValue ? (long)(CompletedAt.Value - StartedAt).TotalMilliseconds : 0; // Results public int ItemsDiscovered { get; private set; } public int ItemsScanned { get; private set; } public int ItemsSucceeded { get; private set; } public int ItemsFailed { get; private set; } public int ItemsSkipped { get; private set; } // Scan job references public List ScanJobIds { get; init; } = []; // Error tracking public string? ErrorMessage { get; private set; } public string? ErrorStackTrace { get; private set; } // Correlation public string CorrelationId { get; init; } = null!; } public enum SbomSourceRunTrigger { Scheduled, // Cron-based Webhook, // Registry push, git push Manual, // User-initiated Backfill, // Historical scan Retry // Retry of failed run } ``` --- ## Task Breakdown ### T1: Domain Models & Contracts (DOING) **Files to Create:** - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Domain/SbomSource.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Domain/SbomSourceRun.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Contracts/SbomSourceContracts.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Contracts/SourceTypeConfigs.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/StellaOps.Scanner.Sources.csproj` **Deliverables:** - Domain entities with validation - Configuration DTOs per source type - Request/Response contracts for API --- ### T2: Repository & Persistence (TODO) **Files to Create:** - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/ISbomSourceRepository.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/SbomSourceRepository.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/SbomSourceRunRepository.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/Migrations/*.cs` **Deliverables:** - PostgreSQL persistence layer - EF Core migrations for schema - Query methods: list, get, create, update, delete - Run history queries with pagination --- ### T3: Source Service & Business Logic (TODO) **Files to Create:** - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Services/ISbomSourceService.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Services/SbomSourceService.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Services/SourceConfigValidator.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Services/SourceConnectionTester.cs` **Deliverables:** - CRUD operations with validation - Configuration validation per source type - Connection testing (registry auth, git auth) - Pause/resume with audit trail - Webhook endpoint generation --- ### T4: Credential Integration (TODO) **Files to Create:** - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Credentials/ISourceCredentialStore.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Credentials/AuthorityCredentialStore.cs` - `src/Scanner/__Libraries/StellaOps.Scanner.Sources/Credentials/SourceCredentialModels.cs` **Deliverables:** - AuthRef pattern implementation - Credential CRUD (store, retrieve, rotate) - Integration with Authority service - Secure credential handling (never log, never expose) --- ### T5: REST API Endpoints (TODO) **Files to Create:** - `src/Scanner/StellaOps.Scanner.WebService/Endpoints/SourceEndpoints.cs` - `src/Scanner/StellaOps.Scanner.WebService/Endpoints/SourceRunEndpoints.cs` **API Design:** ``` # Source Management GET /api/v1/sources # List sources (paginated, filtered) POST /api/v1/sources # Create source GET /api/v1/sources/{sourceId} # Get source details PUT /api/v1/sources/{sourceId} # Update source DELETE /api/v1/sources/{sourceId} # Delete source # Source Actions POST /api/v1/sources/{sourceId}/test # Test connection POST /api/v1/sources/{sourceId}/trigger # Trigger manual scan POST /api/v1/sources/{sourceId}/pause # Pause source POST /api/v1/sources/{sourceId}/resume # Resume source # Source Runs (History) GET /api/v1/sources/{sourceId}/runs # List runs (paginated) GET /api/v1/sources/{sourceId}/runs/{runId} # Get run details # Webhook Endpoints (registered dynamically) POST /api/v1/webhooks/zastava/{sourceId} # Registry webhook POST /api/v1/webhooks/git/{sourceId} # Git webhook ``` **Authorization Scopes:** - `sources:read` - List, get sources - `sources:write` - Create, update, delete - `sources:trigger` - Manual trigger - `sources:admin` - Pause, resume, delete --- ### T6: Unit Tests (TODO) **Files to Create:** - `src/Scanner/__Tests/StellaOps.Scanner.Sources.Tests/Domain/SbomSourceTests.cs` - `src/Scanner/__Tests/StellaOps.Scanner.Sources.Tests/Services/SbomSourceServiceTests.cs` - `src/Scanner/__Tests/StellaOps.Scanner.Sources.Tests/Services/SourceConfigValidatorTests.cs` - `src/Scanner/__Tests/StellaOps.Scanner.Sources.Tests/Persistence/SbomSourceRepositoryTests.cs` --- ## Database Schema ```sql -- src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/Migrations/ CREATE TABLE scanner.sbom_sources ( source_id UUID PRIMARY KEY, tenant_id TEXT NOT NULL, name TEXT NOT NULL, description TEXT, source_type TEXT NOT NULL, -- 'zastava', 'docker', 'cli', 'git' status TEXT NOT NULL DEFAULT 'pending', configuration JSONB NOT NULL, auth_ref TEXT, webhook_endpoint TEXT, webhook_secret_ref TEXT, cron_schedule TEXT, cron_timezone TEXT DEFAULT 'UTC', next_scheduled_run TIMESTAMPTZ, last_run_at TIMESTAMPTZ, last_run_status TEXT, last_run_error TEXT, consecutive_failures INT DEFAULT 0, paused BOOLEAN DEFAULT FALSE, pause_reason TEXT, pause_ticket TEXT, paused_at TIMESTAMPTZ, paused_by TEXT, max_scans_per_hour INT, current_hour_scans INT DEFAULT 0, hour_window_start TIMESTAMPTZ, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), created_by TEXT NOT NULL, updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_by TEXT NOT NULL, tags TEXT[] DEFAULT '{}', metadata JSONB DEFAULT '{}', CONSTRAINT uq_source_tenant_name UNIQUE (tenant_id, name) ); CREATE INDEX idx_sources_tenant ON scanner.sbom_sources(tenant_id); CREATE INDEX idx_sources_type ON scanner.sbom_sources(source_type); CREATE INDEX idx_sources_status ON scanner.sbom_sources(status); CREATE INDEX idx_sources_next_run ON scanner.sbom_sources(next_scheduled_run) WHERE next_scheduled_run IS NOT NULL; CREATE TABLE scanner.sbom_source_runs ( run_id UUID PRIMARY KEY, source_id UUID NOT NULL REFERENCES scanner.sbom_sources(source_id) ON DELETE CASCADE, tenant_id TEXT NOT NULL, trigger TEXT NOT NULL, -- 'scheduled', 'webhook', 'manual', 'backfill', 'retry' trigger_details TEXT, status TEXT NOT NULL DEFAULT 'running', started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), completed_at TIMESTAMPTZ, items_discovered INT DEFAULT 0, items_scanned INT DEFAULT 0, items_succeeded INT DEFAULT 0, items_failed INT DEFAULT 0, items_skipped INT DEFAULT 0, scan_job_ids UUID[] DEFAULT '{}', error_message TEXT, error_stack_trace TEXT, correlation_id TEXT NOT NULL, CONSTRAINT fk_run_source FOREIGN KEY (source_id) REFERENCES scanner.sbom_sources(source_id) ON DELETE CASCADE ); CREATE INDEX idx_runs_source ON scanner.sbom_source_runs(source_id); CREATE INDEX idx_runs_started ON scanner.sbom_source_runs(started_at DESC); CREATE INDEX idx_runs_correlation ON scanner.sbom_source_runs(correlation_id); ``` --- ## Delivery Tracker | Task | Status | Assignee | Notes | |------|--------|----------|-------| | T1: Domain Models | TODO | | | | T2: Repository & Persistence | TODO | | | | T3: Source Service | TODO | | | | T4: Credential Integration | TODO | | | | T5: REST API Endpoints | TODO | | | | T6: Unit Tests | TODO | | | --- ## Decisions & Risks | Decision | Choice | Rationale | |----------|--------|-----------| | Source library location | `StellaOps.Scanner.Sources` | Co-located with scanner, but separate library for clean separation | | Configuration storage | JSONB in PostgreSQL | Flexible per-type config without schema changes | | Credential pattern | AuthRef (reference) | Security: credentials never in source config, always in vault | | Webhook endpoint | Dynamic generation | Per-source endpoints for isolation and revocation | | Risk | Mitigation | |------|------------| | Credential exposure | AuthRef pattern, audit logging, never log credentials | | Webhook secret leakage | Hashed comparison, rotate-on-demand | | Configuration drift | Version tracking in metadata, audit trail | --- ## Next Sprint **SPRINT_1229_002_BE_sbom-sources-triggers** - Trigger service implementation: - Scheduler integration for cron-based sources - Webhook handlers for Zastava and Git - Manual trigger API - Retry logic for failed runs