Files
git.stella-ops.org/docs/implplan/SPRINT_1229_001_BE_sbom-sources-foundation.md

22 KiB

SPRINT_1229_001_BE: SBOM Sources Foundation

Executive Summary

This sprint establishes the backend foundation for unified SBOM source management across all scanner types: Zastava (registry webhooks), Docker Scanner (direct image scans), CLI Scanner (external submissions), and Git/Sources Scanner (repository scans).

Working Directory: src/Scanner/, src/Orchestrator/, src/SbomService/ Module: BE (Backend) Dependencies: Orchestrator Source model, Authority (credentials), Scanner WebService


Problem Statement

Currently, StellaOps has fragmented source management:

  • Orchestrator Sources - General job producers (advisory, vex, sbom types)
  • Concelier Connectors - Advisory-specific with heartbeat/command protocol
  • SBOM Provenance - Attribution only (tool, version, CI context)
  • Scanner Jobs - No source configuration, just ad-hoc submissions
  • Zastava - Webhook receiver with no UI-based configuration

Gap: No unified way to configure, manage, and monitor SBOM ingestion sources.


Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                        SBOM Sources Manager                                  │
│                                                                             │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐   │
│  │   Zastava   │ │   Docker    │ │     CLI     │ │    Git/Sources      │   │
│  │  (Registry  │ │  (Direct    │ │  (External  │ │   (Repository       │   │
│  │  Webhooks)  │ │   Image)    │ │ Submission) │ │      Scans)         │   │
│  └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────────┬──────────┘   │
│         │               │               │                    │              │
│         └───────────────┴───────────────┴────────────────────┘              │
│                                    │                                        │
│                          ┌─────────▼─────────┐                              │
│                          │   SbomSource      │                              │
│                          │   Domain Model    │                              │
│                          └─────────┬─────────┘                              │
│                                    │                                        │
│         ┌──────────────────────────┼──────────────────────────┐            │
│         │                          │                          │            │
│  ┌──────▼──────┐  ┌────────────────▼────────────────┐  ┌──────▼──────┐    │
│  │  Credential │  │        Configuration            │  │   Status    │    │
│  │    Vault    │  │        (Type-specific)          │  │  Tracking   │    │
│  │  (AuthRef)  │  │                                 │  │  & History  │    │
│  └─────────────┘  └─────────────────────────────────┘  └─────────────┘    │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                          Scan Trigger Service                               │
│                                                                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────────┐    │
│  │  Webhook    │  │  Scheduled  │  │  On-Demand  │  │    Event        │    │
│  │  Handler    │  │   (Cron)    │  │   (Manual)  │  │   (Git Push)    │    │
│  └─────────────┘  └─────────────┘  └─────────────┘  └─────────────────┘    │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                Scanner → SBOM Service → Lineage Ledger                      │
└─────────────────────────────────────────────────────────────────────────────┘

Source Type Specifications

1. Zastava (Registry Webhook)

Trigger: Push webhooks from container registries Configuration:

interface ZastavaSourceConfig {
  registryType: 'dockerhub' | 'harbor' | 'quay' | 'ecr' | 'gcr' | 'acr' | 'ghcr' | 'generic';
  registryUrl: string;
  webhookPath: string;         // Generated: /api/v1/webhooks/zastava/{sourceId}
  webhookSecret: string;       // AuthRef, not inline

  filters: {
    repositories: string[];    // Glob patterns: ["myorg/*", "prod-*"]
    tags: string[];            // Glob patterns: ["v*", "latest", "!*-dev"]
    excludeRepositories?: string[];
    excludeTags?: string[];
  };

  scanOptions: {
    analyzers: string[];       // ["os", "lang.node", "lang.python"]
    enableReachability: boolean;
    enableVexLookup: boolean;
  };
}

Credentials (AuthRef):

  • registry.{sourceId}.username
  • registry.{sourceId}.password or .token
  • registry.{sourceId}.webhookSecret

2. Docker Scanner (Direct Image)

Trigger: Scheduled (cron) or on-demand Configuration:

interface DockerSourceConfig {
  registryUrl: string;
  images: ImageSpec[];

  schedule?: {
    cron: string;              // "0 2 * * *" (daily at 2am)
    timezone: string;          // "UTC"
  };

  scanOptions: {
    analyzers: string[];
    enableReachability: boolean;
    enableVexLookup: boolean;
    platforms?: string[];      // ["linux/amd64", "linux/arm64"]
  };
}

interface ImageSpec {
  reference: string;           // "nginx:latest" or "myrepo/app:v1.2.3"
  tagPatterns?: string[];      // Scan matching tags: ["v*", "release-*"]
  digestPin?: boolean;         // Pin to specific digest after first scan
}

Credentials (AuthRef):

  • registry.{sourceId}.username
  • registry.{sourceId}.password

3. CLI Scanner (External Submission)

Trigger: External CLI invocations with API token Configuration:

interface CliSourceConfig {
  allowedTools: string[];      // ["stella-cli", "trivy", "syft"]
  allowedCiSystems?: string[]; // ["github-actions", "gitlab-ci", "jenkins"]

  validation: {
    requireSignedSbom: boolean;
    allowedSigners?: string[]; // Public key fingerprints
    maxSbomSizeBytes: number;
    allowedFormats: ('spdx-json' | 'cyclonedx-json' | 'cyclonedx-xml')[];
  };

  attribution: {
    requireBuildId: boolean;
    requireRepository: boolean;
    requireCommitSha: boolean;
  };
}

Credentials (AuthRef):

  • cli.{sourceId}.apiToken - Token for CLI authentication
  • Scopes: sbom:upload, scan:trigger

4. Git/Sources Scanner (Repository)

Trigger: Webhook (push/PR), scheduled, or on-demand Configuration:

interface GitSourceConfig {
  provider: 'github' | 'gitlab' | 'bitbucket' | 'azure-devops' | 'gitea';
  repositoryUrl: string;

  branches: {
    include: string[];         // ["main", "release/*"]
    exclude?: string[];        // ["feature/*", "wip/*"]
  };

  triggers: {
    onPush: boolean;
    onPullRequest: boolean;
    onTag: boolean;
    tagPatterns?: string[];    // ["v*", "release-*"]
    scheduled?: {
      cron: string;
      timezone: string;
    };
  };

  scanOptions: {
    analyzers: string[];
    scanPaths?: string[];      // [".", "services/*"]
    excludePaths?: string[];   // ["vendor/", "node_modules/"]
    enableLockfileOnly: boolean;
    enableReachability: boolean;
  };

  webhookConfig?: {
    webhookPath: string;       // Generated
    webhookSecret: string;     // AuthRef
  };
}

Credentials (AuthRef):

  • git.{sourceId}.token - PAT or OAuth token
  • git.{sourceId}.sshKey - SSH private key (optional)
  • git.{sourceId}.webhookSecret

Domain Model

SbomSource Entity

File: src/Scanner/__Libraries/StellaOps.Scanner.Sources/Domain/SbomSource.cs

public sealed class SbomSource
{
    public Guid SourceId { get; init; }
    public string TenantId { get; init; } = null!;
    public string Name { get; init; } = null!;
    public string? Description { get; set; }

    public SbomSourceType SourceType { get; init; }
    public SbomSourceStatus Status { get; private set; }

    // Type-specific configuration (JSON)
    public JsonDocument Configuration { get; set; } = null!;

    // Credential reference (NOT the actual secret)
    public string? AuthRef { get; set; }

    // Webhook endpoint (generated for webhook-based sources)
    public string? WebhookEndpoint { get; private set; }
    public string? WebhookSecretRef { get; private set; }

    // Scheduling
    public string? CronSchedule { get; set; }
    public string? CronTimezone { get; set; }
    public DateTimeOffset? NextScheduledRun { get; private set; }

    // Status tracking
    public DateTimeOffset? LastRunAt { get; private set; }
    public SbomSourceRunStatus? LastRunStatus { get; private set; }
    public string? LastRunError { get; private set; }
    public int ConsecutiveFailures { get; private set; }

    // Pause/Resume
    public bool Paused { get; private set; }
    public string? PauseReason { get; private set; }
    public string? PauseTicket { get; private set; }
    public DateTimeOffset? PausedAt { get; private set; }
    public string? PausedBy { get; private set; }

    // Rate limiting
    public int? MaxScansPerHour { get; set; }
    public int? CurrentHourScans { get; private set; }
    public DateTimeOffset? HourWindowStart { get; private set; }

    // Audit
    public DateTimeOffset CreatedAt { get; init; }
    public string CreatedBy { get; init; } = null!;
    public DateTimeOffset UpdatedAt { get; private set; }
    public string UpdatedBy { get; private set; } = null!;

    // Tags for organization
    public List<string> Tags { get; set; } = [];

    // Metadata (custom key-value pairs)
    public Dictionary<string, string> Metadata { get; set; } = [];
}

public enum SbomSourceType
{
    Zastava,       // Registry webhook
    Docker,        // Direct image scan
    Cli,           // External CLI submission
    Git            // Git repository
}

public enum SbomSourceStatus
{
    Active,        // Operational
    Paused,        // Manually paused
    Error,         // Last run failed
    Disabled,      // Administratively disabled
    Pending        // Awaiting first run / validation
}

public enum SbomSourceRunStatus
{
    Succeeded,
    Failed,
    PartialSuccess,  // Some items succeeded, some failed
    Skipped,         // No matching items
    Cancelled
}

SbomSourceRun Entity (History)

File: src/Scanner/__Libraries/StellaOps.Scanner.Sources/Domain/SbomSourceRun.cs

public sealed class SbomSourceRun
{
    public Guid RunId { get; init; }
    public Guid SourceId { get; init; }
    public string TenantId { get; init; } = null!;

    public SbomSourceRunTrigger Trigger { get; init; }
    public string? TriggerDetails { get; init; }  // Webhook payload digest, cron expression, etc.

    public SbomSourceRunStatus Status { get; private set; }

    public DateTimeOffset StartedAt { get; init; }
    public DateTimeOffset? CompletedAt { get; private set; }
    public long DurationMs => CompletedAt.HasValue
        ? (long)(CompletedAt.Value - StartedAt).TotalMilliseconds
        : 0;

    // Results
    public int ItemsDiscovered { get; private set; }
    public int ItemsScanned { get; private set; }
    public int ItemsSucceeded { get; private set; }
    public int ItemsFailed { get; private set; }
    public int ItemsSkipped { get; private set; }

    // Scan job references
    public List<Guid> ScanJobIds { get; init; } = [];

    // Error tracking
    public string? ErrorMessage { get; private set; }
    public string? ErrorStackTrace { get; private set; }

    // Correlation
    public string CorrelationId { get; init; } = null!;
}

public enum SbomSourceRunTrigger
{
    Scheduled,       // Cron-based
    Webhook,         // Registry push, git push
    Manual,          // User-initiated
    Backfill,        // Historical scan
    Retry            // Retry of failed run
}

Task Breakdown

T1: Domain Models & Contracts (DOING)

Files to Create:

  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Domain/SbomSource.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Domain/SbomSourceRun.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Contracts/SbomSourceContracts.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Contracts/SourceTypeConfigs.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/StellaOps.Scanner.Sources.csproj

Deliverables:

  • Domain entities with validation
  • Configuration DTOs per source type
  • Request/Response contracts for API

T2: Repository & Persistence (TODO)

Files to Create:

  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/ISbomSourceRepository.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/SbomSourceRepository.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/SbomSourceRunRepository.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/Migrations/*.cs

Deliverables:

  • PostgreSQL persistence layer
  • EF Core migrations for schema
  • Query methods: list, get, create, update, delete
  • Run history queries with pagination

T3: Source Service & Business Logic (TODO)

Files to Create:

  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Services/ISbomSourceService.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Services/SbomSourceService.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Services/SourceConfigValidator.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Services/SourceConnectionTester.cs

Deliverables:

  • CRUD operations with validation
  • Configuration validation per source type
  • Connection testing (registry auth, git auth)
  • Pause/resume with audit trail
  • Webhook endpoint generation

T4: Credential Integration (TODO)

Files to Create:

  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Credentials/ISourceCredentialStore.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Credentials/AuthorityCredentialStore.cs
  • src/Scanner/__Libraries/StellaOps.Scanner.Sources/Credentials/SourceCredentialModels.cs

Deliverables:

  • AuthRef pattern implementation
  • Credential CRUD (store, retrieve, rotate)
  • Integration with Authority service
  • Secure credential handling (never log, never expose)

T5: REST API Endpoints (TODO)

Files to Create:

  • src/Scanner/StellaOps.Scanner.WebService/Endpoints/SourceEndpoints.cs
  • src/Scanner/StellaOps.Scanner.WebService/Endpoints/SourceRunEndpoints.cs

API Design:

# Source Management
GET    /api/v1/sources                    # List sources (paginated, filtered)
POST   /api/v1/sources                    # Create source
GET    /api/v1/sources/{sourceId}         # Get source details
PUT    /api/v1/sources/{sourceId}         # Update source
DELETE /api/v1/sources/{sourceId}         # Delete source

# Source Actions
POST   /api/v1/sources/{sourceId}/test    # Test connection
POST   /api/v1/sources/{sourceId}/trigger # Trigger manual scan
POST   /api/v1/sources/{sourceId}/pause   # Pause source
POST   /api/v1/sources/{sourceId}/resume  # Resume source

# Source Runs (History)
GET    /api/v1/sources/{sourceId}/runs    # List runs (paginated)
GET    /api/v1/sources/{sourceId}/runs/{runId}  # Get run details

# Webhook Endpoints (registered dynamically)
POST   /api/v1/webhooks/zastava/{sourceId}  # Registry webhook
POST   /api/v1/webhooks/git/{sourceId}      # Git webhook

Authorization Scopes:

  • sources:read - List, get sources
  • sources:write - Create, update, delete
  • sources:trigger - Manual trigger
  • sources:admin - Pause, resume, delete

T6: Unit Tests (TODO)

Files to Create:

  • src/Scanner/__Tests/StellaOps.Scanner.Sources.Tests/Domain/SbomSourceTests.cs
  • src/Scanner/__Tests/StellaOps.Scanner.Sources.Tests/Services/SbomSourceServiceTests.cs
  • src/Scanner/__Tests/StellaOps.Scanner.Sources.Tests/Services/SourceConfigValidatorTests.cs
  • src/Scanner/__Tests/StellaOps.Scanner.Sources.Tests/Persistence/SbomSourceRepositoryTests.cs

Database Schema

-- src/Scanner/__Libraries/StellaOps.Scanner.Sources/Persistence/Migrations/

CREATE TABLE scanner.sbom_sources (
    source_id       UUID PRIMARY KEY,
    tenant_id       TEXT NOT NULL,
    name            TEXT NOT NULL,
    description     TEXT,
    source_type     TEXT NOT NULL,  -- 'zastava', 'docker', 'cli', 'git'
    status          TEXT NOT NULL DEFAULT 'pending',
    configuration   JSONB NOT NULL,
    auth_ref        TEXT,
    webhook_endpoint TEXT,
    webhook_secret_ref TEXT,
    cron_schedule   TEXT,
    cron_timezone   TEXT DEFAULT 'UTC',
    next_scheduled_run TIMESTAMPTZ,
    last_run_at     TIMESTAMPTZ,
    last_run_status TEXT,
    last_run_error  TEXT,
    consecutive_failures INT DEFAULT 0,
    paused          BOOLEAN DEFAULT FALSE,
    pause_reason    TEXT,
    pause_ticket    TEXT,
    paused_at       TIMESTAMPTZ,
    paused_by       TEXT,
    max_scans_per_hour INT,
    current_hour_scans INT DEFAULT 0,
    hour_window_start TIMESTAMPTZ,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    created_by      TEXT NOT NULL,
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_by      TEXT NOT NULL,
    tags            TEXT[] DEFAULT '{}',
    metadata        JSONB DEFAULT '{}',

    CONSTRAINT uq_source_tenant_name UNIQUE (tenant_id, name)
);

CREATE INDEX idx_sources_tenant ON scanner.sbom_sources(tenant_id);
CREATE INDEX idx_sources_type ON scanner.sbom_sources(source_type);
CREATE INDEX idx_sources_status ON scanner.sbom_sources(status);
CREATE INDEX idx_sources_next_run ON scanner.sbom_sources(next_scheduled_run)
    WHERE next_scheduled_run IS NOT NULL;

CREATE TABLE scanner.sbom_source_runs (
    run_id          UUID PRIMARY KEY,
    source_id       UUID NOT NULL REFERENCES scanner.sbom_sources(source_id) ON DELETE CASCADE,
    tenant_id       TEXT NOT NULL,
    trigger         TEXT NOT NULL,  -- 'scheduled', 'webhook', 'manual', 'backfill', 'retry'
    trigger_details TEXT,
    status          TEXT NOT NULL DEFAULT 'running',
    started_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    completed_at    TIMESTAMPTZ,
    items_discovered INT DEFAULT 0,
    items_scanned   INT DEFAULT 0,
    items_succeeded INT DEFAULT 0,
    items_failed    INT DEFAULT 0,
    items_skipped   INT DEFAULT 0,
    scan_job_ids    UUID[] DEFAULT '{}',
    error_message   TEXT,
    error_stack_trace TEXT,
    correlation_id  TEXT NOT NULL,

    CONSTRAINT fk_run_source FOREIGN KEY (source_id)
        REFERENCES scanner.sbom_sources(source_id) ON DELETE CASCADE
);

CREATE INDEX idx_runs_source ON scanner.sbom_source_runs(source_id);
CREATE INDEX idx_runs_started ON scanner.sbom_source_runs(started_at DESC);
CREATE INDEX idx_runs_correlation ON scanner.sbom_source_runs(correlation_id);

Delivery Tracker

Task Status Assignee Notes
T1: Domain Models TODO
T2: Repository & Persistence TODO
T3: Source Service TODO
T4: Credential Integration TODO
T5: REST API Endpoints TODO
T6: Unit Tests TODO

Decisions & Risks

Decision Choice Rationale
Source library location StellaOps.Scanner.Sources Co-located with scanner, but separate library for clean separation
Configuration storage JSONB in PostgreSQL Flexible per-type config without schema changes
Credential pattern AuthRef (reference) Security: credentials never in source config, always in vault
Webhook endpoint Dynamic generation Per-source endpoints for isolation and revocation
Risk Mitigation
Credential exposure AuthRef pattern, audit logging, never log credentials
Webhook secret leakage Hashed comparison, rotate-on-demand
Configuration drift Version tracking in metadata, audit trail

Next Sprint

SPRINT_1229_002_BE_sbom-sources-triggers - Trigger service implementation:

  • Scheduler integration for cron-based sources
  • Webhook handlers for Zastava and Git
  • Manual trigger API
  • Retry logic for failed runs