Files
git.stella-ops.org/docs/modules/concelier/sbom-learning-api.md
2026-01-22 19:08:46 +02:00

15 KiB

SBOM Learning API

Per SPRINT_8200_0013_0003.

Overview

The SBOM Learning API enables Concelier to learn which advisories are relevant to your organization by registering SBOMs from scanned images. When an SBOM is registered, Concelier matches its components against the canonical advisory database and updates interest scores accordingly.

SBOM Extraction

Concelier normalizes incoming CycloneDX 1.7 and SPDX 3.0.1 documents into the internal ParsedSbom model for matching and downstream analysis.

Current extraction coverage (SPRINT_20260119_015):

  • Document metadata: format, specVersion, serialNumber, created, name, profiles, sbomType, namespace/imports
  • Components: bomRef, type, name, version, purl, cpe, hashes (including SPDX verifiedUsing), license IDs/expressions, license text (base64 decode), external references, properties, scope/modified, supplier/manufacturer, evidence, pedigree, cryptoProperties, modelCard (CycloneDX), swid (CycloneDX), SPDX AI model parameters, SPDX dataset metadata, SPDX file/snippet properties
  • Licensing: SPDX Licensing profile elements (listed/custom licenses, license additions, AND/OR/WITH/or-later operators), with OSI/FSF flags and deprecated IDs captured
  • Dependencies: component dependency edges (CycloneDX dependencies, SPDX relationships; DependencyOf is inverted to DependsOn)
  • Vulnerabilities: CycloneDX embedded vulnerabilities (ratings, affects, VEX analysis), SPDX Security profile vulnerabilities + VEX assessments
  • Services: endpoints, authentication, crossesTrustBoundary, data flows, licenses, external references (CycloneDX)
  • Formulation: components, workflows, tasks, properties (CycloneDX)
  • Declarations/definitions: attestations, affirmations, standards, signatures (CycloneDX)
  • Compositions/annotations (CycloneDX)
  • Build metadata: buildId, buildType, timestamps, config source, environment, parameters (SPDX)
  • Document properties

Notes:

  • License expressions can be validated against embedded SPDX license/exception lists via ILicenseExpressionValidator.
  • Matching currently uses PURL and CPE; additional fields are stored for downstream consumers.

VEX consumption

When SBOM vulnerabilities include embedded VEX analysis, Concelier consumes the statements to filter or annotate advisory matches. NotAffected statements can be filtered when policy allows, and trust evaluation checks timestamps, signatures (when provided), and justification requirements for not-affected claims.

Configuration (YAML or JSON), loaded from Concelier:VexConsumption:PolicyPath:

vexConsumptionPolicy:
  trustEmbeddedVex: true
  minimumTrustLevel: Unverified
  filterNotAffected: true

  signatureRequirements:
    requireSignedVex: false
    trustedSigners:
      - "https://example.com/keys/vex-signer"

  timestampRequirements:
    maxAgeHours: 720
    requireTimestamp: true

  conflictResolution:
    strategy: mostRecent
    logConflicts: true

  mergePolicy:
    mode: union
    externalSources:
      - type: repository
        url: "https://vex.example.com/api"

  justificationRequirements:
    requireJustificationForNotAffected: true
    acceptedJustifications:
      - component_not_present
      - vulnerable_code_not_present
      - vulnerable_code_not_in_execute_path
      - inline_mitigations_already_exist

Reports are emitted via VexConsumptionReporter in JSON, SARIF, and text formats. Runtime overrides can be supplied via Concelier:VexConsumption (Enabled, IgnoreVex, PolicyPath, TrustEmbeddedVex, MinimumTrustLevel, FilterNotAffected, ExternalVexSources).

Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                           SBOM Learning Flow                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│    ┌─────────┐    scan     ┌─────────┐    SBOM    ┌───────────┐            │
│    │  Image  │ ──────────► │ Scanner │ ─────────► │ Concelier │            │
│    │         │             │         │            │           │            │
│    └─────────┘             └─────────┘            └─────┬─────┘            │
│                                                         │                   │
│                                                         ▼                   │
│                                              ┌─────────────────────┐       │
│                                              │  SBOM Registration  │       │
│                                              │  ┌───────────────┐  │       │
│                                              │  │ Extract PURLs │  │       │
│                                              │  └───────┬───────┘  │       │
│                                              │          │          │       │
│                                              │          ▼          │       │
│                                              │  ┌───────────────┐  │       │
│                                              │  │ Match Advs    │  │       │
│                                              │  └───────┬───────┘  │       │
│                                              │          │          │       │
│                                              │          ▼          │       │
│                                              │  ┌───────────────┐  │       │
│                                              │  │ Update Scores │  │       │
│                                              │  └───────────────┘  │       │
│                                              └─────────────────────┘       │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

API Endpoints

Register SBOM

POST /api/v1/learn/sbom
Content-Type: application/vnd.cyclonedx+json

or

POST /api/v1/learn/sbom
Content-Type: application/spdx+json

Request Body: CycloneDX or SPDX SBOM document

Query Parameters:

Parameter Type Default Description
artifact_id string required Image digest or artifact identifier
update_scores bool true Trigger immediate score recalculation
include_reachability bool true Include reachability data in matching

Response:

{
  "sbom_id": "uuid",
  "sbom_digest": "sha256:abc123...",
  "artifact_id": "sha256:image...",
  "component_count": 234,
  "matched_advisories": 15,
  "scores_updated": true,
  "registered_at": "2025-01-15T10:30:00Z"
}

Get Affected Advisories

GET /api/v1/sboms/{digest}/affected

Response:

{
  "sbom_digest": "sha256:abc123...",
  "artifact_id": "sha256:image...",
  "matched_advisories": [
    {
      "canonical_id": "uuid",
      "cve": "CVE-2024-1234",
      "severity": "high",
      "interest_score": 0.85,
      "matched_component": "pkg:npm/express@4.17.1",
      "is_reachable": true
    },
    {
      "canonical_id": "uuid",
      "cve": "CVE-2024-5678",
      "severity": "medium",
      "interest_score": 0.65,
      "matched_component": "pkg:npm/lodash@4.17.20",
      "is_reachable": false
    }
  ],
  "total_count": 15,
  "last_matched_at": "2025-01-15T10:30:00Z"
}

List Registered SBOMs

GET /api/v1/sboms

Query Parameters:

Parameter Type Default Description
artifact_id string null Filter by artifact
since datetime null Only SBOMs registered after this time
limit int 100 Max results
cursor string null Pagination cursor

Response:

{
  "sboms": [
    {
      "id": "uuid",
      "artifact_id": "sha256:image...",
      "sbom_digest": "sha256:abc123...",
      "sbom_format": "cyclonedx",
      "component_count": 234,
      "matched_advisory_count": 15,
      "registered_at": "2025-01-15T10:30:00Z"
    }
  ],
  "total_count": 42,
  "next_cursor": "cursor..."
}

Unregister SBOM

DELETE /api/v1/sboms/{digest}

Query Parameters:

Parameter Type Default Description
update_scores bool true Recalculate scores after removal

Matching Algorithm

PURL Matching

  1. Exact Match: pkg:npm/express@4.17.1 matches advisories affecting exactly that version
  2. Range Match: Uses semantic version ranges from advisory affects_key
  3. Namespace Normalization: @scope/pkg normalized for comparison

CPE Matching

For OS packages (rpm, deb):

  1. Extract CPE from SBOM
  2. Match against advisory CPE patterns
  3. Apply distro-specific version logic (NEVRA/EVR)

Reachability Integration

When include_reachability=true:

  1. Query Scanner call graph data for matched components
  2. Mark is_reachable based on path from entry point
  3. Factor into interest score calculation

Events

SbomLearned

Published when SBOM is registered:

{
  "event_type": "sbom_learned",
  "sbom_id": "uuid",
  "sbom_digest": "sha256:...",
  "artifact_id": "sha256:...",
  "component_count": 234,
  "matched_advisory_count": 15,
  "timestamp": "2025-01-15T10:30:00Z"
}

ScoresUpdated

Published after batch score update:

{
  "event_type": "scores_updated",
  "trigger": "sbom_registration",
  "sbom_digest": "sha256:...",
  "advisories_updated": 15,
  "timestamp": "2025-01-15T10:30:05Z"
}

Auto-Learning

Subscribe to Scanner events for automatic SBOM registration:

Configuration

SbomIntegration:
  AutoLearn:
    Enabled: true
    SubscribeToScanEvents: true
    EventSource: "scanner:scan_completed"

  Matching:
    EnablePurl: true
    EnableCpe: true
    IncludeReachability: true

  ScoreUpdate:
    BatchSize: 1000
    DelaySeconds: 5  # Debounce rapid updates

Event Handler

// Automatic registration on scan completion
public class ScanCompletedHandler : IEventHandler<ScanCompletedEvent>
{
    public async Task HandleAsync(ScanCompletedEvent evt, CancellationToken ct)
    {
        await _sbomService.LearnFromScanAsync(
            artifactId: evt.ImageDigest,
            sbomDigest: evt.SbomDigest,
            sbomContent: evt.SbomContent,
            cancellationToken: ct);
    }
}

CLI Commands

# Register SBOM from file
stella learn sbom --file ./sbom.json --artifact sha256:image...

# Register from stdin
cat sbom.json | stella learn sbom --artifact sha256:image...

# List affected advisories
stella sbom affected sha256:sbomdigest...

# List registered SBOMs
stella sbom list --limit 20

# Unregister SBOM
stella sbom unregister sha256:sbomdigest...

Integration Examples

CI/CD Pipeline

# Example GitHub Actions workflow
- name: Scan image
  run: stella scan image myapp:latest -o sbom.json

- name: Register SBOM
  run: stella learn sbom --file sbom.json --artifact ${{ steps.build.outputs.digest }}

- name: Check for critical advisories
  run: |
    AFFECTED=$(stella sbom affected ${{ steps.sbom.outputs.digest }} --severity critical --count)
    if [ "$AFFECTED" -gt 0 ]; then
      echo "::error::Found $AFFECTED critical advisories"
      exit 1
    fi

Programmatic Registration

// Register SBOM from code
var result = await sbomService.RegisterSbomAsync(
    artifactId: imageDigest,
    sbomContent: sbomJson,
    format: SbomFormat.CycloneDX,
    options: new RegistrationOptions
    {
        UpdateScores = true,
        IncludeReachability = true
    },
    cancellationToken);

// Get affected advisories
var affected = await sbomService.GetAffectedAdvisoriesAsync(
    sbomDigest: result.SbomDigest,
    cancellationToken);

Database Schema

CREATE TABLE vuln.sbom_registry (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    digest TEXT NOT NULL,
    format TEXT NOT NULL CHECK (format IN ('cyclonedx', 'spdx')),
    spec_version TEXT NOT NULL,
    primary_name TEXT,
    primary_version TEXT,
    component_count INT NOT NULL DEFAULT 0,
    affected_count INT NOT NULL DEFAULT 0,
    source TEXT NOT NULL,
    tenant_id TEXT,
    registered_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    last_matched_at TIMESTAMPTZ,
    CONSTRAINT uq_sbom_registry_digest UNIQUE (digest)
);

CREATE TABLE vuln.sbom_canonical_match (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    sbom_id UUID NOT NULL REFERENCES vuln.sbom_registry(id),
    canonical_id UUID NOT NULL REFERENCES vuln.advisory_canonical(id),
    purl TEXT NOT NULL,
    match_method TEXT NOT NULL,
    confidence NUMERIC(3,2) NOT NULL DEFAULT 1.0,
    is_reachable BOOLEAN NOT NULL DEFAULT false,
    is_deployed BOOLEAN NOT NULL DEFAULT false,
    matched_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    CONSTRAINT uq_sbom_canonical_match UNIQUE (sbom_id, canonical_id, purl)
);

CREATE TABLE concelier.sbom_documents (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    serial_number TEXT NOT NULL,
    artifact_digest TEXT,
    format TEXT NOT NULL CHECK (format IN ('cyclonedx', 'spdx')),
    spec_version TEXT NOT NULL,
    component_count INT NOT NULL DEFAULT 0,
    service_count INT NOT NULL DEFAULT 0,
    vulnerability_count INT NOT NULL DEFAULT 0,
    has_crypto BOOLEAN NOT NULL DEFAULT false,
    has_services BOOLEAN NOT NULL DEFAULT false,
    has_vulnerabilities BOOLEAN NOT NULL DEFAULT false,
    license_ids TEXT[] NOT NULL DEFAULT '{}',
    license_expressions TEXT[] NOT NULL DEFAULT '{}',
    sbom_json JSONB NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    CONSTRAINT uq_concelier_sbom_serial UNIQUE (serial_number),
    CONSTRAINT uq_concelier_sbom_artifact UNIQUE (artifact_digest)
);