Files
git.stella-ops.org/docs/ARCHITECTURE_CONCELIER.md
master 96d52884e8
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Add Policy DSL Validator, Schema Exporter, and Simulation Smoke tools
- Implemented PolicyDslValidator with command-line options for strict mode and JSON output.
- Created PolicySchemaExporter to generate JSON schemas for policy-related models.
- Developed PolicySimulationSmoke tool to validate policy simulations against expected outcomes.
- Added project files and necessary dependencies for each tool.
- Ensured proper error handling and usage instructions across tools.
2025-10-27 08:00:11 +02:00

21 KiB
Raw Blame History

component_architecture_concelier.md — StellaOps Concelier (2025Q4)

Scope. Implementationready architecture for Concelier: the vulnerability ingest/normalize/merge/export subsystem that produces deterministic advisory data for the Scanner + Policy + Excititor pipeline. Covers domain model, connectors, merge rules, storage schema, exports, APIs, performance, security, and test matrices.


0) Mission & boundaries

Mission. Acquire authoritative vulnerability advisories (vendor PSIRTs, distros, OSS ecosystems, CERTs), normalize them into a canonical model, reconcile aliases and version ranges, and export deterministic artifacts (JSON, Trivy DB) for fast backend joins.

Boundaries.

  • Concelier does not sign with private keys. When attestation is required, the export artifact is handed to the Signer/Attestor pipeline (outofprocess).
  • Concelier does not decide PASS/FAIL; it provides data to the Policy engine.
  • Online operation is allowlistonly; airgapped deployments use the Offline Kit.

1) Topology & processes

Process shape: single ASP.NET Core service StellaOps.Concelier.WebService hosting:

  • Scheduler with distributed locks (Mongo backed).
  • Connectors (fetch/parse/map).
  • Merger (canonical record assembly + precedence).
  • Exporters (JSON, Trivy DB).
  • Minimal REST for health/status/trigger/export.

Scale: HA by running N replicas; locks prevent overlapping jobs per source/exporter.


2) Canonical domain model

Stored in MongoDB (database concelier), serialized with a canonical JSON writer (stable order, camelCase, normalized timestamps).

2.1 Core entities

Advisory

advisoryId          // internal GUID
advisoryKey         // stable string key (e.g., CVE-2025-12345 or vendor ID)
title               // short title (best-of from sources)
summary             // normalized summary (English; i18n optional)
published           // earliest source timestamp
modified            // latest source timestamp
severity            // normalized {none, low, medium, high, critical}
cvss                // {v2?, v3?, v4?} objects (vector, baseScore, severity, source)
exploitKnown        // bool (e.g., KEV/active exploitation flags)
references[]        // typed links (advisory, kb, patch, vendor, exploit, blog)
sources[]           // provenance for traceability (doc digests, URIs)

Alias

advisoryId
scheme              // CVE, GHSA, RHSA, DSA, USN, MSRC, etc.
value               // e.g., "CVE-2025-12345"

Affected

advisoryId
productKey          // canonical product identity (see 2.2)
rangeKind           // semver | evr | nvra | apk | rpm | deb | generic | exact
introduced?         // string (format depends on rangeKind)
fixed?              // string (format depends on rangeKind)
lastKnownSafe?      // optional explicit safe floor
arch?               // arch or platform qualifier if source declares (x86_64, aarch64)
distro?             // distro qualifier when applicable (rhel:9, debian:12, alpine:3.19)
ecosystem?          // npm|pypi|maven|nuget|golang|…
notes?              // normalized notes per source

Reference

advisoryId
url
kind                // advisory | patch | kb | exploit | mitigation | blog | cvrf | csaf
sourceTag           // e.g., vendor/redhat, distro/debian, oss/ghsa

MergeEvent

advisoryKey
beforeHash          // canonical JSON hash before merge
afterHash           // canonical JSON hash after merge
mergedAt
inputs[]            // source doc digests that contributed

AdvisoryStatement (event log)

statementId         // GUID (immutable)
vulnerabilityKey    // canonical advisory key (e.g., CVE-2025-12345)
advisoryKey         // merge snapshot advisory key (may reference variant)
statementHash       // canonical hash of advisory payload
asOf                // timestamp of snapshot (UTC)
recordedAt          // persistence timestamp (UTC)
inputDocuments[]    // document IDs contributing to the snapshot
payload             // canonical advisory document (BSON / canonical JSON)

AdvisoryConflict

conflictId          // GUID
vulnerabilityKey    // canonical advisory key
conflictHash        // deterministic hash of conflict payload
asOf                // timestamp aligned with originating statement set
recordedAt          // persistence timestamp
statementIds[]      // related advisoryStatement identifiers
details             // structured conflict explanation / merge reasoning
  • AdvisoryEventLog (Concelier.Core) provides the public API for appending immutable statements/conflicts and querying replay history. Inputs are normalized by trimming and lower-casing vulnerabilityKey, serializing advisories with CanonicalJsonSerializer, and computing SHA-256 hashes (statementHash, conflictHash) over the canonical JSON payloads. Consumers can replay by key with an optional asOf filter to obtain deterministic snapshots ordered by asOf then recordedAt.
  • Conflict explainers are serialized as deterministic MergeConflictExplainerPayload records (type, reason, source ranks, winning values); replay clients can parse the payload to render human-readable rationales without re-computing precedence.
  • Concelier.WebService exposes the immutable log via GET /concelier/advisories/{vulnerabilityKey}/replay[?asOf=UTC_ISO8601], returning the latest statements (with hex-encoded hashes) and any conflict explanations for downstream exporters and APIs.

AdvisoryObservation (new in Sprint 24)

observationId       // deterministic id: {tenant}:{source}:{upstreamId}:{revision}
tenant              // issuing tenant (lower-case)
source{vendor,stream,api,collectorVersion}
upstream{
    upstreamId, documentVersion, contentHash,
    fetchedAt, receivedAt, signature{present,format,keyId,signature}}
content{format,specVersion,raw,metadata}
linkset{aliases[], purls[], cpes[], references[{type,url}]}
createdAt           // when Concelier recorded the observation
attributes          // optional provenance metadata (e.g., batch, connector)

The observation is an immutable projection of the raw ingestion document (post provenance validation, pre-merge) that powers LinkNotMerge overlays and Vuln Explorer. Observations live in the advisory_observations collection, keyed by tenant + upstream identity. linkset provides normalized aliases/PURLs/CPES that downstream services (Graph/Vuln Explorer) join against without triggering merge logic. Concelier.Core exposes strongly-typed models (AdvisoryObservation, AdvisoryObservationLinkset, etc.) and a Mongo-backed store for filtered queries by tenant/alias; this keeps overlay consumers read-only while preserving AOC guarantees.

ExportState

exportKind          // json | trivydb
baseExportId?       // last full baseline
baseDigest?         // digest of last full baseline
lastFullDigest?     // digest of last full export
lastDeltaDigest?    // digest of last delta export
cursor              // per-kind incremental cursor
files[]             // last manifest snapshot (path → sha256)

2.2 Product identity (productKey)

  • Primary: purl (Package URL).
  • OS packages: RPM (NEVRA→purl:rpm), DEB (dpkg→purl:deb), APK (apk→purl:alpine), with EVR/NVRA preserved.
  • Secondary: cpe retained for compatibility; advisory records may carry both.
  • Image/platform: oci:<registry>/<repo>@<digest> for imagelevel advisories (rare).
  • Unmappable: if a source is nondeterministic, keep native string under productKey="native:<provider>:<id>" and mark nonjoinable.

3) Source families & precedence

3.1 Families

  • Vendor PSIRTs: Microsoft, Oracle, Cisco, Adobe, Apple, VMware, Chromium…
  • Linux distros: Red Hat, SUSE, Ubuntu, Debian, Alpine…
  • OSS ecosystems: OSV, GHSA (GitHub Security Advisories), PyPI, npm, Maven, NuGet, Go.
  • CERTs / national CSIRTs: CISA (KEV, ICS), JVN, ACSC, CCCS, KISA, CERTFR/BUND, etc.

3.2 Precedence (when claims conflict)

  1. Vendor PSIRT (authoritative for their product).
  2. Distro (authoritative for packages they ship, including backports).
  3. Ecosystem (OSV/GHSA) for library semantics.
  4. CERTs/aggregators for enrichment (KEV/known exploited).

Precedence affects Affected ranges and fixed info; severity is normalized to the maximum credible severity unless policy overrides. Conflicts are retained with source provenance.


4) Connectors & normalization

4.1 Connector contract

public interface IFeedConnector {
  string SourceName { get; }
  Task FetchAsync(IServiceProvider sp, CancellationToken ct);   // -> document collection
  Task ParseAsync(IServiceProvider sp, CancellationToken ct);   // -> dto collection (validated)
  Task MapAsync(IServiceProvider sp, CancellationToken ct);     // -> advisory/alias/affected/reference
}
  • Fetch: windowed (cursor), conditional GET (ETag/LastModified), retry/backoff, rate limiting.
  • Parse: schema validation (JSON Schema, XSD/CSAF), content type checks; write DTO with normalized casing.
  • Map: build canonical records; all outputs carry provenance (doc digest, URI, anchors).

4.2 Version range normalization

  • SemVer ecosystems (npm, pypi, maven, nuget, golang): normalize to introduced/fixed semver ranges (use ~, ^, <, >= canonicalized to intervals).
  • RPM EVR: epoch:version-release with rpmvercmp semantics; store raw EVR strings and also computed order keys for query.
  • DEB: dpkg version comparison semantics mirrored; store computed keys.
  • APK: Alpine version semantics; compute order keys.
  • Generic: if provider uses text, retain raw; do not invent ranges.

4.3 Severity & CVSS

  • Normalize CVSS v2/v3/v4 where available (vector, baseScore, severity).
  • If multiple CVSS sources exist, track them all; effective severity defaults to max by policy (configurable).
  • ExploitKnown toggled by KEV and equivalent sources; store evidence (source, date).

5) Merge engine

5.1 Keying & identity

  • Identity graph: CVE is primary node; vendor/distro IDs resolved via Alias edges (from connectors and Conceliers alias tables).
  • advisoryKey is the canonical primary key (CVE if present, else vendor/distro key).

5.2 Merge algorithm (deterministic)

  1. Gather all rows for advisoryKey (across sources).

  2. Select title/summary by precedence source (vendor>distro>ecosystem>cert).

  3. Union aliases (dedupe by scheme+value).

  4. Merge Affected with rules:

    • Prefer vendor ranges for vendor products; prefer distro for distroshipped packages.
    • If both exist for same productKey, keep both; mark sourceTag and precedence so Policy can decide.
    • Never collapse range semantics across different families (e.g., rpm EVR vs semver).
  5. CVSS/severity: record all CVSS sets; compute effectiveSeverity = max (unless policy override).

  6. References: union with type precedence (advisory > patch > kb > exploit > blog); dedupe by URL; preserve sourceTag.

  7. Produce canonical JSON; compute afterHash; store MergeEvent with inputs and hashes.

The merge is pure given inputs. Any change in inputs or precedence matrices changes the hash predictably.


6) Storage schema (MongoDB)

Collections & indexes

  • source {_id, type, baseUrl, enabled, notes}

  • source_state {sourceName(unique), enabled, cursor, lastSuccess, backoffUntil, paceOverrides}

  • document {_id, sourceName, uri, fetchedAt, sha256, contentType, status, metadata, gridFsId?, etag?, lastModified?}

    • Index: {sourceName:1, uri:1} unique, {fetchedAt:-1}
  • dto {_id, sourceName, documentId, schemaVer, payload, validatedAt}

    • Index: {sourceName:1, documentId:1}
  • advisory {_id, advisoryKey, title, summary, published, modified, severity, cvss, exploitKnown, sources[]}

    • Index: {advisoryKey:1} unique, {modified:-1}, {severity:1}, text index (title, summary)
  • alias {advisoryId, scheme, value}

    • Index: {scheme:1,value:1}, {advisoryId:1}
  • affected {advisoryId, productKey, rangeKind, introduced?, fixed?, arch?, distro?, ecosystem?}

    • Index: {productKey:1}, {advisoryId:1}, {productKey:1, rangeKind:1}
  • reference {advisoryId, url, kind, sourceTag}

    • Index: {advisoryId:1}, {kind:1}
  • merge_event {advisoryKey, beforeHash, afterHash, mergedAt, inputs[]}

    • Index: {advisoryKey:1, mergedAt:-1}
  • export_state {_id(exportKind), baseExportId?, baseDigest?, lastFullDigest?, lastDeltaDigest?, cursor, files[]}

  • locks {_id(jobKey), holder, acquiredAt, heartbeatAt, leaseMs, ttlAt} (TTL cleans dead locks)

  • jobs {_id, type, args, state, startedAt, heartbeatAt, endedAt, error}

GridFS buckets: fs.documents for raw payloads.


7) Exporters

7.1 Deterministic JSON (vulnlist style)

  • Folder structure mirroring /<scheme>/<first-two>/<rest>/… with one JSON per advisory; deterministic ordering, stable timestamps, normalized whitespace.
  • manifest.json lists all files with SHA256 and a toplevel export digest.

7.2 Trivy DB exporter

  • Builds Bolt DB archives compatible with Trivy; supports full and delta modes.

  • In delta, unchanged blobs are reused from the base; metadata captures:

    {
      "mode": "delta|full",
      "baseExportId": "...",
      "baseManifestDigest": "sha256:...",
      "changed": ["path1", "path2"],
      "removed": ["path3"]
    }
    
  • Optional ORAS push (OCI layout) for registries.

  • Offline kit bundles include Trivy DB + JSON tree + export manifest.

  • Mirror-ready bundles: when concelier.trivy.mirror defines domains, the exporter emits mirror/index.json plus per-domain manifest.json, metadata.json, and db.tar.gz files with SHA-256 digests so Concelier mirrors can expose domain-scoped download endpoints.

  • Concelier.WebService serves /concelier/exports/index.json and /concelier/exports/mirror/{domain}/… directly from the export tree with hour-long budgets (index: 60s, bundles: 300s, immutable) and per-domain rate limiting; the endpoints honour Stella Ops Authority or CIDR bypass lists depending on mirror topology.

7.3 Handoff to Signer/Attestor (optional)

  • On export completion, if attest: true is set in job args, Concelier posts the artifact metadata to Signer/Attestor; Concelier itself does not hold signing keys.
  • Export record stores returned { uuid, index, url } from Rekor v2.

8) REST APIs

All under /api/v1/concelier.

Health & status

GET  /healthz | /readyz
GET  /status                              → sources, last runs, export cursors

Sources & jobs

GET  /sources                              → list of configured sources
POST /sources/{name}/trigger               → { jobId }
POST /sources/{name}/pause | /resume       → toggle
GET  /jobs/{id}                            → job status

Exports

POST /exports/json   { full?:bool, force?:bool, attest?:bool } → { exportId, digest, rekor? }
POST /exports/trivy  { full?:bool, force?:bool, publish?:bool, attest?:bool } → { exportId, digest, rekor? }
GET  /exports/{id}   → export metadata (kind, digest, createdAt, rekor?)
GET  /concelier/exports/index.json        → mirror index describing available domains/bundles
GET  /concelier/exports/mirror/{domain}/manifest.json
GET  /concelier/exports/mirror/{domain}/bundle.json
GET  /concelier/exports/mirror/{domain}/bundle.json.jws

Search (operator debugging)

GET  /advisories/{key}
GET  /advisories?scheme=CVE&value=CVE-2025-12345
GET  /affected?productKey=pkg:rpm/openssl&limit=100

AuthN/Z: Authority tokens (OpTok) with roles: concelier.read, concelier.admin, concelier.export.


9) Configuration (YAML)

concelier:
  mongo: { uri: "mongodb://mongo/concelier" }
  s3:
    endpoint: "http://minio:9000"
    bucket: "stellaops-concelier"
  scheduler:
    windowSeconds: 30
    maxParallelSources: 4
  sources:
    - name: redhat
      kind: csaf
      baseUrl: https://access.redhat.com/security/data/csaf/v2/
      signature: { type: pgp, keys: [ "…redhat PGP…" ] }
      enabled: true
      windowDays: 7
    - name: suse
      kind: csaf
      baseUrl: https://ftp.suse.com/pub/projects/security/csaf/
      signature: { type: pgp, keys: [ "…suse PGP…" ] }
    - name: ubuntu
      kind: usn-json
      baseUrl: https://ubuntu.com/security/notices.json
      signature: { type: none }
    - name: osv
      kind: osv
      baseUrl: https://api.osv.dev/v1/
      signature: { type: none }
    - name: ghsa
      kind: ghsa
      baseUrl: https://api.github.com/graphql
      auth: { tokenRef: "env:GITHUB_TOKEN" }
  exporters:
    json:
      enabled: true
      output: s3://stellaops-concelier/json/
    trivy:
      enabled: true
      mode: full
      output: s3://stellaops-concelier/trivy/
      oras:
        enabled: false
        repo: ghcr.io/org/concelier
  precedence:
    vendorWinsOverDistro: true
    distroWinsOverOsv: true
  severity:
    policy: max    # or 'vendorPreferred' / 'distroPreferred'

10) Security & compliance

  • Outbound allowlist per connector (domains, protocols); proxy support; TLS pinning where possible.
  • Signature verification for raw docs (PGP/cosign/x509) with results stored in document.metadata.sig. Docs failing verification may still be ingested but flagged; merge can downweight or ignore them by config.
  • No secrets in logs; auth material via env: or mounted files; HTTP redaction of Authorization headers.
  • Multitenant: pertenant DBs or prefixes; pertenant S3 prefixes; tenantscoped API tokens.
  • Determinism: canonical JSON writer; export digests stable across runs given same inputs.

11) Performance targets & scale

  • Ingest: ≥ 5k documents/min on 4 cores (CSAF/OpenVEX/JSON).
  • Normalize/map: ≥ 50k Affected rows/min on 4 cores.
  • Merge: ≤ 10ms P95 per advisory at steadystate updates.
  • Export: 1M advisories JSON in ≤ 90s (streamed, zstd), Trivy DB in ≤ 60s on 8 cores.
  • Memory: hard cap per job; chunked streaming writers; backpressure to avoid GC spikes.

Scale pattern: add Concelier replicas; Mongo scaling via indices and read/write concerns; GridFS only for oversized docs.


12) Observability

  • Metrics

    • concelier.fetch.docs_total{source}
    • concelier.fetch.bytes_total{source}
    • concelier.parse.failures_total{source}
    • concelier.map.affected_total{source}
    • concelier.merge.changed_total
    • concelier.export.bytes{kind}
    • concelier.export.duration_seconds{kind}
  • Tracing around fetch/parse/map/merge/export.

  • Logs: structured with source, uri, docDigest, advisoryKey, exportId.


13) Testing matrix

  • Connectors: fixture suites for each provider/format (happy path; malformed; signature fail).
  • Version semantics: EVR vs dpkg vs semver edge cases (epoch bumps, tilde versions, prereleases).
  • Merge: conflicting sources (vendor vs distro vs OSV); verify precedence & dual retention.
  • Export determinism: byteforbyte stable outputs across runs; digest equality.
  • Performance: soak tests with 1M advisories; cap memory; verify backpressure.
  • API: pagination, filters, RBAC, error envelopes (RFC 7807).
  • Offline kit: bundle build & import correctness.

14) Failure modes & recovery

  • Source outages: scheduler backs off with exponential delay; source_state.backoffUntil; alerts on staleness.
  • Schema drifts: parse stage marks DTO invalid; job fails with clear diagnostics; connector version flags track supported schema ranges.
  • Partial exports: exporters write to temp prefix; manifest commit is atomic; only then move to final prefix and update export_state.
  • Resume: all stages idempotent; source_state.cursor supports window resume.

15) Operator runbook (quick)

  • Trigger all sources: POST /api/v1/concelier/sources/*/trigger
  • Force full export JSON: POST /api/v1/concelier/exports/json { "full": true, "force": true }
  • Force Trivy DB delta publish: POST /api/v1/concelier/exports/trivy { "full": false, "publish": true }
  • Inspect advisory: GET /api/v1/concelier/advisories?scheme=CVE&value=CVE-2025-12345
  • Pause noisy source: POST /api/v1/concelier/sources/osv/pause

16) Rollout plan

  1. MVP: Red Hat (CSAF), SUSE (CSAF), Ubuntu (USN JSON), OSV; JSON export.
  2. Add: GHSA GraphQL, Debian (DSA HTML/JSON), Alpine secdb; Trivy DB export.
  3. Attestation handoff: integrate with Signer/Attestor (optional).
  4. Scale & diagnostics: provider dashboards, staleness alerts, export cache reuse.
  5. Offline kit: endtoend verified bundles for airgap.