consolidation of some of the modules, localization fixes, product advisories work, qa work

This commit is contained in:
master
2026-03-05 03:54:22 +02:00
parent 7bafcc3eef
commit 8e1cb9448d
3878 changed files with 72600 additions and 46861 deletions

View File

@@ -0,0 +1,232 @@
# VEX Observation Model (`vex_observations`)
> Authored 2025-11-14 for Sprint 120 (`EXCITITOR-LNM-21-001`). This document is the canonical schema description for Excititor's immutable observation records. It unblocks downstream documentation tasks (`DOCS-LNM-22-002`) and aligns the WebService/Worker data structures with PostgreSQL persistence.
Excititor ingests heterogeneous VEX statements, normalizes them under the Aggregation-Only Contract (AOC), and persists each normalized statement as a **VEX observation**. These observations are the source of truth for:
- Advisory AI citation APIs (`/v1/vex/observations/{vulnerabilityId}/{productKey}`)
- Graph/Vuln Explorer overlays (batch observation APIs)
- Evidence Locker + portable bundle manifests
- Policy Engine materialization and audit trails
All observation documents are immutable. New information creates a new observation record linked by `observationId`; supersedence happens through Graph/Lens layers, not by mutating this collection.
## Storage & routing
| Aspect | Value |
| --- | --- |
| Table | `vex_observations` (PostgreSQL) |
| Upstream generator | `VexObservationProjectionService` (WebService) and Worker normalization pipeline |
| Primary key | `{tenant, observationId}` |
| Required indexes | `{tenant, vulnerabilityId}`, `{tenant, productKey}`, `{tenant, document.digest}`, `{tenant, providerId, status}` |
| Source of truth for | `/v1/vex/observations`, Graph batch APIs, Excititor → Evidence Locker replication |
## Canonical document shape
```jsonc
{
"tenant": "default",
"observationId": "vex:obs:sha256:...",
"vulnerabilityId": "CVE-2024-12345",
"productKey": "pkg:maven/org.example/app@1.2.3",
"providerId": "ubuntu-csaf",
"status": "affected", // matches VexClaimStatus enum
"justification": {
"type": "component_not_present",
"reason": "Package not shipped in this profile",
"detail": "Binary not in base image"
},
"detail": "Free-form vendor detail",
"confidence": {
"score": 0.9,
"level": "high",
"method": "vendor"
},
"signals": {
"severity": {
"scheme": "cvss3.1",
"score": 7.8,
"label": "High",
"vector": "CVSS:3.1/..."
},
"kev": true,
"epss": 0.77
},
"scope": {
"key": "pkg:deb/ubuntu/apache2@2.4.58-1",
"purls": [
"pkg:deb/ubuntu/apache2@2.4.58-1",
"pkg:docker/example/app@sha256:..."
],
"cpes": ["cpe:2.3:a:apache:http_server:2.4.58:*:*:*:*:*:*:*"]
},
"anchors": [
"#/statements/0/justification",
"#/statements/0/detail"
],
"document": {
"format": "csaf",
"digest": "sha256:abc123...",
"revision": "2024-10-22T09:00:00Z",
"sourceUri": "https://ubuntu.com/security/notices/USN-0000-1",
"signature": {
"type": "cosign",
"issuer": "https://token.actions.githubusercontent.com",
"keyId": "ubuntu-vex-prod",
"verifiedAt": "2024-10-22T09:01:00Z",
"transparencyLogReference": "rekor://UUID",
"trust": {
"tenantId": "default",
"issuerId": "ubuntu",
"effectiveWeight": 0.9,
"tenantOverrideApplied": false,
"retrievedAtUtc": "2024-10-22T09:00:30Z"
}
}
},
"aoc": {
"guardVersion": "2024.10.0",
"violations": [], // non-empty -> stored + surfaced
"ingestedAt": "2024-10-22T09:00:05Z",
"retrievedAt": "2024-10-22T08:59:59Z"
},
"metadata": {
"provider-hint": "Mainline feed",
"source-channel": "mirror"
}
}
```
### Field notes
- **`tenant`** logical tenant resolved by WebService based on headers or default configuration.
- **`observationId`** deterministic hash (sha256) over `{tenant, vulnerabilityId, productKey, providerId, statementDigest}`. Never reused.
- **`status` + `justification`** follow the OpenVEX semantics enforced by `StellaOps.Excititor.Core.VexClaim`.
- **`scope`** includes canonical `key` plus normalized PURLs/CPES; deterministic ordering.
- **`anchors`** optional JSON-pointer hints pointing to the source document sections; stored as trimmed strings.
- **`document.signature`** mirrors `VexSignatureMetadata`; empty if upstream feed lacks signatures.
- **`aoc.violations`** stored if the guard detected non-fatal issues; fatal issues never create an observation.
- **`metadata`** reserved for deterministic provider hints; keys follow `vex.*` prefix guidance.
## Determinism & AOC guarantees
1. **Write-once** once inserted, observation documents never change. New evidence creates a new `observationId`.
2. **Sorted collections** arrays (`anchors`, `purls`, `cpes`) are sorted lexicographically before persistence.
3. **Guard metadata** `aoc.guardVersion` records the guard library version (`docs/aoc/guard-library.md`), enabling audits.
4. **Signatures** only verification metadata proven by the Worker is stored; WebService never recomputes trust.
5. **Time normalization** all timestamps stored as UTC ISO-8601 strings (PostgreSQL `timestamptz`).
## API mapping
| API | Source fields | Notes |
| --- | --- | --- |
| `GET /vex/observations` | `tenant`, `vulnerabilityId`, `productKey`, `providerId` | List observations with filters. Implemented in `ObservationEndpoints.cs`. |
| `GET /vex/observations/{observationId}` | `tenant`, `observationId` | Get single observation by ID with full detail. |
| `GET /vex/observations/count` | `tenant` | Count all observations for tenant. |
| `/v1/vex/observations/{vuln}/{product}` | `tenant`, `vulnerabilityId`, `productKey`, `scope`, `statements[]` | Response uses `VexObservationProjectionService` to render `statements`, `document`, and `signature` fields. |
| `/vex/aoc/verify` | `document.digest`, `providerId`, `aoc` | Replays guard validation for recent digests; guard violations here align with `aoc.violations`. |
| Evidence batch API (Graph) | `statements[]`, `scope`, `signals`, `anchors` | Format optimized for overlays; reduces `document` to digest/URI. |
## Related work
- `EXCITITOR-GRAPH-24-*` relies on this schema to build overlays.
- `DOCS-LNM-22-002` (Link-Not-Merge documentation) references this file.
- `EXCITITOR-ATTEST-73-*` uses `document.digest` + `signature` to embed provenance in attestation payloads.
---
## Rekor Transparency Log Linkage
**Sprint Reference**: `SPRINT_20260117_002_EXCITITOR_vex_rekor_linkage`
VEX observations can be attested to the Sigstore Rekor transparency log, providing an immutable, publicly verifiable record of when each observation was recorded. This supports:
- **Auditability**: Independent verification that an observation existed at a specific time
- **Non-repudiation**: Cryptographic proof of observation provenance
- **Supply chain compliance**: Evidence for regulatory and security requirements
- **Offline verification**: Stored inclusion proofs enable air-gapped verification
### Rekor Linkage Fields
The following fields are added to `vex_observations` when an observation is attested:
| Field | Type | Description |
|-------|------|-------------|
| `rekor_uuid` | TEXT | Rekor entry UUID (64-char hex) |
| `rekor_log_index` | BIGINT | Monotonically increasing log position |
| `rekor_integrated_time` | TIMESTAMPTZ | When entry was integrated into log |
| `rekor_log_url` | TEXT | Rekor server URL where submitted |
| `rekor_inclusion_proof` | JSONB | RFC 6962 inclusion proof for offline verification |
| `rekor_linked_at` | TIMESTAMPTZ | When linkage was recorded locally |
### Schema Extension
```sql
-- V20260117__vex_rekor_linkage.sql
ALTER TABLE excititor.vex_observations
ADD COLUMN IF NOT EXISTS rekor_uuid TEXT,
ADD COLUMN IF NOT EXISTS rekor_log_index BIGINT,
ADD COLUMN IF NOT EXISTS rekor_integrated_time TIMESTAMPTZ,
ADD COLUMN IF NOT EXISTS rekor_log_url TEXT,
ADD COLUMN IF NOT EXISTS rekor_inclusion_proof JSONB,
ADD COLUMN IF NOT EXISTS rekor_linked_at TIMESTAMPTZ;
-- Indexes for Rekor queries
CREATE INDEX idx_vex_observations_rekor_uuid
ON excititor.vex_observations(rekor_uuid)
WHERE rekor_uuid IS NOT NULL;
CREATE INDEX idx_vex_observations_pending_rekor
ON excititor.vex_observations(created_at)
WHERE rekor_uuid IS NULL;
```
### API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/attestations/rekor/observations/{id}` | POST | Attest observation to Rekor |
| `/attestations/rekor/observations/batch` | POST | Batch attestation |
| `/attestations/rekor/observations/{id}/verify` | GET | Verify Rekor linkage |
| `/attestations/rekor/pending` | GET | List observations pending attestation |
### CLI Commands
```bash
# Show observation with Rekor details
stella vex observation show <id> --show-rekor
# Attest an observation to Rekor
stella vex observation attest <id> [--rekor-url URL]
# Verify Rekor linkage
stella vex observation verify-rekor <id> [--offline]
# List pending attestations
stella vex observation list-pending
```
### Inclusion Proof Structure
```jsonc
{
"treeSize": 1234567,
"rootHash": "base64-encoded-root-hash",
"logIndex": 12345,
"hashes": [
"base64-hash-1",
"base64-hash-2",
"base64-hash-3"
]
}
```
### Verification Modes
| Mode | Network | Use Case |
|------|---------|----------|
| Online | Required | Full verification against live Rekor |
| Offline | Not required | Verify using stored inclusion proof |
Offline mode uses the stored `rekor_inclusion_proof` to verify the Merkle path locally. This is essential for air-gapped environments.