docs consoliation work

This commit is contained in:
StellaOps Bot
2025-12-24 14:19:46 +02:00
parent 40362de568
commit 5540ce9430
58 changed files with 2671 additions and 1751 deletions

View File

@@ -1,14 +1,14 @@
# StellaOps Concelier
Concelier ingests signed advisories from dozens of sources and converts them into immutable observations plus linksets under the Aggregation-Only Contract (AOC).
## Responsibilities
# StellaOps Concelier
Concelier ingests signed advisories from dozens of sources and converts them into immutable observations plus linksets under the Aggregation-Only Contract (AOC).
## Responsibilities
- Fetch and normalise vulnerability advisories via restart-time connectors.
- Persist observations and correlation linksets without precedence decisions.
- Emit deterministic exports (JSON, Trivy DB) for downstream policy evaluation.
- Coordinate offline/air-gap updates via Offline Kit bundles.
- Serve paragraph-anchored advisory chunks for Advisory AI consumers without breaking the Aggregation-Only Contract.
## Key components
- `StellaOps.Concelier.WebService` orchestration host.
- Connector libraries under `StellaOps.Concelier.Connector.*`.
@@ -16,12 +16,12 @@ Concelier ingests signed advisories from dozens of sources and converts them int
## Recent updates
- **2025-11-07:** Paragraph-anchored `/advisories/{advisoryKey}/chunks` endpoint shipped for Advisory AI paragraph retrieval. Details and rollout notes live in [`../../updates/2025-11-07-concelier-advisory-chunks.md`](../../updates/2025-11-07-concelier-advisory-chunks.md).
## Integrations & dependencies
- MongoDB for canonical observations and schedules.
- Policy Engine / Export Center / CLI for evidence consumption.
- Notify and UI for advisory deltas.
## Integrations & dependencies
- PostgreSQL (schema `vuln`) for canonical observations and schedules.
- Policy Engine / Export Center / CLI for evidence consumption.
- Notify and UI for advisory deltas.
## Operational notes
- Connector runbooks in ./operations/connectors/.
- Mirror operations for Offline Kit parity.
@@ -38,6 +38,6 @@ Concelier ingests signed advisories from dozens of sources and converts them int
- DOCS-LNM-22-001, DOCS-LNM-22-007 in ../../TASKS.md.
- Connector-specific TODOs in `src/Concelier/**/TASKS.md`.
## Epic alignment
- **Epic 1 AOC enforcement:** uphold raw observation invariants, provenance requirements, linkset-only enrichment, and AOC verifier guardrails across every connector.
- **Epic 10 Export Center:** expose deterministic advisory exports and metadata required by JSON/Trivy/mirror bundles.
## Epic alignment
- **Epic 1 AOC enforcement:** uphold raw observation invariants, provenance requirements, linkset-only enrichment, and AOC verifier guardrails across every connector.
- **Epic 10 Export Center:** expose deterministic advisory exports and metadata required by JSON/Trivy/mirror bundles.

View File

@@ -45,7 +45,7 @@ Expect all logs at `Information`. Ensure OTEL exporters include the scope `Stell
- `eventId=1002` with `reason="equal_rank"` - indicates precedence table gaps; page merge owners.
- `eventId=1002` with `reason="mismatch"` - severity disagreement; open connector bug if sustained.
3. **Job health**
- `stella db merge` exit code `1` signifies unresolved conflicts. Pipe to automation that captures logs and notifies #concelier-ops.
- `stella db merge` exit code `1` signifies unresolved conflicts. Pipe to automation that captures logs and notifies #concelier-ops.
### Threshold updates (2025-10-12)
@@ -58,7 +58,7 @@ Expect all logs at `Information`. Ensure OTEL exporters include the scope `Stell
## 4. Triage Workflow
1. **Confirm job context**
- `stella db merge` (CLI) or `POST /jobs/merge:reconcile` (API) to rehydrate the merge job. Use `--verbose` to stream structured logs during triage.
- `stella db merge` (CLI) or `POST /jobs/merge:reconcile` (API) to rehydrate the merge job. Use `--verbose` to stream structured logs during triage.
2. **Inspect metrics**
- Correlate spikes in `concelier.merge.conflicts` with `primary_source`/`suppressed_source` tags from `concelier.merge.overrides`.
3. **Pull structured logs**
@@ -94,7 +94,7 @@ Expect all logs at `Information`. Ensure OTEL exporters include the scope `Stell
## 6. Resolution Playbook
1. **Connector data fix**
- Re-run the offending connector stages (`stella db fetch --source ghsa --stage map` etc.).
- Re-run the offending connector stages (`stella db fetch --source ghsa --stage map` etc.).
- Once fixed, rerun merge and verify `decisionReason` reflects `freshness` or `precedence` as expected.
2. **Temporary precedence override**
- Edit `etc/concelier.yaml`:
@@ -131,7 +131,7 @@ Expect all logs at `Information`. Ensure OTEL exporters include the scope `Stell
- Canonical conflict rules: `src/DEDUP_CONFLICTS_RESOLUTION_ALGO.md`.
- Merge engine internals: `src/Concelier/__Libraries/StellaOps.Concelier.Merge/Services/AdvisoryPrecedenceMerger.cs`.
- Metrics definitions: `src/Concelier/__Libraries/StellaOps.Concelier.Merge/Services/AdvisoryMergeService.cs` (identity conflicts) and `AdvisoryPrecedenceMerger`.
- Storage audit trail: `src/Concelier/__Libraries/StellaOps.Concelier.Merge/Services/MergeEventWriter.cs`, `src/Concelier/__Libraries/StellaOps.Concelier.Storage.Mongo/MergeEvents`.
- Storage audit trail: `src/Concelier/__Libraries/StellaOps.Concelier.Merge/Services/MergeEventWriter.cs`, `src/Concelier/__Libraries/StellaOps.Concelier.Storage/MergeEvents`.
Keep this runbook synchronized with future sprint notes and update alert thresholds as baseline volumes change.

View File

@@ -25,13 +25,13 @@ concelier:
## 2. Staging Smoke Test
1. Deploy the configuration and restart the Concelier workers to ensure the Apple connector options are bound.
2. Trigger a full connector cycle:
- CLI: run `stella db fetch --source vndr-apple --stage fetch`, then `--stage parse`, then `--stage map`.
- REST: `POST /jobs/run { "kind": "source:vndr-apple:fetch", "chain": ["source:vndr-apple:parse", "source:vndr-apple:map"] }`
3. Validate metrics exported under meter `StellaOps.Concelier.Connector.Vndr.Apple`:
- `apple.fetch.items` (documents fetched)
- `apple.fetch.failures`
1. Deploy the configuration and restart the Concelier workers to ensure the Apple connector options are bound.
2. Trigger a full connector cycle:
- CLI: run `stella db fetch --source vndr-apple --stage fetch`, then `--stage parse`, then `--stage map`.
- REST: `POST /jobs/run { "kind": "source:vndr-apple:fetch", "chain": ["source:vndr-apple:parse", "source:vndr-apple:map"] }`
3. Validate metrics exported under meter `StellaOps.Concelier.Connector.Vndr.Apple`:
- `apple.fetch.items` (documents fetched)
- `apple.fetch.failures`
- `apple.fetch.unchanged`
- `apple.parse.failures`
- `apple.map.affected.count` (histogram of affected package counts)
@@ -42,10 +42,10 @@ concelier:
- `Apple software index fetch … processed=X newDocuments=Y`
- `Apple advisory parse complete … aliases=… affected=…`
- `Mapped Apple advisory … pendingMappings=0`
6. Confirm MongoDB state:
- `raw_documents` store contains the HT article HTML with metadata (`apple.articleId`, `apple.postingDate`).
- `dtos` store has `schemaVersion="apple.security.update.v1"`.
- `advisories` collection includes keys `HTxxxxxx` with normalized SemVer rules.
6. Confirm PostgreSQL state (schema `vuln`):
- `raw_documents` table contains the HT article HTML with metadata (`apple.articleId`, `apple.postingDate`).
- `dtos` table has `schemaVersion="apple.security.update.v1"`.
- `advisories` table includes keys `HTxxxxxx` with normalized SemVer rules.
- `source_states` entry for `apple` shows a recent `cursor.lastPosted`.
## 3. Production Monitoring

View File

@@ -96,7 +96,7 @@ curl -s -b cookies.txt \
Iterate `page` until the response `content` array is empty. Pages 09 currently cover 2014→present. Persist JSON responses (plus SHA256) for Offline Kit parity.
> **Shortcut** run `python src/Tools/certbund_offline_snapshot.py --output seed-data/cert-bund`
> **Shortcut** run `python src/Tools/certbund_offline_snapshot.py --output seed-data/cert-bund`
> to bootstrap the session, capture the paginated search responses, and regenerate
> the manifest/checksum files automatically. Supply `--cookie-file` and `--xsrf-token`
> if the portal requires a browser-derived session (see options via `--help`).
@@ -104,7 +104,7 @@ Iterate `page` until the response `content` array is empty. Pages 09 currentl
### 3.3 Export bundles
```bash
python src/Tools/certbund_offline_snapshot.py \
python src/Tools/certbund_offline_snapshot.py \
--output seed-data/cert-bund \
--start-year 2014 \
--end-year "$(date -u +%Y)"
@@ -124,7 +124,7 @@ operating offline.
### 3.4 Connector-driven catch-up
1. Temporarily raise `maxAdvisoriesPerFetch` (e.g. 150) and reduce `requestDelay`.
2. Run `stella db fetch --source cert-bund --stage fetch`, then `--stage parse`, then `--stage map` until the fetch log reports `enqueued=0`.
2. Run `stella db fetch --source cert-bund --stage fetch`, then `--stage parse`, then `--stage map` until the fetch log reports `enqueued=0`.
3. Restore defaults and capture the cursor snapshot for audit.
---
@@ -142,5 +142,5 @@ operating offline.
1. Observe `certbund.feed.fetch.success` and `certbund.detail.fetch.success` increments after runs; `certbund.feed.coverage.days` should hover near the observed RSS window.
2. Ensure summary logs report `truncated=false` in steady state—`true` indicates the fetch cap was hit.
3. During backfills, watch `certbund.feed.enqueued.count` trend to zero.
4. Spot-check stored advisories in Mongo to confirm `language="de"` and reference URLs match the portal detail endpoint.
4. Spot-check stored advisories in PostgreSQL (schema `vuln`) to confirm `language="de"` and reference URLs match the portal detail endpoint.
5. For Offline Kit exports, validate SHA256 hashes before distribution.

View File

@@ -34,7 +34,7 @@ concelier:
1. Deploy the updated configuration and restart the Concelier service so the connector picks up the credentials.
2. Trigger one end-to-end cycle:
- Concelier CLI: run `stella db fetch --source cve --stage fetch`, then `--stage parse`, then `--stage map`.
- Concelier CLI: run `stella db fetch --source cve --stage fetch`, then `--stage parse`, then `--stage map`.
- REST fallback: `POST /jobs/run { "kind": "source:cve:fetch", "chain": ["source:cve:parse", "source:cve:map"] }`
3. Observe the following metrics (exported via OTEL meter `StellaOps.Concelier.Connector.Cve`):
- `cve.fetch.attempts`, `cve.fetch.success`, `cve.fetch.documents`, `cve.fetch.failures`, `cve.fetch.unchanged`
@@ -42,7 +42,7 @@ concelier:
- `cve.map.success`
4. Verify Prometheus shows matching `concelier.source.http.requests_total{concelier_source="cve"}` deltas (list vs detail phases) while `concelier.source.http.failures_total{concelier_source="cve"}` stays flat.
5. Confirm the info-level summary log `CVEs fetch window … pages=X detailDocuments=Y detailFailures=Z` appears once per fetch run and shows `detailFailures=0`.
6. Verify the MongoDB advisory store contains fresh CVE advisories (`advisoryKey` prefix `cve/`) and that the source cursor (`source_states` collection) advanced.
6. Verify the PostgreSQL advisory store (schema `vuln`) contains fresh CVE advisories (`advisoryKey` prefix `cve/`) and that the source cursor (`source_states` table) advanced.
### 1.3 Production Monitoring
@@ -107,7 +107,7 @@ Treat repeated schema failures or growing anomaly counts as an upstream regressi
1. Deploy the configuration and restart Concelier.
2. Trigger a pipeline run:
- CLI: run `stella db fetch --source kev --stage fetch`, then `--stage parse`, then `--stage map`.
- CLI: run `stella db fetch --source kev --stage fetch`, then `--stage parse`, then `--stage map`.
- REST: `POST /jobs/run { "kind": "source:kev:fetch", "chain": ["source:kev:parse", "source:kev:map"] }`
3. Verify the metrics exposed by meter `StellaOps.Concelier.Connector.Kev`:
- `kev.fetch.attempts`, `kev.fetch.success`, `kev.fetch.unchanged`, `kev.fetch.failures`
@@ -115,7 +115,7 @@ Treat repeated schema failures or growing anomaly counts as an upstream regressi
- `kev.map.advisories` (tag `catalogVersion`)
4. Confirm `concelier.source.http.requests_total{concelier_source="kev"}` increments once per fetch and that the paired `concelier.source.http.failures_total` stays flat (zero increase).
5. Inspect the info logs `Fetched KEV catalog document … pendingDocuments=…` and `Parsed KEV catalog document … entries=…`—they should appear exactly once per run and `Mapped X/Y… skipped=0` should match the `kev.map.advisories` delta.
6. Confirm MongoDB documents exist for the catalog JSON (`raw_documents` & `dtos`) and that advisories with prefix `kev/` are written.
6. Confirm PostgreSQL records exist for the catalog JSON (`raw_documents` and `dtos` tables in schema `vuln`) and that advisories with prefix `kev/` are written.
### 2.4 Production Monitoring

View File

@@ -25,7 +25,7 @@ concelier:
1. Restart the Concelier workers so the KISA options bind.
2. Run a full connector cycle:
- CLI: run `stella db fetch --source kisa --stage fetch`, then `--stage parse`, then `--stage map`.
- CLI: run `stella db fetch --source kisa --stage fetch`, then `--stage parse`, then `--stage map`.
- REST: `POST /jobs/run { "kind": "source:kisa:fetch", "chain": ["source:kisa:parse", "source:kisa:map"] }`
3. Confirm telemetry (Meter `StellaOps.Concelier.Connector.Kisa`):
- `kisa.feed.success`, `kisa.feed.items`
@@ -38,10 +38,10 @@ concelier:
- `KISA fetched detail for {Idx} … category={Category}`
- `KISA mapped advisory {AdvisoryId} (severity={Severity})`
- Absence of warnings such as `document missing GridFS payload`.
5. Validate MongoDB state:
- `raw_documents.metadata` has `kisa.idx`, `kisa.category`, `kisa.title`.
- DTO store contains `schemaVersion="kisa.detail.v1"`.
- Advisories include aliases (`IDX`, CVE) and `language="ko"`.
5. Validate PostgreSQL state (schema `vuln`):
- `raw_documents` table metadata has `kisa.idx`, `kisa.category`, `kisa.title`.
- `dtos` table contains `schemaVersion="kisa.detail.v1"`.
- `advisories` table includes aliases (`IDX`, CVE) and `language="ko"`.
- `source_states` entry for `kisa` shows recent `cursor.lastFetchAt`.
## 3. Production Monitoring

View File

@@ -41,7 +41,7 @@ concelier:
### 4.1 State seeding helper
Use `src/Tools/SourceStateSeeder` to queue historical advisories (detail JSON + optional CVRF artefacts) for replay without manual Mongo edits. Example seed file:
Use `src/Tools/SourceStateSeeder` to queue historical advisories (detail JSON + optional CVRF artefacts) for replay without manual Mongo edits. Example seed file:
```json
{
@@ -71,9 +71,9 @@ Use `src/Tools/SourceStateSeeder` to queue historical advisories (detail JSON +
Run the helper:
```bash
dotnet run --project src/Tools/SourceStateSeeder -- \
--connection-string "mongodb://localhost:27017" \
--database concelier \
dotnet run --project src/Tools/SourceStateSeeder -- \
--connection-string "Host=localhost;Database=stellaops;Username=stella;Password=..." \
--schema vuln \
--input seeds/msrc-backfill.json
```

View File

@@ -1,19 +1,19 @@
# Observation Event Transport (advisory.observation.updated@1)
Purpose: document how to emit `advisory.observation.updated@1` events via Mongo outbox with optional NATS JetStream transport.
Purpose: document how to emit `advisory.observation.updated@1` events via PostgreSQL outbox with optional NATS JetStream transport.
## Configuration (appsettings.yaml / config)
```yaml
advisoryObservationEvents:
enabled: false # set true to publish beyond Mongo outbox
transport: "mongo" # "mongo" (no-op publisher) or "nats"
enabled: false # set true to publish beyond PostgreSQL outbox
transport: "postgres" # "postgres" (no-op publisher) or "nats"
natsUrl: "nats://127.0.0.1:4222"
subject: "concelier.advisory.observation.updated.v1"
deadLetterSubject: "concelier.advisory.observation.updated.dead.v1"
stream: "CONCELIER_OBS"
```
Defaults: disabled, transport `mongo`; subject/stream as above.
Defaults: disabled, transport `postgres`; subject/stream as above.
## Flow
1) Observation sink writes event to `advisory_observation_events` (idempotent on `observationHash`).
@@ -24,7 +24,7 @@ Defaults: disabled, transport `mongo`; subject/stream as above.
- Ensure NATS JetStream is reachable before enabling `transport: nats` to avoid retry noise.
- Stream is auto-created if missing with current subject; size capped at 512 KiB per message.
- Dead-letter subject reserved; not yet wired—keep for future schema validation failures.
- Backlog monitoring: count documents in `advisory_observation_events` with `publishedAt: null`.
- Backlog monitoring: count rows in `advisory_observation_events` table with `published_at IS NULL`.
## Testing
- Without NATS: leave `enabled=false`; app continues writing outbox only.