Align AOC tasks for Excititor and Concelier

This commit is contained in:
master
2025-10-31 18:50:15 +02:00
committed by root
parent 9e6d9fbae8
commit 8da4e12a90
334 changed files with 35528 additions and 34546 deletions

View File

@@ -1,59 +1,59 @@
# FEEDCONN-CERTCC-02-009 VINCE Detail & Map Reintegration Plan
- **Author:** BE-Conn-CERTCC (current on-call)
- **Date:** 2025-10-11
- **Scope:** Restore VINCE detail parsing and canonical mapping in Concelier without destabilising downstream Merge/Export pipelines.
## 1. Current State Snapshot (2025-10-11)
- ✅ Fetch pipeline, VINCE summary planner, and detail queue are live; documents land with `DocumentStatuses.PendingParse`.
- ✅ DTO aggregate (`CertCcNoteDto`) plus mapper emit vendor-centric `normalizedVersions` (`scheme=certcc.vendor`) and provenance aligned with `src/Concelier/__Libraries/StellaOps.Concelier.Models/PROVENANCE_GUIDELINES.md`.
- ✅ Regression coverage exists for fetch/parse/map flows (`CertCcConnectorSnapshotTests`), but snapshot regeneration is gated on harness refresh (FEEDCONN-CERTCC-02-007) and QA handoff (FEEDCONN-CERTCC-02-008).
- ⚠️ Parse/map jobs are not scheduled; production still operates in fetch-only mode.
- ⚠️ Downstream Merge team is finalising normalized range ingestion per `src/FASTER_MODELING_AND_NORMALIZATION.md`; we must avoid publishing canonical records until they certify compatibility.
## 2. Required Dependencies & Coordinated Tasks
| Dependency | Owner(s) | Blocking Condition | Handshake |
|------------|----------|--------------------|-----------|
| FEEDCONN-CERTCC-02-004 (Canonical mapping & range primitives hardening) | BE-Conn-CERTCC + Models | Ensure mapper emits deterministic `normalizedVersions` array and provenance field masks | Daily sync with Models/Merge leads; share fixture diff before each enablement phase |
| FEEDCONN-CERTCC-02-007 (Connector test harness remediation) | BE-Conn-CERTCC, QA | Restore `AddSourceCommon` harness + canned VINCE fixtures so we can shadow-run parse/map | Required before Phase 1 |
| FEEDCONN-CERTCC-02-008 (Snapshot coverage handoff) | QA | Snapshot refresh process green to surface regressions | Required before Phase 2 |
| FEEDCONN-CERTCC-02-010 (Partial-detail graceful degradation) | BE-Conn-CERTCC | Resiliency for missing VINCE endpoints to avoid job wedging after reintegration | Should land before Phase 2 cutover |
## 3. Phased Rollout Plan
| Phase | Window (UTC) | Actions | Success Signals | Rollback |
|-------|--------------|---------|-----------------|----------|
| **0 Pre-flight validation** | 2025-10-11 → 2025-10-12 | • Finish FEEDCONN-CERTCC-02-007 harness fixes and regenerate fixtures.<br>• Run `dotnet test src/Concelier/__Tests/StellaOps.Concelier.Connector.CertCc.Tests` with `UPDATE_CERTCC_FIXTURES=0` to confirm deterministic baselines.<br>• Generate sample advisory batch (`dotnet test … --filter SnapshotSmoke`) and deliver JSON diff to Merge for schema verification (`normalizedVersions[].scheme == certcc.vendor`, provenance masks populated). | • Harness tests green locally and in CI.<br>• Merge sign-off that sample advisories conform to `FASTER_MODELING_AND_NORMALIZATION.md`. | N/A (no production enablement yet). |
| **1 Shadow parse/map in staging** | Target start 2025-10-13 | • Register `source:cert-cc:parse` and `source:cert-cc:map` jobs, but gate them behind new config flag `concelier:sources:cert-cc:enableDetailMapping` (default `false`).<br>• Deploy (restart required for options rebinding), enable flag, and point connector at staging Mongo with isolated collection (`advisories_certcc_shadow`).<br>• Run connector for ≥2 cycles; compare advisory counts vs. fetch-only baseline and validate `concelier.range.primitives` metrics include `scheme=certcc.vendor`. | • No uncaught exceptions in staging logs.<br>• Shadow advisories match expected vendor counts (±5%).<br>`certcc.summary.fetch.*` + new `certcc.map.duration.ms` metrics stable. | Disable flag; staging returns to fetch-only. No production impact. |
| **2 Controlled production enablement** | Target start 2025-10-14 | • Redeploy production with flag enabled, start with job concurrency `1`, and reduce `MaxNotesPerFetch` to 5 for first 24h.<br>• Observe metrics dashboards hourly (fetch/map latency, pending queues, Mongo write throughput).<br>• QA to replay latest snapshots and confirm no deterministic drift.<br>• Publish advisory sample (top 10 changed docs) to Merge Slack channel for validation. | • Pending parse/mapping queues drain within expected SLA (<30min).<br>• No increase in merge dedupe anomalies.<br>• Mongo writes stay within 10% of baseline. | Toggle flag off, re-run fetch-only. Clear `pendingMappings` via connector cursor reset if stuck. |
| **3 Full production & cleanup** | Target start 2025-10-15 | • Restore `MaxNotesPerFetch` to configured default (20).<br>• Remove temporary throttles and leave flag enabled by default.<br>• Update `README.md` rollout notes; close FEEDCONN-CERTCC-02-009.<br>• Kick off post-merge audit with Merge to ensure new advisories dedupe with other sources. | • Stable operations for ≥48h, no degradation alerts.<br>• Merge confirms conflict resolver behaviour unchanged. | If regression detected, revert to Phase2 state or disable jobs; retain plan for reuse. |
## 4. Monitoring & Validation Checklist
- Dashboards: `certcc.*` meters (plan, summary fetch, detail fetch) plus `concelier.range.primitives` with tag `scheme=certcc.vendor`.
- Logs: ensure Parse/Map jobs emit `correlationId` aligned with fetch events for traceability.
- Data QA: run `src/Tools/dump_advisory` against two VINCE notes (one multi-vendor, one single-vendor) every phase to spot-check normalized versions ordering and provenance.
- Storage: verify Mongo TTL/size for `raw_documents` and `dtos`—detail payload volume increases by ~3× when mapping resumes.
## 5. Rollback / Contingency Playbook
1. Disable `concelier:sources:cert-cc:enableDetailMapping` flag (and optionally set `MaxNotesPerFetch=0` for a single cycle) to halt new detail ingestion.
2. Run connector once to update cursor; verify `pendingMappings` drains.
3. If advisories already persisted, coordinate with Merge to soft-delete affected `certcc/*` advisories by advisory key hash (no schema rollback required).
4. Re-run Phase1 shadow validation before retrying.
## 6. Communication Cadence
- Daily check-in with Models/Merge leads (09:30 EDT) to surface normalizedVersions/provenance diffs.
- Post-phase reports in `#concelier-certcc` Slack channel summarising metrics, advisory counts, and outstanding issues.
- Escalate blockers >12h via Runbook SEV-3 path and annotate `TASKS.md`.
## 7. Open Questions / Next Actions
- [ ] Confirm whether Merge requires additional provenance field masks before Phase2 (waiting on feedback from 2025-10-11 sample).
- [ ] Decide if CSAF endpoint ingestion (optional) should piggyback on Phase3 or stay deferred.
- [ ] Validate that FEEDCONN-CERTCC-02-010 coverage handles mixed 200/404 VINCE endpoints during partial outages.
Once Dependencies (Section2) are cleared and Phase3 completes, update `src/Concelier/StellaOps.Concelier.PluginBinaries/StellaOps.Concelier.Connector.CertCc/TASKS.md` and close FEEDCONN-CERTCC-02-009.
# FEEDCONN-CERTCC-02-009 VINCE Detail & Map Reintegration Plan
- **Author:** BE-Conn-CERTCC (current on-call)
- **Date:** 2025-10-11
- **Scope:** Restore VINCE detail parsing and canonical mapping in Concelier without destabilising downstream Merge/Export pipelines.
## 1. Current State Snapshot (2025-10-11)
- ✅ Fetch pipeline, VINCE summary planner, and detail queue are live; documents land with `DocumentStatuses.PendingParse`.
- ✅ DTO aggregate (`CertCcNoteDto`) plus mapper emit vendor-centric `normalizedVersions` (`scheme=certcc.vendor`) and provenance aligned with `src/Concelier/__Libraries/StellaOps.Concelier.Models/PROVENANCE_GUIDELINES.md`.
- ✅ Regression coverage exists for fetch/parse/map flows (`CertCcConnectorSnapshotTests`), but snapshot regeneration is gated on harness refresh (FEEDCONN-CERTCC-02-007) and QA handoff (FEEDCONN-CERTCC-02-008).
- ⚠️ Parse/map jobs are not scheduled; production still operates in fetch-only mode.
- ⚠️ Downstream Merge team is finalising normalized range ingestion per `src/FASTER_MODELING_AND_NORMALIZATION.md`; we must avoid publishing canonical records until they certify compatibility.
## 2. Required Dependencies & Coordinated Tasks
| Dependency | Owner(s) | Blocking Condition | Handshake |
|------------|----------|--------------------|-----------|
| FEEDCONN-CERTCC-02-004 (Canonical mapping & range primitives hardening) | BE-Conn-CERTCC + Models | Ensure mapper emits deterministic `normalizedVersions` array and provenance field masks | Daily sync with Models/Merge leads; share fixture diff before each enablement phase |
| FEEDCONN-CERTCC-02-007 (Connector test harness remediation) | BE-Conn-CERTCC, QA | Restore `AddSourceCommon` harness + canned VINCE fixtures so we can shadow-run parse/map | Required before Phase 1 |
| FEEDCONN-CERTCC-02-008 (Snapshot coverage handoff) | QA | Snapshot refresh process green to surface regressions | Required before Phase 2 |
| FEEDCONN-CERTCC-02-010 (Partial-detail graceful degradation) | BE-Conn-CERTCC | Resiliency for missing VINCE endpoints to avoid job wedging after reintegration | Should land before Phase 2 cutover |
## 3. Phased Rollout Plan
| Phase | Window (UTC) | Actions | Success Signals | Rollback |
|-------|--------------|---------|-----------------|----------|
| **0 Pre-flight validation** | 2025-10-11 → 2025-10-12 | • Finish FEEDCONN-CERTCC-02-007 harness fixes and regenerate fixtures.<br>• Run `dotnet test src/Concelier/__Tests/StellaOps.Concelier.Connector.CertCc.Tests` with `UPDATE_CERTCC_FIXTURES=0` to confirm deterministic baselines.<br>• Generate sample advisory batch (`dotnet test … --filter SnapshotSmoke`) and deliver JSON diff to Merge for schema verification (`normalizedVersions[].scheme == certcc.vendor`, provenance masks populated). | • Harness tests green locally and in CI.<br>• Merge sign-off that sample advisories conform to `FASTER_MODELING_AND_NORMALIZATION.md`. | N/A (no production enablement yet). |
| **1 Shadow parse/map in staging** | Target start 2025-10-13 | • Register `source:cert-cc:parse` and `source:cert-cc:map` jobs, but gate them behind new config flag `concelier:sources:cert-cc:enableDetailMapping` (default `false`).<br>• Deploy (restart required for options rebinding), enable flag, and point connector at staging Mongo with isolated collection (`advisories_certcc_shadow`).<br>• Run connector for ≥2 cycles; compare advisory counts vs. fetch-only baseline and validate `concelier.range.primitives` metrics include `scheme=certcc.vendor`. | • No uncaught exceptions in staging logs.<br>• Shadow advisories match expected vendor counts (±5%).<br>`certcc.summary.fetch.*` + new `certcc.map.duration.ms` metrics stable. | Disable flag; staging returns to fetch-only. No production impact. |
| **2 Controlled production enablement** | Target start 2025-10-14 | • Redeploy production with flag enabled, start with job concurrency `1`, and reduce `MaxNotesPerFetch` to 5 for first 24h.<br>• Observe metrics dashboards hourly (fetch/map latency, pending queues, Mongo write throughput).<br>• QA to replay latest snapshots and confirm no deterministic drift.<br>• Publish advisory sample (top 10 changed docs) to Merge Slack channel for validation. | • Pending parse/mapping queues drain within expected SLA (<30min).<br>• No increase in merge dedupe anomalies.<br>• Mongo writes stay within 10% of baseline. | Toggle flag off, re-run fetch-only. Clear `pendingMappings` via connector cursor reset if stuck. |
| **3 Full production & cleanup** | Target start 2025-10-15 | • Restore `MaxNotesPerFetch` to configured default (20).<br>• Remove temporary throttles and leave flag enabled by default.<br>• Update `README.md` rollout notes; close FEEDCONN-CERTCC-02-009.<br>• Kick off post-merge audit with Merge to ensure new advisories dedupe with other sources. | • Stable operations for ≥48h, no degradation alerts.<br>• Merge confirms conflict resolver behaviour unchanged. | If regression detected, revert to Phase2 state or disable jobs; retain plan for reuse. |
## 4. Monitoring & Validation Checklist
- Dashboards: `certcc.*` meters (plan, summary fetch, detail fetch) plus `concelier.range.primitives` with tag `scheme=certcc.vendor`.
- Logs: ensure Parse/Map jobs emit `correlationId` aligned with fetch events for traceability.
- Data QA: run `src/Tools/dump_advisory` against two VINCE notes (one multi-vendor, one single-vendor) every phase to spot-check normalized versions ordering and provenance.
- Storage: verify Mongo TTL/size for `raw_documents` and `dtos`—detail payload volume increases by ~3× when mapping resumes.
## 5. Rollback / Contingency Playbook
1. Disable `concelier:sources:cert-cc:enableDetailMapping` flag (and optionally set `MaxNotesPerFetch=0` for a single cycle) to halt new detail ingestion.
2. Run connector once to update cursor; verify `pendingMappings` drains.
3. If advisories already persisted, coordinate with Merge to soft-delete affected `certcc/*` advisories by advisory key hash (no schema rollback required).
4. Re-run Phase1 shadow validation before retrying.
## 6. Communication Cadence
- Daily check-in with Models/Merge leads (09:30 EDT) to surface normalizedVersions/provenance diffs.
- Post-phase reports in `#concelier-certcc` Slack channel summarising metrics, advisory counts, and outstanding issues.
- Escalate blockers >12h via Runbook SEV-3 path and annotate `TASKS.md`.
## 7. Open Questions / Next Actions
- [ ] Confirm whether Merge requires additional provenance field masks before Phase2 (waiting on feedback from 2025-10-11 sample).
- [ ] Decide if CSAF endpoint ingestion (optional) should piggyback on Phase3 or stay deferred.
- [ ] Validate that FEEDCONN-CERTCC-02-010 coverage handles mixed 200/404 VINCE endpoints during partial outages.
Once Dependencies (Section2) are cleared and Phase3 completes, update `src/Concelier/StellaOps.Concelier.PluginBinaries/StellaOps.Concelier.Connector.CertCc/TASKS.md` and close FEEDCONN-CERTCC-02-009.