Align AOC tasks for Excititor and Concelier
This commit is contained in:
@@ -1,59 +1,59 @@
|
||||
# FEEDCONN-CERTCC-02-009 – VINCE Detail & Map Reintegration Plan
|
||||
|
||||
- **Author:** BE-Conn-CERTCC (current on-call)
|
||||
- **Date:** 2025-10-11
|
||||
- **Scope:** Restore VINCE detail parsing and canonical mapping in Concelier without destabilising downstream Merge/Export pipelines.
|
||||
|
||||
## 1. Current State Snapshot (2025-10-11)
|
||||
|
||||
- ✅ Fetch pipeline, VINCE summary planner, and detail queue are live; documents land with `DocumentStatuses.PendingParse`.
|
||||
- ✅ DTO aggregate (`CertCcNoteDto`) plus mapper emit vendor-centric `normalizedVersions` (`scheme=certcc.vendor`) and provenance aligned with `src/Concelier/__Libraries/StellaOps.Concelier.Models/PROVENANCE_GUIDELINES.md`.
|
||||
- ✅ Regression coverage exists for fetch/parse/map flows (`CertCcConnectorSnapshotTests`), but snapshot regeneration is gated on harness refresh (FEEDCONN-CERTCC-02-007) and QA handoff (FEEDCONN-CERTCC-02-008).
|
||||
- ⚠️ Parse/map jobs are not scheduled; production still operates in fetch-only mode.
|
||||
- ⚠️ Downstream Merge team is finalising normalized range ingestion per `src/FASTER_MODELING_AND_NORMALIZATION.md`; we must avoid publishing canonical records until they certify compatibility.
|
||||
|
||||
## 2. Required Dependencies & Coordinated Tasks
|
||||
|
||||
| Dependency | Owner(s) | Blocking Condition | Handshake |
|
||||
|------------|----------|--------------------|-----------|
|
||||
| FEEDCONN-CERTCC-02-004 (Canonical mapping & range primitives hardening) | BE-Conn-CERTCC + Models | Ensure mapper emits deterministic `normalizedVersions` array and provenance field masks | Daily sync with Models/Merge leads; share fixture diff before each enablement phase |
|
||||
| FEEDCONN-CERTCC-02-007 (Connector test harness remediation) | BE-Conn-CERTCC, QA | Restore `AddSourceCommon` harness + canned VINCE fixtures so we can shadow-run parse/map | Required before Phase 1 |
|
||||
| FEEDCONN-CERTCC-02-008 (Snapshot coverage handoff) | QA | Snapshot refresh process green to surface regressions | Required before Phase 2 |
|
||||
| FEEDCONN-CERTCC-02-010 (Partial-detail graceful degradation) | BE-Conn-CERTCC | Resiliency for missing VINCE endpoints to avoid job wedging after reintegration | Should land before Phase 2 cutover |
|
||||
|
||||
## 3. Phased Rollout Plan
|
||||
|
||||
| Phase | Window (UTC) | Actions | Success Signals | Rollback |
|
||||
|-------|--------------|---------|-----------------|----------|
|
||||
| **0 – Pre-flight validation** | 2025-10-11 → 2025-10-12 | • Finish FEEDCONN-CERTCC-02-007 harness fixes and regenerate fixtures.<br>• Run `dotnet test src/Concelier/__Tests/StellaOps.Concelier.Connector.CertCc.Tests` with `UPDATE_CERTCC_FIXTURES=0` to confirm deterministic baselines.<br>• Generate sample advisory batch (`dotnet test … --filter SnapshotSmoke`) and deliver JSON diff to Merge for schema verification (`normalizedVersions[].scheme == certcc.vendor`, provenance masks populated). | • Harness tests green locally and in CI.<br>• Merge sign-off that sample advisories conform to `FASTER_MODELING_AND_NORMALIZATION.md`. | N/A (no production enablement yet). |
|
||||
| **1 – Shadow parse/map in staging** | Target start 2025-10-13 | • Register `source:cert-cc:parse` and `source:cert-cc:map` jobs, but gate them behind new config flag `concelier:sources:cert-cc:enableDetailMapping` (default `false`).<br>• Deploy (restart required for options rebinding), enable flag, and point connector at staging Mongo with isolated collection (`advisories_certcc_shadow`).<br>• Run connector for ≥2 cycles; compare advisory counts vs. fetch-only baseline and validate `concelier.range.primitives` metrics include `scheme=certcc.vendor`. | • No uncaught exceptions in staging logs.<br>• Shadow advisories match expected vendor counts (±5%).<br>• `certcc.summary.fetch.*` + new `certcc.map.duration.ms` metrics stable. | Disable flag; staging returns to fetch-only. No production impact. |
|
||||
| **2 – Controlled production enablement** | Target start 2025-10-14 | • Redeploy production with flag enabled, start with job concurrency `1`, and reduce `MaxNotesPerFetch` to 5 for first 24 h.<br>• Observe metrics dashboards hourly (fetch/map latency, pending queues, Mongo write throughput).<br>• QA to replay latest snapshots and confirm no deterministic drift.<br>• Publish advisory sample (top 10 changed docs) to Merge Slack channel for validation. | • Pending parse/mapping queues drain within expected SLA (<30 min).<br>• No increase in merge dedupe anomalies.<br>• Mongo writes stay within 10% of baseline. | Toggle flag off, re-run fetch-only. Clear `pendingMappings` via connector cursor reset if stuck. |
|
||||
| **3 – Full production & cleanup** | Target start 2025-10-15 | • Restore `MaxNotesPerFetch` to configured default (20).<br>• Remove temporary throttles and leave flag enabled by default.<br>• Update `README.md` rollout notes; close FEEDCONN-CERTCC-02-009.<br>• Kick off post-merge audit with Merge to ensure new advisories dedupe with other sources. | • Stable operations for ≥48 h, no degradation alerts.<br>• Merge confirms conflict resolver behaviour unchanged. | If regression detected, revert to Phase 2 state or disable jobs; retain plan for reuse. |
|
||||
|
||||
## 4. Monitoring & Validation Checklist
|
||||
|
||||
- Dashboards: `certcc.*` meters (plan, summary fetch, detail fetch) plus `concelier.range.primitives` with tag `scheme=certcc.vendor`.
|
||||
- Logs: ensure Parse/Map jobs emit `correlationId` aligned with fetch events for traceability.
|
||||
- Data QA: run `src/Tools/dump_advisory` against two VINCE notes (one multi-vendor, one single-vendor) every phase to spot-check normalized versions ordering and provenance.
|
||||
- Storage: verify Mongo TTL/size for `raw_documents` and `dtos`—detail payload volume increases by ~3× when mapping resumes.
|
||||
|
||||
## 5. Rollback / Contingency Playbook
|
||||
|
||||
1. Disable `concelier:sources:cert-cc:enableDetailMapping` flag (and optionally set `MaxNotesPerFetch=0` for a single cycle) to halt new detail ingestion.
|
||||
2. Run connector once to update cursor; verify `pendingMappings` drains.
|
||||
3. If advisories already persisted, coordinate with Merge to soft-delete affected `certcc/*` advisories by advisory key hash (no schema rollback required).
|
||||
4. Re-run Phase 1 shadow validation before retrying.
|
||||
|
||||
## 6. Communication Cadence
|
||||
|
||||
- Daily check-in with Models/Merge leads (09:30 EDT) to surface normalizedVersions/provenance diffs.
|
||||
- Post-phase reports in `#concelier-certcc` Slack channel summarising metrics, advisory counts, and outstanding issues.
|
||||
- Escalate blockers >12 h via Runbook SEV-3 path and annotate `TASKS.md`.
|
||||
|
||||
## 7. Open Questions / Next Actions
|
||||
|
||||
- [ ] Confirm whether Merge requires additional provenance field masks before Phase 2 (waiting on feedback from 2025-10-11 sample).
|
||||
- [ ] Decide if CSAF endpoint ingestion (optional) should piggyback on Phase 3 or stay deferred.
|
||||
- [ ] Validate that FEEDCONN-CERTCC-02-010 coverage handles mixed 200/404 VINCE endpoints during partial outages.
|
||||
|
||||
Once Dependencies (Section 2) are cleared and Phase 3 completes, update `src/Concelier/StellaOps.Concelier.PluginBinaries/StellaOps.Concelier.Connector.CertCc/TASKS.md` and close FEEDCONN-CERTCC-02-009.
|
||||
# FEEDCONN-CERTCC-02-009 – VINCE Detail & Map Reintegration Plan
|
||||
|
||||
- **Author:** BE-Conn-CERTCC (current on-call)
|
||||
- **Date:** 2025-10-11
|
||||
- **Scope:** Restore VINCE detail parsing and canonical mapping in Concelier without destabilising downstream Merge/Export pipelines.
|
||||
|
||||
## 1. Current State Snapshot (2025-10-11)
|
||||
|
||||
- ✅ Fetch pipeline, VINCE summary planner, and detail queue are live; documents land with `DocumentStatuses.PendingParse`.
|
||||
- ✅ DTO aggregate (`CertCcNoteDto`) plus mapper emit vendor-centric `normalizedVersions` (`scheme=certcc.vendor`) and provenance aligned with `src/Concelier/__Libraries/StellaOps.Concelier.Models/PROVENANCE_GUIDELINES.md`.
|
||||
- ✅ Regression coverage exists for fetch/parse/map flows (`CertCcConnectorSnapshotTests`), but snapshot regeneration is gated on harness refresh (FEEDCONN-CERTCC-02-007) and QA handoff (FEEDCONN-CERTCC-02-008).
|
||||
- ⚠️ Parse/map jobs are not scheduled; production still operates in fetch-only mode.
|
||||
- ⚠️ Downstream Merge team is finalising normalized range ingestion per `src/FASTER_MODELING_AND_NORMALIZATION.md`; we must avoid publishing canonical records until they certify compatibility.
|
||||
|
||||
## 2. Required Dependencies & Coordinated Tasks
|
||||
|
||||
| Dependency | Owner(s) | Blocking Condition | Handshake |
|
||||
|------------|----------|--------------------|-----------|
|
||||
| FEEDCONN-CERTCC-02-004 (Canonical mapping & range primitives hardening) | BE-Conn-CERTCC + Models | Ensure mapper emits deterministic `normalizedVersions` array and provenance field masks | Daily sync with Models/Merge leads; share fixture diff before each enablement phase |
|
||||
| FEEDCONN-CERTCC-02-007 (Connector test harness remediation) | BE-Conn-CERTCC, QA | Restore `AddSourceCommon` harness + canned VINCE fixtures so we can shadow-run parse/map | Required before Phase 1 |
|
||||
| FEEDCONN-CERTCC-02-008 (Snapshot coverage handoff) | QA | Snapshot refresh process green to surface regressions | Required before Phase 2 |
|
||||
| FEEDCONN-CERTCC-02-010 (Partial-detail graceful degradation) | BE-Conn-CERTCC | Resiliency for missing VINCE endpoints to avoid job wedging after reintegration | Should land before Phase 2 cutover |
|
||||
|
||||
## 3. Phased Rollout Plan
|
||||
|
||||
| Phase | Window (UTC) | Actions | Success Signals | Rollback |
|
||||
|-------|--------------|---------|-----------------|----------|
|
||||
| **0 – Pre-flight validation** | 2025-10-11 → 2025-10-12 | • Finish FEEDCONN-CERTCC-02-007 harness fixes and regenerate fixtures.<br>• Run `dotnet test src/Concelier/__Tests/StellaOps.Concelier.Connector.CertCc.Tests` with `UPDATE_CERTCC_FIXTURES=0` to confirm deterministic baselines.<br>• Generate sample advisory batch (`dotnet test … --filter SnapshotSmoke`) and deliver JSON diff to Merge for schema verification (`normalizedVersions[].scheme == certcc.vendor`, provenance masks populated). | • Harness tests green locally and in CI.<br>• Merge sign-off that sample advisories conform to `FASTER_MODELING_AND_NORMALIZATION.md`. | N/A (no production enablement yet). |
|
||||
| **1 – Shadow parse/map in staging** | Target start 2025-10-13 | • Register `source:cert-cc:parse` and `source:cert-cc:map` jobs, but gate them behind new config flag `concelier:sources:cert-cc:enableDetailMapping` (default `false`).<br>• Deploy (restart required for options rebinding), enable flag, and point connector at staging Mongo with isolated collection (`advisories_certcc_shadow`).<br>• Run connector for ≥2 cycles; compare advisory counts vs. fetch-only baseline and validate `concelier.range.primitives` metrics include `scheme=certcc.vendor`. | • No uncaught exceptions in staging logs.<br>• Shadow advisories match expected vendor counts (±5%).<br>• `certcc.summary.fetch.*` + new `certcc.map.duration.ms` metrics stable. | Disable flag; staging returns to fetch-only. No production impact. |
|
||||
| **2 – Controlled production enablement** | Target start 2025-10-14 | • Redeploy production with flag enabled, start with job concurrency `1`, and reduce `MaxNotesPerFetch` to 5 for first 24 h.<br>• Observe metrics dashboards hourly (fetch/map latency, pending queues, Mongo write throughput).<br>• QA to replay latest snapshots and confirm no deterministic drift.<br>• Publish advisory sample (top 10 changed docs) to Merge Slack channel for validation. | • Pending parse/mapping queues drain within expected SLA (<30 min).<br>• No increase in merge dedupe anomalies.<br>• Mongo writes stay within 10% of baseline. | Toggle flag off, re-run fetch-only. Clear `pendingMappings` via connector cursor reset if stuck. |
|
||||
| **3 – Full production & cleanup** | Target start 2025-10-15 | • Restore `MaxNotesPerFetch` to configured default (20).<br>• Remove temporary throttles and leave flag enabled by default.<br>• Update `README.md` rollout notes; close FEEDCONN-CERTCC-02-009.<br>• Kick off post-merge audit with Merge to ensure new advisories dedupe with other sources. | • Stable operations for ≥48 h, no degradation alerts.<br>• Merge confirms conflict resolver behaviour unchanged. | If regression detected, revert to Phase 2 state or disable jobs; retain plan for reuse. |
|
||||
|
||||
## 4. Monitoring & Validation Checklist
|
||||
|
||||
- Dashboards: `certcc.*` meters (plan, summary fetch, detail fetch) plus `concelier.range.primitives` with tag `scheme=certcc.vendor`.
|
||||
- Logs: ensure Parse/Map jobs emit `correlationId` aligned with fetch events for traceability.
|
||||
- Data QA: run `src/Tools/dump_advisory` against two VINCE notes (one multi-vendor, one single-vendor) every phase to spot-check normalized versions ordering and provenance.
|
||||
- Storage: verify Mongo TTL/size for `raw_documents` and `dtos`—detail payload volume increases by ~3× when mapping resumes.
|
||||
|
||||
## 5. Rollback / Contingency Playbook
|
||||
|
||||
1. Disable `concelier:sources:cert-cc:enableDetailMapping` flag (and optionally set `MaxNotesPerFetch=0` for a single cycle) to halt new detail ingestion.
|
||||
2. Run connector once to update cursor; verify `pendingMappings` drains.
|
||||
3. If advisories already persisted, coordinate with Merge to soft-delete affected `certcc/*` advisories by advisory key hash (no schema rollback required).
|
||||
4. Re-run Phase 1 shadow validation before retrying.
|
||||
|
||||
## 6. Communication Cadence
|
||||
|
||||
- Daily check-in with Models/Merge leads (09:30 EDT) to surface normalizedVersions/provenance diffs.
|
||||
- Post-phase reports in `#concelier-certcc` Slack channel summarising metrics, advisory counts, and outstanding issues.
|
||||
- Escalate blockers >12 h via Runbook SEV-3 path and annotate `TASKS.md`.
|
||||
|
||||
## 7. Open Questions / Next Actions
|
||||
|
||||
- [ ] Confirm whether Merge requires additional provenance field masks before Phase 2 (waiting on feedback from 2025-10-11 sample).
|
||||
- [ ] Decide if CSAF endpoint ingestion (optional) should piggyback on Phase 3 or stay deferred.
|
||||
- [ ] Validate that FEEDCONN-CERTCC-02-010 coverage handles mixed 200/404 VINCE endpoints during partial outages.
|
||||
|
||||
Once Dependencies (Section 2) are cleared and Phase 3 completes, update `src/Concelier/StellaOps.Concelier.PluginBinaries/StellaOps.Concelier.Connector.CertCc/TASKS.md` and close FEEDCONN-CERTCC-02-009.
|
||||
|
||||
Reference in New Issue
Block a user