This commit is contained in:
2025-10-05 00:35:42 +03:00
parent 6abb751ce8
commit b0c3fa10fb
3 changed files with 0 additions and 1084 deletions

View File

@@ -1,718 +0,0 @@
# AGENTS.md — Feedser (StellaOps)
> YOU ARE: a senior C#/.NET 10 (Preview 7, SDK 10.0.100-preview.7.25380.108) engineeragent tasked with building **Feedser**, a CLI that fetches, normalizes, reconciles, and packages *primary, nonaggregated* vulnerability intelligence into a single **feedmerge** database and exports a **Trivycompatible** DB (OCI artifact) for offline/selfhosted scanning.
> MODE: factual, deterministic, testfirst, with strict provenance and reproducibility.
> BOUNDARIES: implement architecture & code in this repo only. No secret exfiltration. Default offline.
---
## 0) StellaOps — condensed context
**StellaOps** is a container & infra security platform built for sovereign/offline operation. Key storylines: **ΔSBOM warm path**, nightly rechecks, policyascode, **signed artifacts**, optional **AI** remediation, and regional/airgapped operation.
**Feedser** is foundational: it powers the scanner by producing a unified, deduped, explainable vulnerability database and a **selfhosted Trivy DB**.
**Sibling components (stable contracts, no code here):**
- **Scanner** (`stellaops.module.scanning`) — consumes Trivycompatible DB → findings + SBOM digests.
- **Policy Engine**, **Signed Artifacts Service** (cosign), **AIRE** (AI suggestions), **SecretsScanner**, **MailDaemon**, **Offline Kit**, **RU/EEU adapters** (CryptoPro TLS, RU cert chains), **UI Shell**.
---
## 1) Problem statement
1) **Fetch** authoritative *primary* sources (global + regional + PSIRT + distro + CERTs + ICS).
2) **Parse & Normalize** to a **UnifiedVuln** model.
3) **Reconcile/Deduplicate** deterministically across sources with precedence rules.
4) **Persist** into **feedmerge DB** with both **bootstrapfromscratch** and **incremental refresh**.
5) **Package & Publish**:
- **Trivy DB (v2) OCI artifact** for scanners (`--db-repository`),
- optional **vulnlistshaped JSON** tree (to reuse `trivy-db` builder),
- optional **signed offline bundle**.
Nongoals v0: building a new scanner or a custom Java DB; we only ensure Scanner can target our selfhosted DB.
---
## 2) Highlevel architecture
```
[Connectors] ──► [Source DTO validation] ──► [Normalizer → UnifiedVuln]
CVE/NVD, GHSA/OSV, JVN, CERT/CC, CISA KEV, KISA, CERT-In, ANSSI (CERT-FR),
BSI (CERT-Bund WID), ACSC, CCCS, RU: BDU + NKCKI, Vendor PSIRTs (MSRC, Cisco,
Oracle CPU, Adobe APSB, Apple, Chromium, VMware), Distros (Red Hat, Ubuntu,
Debian, SUSE), ICS (CISA ICS, Kaspersky ICS-CERT)
[Merge/Reconcile Engine]
(aliases, precedence, ranges, KEV flags, PSIRT flags)
[FeedMerge DB (SQLite→Postgres)]
┌──────────────────┴──────────────────┐
▼ ▼
[Export: vulnlist JSON] [Packager: Trivy DB v2]
│ │
(CI) [ORAS push / offline tar]
```
**Principles**
- Determinism (same inputs → same outputs, hashed) and provenance per field.
- OVAL (vendor/distro) **overrides** generic ranges for OS packages.
- Regional feeds **enrich** rather than blindly override unless they carry stronger packagelevel truth.
---
## 3) Repository layout (create exactly)
```
src/Feedser/
Feedser.Cli/ # .NET 10 preview console (System.CommandLine)
Feedser.Core/ # domain model & orchestration
Feedser.Storage/ # EF Core migrations (SQLite dev/CI; Postgres prod)
Feedser.Connectors/
Common/ # HTTP, pagination, ETag, backoff, schema validators
Cve/ # CVE registry (id+refs)
Nvd/ # NVD API v2 windows
Ghsa/ # GHSA REST/GraphQL
Osm.Osv/ # OSV API
Jvn/ # MyJVN (JVNRSS/VULDEF)
CertCc/ # CERT/CC Vulnerability Notes
Kev/ # CISA Known Exploited
Kr.Kisa/ # KISA/KrCERT advisories
In.CertIn/ # CERT-In advisories
Fr.CertFr/ # ANSSI CERT-FR avis/alertes
De.CertBund/ # BSI CERT-Bund WID
Au.Acsc/ # ACSC advisories
Ca.Cccs/ # CCCS advisories
Ru.Bdu/ # FSTEC BDU (HTML→schema; LLM fallback gated)
Ru.Nkcki/ # NKCKI bulletins (HTML/PDF→text)
Vndr.Msrc/ # MSRC CVRF
Vndr.Cisco/ # Cisco PSIRT openVuln
Vndr.Oracle/ # Oracle CPU/advisories
Vndr.Adobe/ # Adobe APSB/APA
Vndr.Apple/ # Apple HT201222 feed
Vndr.Chromium/ # Chrome Releases security posts
Vndr.Vmware/ # VMSA (Broadcom portal)
Distro.RedHat/ # Red Hat Security Data API + OVAL
Distro.Ubuntu/ # USN + Security API
Distro.Debian/ # Debian Security Tracker JSON
Distro.Suse/ # SUSE Update Advisories
Ics.Cisa/ # CISA ICS advisories (ICSA-*)
Ics.Kaspersky/ # Kaspersky ICS-CERT advisories
Feedser.Merge/ # dedupe/aliases/precedence/version-ranges
Feedser.Export.VulnList/ # optional vuln-list JSON renderer
Feedser.Packagers.TrivyDb/ # db.tar.gz + metadata.json + ORAS push
Feedser.Signing/ # cosign integration
Feedser.Tests/
etc/
feedser.yaml # config template (extended, see §11)
schemas/ # JSON Schema/XSD for inputs & internal payloads
samples/ # golden fixtures per source
````
---
## 4) Unified data model (relational + evented)
**Storage default**: **SQLite** (dev/CI), **Postgres** (prod). EF Core migrations. Dapper for hot paths if needed.
**Tables (no change from v1 + PSIRT/CERT flags)**
- `source(id, name, type, base_url, auth_mode, notes)`
- `watermark(source_id, cursor, updated_at)` ← **incremental windows per source**
- `document(id, source_id, uri, fetched_at, content_sha256, content_type, status, raw_blob?, metadata_json)`
- `advisory(id, advisory_key, title, summary, lang, published, modified,
severity_cvss_v3?, severity_cvss_v4?, vendor_severity?,
exploit_known bool)`
- `alias(advisory_id, scheme, value)` — **schemes** include: CVE, GHSA, OSV, JVN, BDU, VU (CERT/CC), MSRC, CISCOSA, ORACLECPU, APSB/APA, APPLEHT, CHROMIUMPOST, VMSA, RHSA, USN, DSA, SUSESU, ICSA, CWE, CPE, PURL, etc.
- `affected(advisory_id, platform, name, version_range, cpe?, purl?, fixed_by?, introduced_version?)`
- `reference(advisory_id, url, kind, source_tag)` — kind examples: advisory, patch, bulletin, kb, blog, vendor, exploit
- `provenance(advisory_id, document_id, extraction, confidence, fields_mask)`
- `kev_flag(advisory_id, kev_id, added_date, due_date?)`
- `ru_flags(advisory_id, bdu_id?, nkcki_ids_json?, ru_severity?, notes?)`
- `jp_flags(advisory_id, jvndb_id?, jvn_category?, vendor_status?)`
- `psirt_flags(advisory_id, vendor, advisory_id_text, product_tags_json?)`
- `merge_event(id, advisory_key, before_hash, after_hash, merged_at)`
**Indexes**: unique(advisory_key); index(scheme,value); index(platform,name); index(published); index(modified).
### 4.1) Alternate storage (MongoDB) — mapping (per your plan)
If a **MongoDB** deployment is preferred, mirror the relational shape **as collections** with analogous names (`source`, `watermark`, `document`, `advisory`, `alias`, `affected`, `reference`, `provenance`, `kev_flag`, `ru_flags`, `jp_flags`, `psirt_flags`, `merge_event`).
- Keep **advisory documents** flat and **embed** `aliases[]`, `affected[]`, `references[]` when practical; store **provenance** entries as embedded or sidecar collection depending on document growth.
- Maintain **deterministic canonical JSON** for merges; hash stored in `merge_event`.
- Incremental refreshes rely on the same **persource watermarks**.
---
## 5) Source connectors — contracts & incremental strategy
**Common interface**
```csharp
public interface IFeedConnector {
string SourceName { get; }
Task FetchAsync(FeedserContext db, CancellationToken ct); // populate document rows
Task ParseAsync(FeedserContext db, CancellationToken ct); // document -> DTOs (validated)
Task MapAsync(FeedserContext db, CancellationToken ct); // DTOs -> UnifiedVuln tables + provenance
}
````
### 5.1 Registries & crossecosystem
* **CVE (cve.org)** — *identifier registry*. Fetch for alias crosschecks; minimal fields only. Watermark by last seen ID/time.
* **NVD API v2** — sliding **modified windows** (e.g., 612h) with backoff and pagination. Persist CVSS/CWE/CPE as aliases; capture change history if present. Watermark = last successful `modified_end`.
* **GHSA** — **REST** “global security advisories” + **GraphQL** for richer fields; **note**: `cvss` → `cvss_severities` deprecation → map accordingly. Watermark by updated timestamp/ID cursor.
* **OSV** — fetch per eco or time range; map PURL + SemVer ranges.
### 5.2 National CERTs (incremental via RSS/API/pages)
* **CERT/CC Vulnerability Notes** — scrape/archive pages (VU#), and/or GitHub data archive when suitable. Watermark by VU publish date/ID.
* **JVN / MyJVN (Japan)** — **MyJVN API**: JVNRSS (overview) + VULDEF (detail). Watermark by `dateFirstPublished`/`dateLastUpdated`. Map **JVNDB** IDs, CVE aliases, vendor status.
* **RUCERT** — advisory/news portal; treat as **enrichment references** (aliases+refs), not a primary package range source. Watermark by post date.
* **KISA (KrCERT/KRCERT)** — advisories/notices portal. Watermark by advisory date/ID.
* **CERTIn (India)** — **CIAD** advisories via portal pages; Watermark by advisory code/date.
* **ANSSI/CERTFR** — *avis/alertes* RSS and list pages; Watermark by advisory ID/date.
* **BSI CERTBund (WID)** — “Technische Sicherheitshinweise” pages/feeds; Watermark by bulletin ID/date.
* **ACSC (Australia)** — alerts/advisories; Watermark by publish date/slug.
* **CCCS (Canada)** — advisories page; Watermark by date/slug.
### 5.3 Russiaspecific
* **FSTEC BDU** — **hybrid**: primary **HTML parser** → validate against our **internal XML schema**; if validation fails → **LLM extraction fallback** (strictly gated; see §7). Also support **bulk DB ingests** if official XML/Excel exports are available in the environment. Watermark by BDU ID/date.
* **NKCKI** — bulletins list (HTML/PDF). Extract structured fields via PDF→text pipeline + postvalidation. Watermark by bulletin ID/date.
### 5.4 Vendor PSIRTs (canonical)
* **MSRC** — **CVRF API** monthly and peradvisory endpoints. Watermark by month + last modified. Alias: `MSRC:<YYYY-MMM>`; references to KBs/CVEs.
* **Cisco PSIRT (openVuln API)** — REST; filter by last published/updated. Alias: `CISCO-SA:<slug>`; map fixed releases.
* **Oracle CPU / Security Alerts** — quarterly schedule (3rd Tue of Jan/Apr/Jul/Oct). Scrape CPU pages and advisories. Alias: `CPU:<YYYY-QQ>`; link perproduct CVEs. Watermark by CPU cycle.
* **Adobe APSB/APA** — advisory index pages + product feeds. Alias: `APSB-YYYY-XX`.
* **Apple** — **HT201222/“About Apple security releases”** index page(s). Alias: `APPLE-HT:HT201222:<yyyy-mm-dd>` + perproduct pages.
* **Google Chromium** — **Chrome Releases** blog “Stable Channel Update” posts with security fix lists. Alias: `CHROMIUM-POST:<date>`.
* **VMware (VMSA)** — Broadcom support portal VMSA pages; parse ID + affected products + CVEs. Alias: `VMSA-YYYY-XXXX`.
### 5.5 Linux distributions
* **Red Hat Security Data API** (CSAF/OVAL/CVE); plus OVAL content. **Precedence** for OS packages. Watermark via API `last_modified`/etag. Alias: `RHSA-YYYY:NNNN`.
* **Ubuntu USN** — USN list + **Security API**; Watermark by USN ID/date. Alias: `USN-####-#`.
* **Debian Security Tracker** — JSON dataset for CVE↔package↔suite; Watermark by file etag/commit. Alias: `DSA-####-#` (when present).
* **SUSE** — security/update advisories pages; Watermark by SUSESU ID/date. Alias: `SUSE-SU-YYYY:NNNN`.
### 5.6 Specialized / ICS
* **CISA ICS advisories (ICSA)** — list feeds; Watermark by ICSAID. Alias: `ICSA-YY-###-##`.
* **Kaspersky ICSCERT** — advisories list; Watermark by advisory ID/date; treat as authoritative vendor ICS source for impacted OT products.
### 5.7 Exploitation & enrichment
* **CISA KEV** — JSON catalog; set exploitation flag (`exploit_known=true`), store `kev_id`, `added_date`, `due_date`.
---
## 6) Normalization details
**UnifiedVuln JSON (internal canonical)**
```json
{
"advisory_key": "CVE-2025-12345",
"ids": { "cve": "CVE-2025-12345", "ghsa": "GHSA-xxxx", "bdu": "BDU:2025-06025", "jvndb": "JVNDB-2025-000123", "msrc": "2025-Jan" },
"titles": [{ "text": "Buffer overflow in foo()", "lang": "en" }],
"summary": { "text": "...", "lang": "en" },
"published": "2025-06-21T12:00:00Z",
"modified": "2025-07-03T09:00:00Z",
"severity": {
"cvss_v3": { "base": 9.8, "vector": "CVSS:3.1/..." },
"cvss_v4": null,
"vendor": "Critical"
},
"affected": [
{ "platform": "os-distro", "name": "ubuntu:20.04",
"cpe": "cpe:/o:canonical:ubuntu_linux:20.04",
"version_range": "pkg:deb/ubuntu/foo<1.2.3-0ubuntu0.20.04.1",
"fixed_by": "1.2.3-0ubuntu0.20.04.1"
}
],
"references": [
{ "url": "https://msrc.microsoft.com/update-guide", "kind": "advisory", "source": "MSRC" }
],
"exploitation": { "cisa_kev": true, "nkcki": false },
"provenance": [
{ "source": "RedHat", "document": "https://...", "method": "parser", "confidence": 1.0 }
],
"psirt": [{ "vendor": "Cisco", "advisory": "cisco-sa-..." }]
}
```
**Ranges**
* **OS packages**: distro semantics (Debian **EVR**, RPM **NEVRA**). Prefer OVAL/PSIRT source whenever available.
* **Language ecosystems**: **SemVer** ranges with **PURL** coordinates; use OSV/GHSA fields for introduced/fixed events.
* **Severity**: keep **all** CVSS sources; compute a max/consensus for display but preserve originals.
---
## 7) FSTEC BDU hybrid extraction (HTML→schema with gated LLM fallback)
1. **HTML parser** extracts into `BduHtmlExtract`.
2. Validate against **internal XML schema** (XSD). Rules: `bdu_id` format `^BDU:\d{4}-\d{5}$`; CVE regex; date parse; severity enumeration.
3. On validation failure: run **LLM extraction** (temperature 0) to the same JSON Schema; accept **only** if postvalidation passes and `confidence ≥ minConfidence`. Mark `provenance.method = "llm"`.
4. Keep audit logs locally; default **offline model** in sovereign builds.
---
## 8) Merge & reconciliation (deterministic)
* **Identity**: prefer **CVE**; fallback to other keys (BDU/JVN/GHSA/MSRC/CISCOSA/VMSA/USN/DSA/SUSESU/ICSA). Canonical `advisory_key`.
* **Aliases**: store all crossrefs (CVE, GHSA, OSV, JVN, BDU, MSRC, CISCOSA, ORACLECPU, APSB, APPLEHT, CHROMIUMPOST, VMSA, RHSA, USN, DSA, SUSESU, ICSA, CWE, CPE, PURL…).
* **Precedence**:
* OVAL/PSIRT **override** NVD for OS package ranges.
* **KEV** sets exploitation flags only (no severity override).
* Regional feeds **enrich** (severity text, mitigation, local notes).
* **Determinism**: merged canonical JSON is hashed; store in `merge_event`.
---
## 9) Packaging & publishing
**v0**: render **vulnlistshaped JSON** → invoke stock **`trivy-db`** builder to get `metadata.json` + `trivy.db` → tar to `db.tar.gz`**ORAS push** to your registry with **Trivy DB media types**.
**v1**: native C# packager writing BoltDB + `metadata.json` and pushing via ORAS directly.
**Output contracts**
* **OCI media types**: layer `application/vnd.aquasec.trivy.db.layer.v1.tar+gzip`; config `application/vnd.aquasec.trivy.config.v1+json`.
* Consumers point Trivy at your repo: `--db-repository REGISTRY/PATH`; for airgap ship `db.tar.gz`.
---
## 10) CLI (idempotent)
```
feedser init
feedser fetch --source nvd|cve|ghsa|osv|jvn|certcc|kev|kisa|certin|certfr|certbund|acsc|cccs|bdu|nkcki|msrc|cisco|oracle|adobe|apple|chromium|vmware|redhat|ubuntu|debian|suse [--since ...]
feedser parse --source ...
feedser merge
feedser export vuln-list --out ./out/vuln-list/
feedser pack trivy-db --out ./out/db.tar.gz
feedser push trivy-db --repo registry.local/security/trivy-db --tag 2 [--auth env|file]
feedser sign --artifact ./out/db.tar.gz --key cosign.key
feedser status
feedser gc --keep-raw 3
feedser doctor # media types, registry auth, schema checks
```
Exit codes: nonzero on schema failure, network failure after retries, or merge nondeterminism.
---
## 11) Config (`etc/feedser.yaml`) — extended
```yaml
storage:
driver: sqlite
dsn: "Data Source=feedser.db"
sources:
cve: { enabled: true }
nvd: { enabled: true, window_hours: 6 }
ghsa: { enabled: true, github_token: "${GITHUB_TOKEN:-}", api: "rest+graphql" }
osv: { enabled: true }
jvn:
enabled: true
api_base: "https://jvndb.jvn.jp/en/apis/"
window_days: 7
certcc: { enabled: true }
kev: { enabled: true }
kisa: { enabled: false } # enable when endpoints/feeds are reachable in environment
certin: { enabled: true }
certfr: { enabled: true }
certbund: { enabled: true }
acsc: { enabled: true }
cccs: { enabled: true }
ru:
bdu:
enabled: true
htmlFallback: true
llmFallback: "gated"
minConfidence: 0.85
nkcki:
enabled: true
msrc: { enabled: true }
cisco: { enabled: true, token: "${CISCO_OPENVULN_TOKEN:-}" }
oracle: { enabled: true }
adobe: { enabled: true }
apple: { enabled: true }
chromium: { enabled: true }
vmware: { enabled: true }
redhat:
enabled: true
api_base: "https://access.redhat.com/hydra/rest/securitydata"
ubuntu:
enabled: true
api_base: "https://ubuntu.com/security/api"
debian: { enabled: true }
suse: { enabled: true }
packaging:
trivy:
publish: true
repo: "registry.local/security/trivy-db"
tag: "2"
offline_bundle: true
observability:
metrics: "stdout"
logs: "json"
level: "Information"
tracing: "otlp"
```
---
## 12) Observability & performance
* **Logs**: structured (Serilog); include `source`, `uri`, `status`, `parseMs`, `mappedCount`, `mergeDelta`.
* **Metrics**: fetch latency, parse/validation failures, dedupe ratio, DB compaction time, package size, **persource ratelimit counters**.
* **Tracing**: OpenTelemetry spans per connector/step.
* **Perf**: bounded parallelism per source; streaming XML; contenthash shortcircuit for unchanged docs.
---
## 13) Tests & quality gates
* **Schema validation** for each connector (external JSON/XML → DTOs).
* **Golden fixtures** per source (NVD page, GHSA JSON, OSV, JVN JVNRSS/VULDEF, CERT/CC VU HTML, BDU HTML, NKCKI PDF→text, MSRC CVRF, Cisco openVuln JSON, Oracle CPU HTML, Adobe APSB HTML, Apple HT list, Chrome Releases HTML, VMSA HTML, Red Hat API JSON, USN JSON, Debian JSON, SUSE HTML).
* **Merge determinism** (hashstable).
* **Parity scans**: compare Trivy scan using our DB vs upstream baseline on a reference set of images (differences expected where OVAL narrows ranges).
* **Mediatype conformance** (OCI).
* **Reproducible packaging**: build ID = hash(vulnlist tree).
**Connector DoD**: watermarking; retries/backoff; schemavalidated parsing; mapping; unit tests; goldens; incremental pass; metrics.
---
## 14) Security & compliance
* Default **offline**; explicit allowlist per source host.
* **LLM usage isolated** to BDU fallback; no external calls unless configured; redact logs; audit stored locally.
* **cosign** signing for artifacts; store SHA256 and manifest digests.
* Respect robots/ToS; prefer official APIs/feeds where available.
---
## 15) Concrete TODOs (first sprints)
1. **Storage**: EF Core models & migrations; `watermark` infra; repositories.
2. **NVD**: windowed fetch; JSON Schema validation; mapper; watermark.
3. **OVAL/Distros**: Red Hat (API+OVAL), Ubuntu (USN+API), Debian (JSON), SUSE (advisories).
4. **KEV**: JSON ingest → `exploit_known=true`.
5. **GHSA/OSV**: REST + GraphQL; map PURL/semver; handle `cvss_severities`.
6. **JVN**: JVNRSS + VULDEF; alias mapping; watermark.
7. **RU**: BDU HTML parser + XSD + LLM fallback; NKCKI bulletins harvester.
8. **PSIRTs**: MSRC CVRF; Cisco openVuln; Oracle CPU; Adobe APSB; Apple; Chromium; VMware (VMSA).
9. **Merge Engine**: aliasing + precedence; canonical JSON + hashing.
10. **Export/Pack**: vulnlist renderer; Trivy DB packaging; **ORAS push**; **offline bundle**.
11. **CLI & doctor**; **cosign sign**; **status**.
---
## 16) MASTER SOURCE CATALOG (as provided — preserved verbatim)
### Primary Vulnerability Databases / Advisory Portals
| Vulnerability DB | Who Supports It | Type | URL | DB Type | What Data It Has |
| ---------------------------------------------- | ---------------------------------------------------------- | ----------------------------------- | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------- | ------------------------------------------------------------------ |
| **CVE (Common Vulnerabilities and Exposures)** | MITRE (with CNA partners) | Identifier registry | [https://cve.org](https://cve.org) | Global ID registry | CVE IDs, basic description, references, assigner info |
| **CERT/CC Vulnerability Notes** | Carnegie Mellon CERT/CC | National CERT / coordination center | [https://kb.cert.org/vuls](https://kb.cert.org/vuls) | Vulnerability Notes DB | VU# IDs, description, impact, vendors affected, references |
| **JVN (Japan Vulnerability Notes)** | JPCERT/CC + IPA (Japan) | National CERT | [https://jvn.jp/en/](https://jvn.jp/en/) | Advisory DB | JVN IDs, affected products, mitigation, CVE mappings |
| **RU-CERT** | Coordination Center for .RU / Russian CERT | National CERT | [https://www.cert.ru](https://www.cert.ru) | Advisory DB | Russian advisories, incident/vulnerability notes |
| **CISA KEV Catalog & Advisories** | US CISA (DHS) | Government CERT / advisories | [https://www.cisa.gov/known-exploited-vulnerabilities](https://www.cisa.gov/known-exploited-vulnerabilities) | Catalog | KEV IDs, CVE links, exploited-in-wild status, remediation deadline |
| **KISA (Korean CERT)** | Korea Internet & Security Agency | National CERT | [https://www.krcert.or.kr](https://www.krcert.or.kr) / [https://www.boho.or.kr/en/main.do](https://www.boho.or.kr/en/main.do) | Advisory portal | Korean advisories, CVE refs, guidance |
| **CERT-In (India)** | Ministry of Electronics & IT | National CERT | [https://www.cert-in.org.in](https://www.cert-in.org.in) | Advisory portal | Indian CERT advisories, affected vendors, CVEs |
| **ANSSI (France)** | Agence nationale de la sécurité des systèmes d'information | National CERT | [https://www.cert.ssi.gouv.fr](https://www.cert.ssi.gouv.fr) | Advisory portal | French advisories, technical notes, CVE refs |
| **BSI (Germany, CERT-Bund)** | German Federal Office for Information Security | National CERT | [https://www.bsi.bund.de](https://www.bsi.bund.de) | Advisory portal | Vulnerability advisories, vendor notifications |
| **ACSC (Australia)** | Australian Cyber Security Centre | National CERT | [https://www.cyber.gov.au](https://www.cyber.gov.au) | Advisory portal | Australian advisories, CVE refs, guidance |
| **CCCS (Canada)** | Canadian Centre for Cyber Security | National CERT | [https://www.cyber.gc.ca](https://www.cyber.gc.ca) | Advisory portal | Canadian advisories, CVE refs |
### Vendor / PSIRT Databases (Primary)
| Vulnerability DB | Who Supports It | Type | URL | DB Type | What Data It Has |
| --------------------------------------------- | --------------- | ------------ | ---------------------------------------------------------------------------------------------------------------------------- | ----------------------- | ----------------------------------------------------------- |
| **Microsoft Security Response Center (MSRC)** | Microsoft | Vendor PSIRT | [https://msrc.microsoft.com/update-guide](https://msrc.microsoft.com/update-guide) | Advisory portal | MSRC IDs, CVE mappings, affected products/versions, patches |
| **Cisco PSIRT** | Cisco Systems | Vendor PSIRT | [https://tools.cisco.com/security/center/publicationListing.x](https://tools.cisco.com/security/center/publicationListing.x) | Advisory DB | Cisco advisories, CVEs, product impact, fixes |
| **Oracle CPU / Security Alerts** | Oracle | Vendor PSIRT | [https://www.oracle.com/security-alerts/](https://www.oracle.com/security-alerts/) | Advisory DB | Oracle CPUs, CVEs, affected products/versions, patches |
| **Adobe Security Bulletins & Advisories** | Adobe | Vendor PSIRT | [https://helpx.adobe.com/security.html](https://helpx.adobe.com/security.html) | Advisory DB | APSB/APA IDs, CVEs, affected software, patches |
| **Apple Security Updates** | Apple | Vendor PSIRT | [https://support.apple.com/en-us/HT201222](https://support.apple.com/en-us/HT201222) (security updates index) | Advisory portal | Apple advisories, CVEs, product versions, patches |
| **Google Chromium Security** | Google | Vendor PSIRT | [https://chromereleases.googleblog.com](https://chromereleases.googleblog.com) | Advisory blog / tracker | Chromium/Android advisories, CVEs, fixes |
| **VMware Security Advisories (VMSA)** | VMware/Broadcom | Vendor PSIRT | [https://www.broadcom.com/support/vmware-security-advisories](https://www.broadcom.com/support/vmware-security-advisories) | Advisory DB | VMSA IDs, CVEs, product versions, fixes |
### Linux Distribution Security Trackers (Primary)
| Vulnerability DB | Who Supports It | Type | URL | DB Type | What Data It Has |
| ------------------------------------------- | --------------- | ------------------ | ---------------------------------------------------------------------------------------- | ------------------------------------- | --------------------------------------------------------- |
| **Red Hat Security Data / RHSA** | Red Hat | Distro Security DB | [https://access.redhat.com/security/updates](https://access.redhat.com/security/updates) | Security advisories & OVAL/JSON feeds | RHSA IDs, CVEs, fixed package versions, affected products |
| **Canonical Ubuntu Security Notices (USN)** | Canonical | Distro Security DB | [https://ubuntu.com/security/notices](https://ubuntu.com/security/notices) | Advisory DB | USN IDs, CVEs, affected packages, patches |
| **Debian Security Tracker (DSA)** | Debian Project | Distro Security DB | [https://security-tracker.debian.org](https://security-tracker.debian.org) | Tracker + Advisories | DSA IDs, CVEs, package status per release |
| **SUSE Security Announcements** | SUSE | Distro Security DB | [https://www.suse.com/support/security/](https://www.suse.com/support/security/) | Advisory DB | SUSE-SA/Update IDs, CVEs, package fix versions |
### Open Source Ecosystem Advisory Databases (Primary)
| Vulnerability DB | Who Supports It | Type | URL | DB Type | What Data It Has |
| ----------------------------------------- | ---------------------- | ----------------------- | -------------------------------------------------------------- | --------------- | ------------------------------------------------------------------------------------------------- |
| **GitHub Security Advisories (GHSA)** | GitHub (Microsoft) | Open Source Advisory DB | [https://github.com/advisories](https://github.com/advisories) | Advisory DB | GHSA IDs, CVEs, affected repos/packages, patches, severity |
| **OSV.dev (Open Source Vulnerabilities)** | Google / OSS community | Open Source Advisory DB | [https://osv.dev](https://osv.dev) | Schema-based DB | OSV IDs, CVEs, affected ecosystems (npm, PyPI, Go, crates.io, Maven, etc.), version ranges, fixes |
### Specialized (ICS / Sectoral)
| Vulnerability DB | Who Supports It | Type | URL | DB Type | What Data It Has |
| ----------------------- | ------------------ | ------------- | ---------------------------------------------------------------------------------------- | --------------- | --------------------------------------------------------------------- |
| **CISA ICS Advisories** | US CISA (ICSCERT) | ICS sector DB | [https://www.cisa.gov/ics/advisories](https://www.cisa.gov/ics/advisories) | Advisory DB | ICS advisory IDs, CVEs, affected vendors, exploitability, mitigations |
| **Kaspersky ICS CERT** | Kaspersky Lab | ICS CERT | [https://ics-cert.kaspersky.com/advisories/](https://ics-cert.kaspersky.com/advisories/) | Advisory portal | ICS advisories, CVEs, technical detail, mitigations |
---
## 17) Fieldmapping guide (per family)
**PSIRT**: set `psirt_flags.vendor` + vendor advisory ID in `alias` and `psirt_flags.advisory_id_text`. Always attach **patch references** and **fixed versions** into `affected.fixed_by`.
**Distros**: treat **OVAL/JSON** as range authority; `alias` with RHSA/USN/DSA/SUSESU; attach persuite/package status.
**CERTs**: attach `reference(kind=bulletin)` and severity text; use as enrichment unless they include authoritative package ranges.
**ICS**: map vendor & model families into `affected.platform="ics-vendor"` with product tags.
**KEV**: set exploitation flags only.
**BDU/JVN**: include local IDs (BDU, JVNDB) in `alias` and specific flags in `ru_flags`/`jp_flags`.
---
## 18) Reference commands & snippets
**ORAS push (Trivy DB v2)**
```bash
oras push --artifact-type application/vnd.aquasec.trivy.config.v1+json \
"registry.local/security/trivy-db:2" \
db.tar.gz:application/vnd.aquasec.trivy.db.layer.v1.tar+gzip
```
**Point Trivy at our repo**
```bash
trivy image --db-repository registry.local/security/trivy-db --download-db-only
```
**BDU LLM fallback gate (pseudo)**
```csharp
if (!BduSchemaValidator.IsValid(parsed)) {
var json = LlmExtractToJson(rawText, schema: BduSchema, temperature: 0);
if (!BduSchemaValidator.IsValid(json) || Confidence(json) < minConfidence) Fail("BDU: low confidence");
Save(json, provenance: "llm");
} else {
Save(parsed, provenance: "parser");
}
```
## Reference notes (authoritative links for the agent)
**Trivy selfhosting / DB media types / vulnlist**
* Trivy selfhosting databases and `--db-repository` flag. ([trivy.dev][1])
* DB repository & required OCI media type (`application/vnd.aquasec.trivy.db.layer.v1.tar+gzip`). ([Aqua Security][2])
* `vuln-list` and `vuln-list-update` (inputs/build). ([GitHub][3])
* `trivy-db` tool (builder/DB format). ([GitHub][4])
* GitLab registry mediatype support for trivydb (confirmation of the two media types). ([about.gitlab.com][5])
**Global registries / crossecosystem**
* CVE program (official). ([CVE][6])
* NVD general/search. ([NVD][7])
* GHSA DB and APIs (REST/GraphQL + deprecation notice). ([GitHub][8])
* OSV.dev (DB + data sources). ([OSV][9])
**National CERTs**
* CERT/CC Vulnerability Notes + docs. ([CERT Coordination Center][10])
* JVN / MyJVN API (Japan). ([JVN iPedia][11])
* RUCERT (coordination center profile & site). ([cctld.ru][12])
* KISA/KrCERT portals and examples. ([boho.or.kr][13])
* CERTIn (site, CNA role, sample advisory). ([CERT-IN][14])
* ANSSI CERTFR portal and *avis*. ([cert.ssi.gouv.fr][15])
* BSI CERTBund WID pages. ([wid.cert-bund.de][16])
* ACSC advisories hub. ([cyber.gov.au][17])
* CCCS advisories hub. ([Canadian Centre for Cyber Security][18])
**Russiaspecific**
* BDU site and documentation of XML/Excel dumps (context). ([bdu.fstec.ru][19])
* NKCKI vulnerability bulletins list. ([safe-surf.ru][20])
**Vendor PSIRTs**
* MSRC Security Update Guide + CVRF API examples. ([msrc.microsoft.com][21])
* Cisco PSIRT advisories + openVuln API. ([Cisco][22])
* Oracle CPU schedule / advisories. ([Oracle][23])
* Adobe security advisories (index + product). ([Adobe Help Center][24])
* Apple security releases index (HT201222 lineage). ([Apple Support][25])
* Chrome Releases (stable updates with security fixes). ([Chrome Releases][26])
* VMware Security Advisories (VMSA) on Broadcom; move notice. ([Broadcom][27])
**Linux distributions**
* Red Hat Security Data API (+ changelog/pointers). ([Red Hat Docs][28])
* Ubuntu Security Notices & Security API. ([Ubuntu][29])
* Debian Security Tracker (docs + JSON). ([Debian Security Tracker][30])
* SUSE advisories. ([SUSE][31])
**Exploitation & ICS**
* CISA KEV catalog. ([CISA][32])
* CISA ICS advisories hub (ICSA). ([CISA][33])
* Kaspersky ICSCERT advisories. ([Kaspersky ICS-CERT][34])
If you want me to produce **starter EF models + migrations** and a **full `feedser.yaml`** file reflecting all of the above, I can output those files now.
[1]: https://trivy.dev/v0.60/docs/advanced/self-hosting/?utm_source=chatgpt.com "Self-Hosting Trivy's Databases"
[2]: https://aquasecurity.github.io/trivy/v0.56/docs/configuration/db/?utm_source=chatgpt.com "DB"
[3]: https://github.com/aquasecurity/vuln-list?utm_source=chatgpt.com "aquasecurity/vuln-list: NVD, Ubuntu, Alpine"
[4]: https://github.com/aquasecurity/trivy-db?utm_source=chatgpt.com "aquasecurity/trivy-db"
[5]: https://gitlab.com/gitlab-org/container-registry/-/merge_requests/957?utm_source=chatgpt.com "Add trivy-db media types - container-registry"
[6]: https://www.cve.org/?utm_source=chatgpt.com "CVE: Common Vulnerabilities and Exposures"
[7]: https://nvd.nist.gov/vuln/search?utm_source=chatgpt.com "NVD - Search and Statistics"
[8]: https://github.com/advisories?utm_source=chatgpt.com "GitHub Advisory Database"
[9]: https://osv.dev/?utm_source=chatgpt.com "OSV - Open Source Vulnerabilities"
[10]: https://www.kb.cert.org/?utm_source=chatgpt.com "CERT Vulnerability Notes Database"
[11]: https://jvndb.jvn.jp/en/apis/index.html?utm_source=chatgpt.com "MyJVN API"
[12]: https://cctld.ru/files/pdf/RU-CERT.pdf?utm_source=chatgpt.com "RU-CERT.pdf"
[13]: https://www.boho.or.kr/en/main.do?utm_source=chatgpt.com "KISA 인터넷 보호나라&KrCERT"
[14]: https://www.cert-in.org.in/CNA.jsp?utm_source=chatgpt.com "CVE Numbering Authority (CNA) at CERT-In"
[15]: https://www.cert.ssi.gouv.fr/?utm_source=chatgpt.com "CERT-FR Centre gouvernemental de veille, d ... - l'ANSSI"
[16]: https://wid.cert-bund.de/?utm_source=chatgpt.com "Warn- und Informationsdienst - Startseite - CERT-Bund"
[17]: https://www.cyber.gov.au/about-us/view-all-content/alerts-and-advisories?utm_source=chatgpt.com "Alerts and advisories"
[18]: https://www.cyber.gc.ca/en/alerts-advisories?utm_source=chatgpt.com "Alerts and advisories"
[19]: https://bdu.fstec.ru/vul?utm_source=chatgpt.com "Уязвимости - БДУ"
[20]: https://safe-surf.ru/specialists/bulletins-nkcki/?utm_source=chatgpt.com "Список новых уязвимостей ПО | Уведомления НКЦКИ"
[21]: https://msrc.microsoft.com/update-guide?utm_source=chatgpt.com "Security Update Guide"
[22]: https://sec.cloudapps.cisco.com/security/center/publicationListing.x?utm_source=chatgpt.com "Cisco Security Advisories"
[23]: https://www.oracle.com/security-alerts/?utm_source=chatgpt.com "Critical Patch Updates, Security Alerts and Bulletins"
[24]: https://helpx.adobe.com/security/security-bulletin.html?utm_source=chatgpt.com "Security Bulletins and Advisories"
[25]: https://support.apple.com/en-us/100100?utm_source=chatgpt.com "Apple security releases"
[26]: https://chromereleases.googleblog.com/?utm_source=chatgpt.com "Chrome Releases"
[27]: https://www.broadcom.com/support/vmware-security-advisories?utm_source=chatgpt.com "VMware Security Advisories"
[28]: https://docs.redhat.com/en/documentation/red_hat_security_data_api/1.0/html-single/red_hat_security_data_api/index?utm_source=chatgpt.com "Red Hat Security Data API | 1.0"
[29]: https://ubuntu.com/security/notices?utm_source=chatgpt.com "Ubuntu Security Notices"
[30]: https://security-tracker.debian.org/?utm_source=chatgpt.com "Security Bug Tracker - Debian"
[31]: https://www.suse.com/support/update/?utm_source=chatgpt.com "SUSE:Update Advisories"
[32]: https://www.cisa.gov/known-exploited-vulnerabilities-catalog?utm_source=chatgpt.com "Known Exploited Vulnerabilities Catalog"
[33]: https://www.cisa.gov/news-events/ics-advisories?utm_source=chatgpt.com "ICS Advisories"
[34]: https://ics-cert.kaspersky.com/advisories/?utm_source=chatgpt.com "Advisories"
## 19) Role Kickstart Playbooks (aligned with ARCHITECTURE.md & IMPLEMENTATION.md)
### 19.1 Shared pre-flight checklist
- Read **ARCHITECTURE.md §§13** and **IMPLEMENTATION.md §§03** before writing code; treat this document as the quick-start guide and the others as depth references.
- Confirm local toolchain: .NET 10 Preview 7 SDK (10.0.100-preview.7.25380.108), Docker, MongoDB (local or container), `oras` CLI, `cosign`, `yq`, `jq`, Chrome/Chromium for HTML schema inspection, `pdftotext` for NKCKI extracts.
- Sync repo structure with `IMPLEMENTATION.md` naming (`StellaOps.Feedser.*` projects, Mongo storage) even if the CLI-first layout above still exists; prefer additive commits that converge both plans until deprecation is agreed.
- Establish secrets handling: load tokens (GitHub, Cisco openVuln, etc.) via environment variables referenced in `etc/feedser.yaml`.
- Instrument everything: Serilog + OpenTelemetry hooks should be wired during the first implementation of any loop so QA can observe behaviour from day one.
- Definition of Done (all roles): schema validation in tests, deterministic outputs (hash snapshot checked in), logging/metrics assertions, and hand-off notes in `/docs/handbook/ROLE/<source>.md`.
### 19.2 BE-Base — Platform & Pipeline owner
- **Sprint 0 focus** (IMPLEMENTATION.md §1): create the `StellaOps.Feedser.sln`, add the WebService, Core, Models, Storage.Mongo, Source.Common, Exporter.Json, Exporter.TrivyDb projects; seed `Directory.Build.props/targets`, `.editorconfig`, and analyzer packages (`StyleCop.Analyzers`, nullable `enable`, treat warnings as errors in CI).
- Wire **CI** (`.github/workflows/feedser-ci.yml` or internal equivalent) running `dotnet restore/build/test`, lint (StyleCop), and container build check; artifacts should include the WebService image and test results.
- Produce **devcontainer + Dockerfile** aligned with the Mongo-first run mode (ARCHITECTURE.md §2). Ensure `mongo` sidecar is declared in `devcontainer.json` for immediate onboarding.
- Establish **configuration plumbing**: bind `appsettings.json` + `feedser.yaml` into strongly typed options, configure reload-on-change, and hydrate `SourceState`/`ExportState` repositories.
- Create **Mongo collections/indexes** exactly as catalogued in ARCHITECTURE.md §3; provide integration tests under `StellaOps.Feedser.Tests/Storage` that assert index presence and TTL semantics.
- Publish **contribution docs**: `/docs/contribute.md` summarizing coding standards, release tagging, and commit style (Implementation §1.51.6).
- Hand-off: once WebService boots with `/health` and `/ready` endpoints, scheduler skeleton, and Mongo indices created on startup, notify BE-Conn/BE-Merge/BE-Export via project board and land a baseline tag (`v0.1.0-alpha1`).
### 19.3 BE-Conn-X — Source Connector engineers
- Priority waves (IMPLEMENTATION.md §4 + §5): registries (CVE/NVD/GHSA/OSV), national CERTs, vendor PSIRTs, distros, KEV, ICS. Pick sources in order of dependency on precedence rules (e.g., Red Hat before Debian for RPM logic).
- **Workflow** per source:
1. Extend `StellaOps.Feedser.Source.Common` with fetch helpers (rate-limit, retries) if not already present; reuse `IConnectorClock` to respect windowed crawls.
2. Implement `FetchAsync` to persist documents into Mongo `document` collection with SHA256 + metadata; follow the watermark guidance in §5 of this file.
3. Validate raw payloads against schemas (JSON Schema, XSD, or `Joi` equivalent) and store sanitized DTOs in `dto`. Record validation stats in metrics.
4. Map DTOs into canonical advisories using `StellaOps.Feedser.Models`. Guarantee alias completeness and provenance entries (`parser` vs `llm`).
5. Provide **golden fixtures** in `samples/<source>/` and component tests under `StellaOps.Feedser.Tests/Source.<SourceName>` that cover fetch (with canned HTTP responses), parse, map, and incremental resume.
- **Definition of Ready**: Base pipeline live, HTTP client registered, schema stub written, test scaffolding ready, tokens/keys documented. Coordinate with BE-Base for any additional shared tooling (HTML to XML transforms, PDF text extraction).
- **Definition of Done**: deterministic DTO + map outputs (snapshot hashed, stored under `Feedser.Tests/__snapshots__`), metrics counters added, resume cursor unit tests, and documentation entry in `/docs/sources/<source>.md` describing rate limits, cursor logic, tested fixtures.
- Engage QA early: schedule schema reviews before mapper coding to catch field omissions.
### 19.4 BE-Merge — Canonical merge & dedupe
- Start after BE-Base lands canonical models scaffold (IMPLEMENTATION.md §5). Own the `StellaOps.Feedser.Models` definitions, canonical serialization, and hash calculator used by `merge_event`.
- Implement **version range utilities** (RPM NEVRA, Debian EVR, SemVer) with exhaustive tests covering edge cases (epoch handling, tilde comparisons, wildcard SUSE ranges). Use fixtures from distro connectors to validate precedence rules.
- Build the **identity graph**: CVE-first resolution with fallback to other alias schemes (outlined in §8 of this file). Guarantee deterministic ordering (sort keys + stable merges) and record `beforeHash/afterHash` deltas.
- Enforce **precedence policies**: PSIRT/OVAL override generic ranges, KEV toggles exploitation flags without modifying severity, regional feeds enrich severity text but do not downgrade vendor truth. Cover these with integration tests using fused fixture sets.
- Expose a **Merge service** with idempotent `MergeAsync(IEnumerable<Guid> advisoryIds)` that writes both canonical document and `merge_event` records per run. Provide metrics (`merge.delta.count`, `merge.identity.conflicts`).
- Deliver initial **merge deterministic test**: same fixture set processed twice yields identical hashes; store hash snapshot under `/tests/data/merge/expected-hash.json`.
### 19.5 BE-Export — JSON & Trivy DB packaging
- After BE-Merge exposes stable canonical output, implement **JSON exporter** mirroring `aquasecurity/vuln-list` layout (Implementation §5 step 4; ARCHITECTURE.md stage 3). Ensure directory determinism (sorted keys, newline conventions) and record export cursor consumption.
- For **Trivy DB exporter**: wrap the official `trivy-db` builder initially; orchestrate invocation in-process or via CLI with reproducible environment variables. Persist resulting `metadata.json`, BoltDB file, and tarball in `export_state` with digests.
- Integrate **ORAS push** pipeline guarded by dry-run flag; support offline bundle packaging (zip/tar) for air-gapped delivery. Provide config-driven repo/tag resolution from `feedser.yaml`.
- Metrics to emit: `export.duration`, `export.records`, `export.size_bytes`, `export.delta` counts, ORAS push success/failure.
- Tests: snapshot JSON tree, verify OCI manifest media types, simulate incremental export using seeded `advisory` records. Provide CLI smoke test hitting `POST /jobs/export/trivydb` (or CLI equivalent once added).
### 19.6 QA — Validation & observability lead
- Build and maintain **test matrix** covering connectors, merge, export; enforce schema evolution workflows (any schema change must include updated fixtures, docs, and backward-compat diff summary).
- Own **component test harness** utilities (HTTP canned server, PDF→text conversion mocks, time-travel clock). Ensure connectors implement `IClock` dependency for deterministic tests.
- Set up **golden snapshot review** cadence; add tooling (`dotnet test --filter Category=Golden -- Dump`) to regenerate fixtures and compare diffs.
- Monitor **observability baselines**: define default alert thresholds for fetch failures, merge conflicts, export delays. Provide dashboards or documented query templates (Grafana/Prometheus) referenced in `/docs/observability.md`.
- Verify **reproducibility**: rerun end-to-end pipeline twice and compare exported digests; document discrepancies and feed them back to BE-Merge/BE-Export.
### 19.7 Coordination & delivery cadence
- Sprint naming follows Implementation plan (Sprint 04+). Maintain a shared Kanban board with swimlanes per role; entries should reference the numbered tasks from IMPLEMENTATION.md for traceability.
- Hold **daily 15-minute sync** focused on blockers, schema changes, and source-specific rate limits discovered. Escalate major schema/API shifts immediately and capture in `/docs/incidents/YYYY-MM-DD.md`.
- Milestone gates:
1. **Platform GA** — BE-Base tasks 16 closed; Mongo + scheduler operational.
2. **Core connectors online** — NVD, GHSA, OSV, KEV delivering mapped advisories.
3. **Distro precedence** — Red Hat + Ubuntu connectors validating version precedence vs NVD.
4. **Merge deterministic** — hash stability test green.
5. **Export ready** — JSON + Trivy DB artifacts validated and pushed to staging registry.
### 19.8 Hand-off artifacts & documentation expectations
- Every completed task must append/adjust documentation: connector guides, merge rules, exporter usage. Store under `/docs` mirroring role ownership; include sample commands, curl invocations, and expected outputs.
- Check in sample configs (`etc/feedser.local.yaml.example`) per role with placeholders for sensitive values.
- Capture **post-task retros** in `docs/retros/<sprint>-<task>.md` summarizing lessons, API quirks, schema diffs, and follow-up tickets.
- Maintain a shared **glossary** (`docs/glossary.md`) covering abbreviations (KEV, OVAL, EVR, NEVRA, GridFS) to speed up onboarding for new engineers.
---
The sections above convert the role taxonomy from IMPLEMENTATION.md into actionable playbooks while keeping the authoritative architecture and data contracts inline with this guide. Use them to bootstrap work, then dive into ARCHITECTURE.md and IMPLEMENTATION.md for exhaustive references and sprint-by-sprint sequencing.
### Dependency Injection Registration Baseline
- Every product or library that needs DI registration must expose a `<ProductNamespace>.DependencyInjection` namespace/folder containing one or more implementations of `StellaOps.DependencyInjection.IDependencyInjectionRoutine`.
- When distributing only a subset of connectors/exporters, create a dedicated solution that references just the desired plugin projects; the shared build rules always emit every `StellaOps.Feedser.Source.*` and `StellaOps.Feedser.Exporter.*` assembly into `PluginBinaries` by default.
- The `IDependencyInjectionRoutine` interface lives in `src/__Libraries/StellaOps.DependencyInjection` and enables both static opt-in registration and reflection-driven root composition.
- Each library should provide static helpers that wrap its DI registrations and a thin routine class that forwards to those helpers, for example:
```csharp
public interface IDependencyInjectionRoutine
{
IServiceCollection Register(
IServiceCollection services,
IConfiguration configuration);
}
public static class NamespaceLibrary
{
public static IServiceCollection RegisterNamespaceLibrary(
IServiceCollection services,
IConfiguration configuration)
{
// ...
return services;
}
}
public sealed class DependencyInjectionRoutine : IDependencyInjectionRoutine
{
public IServiceCollection Register(
IServiceCollection services,
IConfiguration configuration)
{
return NamespaceLibrary.RegisterNamespaceLibrary(services, configuration);
}
}
```

View File

@@ -1,191 +0,0 @@
# ARCHITECTURE.md — **StellaOps.Feedser**
> **Goal**: Build a sovereign-ready, self-hostable **feed-merge service** that ingests authoritative vulnerability sources, normalizes and de-duplicates them into **MongoDB**, and exports **JSON** and **Trivy-compatible DB** artifacts.
> **Form factor**: Long-running **Web Service** with **REST APIs** (health, status, control) and an embedded **internal cron scheduler**.
> **No signing inside Feedser** (signing is a separate pipeline step).
> **Runtime SDK baseline**: .NET 10 Preview 7 (SDK 10.0.100-preview.7.25380.108) targeting `net10.0`, aligned with the deployed api.stella-ops.org service.
> **Three explicit stages**:
>
> 1. **Source Download** → raw documents.
> 2. **Merge + Dedupe + Normalization** → MongoDB canonical.
> 3. **Export** → JSON or TrivyDB (full or delta), then (externally) sign/publish.
---
## 1) Naming & Solution Layout
**Solution root**: `StellaOps.Feedser`
**Source connectors** namespace prefix: `StellaOps.Feedser.Source.*`
**Exporters**:
* `StellaOps.Feedser.Exporter.Json`
* `StellaOps.Feedser.Exporter.TrivyDb`
**Projects** (`/src`):
```
StellaOps.Feedser.WebService/ # ASP.NET Core (Minimal API, net10.0 preview) WebService + embedded scheduler
StellaOps.Feedser.Core/ # Domain models, pipelines, merge/dedupe engine, jobs orchestration
StellaOps.Feedser.Models/ # Canonical POCOs, JSON Schemas, enums
StellaOps.Feedser.Storage.Mongo/ # Mongo repositories, GridFS access, indexes, resume "flags"
StellaOps.Feedser.Source.Common/ # HTTP clients, rate-limiters, schema validators, parsers utils
StellaOps.Feedser.Source.Cve/
StellaOps.Feedser.Source.Nvd/
StellaOps.Feedser.Source.Ghsa/
StellaOps.Feedser.Source.Osv/
StellaOps.Feedser.Source.Jvn/
StellaOps.Feedser.Source.CertCc/
StellaOps.Feedser.Source.Kev/
StellaOps.Feedser.Source.Kisa/
StellaOps.Feedser.Source.CertIn/
StellaOps.Feedser.Source.CertFr/
StellaOps.Feedser.Source.CertBund/
StellaOps.Feedser.Source.Acsc/
StellaOps.Feedser.Source.Cccs/
StellaOps.Feedser.Source.Ru.Bdu/ # HTML→schema with LLM fallback (gated)
StellaOps.Feedser.Source.Ru.Nkcki/ # PDF/HTML bulletins → structured
StellaOps.Feedser.Source.Vndr.Msrc/
StellaOps.Feedser.Source.Vndr.Cisco/
StellaOps.Feedser.Source.Vndr.Oracle/
StellaOps.Feedser.Source.Vndr.Adobe/
StellaOps.Feedser.Source.Vndr.Apple/
StellaOps.Feedser.Source.Vndr.Chromium/
StellaOps.Feedser.Source.Vndr.Vmware/
StellaOps.Feedser.Source.Distro.RedHat/
StellaOps.Feedser.Source.Distro.Ubuntu/
StellaOps.Feedser.Source.Distro.Debian/
StellaOps.Feedser.Source.Distro.Suse/
StellaOps.Feedser.Source.Ics.Cisa/
StellaOps.Feedser.Source.Ics.Kaspersky/
StellaOps.Feedser.Normalization/ # Canonical mappers, validators, version-range normalization
StellaOps.Feedser.Merge/ # Identity graph, precedence, deterministic merge
StellaOps.Feedser.Exporter.Json/
StellaOps.Feedser.Exporter.TrivyDb/
StellaOps.Feedser.Tests/ # Unit, component, integration & golden fixtures
```
---
## 2) Runtime Shape
**Process**: single service (`StellaOps.Feedser.WebService`)
* `Program.cs`: top-level entry using **Generic Host**, **DI**, **Options** binding from `appsettings.json` + environment + optional `feedser.yaml`.
* Built-in **scheduler** (cron-like) + **job manager** with **distributed locks** in Mongo to prevent overlaps, enforce timeouts, allow cancel/kill.
* **REST APIs** for health/readiness/progress/trigger/kill/status.
**Key NuGet concepts** (indicative): `MongoDB.Driver`, `Polly` (retry/backoff), `System.Threading.Channels`, `Microsoft.Extensions.Http`, `Microsoft.Extensions.Hosting`, `Serilog`, `OpenTelemetry`.
---
## 3) Data Storage — **MongoDB** (single source of truth)
**Database**: `feedser`
**Write concern**: `majority` for merge/export state, `acknowledged` for raw docs.
**Collections** (with “flags”/resume points):
* `source`
* `_id`, `name`, `type`, `baseUrl`, `auth`, `notes`.
* `source_state`
* Keys: `sourceName` (unique), `enabled`, `cursor`, `lastSuccess`, `failCount`, `backoffUntil`, `paceOverrides`, `paused`.
* Drives incremental fetch/parse/map resume and operator pause/pace controls.
* `document`
* `_id`, `sourceName`, `uri`, `fetchedAt`, `sha256`, `contentType`, `status`, `metadata`, `gridFsId`, `etag`, `lastModified`.
* Index `{sourceName:1, uri:1}` unique; optional TTL for superseded versions.
* `dto`
* `_id`, `sourceName`, `documentId`, `schemaVer`, `payload` (BSON), `validatedAt`.
* Index `{sourceName:1, documentId:1}`.
* `advisory`
* `_id`, `advisoryKey`, `title`, `summary`, `lang`, `published`, `modified`, `severity`, `exploitKnown`.
* Unique `{advisoryKey:1}` plus indexes on `modified` and `published`.
* `alias`
* `advisoryId`, `scheme`, `value` with index `{scheme:1, value:1}`.
* `affected`
* `advisoryId`, `platform`, `name`, `versionRange`, `cpe`, `purl`, `fixedBy`, `introducedVersion`.
* Index `{platform:1, name:1}`, `{advisoryId:1}`.
* `reference`
* `advisoryId`, `url`, `kind`, `sourceTag` (e.g., advisory/patch/kb).
* Flags collections: `kev_flag`, `ru_flags`, `jp_flags`, `psirt_flags` keyed by `advisoryId`.
* `merge_event`
* `_id`, `advisoryKey`, `beforeHash`, `afterHash`, `mergedAt`, `inputs` (document ids).
* `export_state`
* `_id` (`json`/`trivydb`), `baseExportId`, `baseDigest`, `lastFullDigest`, `lastDeltaDigest`, `exportCursor`, `targetRepo`, `exporterVersion`.
* `locks`
* `_id` (`jobKey`), `holder`, `acquiredAt`, `heartbeatAt`, `leaseMs`, `ttlAt` (TTL index cleans dead locks).
* `jobs`
* `_id`, `type`, `args`, `state`, `startedAt`, `endedAt`, `error`, `owner`, `heartbeatAt`, `timeoutMs`.
**GridFS buckets**: `fs.documents` for raw large payloads; referenced by `document.gridFsId`.
---
## 4) Job & Scheduler Model
* Scheduler stores cron expressions per source/exporter in config; persists next-run pointers in Mongo.
* Jobs acquire locks (`locks` collection) to ensure singleton execution per source/exporter.
* Supports manual triggers via API endpoints (`POST /jobs/{type}`) and pause/resume toggles per source.
---
## 5) Connector Contracts
Connectors implement:
```csharp
public interface IFeedConnector {
string SourceName { get; }
Task FetchAsync(IServiceProvider sp, CancellationToken ct);
Task ParseAsync(IServiceProvider sp, CancellationToken ct);
Task MapAsync(IServiceProvider sp, CancellationToken ct);
}
```
* Fetch populates `document` rows respecting rate limits, conditional GET, and `source_state.cursor`.
* Parse validates schema (JSON Schema, XSD) and writes sanitized DTO payloads.
* Map produces canonical advisory rows + provenance entries; must be idempotent.
* Base helpers in `StellaOps.Feedser.Source.Common` provide HTTP clients, retry policies, and watermark utilities.
---
## 6) Merge & Normalization
* Canonical model stored in `StellaOps.Feedser.Models` with serialization contracts used by storage/export layers.
* `StellaOps.Feedser.Normalization` handles NEVRA/EVR/PURL range parsing, CVSS normalization, localization.
* `StellaOps.Feedser.Merge` builds alias graphs keyed by CVE first, then falls back to vendor/regional IDs.
* Precedence rules: PSIRT/OVAL overrides generic ranges; KEV only toggles exploitation; regional feeds enrich severity but dont override vendor truth.
* Determinism enforced via canonical JSON hashing logged in `merge_event`.
---
## 7) Exporters
* JSON exporter mirrors `aquasecurity/vuln-list` layout with deterministic ordering and reproducible timestamps.
* Trivy DB exporter initially shells out to `trivy-db` builder; later will emit BoltDB directly.
* `StellaOps.Feedser.Storage.Mongo` provides cursors for delta exports based on `export_state.exportCursor`.
* Export jobs produce OCI tarballs (layer media type `application/vnd.aquasec.trivy.db.layer.v1.tar+gzip`) and optionally push via ORAS.
---
## 8) Observability
* Serilog structured logging with enrichment fields (`source`, `uri`, `stage`, `durationMs`).
* OpenTelemetry traces around fetch/parse/map/export; metrics for rate limit hits, schema failures, dedupe ratios, package size.
* Prometheus scraping endpoint served by WebService.
---
## 9) Security Considerations
* Offline-first: connectors only reach allowlisted hosts.
* BDU LLM fallback gated by config flag; logs audit trail with confidence score.
* No secrets written to logs; secrets loaded via environment or mounted files.
* Signing handled outside Feedser pipeline.
---
## 10) Deployment Notes
* Default storage MongoDB; for air-gapped, bundle Mongo image + seeded data backup.
* Horizontal scale achieved via multiple web service instances sharing Mongo locks.
* Provide `feedser.yaml` template describing sources, rate limits, and export settings.

View File

@@ -1,175 +0,0 @@
# ARCHITECTURE.md — **StellaOps.Feedser** · *Sprints & Tasks Delivery Plan*
> **Objective**: Deliver a production-ready **StellaOps.Feedser** web service that ingests authoritative vulnerability sources, normalizes & de-duplicates into **MongoDB**, and exports **JSON** and **Trivy-compatible DB** artifacts with **resume flags** for inputs and outputs.
> **Team assumption**: **1 Base Engineer** (preparation / platform) in early sprints to unlock **parallel connector work** by many engineers later.
> **Task sizing**: each task is **0.52.0 days** (est.).
> **Runtime SDK baseline**: .NET 10 Preview 7 (SDK 10.0.100-preview.7.25380.108) with target framework `net10.0`, matching the deployed api.stella-ops.org build.
> **Strict staging** (as previously defined):
>
> 1. **Source Download** (docs) → 2) **Merge + Dedupe + Normalization** (Mongo) → 3) **Export** (JSON / TrivyDB).
> **No signing in Feedser**; signing/promotion is an **external pipeline** step.
---
## 0) Roles, Notation & Constraints
* **BE-Base** — Base Engineer (preparation/platform).
* **BE-Conn-X** — Connector Engineer for source X (waves).
* **BE-Merge** — Engineer focusing on merge/dedupe.
* **BE-Export** — Engineer focusing on exporters.
* **QA** — test/dev-ex reviewer (could be engineers rotating).
* **Est.** — estimated effort for the task (0.5/1/1.5/2 days).
* **Depends** — tasks that must finish first.
* **Unlocks** — parallelizable work that becomes possible.
---
## 1) Sprint 0 — Inception & Repo Bootstrap (Week 1)
> Goal: Stand up solution scaffolding, CI, coding standards, and minimal docs to let others clone and build.
1. **Create solution & project skeletons**
* Create `src/` solution with projects & namespaces:
* `StellaOps.Feedser.WebService`, `StellaOps.Feedser.Core`, `StellaOps.Feedser.Models`,
`StellaOps.Feedser.Storage.Mongo`, `StellaOps.Feedser.Source.Common`,
`StellaOps.Feedser.Exporter.Json`, `StellaOps.Feedser.Exporter.TrivyDb`,
stubs for `StellaOps.Feedser.Source.*` families.
* Add top-level `Directory.Build.props` / `Directory.Build.targets`.
* Pin all projects to `net10.0` and ensure CI/global.json uses .NET SDK 10.0.100-preview.7.25380.108 (same preview as api.stella-ops.org).
2. **Coding standards & analyzers**
* `.editorconfig`, `StyleCop.Analyzers`, Roslyn rulesets, nullable enabled, warnings as errors in CI.
3. **Solution build & test CI**
* GitHub Actions (or internal CI) with matrix: `dotnet build`, `dotnet test`, coverage, artifact publish.
4. **Docker build & devcontainer**
* Multi-stage Dockerfile for WebService; VS Code devcontainer (`.devcontainer/`).
5. **Docs scaffold**
* Seed `/docs` with `README.md`, link to `AGENTS.md`, `ARCHITECTURE.md`, contribution guide, commit message convention.
6. **Versioning & release tagging**
* SemVer policy; auto version from tags.
---
## 2) Sprint 1 — Configuration, MongoDB & Resume Flags (Week 2)
> Goal: Stand up MongoDB storage, resume flags for **sources** and **exports**, and config plumbing.
1. **Configuration plumbing** — Bind `appsettings.json` + optional `feedser.yaml` + ENV to strongly typed `FeedserConfig`.
2. **Mongo client & database bootstrap** — Register `IMongoClient` / `IMongoDatabase`, health probes, retry policies.
3. **Collections & indexes** — Create `source`, `source_state`, `document`, `dto`, `advisory`, `alias`, `affected`, `reference`, `kev_flag`, `ru_flags`, `jp_flags`, `psirt_flags`, `merge_event`, `export_state`, `locks`, `jobs`.
4. **GridFS ingestion path** — Store large payloads, link via `document.gridFsId`, dedupe by SHA256.
5. **Resume flags — sources** — CRUD helpers for `cursor`, `lastSuccess`, `failCount`, `backoffUntil`, `paused`, `paceOverrides`.
6. **Resume flags — exports** — Manage `baseExportId`, `baseDigest`, `lastFullDigest`, `lastDeltaDigest`, `exportCursor`, `targetRepo`, `exporterVersion`.
7. **Storage integration tests** — Validate indexes, TTL, GridFS round-trip, concurrency.
---
## 3) Sprint 2 — Web Service, APIs & Job Infrastructure (Week 3)
1. Minimal API host (`Program.cs`) exposing `/health`, `/ready`.
2. Status & control endpoints (`/status`, `/sources`, pause/pace controls).
3. Job ledger & state machine with `jobs` collection.
4. Distributed lock service with lease & heartbeat.
5. Scheduler with cron parsing, singleton enforcement, backoff.
6. Kill/timeout support via cancellation tokens.
7. Optional API key auth middleware.
---
## 4) Sprint 3 — Source Common, Rate Limit & Validation (Week 4)
1. Connector base classes implementing `IFeedConnector` helpers.
2. HTTP client factory with rate limiter and retry policies.
3. Schema/XSD/JSON validation framework.
4. Document pipeline helpers for conditional GET, hashing, dedupe.
5. DTO pipeline & provenance tracking.
6. Observability wiring (Serilog, OpenTelemetry).
---
## 5) Sprint 4 — Normalization & Merge Engine (Week 5)
1. Canonical models (`StellaOps.Feedser.Models`).
2. Version range utilities (RPM NEVRA, Debian EVR, SemVer).
3. Alias graph & identity resolution.
4. Precedence enforcement & deterministic serialization.
5. Merge service with hash logging and metrics.
6. Merge determinism tests.
---
## 6) Sprint 5 — Exporters & Packaging (Week 6)
1. JSON exporter mirroring `vuln-list` tree.
2. Trivy DB exporter integration (`trivy-db` builder initially).
3. ORAS push / offline bundle support.
4. Export metrics & audit logging.
5. Incremental export tests and media-type validation.
---
## 7) Sprint 6 — Connectors Wave 1 (Weeks 78)
* Registries: CVE, NVD, GHSA, OSV.
* KEV catalog ingestion.
* National CERTs: CERT/CC, JVN, CERT-FR, CERT-In, KISA, CERT-Bund, ACSC, CCCS.
Each connector deliverable includes: watermarking, schema validation, mapping, provenance, metrics, golden fixtures, incremental resume tests.
---
## 8) Sprint 7 — Connectors Wave 2 (Weeks 910)
* PSIRTs: MSRC, Cisco, Oracle, Adobe, Apple, Chromium, VMware.
* Distros: Red Hat (API + OVAL), Ubuntu, Debian, SUSE.
* ICS: CISA ICS, Kaspersky ICS-CERT.
* Russia-specific: BDU (HTML + LLM fallback), NKCKI.
---
## 9) Sprint 8 — Merge Hardening & QA
* Conflict resolution scenarios (mixed aliases, partial data).
* Performance tuning, batching, streaming parse.
* Deterministic output regression tests.
* Golden snapshot review tooling.
---
## 10) Sprint 9 — Packaging, Delivery & Ops
* Reproducible export pipelines (hash-based build IDs).
* OCI push automation, offline bundle scripts.
* Observability dashboards, alerts.
* Disaster recovery playbooks (Mongo backups, export restore).
---
## 11) Sprint 10 — Launch Readiness
* Penetration test fixes, security hardening.
* Documentation polish (operator guides, API reference).
* Release tagging, change log, migration notes.
* GA criteria review with stakeholders.
---
## Definition of Done (for any code change)
1. Unit/integration tests updated and passing.
2. Schema/golden fixtures regenerated when applicable.
3. Telemetry (logs/metrics/traces) reviewed for signal quality.
4. Docs updated (`/docs`, `/etc/feedser.yaml` examples).
5. Reproducibility verified (hash match on repeated run).