Add topology auth policies + journey findings notes

Concelier:
- Register Topology.Read, Topology.Manage, Topology.Admin authorization
  policies mapped to OrchRead/OrchOperate/PlatformContextRead/IntegrationWrite
  scopes. Previously these policies were referenced by endpoints but never
  registered, causing System.InvalidOperationException on every topology
  API call.

Gateway routes:
- Simplified targets/environments routes (removed specific sub-path routes,
  use catch-all patterns instead)
- Changed environments base route to JobEngine (where CRUD lives)
- Changed to ReverseProxy type for all topology routes

KNOWN ISSUE (not yet fixed):
- ReverseProxy routes don't forward the gateway's identity envelope to
  Concelier. The regions/targets/bindings endpoints return 401 because
  hasPrincipal=False — the gateway authenticates the user but doesn't
  pass the identity to the backend via ReverseProxy. Microservice routes
  use Valkey transport which includes envelope headers. Topology endpoints
  need either: (a) Valkey transport registration in Concelier, or
  (b) Concelier configured to accept raw bearer tokens on ReverseProxy paths.
  This is an architecture-level fix.

Journey findings collected so far:
- Integration wizard (Harbor + GitHub App): works end-to-end
- Advisory Check All: fixed (parallel individual checks)
- Mirror domain creation: works, generate-immediately fails silently
- Topology wizard Step 1 (Region): blocked by auth passthrough issue
- Topology wizard Step 2 (Environment): POST to JobEngine needs verify
- User ID resolution: raw hashes shown everywhere

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-03-16 08:12:39 +02:00
parent 602df77467
commit da76d6e93e
223 changed files with 24763 additions and 489 deletions

View File

@@ -0,0 +1,354 @@
# Advisory & VEX Mirror Setup Audit - 2026-03-15
**Auditor**: AI agent acting as first-time operator setting up Stella Ops as a vulnerability/VEX advisory mirror
**Stack**: Live local (stella-ops.local), logged in as admin/Admin@Stella2026!
**Scope**: End-to-end assessment of adding, selecting, grouping, and aggregating advisory/VEX sources via UI, CLI, and backend
---
## Executive Summary
Stella Ops has a **well-architected backend** for advisory/VEX aggregation (47 sources, rate limiting, backoff, deduplication, conflict detection, VEX normalization, airgap/offline support). The **CLI is fully functional** with source management commands. However, **the UI has critical gaps** that prevent a first-time operator from setting up advisory sources without CLI or developer knowledge.
| Layer | Readiness |
|-------|-----------|
| Backend catalog (47 sources, 9 categories) | READY |
| Rate limiting & backoff | READY |
| VEX ingestion pipeline | READY |
| CLI source management | READY |
| CLI setup wizard | READY |
| Feeds & Airgap operations page | PARTIAL |
| UI source addition flow | MISSING |
| UI group/batch source selection | MISSING |
| UI source configuration (API keys, intervals) | MISSING |
---
## 1. Backend Source Catalog Assessment
### 1.1 Supported Sources (47 total)
**File**: `src/Concelier/__Libraries/StellaOps.Concelier.Core/Sources/SourceDefinitions.cs`
| Category | Count | Sources |
|----------|-------|---------|
| Primary Databases | 6 | NVD (NIST), OSV (Google), GitHub Security Advisories, CVE.org (MITRE), EPSS (FIRST), CISA KEV |
| Vendor Advisories | 11 | Red Hat, Microsoft MSRC, Amazon Linux, Google, Oracle, Apple, Cisco, Fortinet, Juniper, Palo Alto, VMware |
| Linux Distributions | 9 | Debian, Ubuntu, Alpine, SUSE, RHEL, CentOS, Fedora, Arch, Gentoo |
| Language Ecosystems | 9 | npm, PyPI, Go, RubyGems, NuGet, Maven, Crates.io, Packagist, Hex.pm |
| CSAF/VEX | 3 | CSAF Aggregator, CSAF TC Trusted Publishers, VEX Hub |
| CERTs/Government | 8 | CERT-FR, CERT-Bund (DE), CERT.at (AT), CERT.be (BE), NCSC-CH (CH), CERT-EU, JPCERT/CC (JP), CISA (US) |
| StellaOps Mirror | 1 | Pre-aggregated mirror endpoint |
**Assessment**: Comprehensive coverage. Each source has: ID, display name, category, base endpoint, health check endpoint, auth requirements, credential env var, documentation URL, default priority, region tags, and grouping tags.
### 1.2 Source Grouping Support (Backend)
**Grouping methods available in `SourceDefinitions`**:
- `GetByCategory(SourceCategory)` - Group by Primary/Vendor/Distribution/Ecosystem/Cert/Csaf/Threat/Mirror
- `GetByTag(string)` - Group by tags (e.g., "linux", "network", "eu", "ecosystem")
- `GetByRegion(string)` - Group by geographic region (FR, DE, EU, JP, APAC, US, NA)
- `GetAuthenticatedSources()` - Filter sources requiring API keys
**Assessment**: Backend supports flexible grouping. Tags like "vendor", "distro", "linux", "eu", "ecosystem" enable batch operations. **However, none of this is exposed in the UI.**
### 1.3 Configuration Model
**File**: `src/Concelier/__Libraries/StellaOps.Concelier.Core/Configuration/SourceConfiguration.cs`
- **Source modes**: Direct (upstream), Mirror (pre-aggregated), Hybrid (mirror + direct fallback)
- **Per-source config**: Enabled/disabled, priority, API key, custom endpoint, request delay, failure backoff, max pages per fetch, metadata
- **Mirror server config**: Export root, authentication (Anonymous/OAuth/ApiKey/mTLS), rate limits, DSSE attestation signing
- **Auto-enable**: `AutoEnableHealthySources = true` by default
**Assessment**: Configuration model is complete and well-designed.
---
## 2. Rate Limiting & Graceful Aggregation Assessment
### 2.1 Per-Source Rate Limiting (Outbound - Concelier)
**File**: `src/Concelier/__Libraries/StellaOps.Concelier.Core/Configuration/SourceConfiguration.cs`
| Setting | Default | Purpose |
|---------|---------|---------|
| `RequestDelay` | 200ms | Delay between consecutive API calls to same source |
| `FailureBackoff` | 5 minutes | Cooldown after a source returns errors |
| `MaxPagesPerFetch` | 10 | Cap pages fetched per sync cycle |
| `ConnectivityCheckTimeout` | 30 seconds | Health check timeout |
**Assessment**: These defaults are reasonable and won't trigger upstream rate limits. NVD allows 50 req/30s with API key (200ms = 5 req/s fits). OSV has no published rate limit. GHSA via GraphQL is limited to 5000 points/hour.
### 2.2 VEX Hub Polling (Scheduler)
**File**: `src/VexHub/__Libraries/StellaOps.VexHub.Core/Ingestion/VexIngestionScheduler.cs`
| Setting | Default | Purpose |
|---------|---------|---------|
| `DefaultPollingIntervalSeconds` | 3600 (1 hour) | How often each source is polled |
| `MaxConcurrentPolls` | 4 | SemaphoreSlim-limited concurrent ingestions |
| `MaxRetries` | 3 | Retries per ingestion attempt |
| `FetchTimeoutSeconds` | 300 (5 min) | Per-source fetch timeout |
| `BatchSize` | 500 | Statements per batch upsert |
**Scheduler behavior**:
- Runs every 1 minute checking for due sources (`GetDueForPollingAsync`)
- Throttles with `SemaphoreSlim` (max 4 concurrent)
- Updates `LastPolledAt` and `LastErrorMessage` per source after each poll
- Per-source configurable `PollingIntervalSeconds`
**Assessment**: 1-hour default polling interval with max 4 concurrent is very conservative and graceful. No DDoS risk. Sources that fail get error logged and next poll delayed by their interval. **However, there is no exponential backoff** - a source that fails will be retried at the same interval. The `FailureBackoff` in `SourceConfig` (5 min) provides a short cooldown but not progressive backoff.
### 2.3 Inbound Rate Limiting (VexHub Mirror Server)
**File**: `src/VexHub/StellaOps.VexHub.WebService/Middleware/RateLimitingMiddleware.cs`
| Setting | Default | Purpose |
|---------|---------|---------|
| Anonymous limit | 60 req/min | Sliding window per IP |
| Authenticated limit | 120 req/min | Sliding window per API key |
| Idle cleanup | 5 min | Expired client entries pruned |
**Headers**: `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset`, `Retry-After`
**Assessment**: Proper rate limiting for when Stella Ops acts as a mirror server. Standard headers support client retry logic.
### 2.4 Deduplication & Conflict Detection
**VEX Ingestion Pipeline**:
- SHA-256 content digest for deduplication
- Conflict detection: when two sources disagree on VEX status for the same CVE+product
- Conflict severity: Low/Medium/High/Critical
- Auto-resolution for low-severity conflicts
- Provenance tracking (audit trail per statement)
**Assessment**: Well-designed. Prevents duplicate data accumulation and tracks disagreements between sources.
---
## 3. CLI Source Management Assessment
### 3.1 Sources Commands
**File**: `src/Cli/StellaOps.Cli/Commands/Sources/SourcesCommandGroup.cs`
| Command | Purpose | Status |
|---------|---------|--------|
| `stella sources list [--category] [--enabled-only] [--json]` | List all 47 sources with category filtering | IMPLEMENTED |
| `stella sources check [source] [--all] [--parallel N] [--timeout N] [--auto-disable]` | Connectivity check with auto-disable | IMPLEMENTED |
| `stella sources enable <sources...>` | Enable one or more sources by ID | IMPLEMENTED |
| `stella sources disable <sources...>` | Disable one or more sources by ID | IMPLEMENTED |
| `stella sources status [--json]` | Show current configuration status | IMPLEMENTED |
**Assessment**: Full CRUD for source management via CLI. Supports batch enable/disable (multiple source IDs in one command). Category filtering available. Auto-disable on connectivity failure.
### 3.2 Feeds Snapshot Commands
**File**: `src/Cli/StellaOps.Cli/Commands/FeedsCommandGroup.cs`
| Command | Purpose | Status |
|---------|---------|--------|
| `stella feeds snapshot create [--label] [--sources] [--json]` | Create atomic feed snapshot | IMPLEMENTED |
| `stella feeds snapshot list [--limit N]` | List available snapshots | IMPLEMENTED |
| `stella feeds snapshot export <id> --output <path> [--compression zstd\|gzip\|none]` | Export for offline/airgap use | IMPLEMENTED |
| `stella feeds snapshot import <file> [--validate]` | Import snapshot bundle | IMPLEMENTED |
| `stella feeds snapshot validate <id>` | Validate snapshot for drift | IMPLEMENTED |
**Assessment**: Complete snapshot lifecycle for offline/airgap operation. Supports zstd compression and integrity validation.
### 3.3 CLI Setup Wizard
**File**: `src/Cli/StellaOps.Cli/Commands/Setup/Steps/Implementations/SourcesSetupStep.cs`
The interactive setup wizard:
1. Runs connectivity checks against all 47 sources in parallel
2. Displays results with latency and error details
3. Offers remediation steps for failed sources
4. Prompts: auto-disable failures / manual fix / keep all
5. Prompts source mode: Mirror (recommended) / Direct / Hybrid
6. Optionally configures mirror server (export root, auth, rate limits)
7. Reports final count of enabled sources
**Assessment**: Excellent guided setup experience via CLI. This is exactly what the UI should replicate.
---
## 4. UI Assessment - Critical Gaps
### 4.1 Gap: No UI Flow to Add Advisory/VEX Sources (P0)
**Route**: `/setup/integrations/advisory-vex-sources` (or `/ops/integrations/advisory-vex-sources`)
**What exists**: An "Advisory & VEX" tab in the Integrations page showing "FeedMirror Integrations" with a "+ Add Integration" button.
**What happens**: Clicking "+ Add Integration" navigates to `/setup/integrations/onboarding` (or `/ops/integrations/onboarding`) which shows the generic onboarding hub with only 4 categories:
1. Container Registries (Harbor)
2. Source Control (GitHub App)
3. CI/CD Pipelines (disabled)
4. Hosts & Observers (disabled)
**Missing**: There is NO "Advisory & VEX Sources" category in the onboarding hub. A first-time operator clicking "Add Integration" from the Advisory & VEX tab lands on an irrelevant page with no way to add advisory sources.
**Impact**: The primary action for setting up advisory mirroring is a dead end in the UI.
### 4.2 Gap: No Source Catalog Browser in UI (P0)
The backend defines 47 sources with categories, descriptions, auth requirements, credential URLs, and documentation links. **None of this is exposed in any UI page.** A first-time operator has no way to:
- Browse available sources
- See which sources require API keys
- Understand source categories
- Learn about source coverage
### 4.3 Gap: No Group/Batch Source Selection in UI (P0)
The backend supports grouping by category, tag, and region (`GetByCategory`, `GetByTag`, `GetByRegion`). **The UI has no batch selection.** An operator cannot:
- "Enable all Linux distribution sources"
- "Enable all EU CERT sources"
- "Enable all ecosystem sources for my language stack"
- "Enable everything in the Primary category"
### 4.4 Gap: No Source Configuration UI (API keys, intervals) (P1)
Sources like GHSA and NuGet require a `GITHUB_PAT` token. NVD recommends an API key for higher rate limits. **The UI has no form for entering per-source credentials, polling intervals, or priority.**
### 4.5 Gap: FeedMirror Integrations Shows 0 but Feeds & Airgap Shows 2 (P1)
**Disconnection**:
- `/ops/operations/feeds-airgap` shows "Mirrors 2" (NVD Mirror, OSV Mirror) both "Fresh" and "OK"
- `/setup/integrations/advisory-vex-sources` shows "No feedmirror integrations found" with "0 pass / 0 warn / 0 fail"
These two pages show contradictory data. The operations page knows about 2 active mirrors but the integrations page shows 0. They appear to query different data sources.
### 4.6 Gap: Security Page Shows 6 Sources All Offline (P1)
**Route**: `/security` > "Advisories & VEX Health" section
Shows 6 sources all "offline - unknown":
- Internal VEX
- KEV
- NVD
- OSV
- Vendor Advisories
- Vendor VEX
Yet the Feeds & Airgap page shows NVD and OSV as "Fresh" and "OK". Another data disconnection.
**Also**: The "Configure sources" link on this section navigates to `/ops/integrations/advisory-vex-sources` which is the empty FeedMirror Integrations page. Dead end loop.
### 4.7 Gap: No Source Mode Selection in UI (P1)
The backend supports Direct/Mirror/Hybrid modes. The CLI setup wizard presents this choice prominently. **The UI has no way to select or view the current source mode.**
### 4.8 Gap: No Mirror Server Configuration in UI (P2)
When Stella Ops operates as a mirror for downstream instances, the mirror server needs configuration (export root, authentication, rate limits, DSSE signing). **The CLI handles this but the UI does not.**
### 4.9 Gap: No Connectivity Check UI (P2)
The CLI has `stella sources check` with parallel connectivity testing, auto-disable, and remediation guidance. **The UI has no equivalent** - no "Test All Sources" button, no health check results.
### 4.10 Gap: Airgap Bundles Tab Not Exercised (P2)
**Route**: `/ops/operations/feeds-airgap?tab=airgap-bundles`
The Airgap Bundles and Version Locks tabs exist in the Feeds & Airgap page but were not testable in this session (stayed on Feed Mirrors tab). These represent the offline/airgap workflow counterpart to `stella feeds snapshot export/import`.
---
## 5. What Works Well
| Feature | Location | Status |
|---------|----------|--------|
| Feed Mirrors monitoring | `/ops/operations/feeds-airgap` | 2 mirrors (NVD, OSV) synced, fresh, OK |
| Feed status in context bar | Global header | "Feed: Live" indicator with link |
| Freshness indicators | Feeds & Airgap table | "Fresh" with timestamp |
| Storage tracking | Feeds & Airgap summary | 12.4 GB tracked |
| Mirror mode display | Feeds & Airgap | "Mode: live mirrors (read-write)" |
| CLI source list/check/enable/disable | `stella sources *` | Full management |
| CLI setup wizard | `stella setup` | Guided interactive flow |
| CLI feed snapshots | `stella feeds snapshot *` | Complete offline workflow |
| Backend rate limiting | SourceConfig + VexIngestionScheduler | 200ms delay, 5min backoff, 4 concurrent max |
| Deduplication | VexIngestionService | SHA-256 content digest |
| Conflict detection | VexConflictRepository | Auto-resolve + manual review |
---
## 6. Aggregation Gracefuless Assessment
### Will upstream providers cut off access?
**Risk: LOW** with current defaults.
| Source | Rate Limit | Stella Default | Safe? |
|--------|-----------|---------------|-------|
| NVD | 50 req/30s (with key), 5 req/30s (without) | 200ms delay = 5 req/s, 1hr polling | YES (with key) |
| OSV | No published limit | 200ms delay, 1hr polling | YES |
| GHSA | 5000 points/hr (GraphQL) | 200ms delay, 1hr polling | YES |
| KEV | Static JSON file | 1hr polling | YES |
| EPSS | No published limit | 200ms delay, 1hr polling | YES |
| Vendor/CERT | Varies | 200ms delay, 1hr polling | YES |
**Concerns**:
1. **No exponential backoff**: Failed sources retry at the same interval. If a source is temporarily down, Stella will retry every hour indefinitely. Should implement exponential backoff (1hr -> 2hr -> 4hr -> max 24hr).
2. **NVD without API key**: Default rate is 5 req/30s. The 200ms delay (5 req/s) would exceed this. The `RequiresAuthentication = false` flag and optional `NVD_API_KEY` env var are correctly modeled, but there's no UI guidance to obtain a key.
3. **MaxPagesPerFetch = 10**: This caps each sync to 10 pages, preventing bulk initial downloads from overwhelming sources. Good design.
4. **4 concurrent polls max**: Prevents parallel requests to the same source type from multiplying load. Good design.
---
## 7. Priority Matrix
| Priority | Issue | Category |
|----------|-------|----------|
| P0 | No UI flow to add advisory/VEX sources | UI |
| P0 | No source catalog browser in UI | UI |
| P0 | No group/batch source selection in UI | UI |
| P1 | No source configuration UI (API keys, intervals) | UI |
| P1 | FeedMirror Integrations vs Feeds & Airgap data disconnection | Data |
| P1 | Security page shows 6 sources offline while feeds page shows 2 healthy | Data |
| P1 | No source mode selection in UI (Direct/Mirror/Hybrid) | UI |
| P2 | No mirror server configuration in UI | UI |
| P2 | No connectivity check in UI | UI |
| P2 | No exponential backoff for failed sources | Backend |
| P2 | NVD without API key may exceed rate limit | Config |
| P3 | Airgap bundles and version locks tabs not wired to UX guidance | UI |
---
## 8. Top 5 Actions for Maximum Self-Serve Impact
1. **Add "Advisory & VEX Sources" category to the onboarding hub** - With a grouped source picker showing all 47 sources organized by category (Primary, Vendor, Distribution, Ecosystem, CERT, CSAF), with checkboxes, descriptions, auth requirements, and "Enable All in Category" buttons.
2. **Wire FeedMirror Integrations page to actual feed mirror data** - The integrations page shows 0 while the operations page shows 2. These need to query the same data source so operators see a single truth.
3. **Add source mode selector to setup** - Allow choosing Direct/Mirror/Hybrid from the UI, matching what the CLI setup wizard offers.
4. **Add per-source configuration panel** - When clicking a source, show: enable/disable toggle, API key field (with link to credential URL), polling interval, priority, health status, last sync time.
5. **Add exponential backoff for failed sources** - Currently retries at constant interval. Implement progressive backoff (1hr -> 2hr -> 4hr -> 8hr -> max 24hr) to be a good upstream citizen.
---
## 9. Comparison: CLI vs UI Feature Parity
| Feature | CLI | UI |
|---------|-----|-----|
| List all 47 sources | `stella sources list` | NO |
| Filter by category | `--category primary` | NO |
| Filter enabled only | `--enabled-only` | NO |
| Enable sources | `stella sources enable nvd osv ghsa` | NO |
| Disable sources | `stella sources disable centos arch` | NO |
| Batch enable/disable | Multiple IDs in one command | NO |
| Connectivity check | `stella sources check --all` | NO |
| Auto-disable failed | `--auto-disable` | NO |
| Source status | `stella sources status` | PARTIAL (Feeds & Airgap) |
| Source mode selection | Setup wizard prompt | NO |
| Mirror server config | Setup wizard prompt | NO |
| Feed snapshot create | `stella feeds snapshot create` | NO (only via Feeds & Airgap operations) |
| Feed snapshot export | `stella feeds snapshot export` | NO |
| Feed snapshot import | `stella feeds snapshot import` | NO |
| Feed freshness view | N/A | YES (Feeds & Airgap) |
| Feed health monitoring | N/A | YES (Feeds & Airgap + context bar) |
**Conclusion**: The CLI is the only functional path for setting up advisory sources. The UI is read-only for feed operations and completely missing the write/configure path. This is the single biggest gap for making Stella Ops a self-serve product for vulnerability mirror setup.