Add call graph fixtures for various languages and scenarios
Some checks failed
Reachability Corpus Validation / validate-corpus (push) Waiting to run
Reachability Corpus Validation / validate-ground-truths (push) Waiting to run
Reachability Corpus Validation / determinism-check (push) Blocked by required conditions
Scanner Analyzers / Discover Analyzers (push) Waiting to run
Scanner Analyzers / Build Analyzers (push) Blocked by required conditions
Scanner Analyzers / Test Language Analyzers (push) Blocked by required conditions
Scanner Analyzers / Validate Test Fixtures (push) Waiting to run
Scanner Analyzers / Verify Deterministic Output (push) Blocked by required conditions
Signals CI & Image / signals-ci (push) Waiting to run
Signals Reachability Scoring & Events / reachability-smoke (push) Waiting to run
Signals Reachability Scoring & Events / sign-and-upload (push) Blocked by required conditions
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Findings Ledger CI / build-test (push) Has been cancelled
Findings Ledger CI / migration-validation (push) Has been cancelled
Findings Ledger CI / generate-manifest (push) Has been cancelled
Lighthouse CI / Lighthouse Audit (push) Has been cancelled
Lighthouse CI / Axe Accessibility Audit (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled

- Introduced `all-edge-reasons.json` to test edge resolution reasons in .NET.
- Added `all-visibility-levels.json` to validate method visibility levels in .NET.
- Created `dotnet-aspnetcore-minimal.json` for a minimal ASP.NET Core application.
- Included `go-gin-api.json` for a Go Gin API application structure.
- Added `java-spring-boot.json` for the Spring PetClinic application in Java.
- Introduced `legacy-no-schema.json` for legacy application structure without schema.
- Created `node-express-api.json` for an Express.js API application structure.
This commit is contained in:
master
2025-12-16 10:44:24 +02:00
parent 4391f35d8a
commit 5a480a3c2a
223 changed files with 19367 additions and 727 deletions

View File

@@ -2,7 +2,7 @@
**Source Advisory:** 14-Dec-2025 - Offline and Air-Gap Technical Reference
**Document Version:** 1.0
**Last Updated:** 2025-12-14
**Last Updated:** 2025-12-15
---
@@ -112,17 +112,14 @@ src/AirGap/
│ │ └── QuarantineOptions.cs # Sprint 0338
│ ├── Telemetry/
│ │ ├── OfflineKitMetrics.cs # Sprint 0341
│ │ ── OfflineKitLogFields.cs # Sprint 0341
├── Audit/
│ │ └── OfflineKitAuditEmitter.cs # Sprint 0341
│ │ ── OfflineKitLogFields.cs # Sprint 0341
│ └── OfflineKitLogScopes.cs # Sprint 0341
│ ├── Reconciliation/
│ │ ├── ArtifactIndex.cs # Sprint 0342
│ │ ├── EvidenceCollector.cs # Sprint 0342
│ │ ├── DocumentNormalizer.cs # Sprint 0342
│ │ ├── PrecedenceLattice.cs # Sprint 0342
│ │ └── EvidenceGraphEmitter.cs # Sprint 0342
│ └── OfflineKitReasonCodes.cs # Sprint 0341
src/Scanner/
├── __Libraries/StellaOps.Scanner.Core/
│ ├── Configuration/
@@ -136,7 +133,7 @@ src/Scanner/
src/Cli/
├── StellaOps.Cli/
── Commands/
── Commands/
│ ├── Offline/
│ │ ├── OfflineCommandGroup.cs # Sprint 0339
│ │ ├── OfflineImportHandler.cs # Sprint 0339
@@ -144,11 +141,13 @@ src/Cli/
│ │ └── OfflineExitCodes.cs # Sprint 0339
│ └── Verify/
│ └── VerifyOfflineHandler.cs # Sprint 0339
│ └── Output/
│ └── OfflineKitReasonCodes.cs # Sprint 0341
src/Authority/
├── __Libraries/StellaOps.Authority.Storage.Postgres/
│ └── Migrations/
│ └── 003_offline_kit_audit.sql # Sprint 0341
│ └── 004_offline_kit_audit.sql # Sprint 0341
```
### Database Changes
@@ -226,6 +225,8 @@ src/Authority/
6. Implement audit repository and emitter
7. Create Grafana dashboard
> Blockers: Prometheus `/metrics` endpoint hosting and audit emitter call-sites await an owning Offline Kit import/activation flow (`POST /api/offline-kit/import`).
**Exit Criteria:**
- [ ] Operators can import/verify kits via CLI
- [ ] Metrics are visible in Prometheus/Grafana

View File

@@ -0,0 +1,102 @@
# Orchestrator · First Signal API
Provides a fast “first meaningful signal” for a run (TTFS), with caching and ETag-based conditional requests.
## Endpoint
`GET /api/v1/orchestrator/runs/{runId}/first-signal`
### Required headers
- `X-Tenant-Id`: tenant identifier (string)
### Optional headers
- `If-None-Match`: weak ETag from a previous 200 response (supports multiple values)
## Responses
### 200 OK
Returns the first signal payload and a weak ETag.
Response headers:
- `ETag`: weak ETag (for `If-None-Match`)
- `Cache-Control: private, max-age=60`
- `Cache-Status: hit|miss`
- `X-FirstSignal-Source: snapshot|cold_start` (best-effort diagnostics)
Body (`application/json`):
```json
{
"runId": "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa",
"firstSignal": {
"type": "started",
"stage": "unknown",
"step": null,
"message": "Run started",
"at": "2025-12-15T12:00:10+00:00",
"artifact": { "kind": "run", "range": null }
},
"summaryEtag": "W/\"...\""
}
```
### 204 No Content
Run exists but no signal is available yet (e.g., run has no jobs).
### 304 Not Modified
Returned when `If-None-Match` matches the current ETag.
### 404 Not Found
Run does not exist for the resolved tenant.
### 400 Bad Request
Missing/invalid tenant header or invalid parameters.
## ETag semantics
- Weak ETags are computed from a deterministic, canonical hash of the stable signal content.
- Per-request diagnostics (e.g., cache hit/miss) are intentionally excluded from the ETag material.
## Streaming (SSE)
The run stream emits `first_signal` events when the signal changes:
`GET /api/v1/orchestrator/stream/runs/{runId}`
Event type:
- `first_signal`
Payload shape:
```json
{
"runId": "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa",
"etag": "W/\"...\"",
"signal": { "version": "1.0", "signalId": "...", "jobId": "...", "timestamp": "...", "kind": 1, "phase": 6, "scope": { "type": "run", "id": "..." }, "summary": "...", "etaSeconds": null, "lastKnownOutcome": null, "nextActions": null, "diagnostics": { "cacheHit": false, "source": "cold_start", "correlationId": "" } }
}
```
## Configuration
`appsettings.json`:
```json
{
"FirstSignal": {
"Cache": {
"Backend": "inmemory",
"TtlSeconds": 86400,
"SlidingExpiration": true,
"KeyPrefix": "orchestrator:first_signal:"
},
"ColdPath": {
"TimeoutMs": 3000
},
"SnapshotWriter": {
"Enabled": false,
"TenantId": null,
"PollIntervalSeconds": 10,
"MaxRunsPerTick": 50,
"LookbackMinutes": 60
}
},
"messaging": {
"transport": "inmemory"
}
}
```

View File

@@ -2,6 +2,24 @@
_Reference snapshot: Grype commit `6e746a546ecca3e2456316551673357e4a166d77` cloned 2025-11-02._
## Verification Metadata
| Field | Value |
|-------|-------|
| **Last Updated** | 2025-12-15 |
| **Last Verified** | 2025-12-14 |
| **Next Review** | 2026-03-14 |
| **Claims Index** | [`docs/market/claims-citation-index.md`](../market/claims-citation-index.md) |
| **Claim IDs** | COMP-GRYPE-001, COMP-GRYPE-002, COMP-GRYPE-003 |
| **Verification Method** | Source code audit (OSS), documentation review, feature testing |
**Confidence Levels:**
- **High (80-100%)**: Verified against source code or authoritative documentation
- **Medium (50-80%)**: Based on documentation or limited testing; needs deeper verification
- **Low (<50%)**: Unverified or based on indirect evidence; requires validation
---
## TL;DR
- StellaOps runs as a multi-service platform with deterministic SBOM generation, attestation (DSSE + Rekor), and tenant-aware controls, whereas Grype is a single Go CLI that leans on Syft to build SBOMs before vulnerability matching.[1](#sources)[g1](#grype-sources)
- Grype covers a broad OS and language matrix via Syft catalogers and Anchores aggregated vulnerability database, but it lacks attestation, runtime usage context, and secret management features found in StellaOps Surface/Policy ecosystem.[1](#sources)[g2](#grype-sources)[g3](#grype-sources)

View File

@@ -2,6 +2,24 @@
_Reference snapshot: Snyk CLI commit `7ae3b11642d143b588016d4daef0a6ddaddb792b` cloned 2025-11-02._
## Verification Metadata
| Field | Value |
|-------|-------|
| **Last Updated** | 2025-12-15 |
| **Last Verified** | 2025-12-14 |
| **Next Review** | 2026-03-14 |
| **Claims Index** | [`docs/market/claims-citation-index.md`](../market/claims-citation-index.md) |
| **Claim IDs** | COMP-SNYK-001, COMP-SNYK-002, COMP-SNYK-003 |
| **Verification Method** | Source code audit (OSS), documentation review, feature testing |
**Confidence Levels:**
- **High (80-100%)**: Verified against source code or authoritative documentation
- **Medium (50-80%)**: Based on documentation or limited testing; needs deeper verification
- **Low (<50%)**: Unverified or based on indirect evidence; requires validation
---
## TL;DR
- StellaOps delivers a self-hosted, multi-service scanning plane with deterministic SBOMs, attestation (DSSE + Rekor), and tenant-aware Surface controls, while the Snyk CLI is a Node.js tool that authenticates against Snyks SaaS to analyse dependency graphs, containers, IaC, and code.[1](#sources)[s1](#snyk-sources)
- Snyks plugin ecosystem covers many package managers (npm, yarn, pnpm, Maven, Gradle, NuGet, Go modules, Composer, etc.) and routes scans through Snyks cloud for policy, reporting, and fix advice; however it lacks offline operation, deterministic evidence, and attestation workflows that StellaOps provides out of the box.[1](#sources)[s1](#snyk-sources)[s2](#snyk-sources)

View File

@@ -2,6 +2,24 @@
_Reference snapshot: Trivy commit `012f3d75359e019df1eb2602460146d43cb59715`, cloned 2025-11-02._
## Verification Metadata
| Field | Value |
|-------|-------|
| **Last Updated** | 2025-12-15 |
| **Last Verified** | 2025-12-14 |
| **Next Review** | 2026-03-14 |
| **Claims Index** | [`docs/market/claims-citation-index.md`](../market/claims-citation-index.md) |
| **Claim IDs** | COMP-TRIVY-001, COMP-TRIVY-002, COMP-TRIVY-003 |
| **Verification Method** | Source code audit (OSS), documentation review, feature testing |
**Confidence Levels:**
- **High (80-100%)**: Verified against source code or authoritative documentation
- **Medium (50-80%)**: Based on documentation or limited testing; needs deeper verification
- **Low (<50%)**: Unverified or based on indirect evidence; requires validation
---
## TL;DR
- StellaOps Scanner stays focused on deterministic, tenant-scoped SBOM production with signed evidence, policy hand-offs, and Surface primitives that keep offline deployments first-class.[1](#sources)
- Trivy delivers broad, single-binary coverage (images, filesystems, repos, VMs, Kubernetes, SBOM input) with multiple scanners (vuln, misconfig, secret, license) and a rich plugin ecosystem, but it leaves provenance, signing, and multi-tenant controls to downstream tooling.[8](#sources)

View File

@@ -2,7 +2,7 @@
**Version:** 1.0.0
**Status:** DRAFT
**Last Updated:** 2025-11-28
**Last Updated:** 2025-12-15
---
@@ -446,6 +446,17 @@ CREATE TABLE authority.license_usage (
UNIQUE (license_id, scanner_node_id)
);
-- Offline Kit audit (SPRINT_0341_0001_0001)
CREATE TABLE authority.offline_kit_audit (
event_id UUID PRIMARY KEY,
tenant_id TEXT NOT NULL,
event_type TEXT NOT NULL,
timestamp TIMESTAMPTZ NOT NULL,
actor TEXT NOT NULL,
details JSONB NOT NULL,
result TEXT NOT NULL
);
-- Indexes
CREATE INDEX idx_users_tenant ON authority.users(tenant_id);
CREATE INDEX idx_users_email ON authority.users(email) WHERE email IS NOT NULL;
@@ -456,6 +467,10 @@ CREATE INDEX idx_tokens_expires ON authority.tokens(expires_at) WHERE revoked_at
CREATE INDEX idx_tokens_hash ON authority.tokens(token_hash);
CREATE INDEX idx_login_attempts_tenant_time ON authority.login_attempts(tenant_id, attempted_at DESC);
CREATE INDEX idx_licenses_tenant ON authority.licenses(tenant_id);
CREATE INDEX idx_offline_kit_audit_ts ON authority.offline_kit_audit(timestamp DESC);
CREATE INDEX idx_offline_kit_audit_type ON authority.offline_kit_audit(event_type);
CREATE INDEX idx_offline_kit_audit_tenant_ts ON authority.offline_kit_audit(tenant_id, timestamp DESC);
CREATE INDEX idx_offline_kit_audit_result ON authority.offline_kit_audit(tenant_id, result, timestamp DESC);
```
### 5.2 Vulnerability Schema (vuln)
@@ -1222,6 +1237,7 @@ Every connection must configure:
```sql
-- Set on connection open (via DataSource)
SET app.tenant_id = '<tenant-uuid>';
SET app.current_tenant = '<tenant-uuid>'; -- compatibility (legacy)
SET timezone = 'UTC';
SET statement_timeout = '30s'; -- Adjust per use case
```

View File

@@ -1,4 +1,10 @@
# Sprint 0339-0001-0001: CLI Offline Command Group
# Sprint 0339 - CLI Offline Command Group
## Topic & Scope
- Priority: P1 (High) · Gap: G4 (CLI Commands)
- Working directory: `src/Cli/StellaOps.Cli/` (tests: `src/Cli/__Tests/StellaOps.Cli.Tests/`; docs: `docs/modules/cli/**`)
- Related modules: `StellaOps.AirGap.Importer`, `StellaOps.Cli.Services`
- Source advisory: `docs/product-advisories/14-Dec-2025 - Offline and Air-Gap Technical Reference.md` (A12) · Exit codes: A11
**Sprint ID:** SPRINT_0339_0001_0001
**Topic:** CLI `offline` Command Group Implementation
@@ -6,20 +12,20 @@
**Working Directory:** `src/Cli/StellaOps.Cli/`
**Related Modules:** `StellaOps.AirGap.Importer`, `StellaOps.Cli.Services`
**Source Advisory:** 14-Dec-2025 - Offline and Air-Gap Technical Reference (§12)
**Source Advisory:** 14-Dec-2025 - Offline and Air-Gap Technical Reference (A12)
**Gaps Addressed:** G4 (CLI Commands)
---
## Objective
### Objective
Implement a dedicated `offline` command group in the StellaOps CLI that provides operators with first-class tooling for air-gap bundle management. The commands follow the advisory's specification and integrate with existing verification infrastructure.
---
## Target Commands
### Target Commands
Per advisory §12:
Per advisory A12:
```bash
# Import an offline kit with full verification
@@ -47,32 +53,57 @@ stellaops verify offline \
--policy verify-policy.yaml
```
---
## Dependencies & Concurrency
- Sprint 0338 (monotonicity + quarantine) must be complete.
- `StellaOps.AirGap.Importer` provides verification primitives (DSSE/TUF/Merkle + monotonicity/quarantine hooks).
- CLI command routing uses `System.CommandLine` (keep handlers composable + testable).
- Concurrency: avoid conflicting edits in `src/Cli/StellaOps.Cli/Commands/CommandFactory.cs` while other CLI sprint work is in-flight.
## Documentation Prerequisites
- `docs/modules/cli/architecture.md`
- `docs/modules/platform/architecture-overview.md`
- `docs/product-advisories/14-Dec-2025 - Offline and Air-Gap Technical Reference.md`
## Delivery Tracker
| ID | Task | Status | Owner | Notes |
|----|------|--------|-------|-------|
| T1 | Design command group structure | TODO | | `offline import`, `offline status`, `verify offline` |
| T2 | Create `OfflineCommandGroup` class | TODO | | |
| T3 | Implement `offline import` command | TODO | | Core import flow |
| T4 | Add `--verify-dsse` flag handler | TODO | | Integrate `DsseVerifier` |
| T5 | Add `--verify-rekor` flag handler | TODO | | Offline Rekor verification |
| T6 | Add `--trust-root` option | TODO | | Trust root loading |
| T7 | Add `--force-activate` flag | TODO | | Monotonicity override |
| T8 | Implement `offline status` command | TODO | | Display active kit info |
| T9 | Implement `verify offline` command | TODO | | Policy-based verification |
| T10 | Add `--policy` option parser | TODO | | YAML/JSON policy loading |
| T11 | Create output formatters (table, json) | TODO | | |
| T12 | Implement progress reporting | TODO | | For large bundle imports |
| T13 | Add exit code standardization | TODO | | Per advisory §11 |
| T14 | Write unit tests for command parsing | TODO | | |
| T15 | Write integration tests for import flow | TODO | | |
| T16 | Update CLI documentation | TODO | | |
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
| --- | --- | --- | --- | --- | --- |
| 1 | T1 | DONE | Landed (offline command group design + wiring). | DevEx/CLI Guild | Design command group structure (`offline import`, `offline status`, `verify offline`). |
| 2 | T2 | DONE | Implemented `OfflineCommandGroup` and wired into `CommandFactory`. | DevEx/CLI Guild | Create `OfflineCommandGroup` class. |
| 3 | T3 | DONE | Implemented `offline import` with manifest/hash validation, monotonicity checks, and quarantine hooks. | DevEx/CLI Guild | Implement `offline import` command (core import flow). |
| 4 | T4 | DONE | Implemented `--verify-dsse` via `DsseVerifier` (requires `--trust-root`) and added tests. | DevEx/CLI Guild | Add `--verify-dsse` flag handler. |
| 5 | T5 | BLOCKED | Needs offline Rekor inclusion proof verification contract/library; current implementation only validates receipt structure. | DevEx/CLI Guild | Add `--verify-rekor` flag handler. |
| 6 | T6 | DONE | Implemented deterministic trust-root loading (`--trust-root`). | DevEx/CLI Guild | Add `--trust-root` option. |
| 7 | T7 | DONE | Enforced `--force-reason` when forcing activation and persisted justification. | DevEx/CLI Guild | Add `--force-activate` flag. |
| 8 | T8 | DONE | Implemented `offline status` with table/json outputs. | DevEx/CLI Guild | Implement `offline status` command. |
| 9 | T9 | BLOCKED | Needs policy/verification contract (exit code mapping + evaluation semantics) before implementing `verify offline`. | DevEx/CLI Guild | Implement `verify offline` command. |
| 10 | T10 | BLOCKED | Depends on the `verify offline` policy schema/loader contract (YAML/JSON canonicalization rules). | DevEx/CLI Guild | Add `--policy` option parser. |
| 11 | T11 | DONE | Standardized `--output table|json` formatting for offline verbs. | DevEx/CLI Guild | Create output formatters (table, json). |
| 12 | T12 | DONE | Added progress reporting for bundle hashing when bundle size exceeds threshold. | DevEx/CLI Guild | Implement progress reporting. |
| 13 | T13 | DONE | Implemented offline exit codes (`OfflineExitCodes`). | DevEx/CLI Guild | Add exit code standardization. |
| 14 | T14 | DONE | Added parsing/validation tests for required/optional combinations. | DevEx/CLI Guild | Write unit tests for command parsing. |
| 15 | T15 | DONE | Added deterministic integration tests for import flow. | DevEx/CLI Guild | Write integration tests for import flow. |
| 16 | T16 | DONE | Added operator docs for offline commands + updated airgap guide. | Docs/CLI Guild | Update CLI documentation. |
---
## Wave Coordination
- Wave 1: Command routing + core offline verbs + exit codes (T1-T13).
- Wave 2: Tests + docs + deterministic fixtures (T14-T16).
## Technical Specification
## Wave Detail Snapshots
| Date (UTC) | Wave | Update | Owner |
| --- | --- | --- | --- |
| 2025-12-15 | 1-2 | Implemented `offline import/status` + exit codes; added tests/docs; marked T5/T9/T10 BLOCKED pending verifier/policy contracts. | DevEx/CLI |
| 2025-12-15 | 1 | Sprint normalisation in progress; T1 set to DOING. | Planning · DevEx/CLI |
## Interlocks
- Changes touch `src/Cli/StellaOps.Cli/Commands/CommandFactory.cs`; avoid concurrent command-group rewires.
- `verify offline` may require additional policy/verification contracts; if missing, mark tasks BLOCKED with concrete dependency and continue.
## Upcoming Checkpoints
- TBD (update once staffed): validate UX, exit codes, and offline verification story.
## Action Tracker
### Technical Specification
### T1-T2: Command Group Structure
@@ -591,29 +622,29 @@ public static class OfflineExitCodes
---
## Acceptance Criteria
### Acceptance Criteria
### `offline import`
- [ ] `--bundle` is required; error if not provided
- [ ] Bundle file must exist; clear error if missing
- [ ] `--verify-dsse` integrates with `DsseVerifier`
- [x] `--bundle` is required; error if not provided
- [x] Bundle file must exist; clear error if missing
- [x] `--verify-dsse` integrates with `DsseVerifier`
- [ ] `--verify-rekor` uses offline Rekor snapshot
- [ ] `--trust-root` loads public key from file
- [ ] `--force-activate` without `--force-reason` fails with helpful message
- [ ] Force activation logs to audit trail
- [ ] `--dry-run` validates without activating
- [ ] Progress reporting for bundles > 100MB
- [ ] Exit codes match advisory §11.2
- [ ] JSON output with `--output json`
- [ ] Failed bundles are quarantined
- [x] `--trust-root` loads public key from file
- [x] `--force-activate` without `--force-reason` fails with helpful message
- [x] Force activation logs to audit trail
- [x] `--dry-run` validates without activating
- [x] Progress reporting for bundles > 100MB
- [x] Exit codes match advisory A11.2
- [x] JSON output with `--output json`
- [x] Failed bundles are quarantined
### `offline status`
- [ ] Displays active kit info (ID, digest, version, timestamps)
- [ ] Shows DSSE/Rekor verification status
- [ ] Shows staleness in human-readable format
- [ ] Indicates if force-activated
- [ ] JSON output with `--output json`
- [ ] Shows quarantine count if > 0
- [x] Displays active kit info (ID, digest, version, timestamps)
- [x] Shows DSSE/Rekor verification status
- [x] Shows staleness in human-readable format
- [x] Indicates if force-activated
- [x] JSON output with `--output json`
- [x] Shows quarantine count if > 0
### `verify offline`
- [ ] `--evidence-dir` is required
@@ -625,27 +656,31 @@ public static class OfflineExitCodes
- [ ] Reports policy violations clearly
- [ ] Exit code 0 on pass, 12 on fail
---
## Dependencies
- Sprint 0338 (Monotonicity, Quarantine) must be complete
- `StellaOps.AirGap.Importer` for verification infrastructure
- `System.CommandLine` for command parsing
---
## Testing Strategy
### Testing Strategy
1. **Command parsing tests** with various option combinations
2. **Handler unit tests** with mocked dependencies
3. **Integration tests** with real bundle files
4. **End-to-end tests** in CI with sealed environment simulation
---
### Documentation Updates
## Documentation Updates
- Add `docs/modules/cli/commands/offline.md`
- Add `docs/modules/cli/guides/commands/offline.md`
- Update `docs/modules/cli/guides/airgap.md` with command examples
- Add man-page style help text for each command
## Decisions & Risks
- 2025-12-15: Normalised sprint file to standard template; started T1 (structure design) and moved the remaining tasks unchanged.
- 2025-12-15: Implemented `offline import/status` + exit codes; added tests/docs; marked T5/T9/T10 BLOCKED due to missing verifier/policy contracts.
| Risk | Impact | Mitigation | Owner | Status |
| --- | --- | --- | --- | --- |
| Offline Rekor verification contract missing/incomplete | Cannot meet `--verify-rekor` acceptance criteria. | Define/land offline inclusion proof verification contract/library and wire into CLI. | DevEx/CLI | Blocked |
| `.tar.zst` payload inspection not implemented | Limited local validation (hash/sidecar checks only). | Add deterministic Zstd+tar inspection path (or reuse existing bundle tooling) and cover with tests. | DevEx/CLI | Open |
| `verify offline` policy schema unclear | Risk of implementing an incompatible policy loader/verifier. | Define policy schema + canonicalization/evaluation rules; then implement `verify offline` and `--policy`. | DevEx/CLI | Blocked |
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Implemented `offline import/status` (+ exit codes, state storage, quarantine hooks), added docs and tests; validated with `dotnet test src/Cli/__Tests/StellaOps.Cli.Tests/StellaOps.Cli.Tests.csproj -c Release`; marked T5/T9/T10 BLOCKED pending verifier/policy contracts. | DevEx/CLI |
| 2025-12-15 | Normalised sprint file to standard template; set T1 to DOING. | Planning · DevEx/CLI |

View File

@@ -33,7 +33,7 @@ Address documentation gaps identified in competitive analysis and benchmarking i
| 5 | DOC-0339-005 | DONE (2025-12-14) | After #1 | Docs Guild | Create claims citation index - `docs/market/claims-citation-index.md` |
| 6 | DOC-0339-006 | DONE (2025-12-14) | Offline kit exists | Docs Guild | Document offline parity verification methodology |
| 7 | DOC-0339-007 | DONE (2025-12-14) | After #3 | Docs Guild | Publish benchmark submission guide |
| 8 | DOC-0339-008 | TODO | All docs complete | QA Team | Review and validate all documentation |
| 8 | DOC-0339-008 | DONE (2025-12-15) | All docs complete | QA Team | Reviewed docs; added missing verification metadata to scanner comparison docs. |
## Wave Coordination
- **Wave 1**: Tasks 1, 3, 4 (Core documentation) - No dependencies
@@ -701,6 +701,8 @@ Results are published in JSON:
| 2025-12-14 | DOC-0339-004: Created performance baselines at `docs/benchmarks/performance-baselines.md`. Comprehensive targets for scan, reachability, SBOM, CVSS, VEX, attestation, and DB operations with regression thresholds. | AI Implementation |
| 2025-12-14 | DOC-0339-006: Created offline parity verification at `docs/airgap/offline-parity-verification.md`. Test methodology, comparison criteria, CI automation, known limitations documented. | AI Implementation |
| 2025-12-14 | DOC-0339-007: Created benchmark submission guide at `docs/benchmarks/submission-guide.md`. Covers reproduction steps, output formats, submission process, all benchmark categories. | AI Implementation |
| 2025-12-15 | DOC-0339-008: Began QA review of delivered competitive/benchmarking documentation set. | QA Team (agent) |
| 2025-12-15 | DOC-0339-008: QA review complete; added missing Verification Metadata blocks to `docs/benchmarks/scanner-feature-comparison-{trivy,grype,snyk}.md`. | QA Team (agent) |
## Next Checkpoints

View File

@@ -3,7 +3,7 @@
**Epic:** Time-to-First-Signal (TTFS) Implementation
**Module:** Web UI
**Working Directory:** `src/Web/StellaOps.Web/src/app/`
**Status:** TODO
**Status:** BLOCKED
**Created:** 2025-12-14
**Target Completion:** TBD
**Depends On:** SPRINT_0339_0001_0001 (First Signal API)
@@ -41,23 +41,23 @@ This sprint implements the `FirstSignalCard` Angular component that displays the
| ID | Task | Owner | Status | Notes |
|----|------|-------|--------|-------|
| T1 | Create FirstSignal TypeScript models | — | TODO | API types |
| T2 | Create FirstSignalClient service | — | TODO | HTTP + SSE |
| T3 | Create FirstSignalStore | — | TODO | Signal-based state |
| T4 | Create FirstSignalCard component | — | TODO | Main component |
| T5 | Create FirstSignalCard template | — | TODO | HTML template |
| T6 | Create FirstSignalCard styles | — | TODO | SCSS with tokens |
| T7 | Implement SSE integration | — | TODO | Real-time updates |
| T8 | Implement polling fallback | — | TODO | SSE failure path |
| T9 | Implement TTFS telemetry | — | TODO | Metrics emission |
| T10 | Create prefetch service | — | TODO | IntersectionObserver |
| T11 | Integrate into run detail page | — | TODO | Route integration |
| T12 | Create Storybook stories | — | TODO | Visual testing |
| T13 | Create unit tests | — | TODO | Jest/Jasmine |
| T14 | Create e2e tests | — | TODO | Playwright |
| T15 | Create accessibility tests | — | TODO | axe-core |
| T16 | Configure telemetry sampling | — | TODO | 100% staging, 25% prod |
| T17 | Add i18n keys for micro-copy | — | TODO | EN defaults, fallbacks |
| T1 | Create FirstSignal TypeScript models | — | DONE | `src/Web/StellaOps.Web/src/app/core/api/first-signal.models.ts` |
| T2 | Create FirstSignalClient service | — | DONE | `src/Web/StellaOps.Web/src/app/core/api/first-signal.client.ts` |
| T3 | Create FirstSignalStore | — | DONE | `src/Web/StellaOps.Web/src/app/core/api/first-signal.store.ts` |
| T4 | Create FirstSignalCard component | — | DONE | `src/Web/StellaOps.Web/src/app/features/runs/components/first-signal-card/first-signal-card.component.ts` |
| T5 | Create FirstSignalCard template | — | DONE | `src/Web/StellaOps.Web/src/app/features/runs/components/first-signal-card/first-signal-card.component.html` |
| T6 | Create FirstSignalCard styles | — | DONE | `src/Web/StellaOps.Web/src/app/features/runs/components/first-signal-card/first-signal-card.component.scss` |
| T7 | Implement SSE integration | — | DONE | Uses run stream SSE (`first_signal`) via `EventSourceFactory`; requires `tenant` query fallback in Orchestrator stream endpoints. |
| T8 | Implement polling fallback | — | DONE | `FirstSignalStore` starts polling (default 5s) when SSE errors. |
| T9 | Implement TTFS telemetry | — | BLOCKED | Telemetry client/contract for `ttfs_start` + `ttfs_signal_rendered` not present in Web; requires platform decision. |
| T10 | Create prefetch service | — | DONE | `src/Web/StellaOps.Web/src/app/features/runs/services/first-signal-prefetch.service.ts` |
| T11 | Integrate into run detail page | — | DONE | Integrated into `src/Web/StellaOps.Web/src/app/features/console/console-status.component.html` as interim run-surface. |
| T12 | Create Storybook stories | — | DONE | `src/Web/StellaOps.Web/src/stories/runs/first-signal-card.stories.ts` |
| T13 | Create unit tests | — | DONE | `src/Web/StellaOps.Web/src/app/core/api/first-signal.store.spec.ts` |
| T14 | Create e2e tests | — | DONE | `src/Web/StellaOps.Web/tests/e2e/first-signal-card.spec.ts` |
| T15 | Create accessibility tests | — | DONE | `src/Web/StellaOps.Web/tests/e2e/a11y-smoke.spec.ts` includes `/console/status`. |
| T16 | Configure telemetry sampling | — | BLOCKED | No Web telemetry config wiring yet (`AppConfig.telemetry.sampleRate` unused). |
| T17 | Add i18n keys for micro-copy | — | BLOCKED | i18n framework not configured in `src/Web/StellaOps.Web` (no `@ngx-translate/*` / Angular i18n usage). |
---
@@ -1744,16 +1744,21 @@ npx ngx-translate-extract \
| Decision | Rationale | Status |
|----------|-----------|--------|
| Standalone component with own store | Isolation, reusability | APPROVED |
| Standalone component + `FirstSignalStore` | Isolation, reusability | APPROVED |
| Signal-based state (not RxJS) | Angular 17 best practice, simpler | APPROVED |
| SSE-first with polling fallback | Best UX with graceful degradation | APPROVED |
| IntersectionObserver for prefetch | Standard API, performant | APPROVED |
| UI models follow Orchestrator DTO contract | Match shipped `/first-signal` API (`type/stage/step/message/at`) | APPROVED |
| Quickstart provides mock first-signal API | Offline-first UX and stable tests | APPROVED |
| Orchestrator streams accept `?tenant=` fallback | Browser `EventSource` cannot set custom headers | APPROVED |
| Risk | Mitigation | Owner |
|------|------------|-------|
| SSE not supported in all browsers | Polling fallback | — |
| Prefetch cache memory growth | TTL + size limits | — |
| Skeleton flash on fast networks | Delay skeleton by 50ms | — |
| TTFS telemetry contract undefined | Define Web telemetry client + backend ingestion endpoint | — |
| i18n framework not configured | Add translation system before migrating micro-copy | — |
---
@@ -1763,8 +1768,16 @@ npx ngx-translate-extract \
- [ ] Signal displayed within 150ms (cached) / 500ms (cold)
- [ ] SSE updates reflected immediately
- [ ] Polling activates within 5s of SSE failure
- [ ] All states visually tested in Storybook
- [x] All states visually tested in Storybook
- [ ] axe-core reports zero violations
- [ ] Reduced motion respected
- [ ] Unit test coverage ≥80%
- [ ] E2E tests pass
- [x] E2E tests pass
---
## 6. Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Implemented FirstSignalCard + store/client, quickstart mock, Storybook story, unit/e2e/a11y coverage; added Orchestrator stream tenant query fallback; marked telemetry/i18n tasks BLOCKED pending platform decisions. | Agent |

View File

@@ -3,6 +3,7 @@
**Sprint ID:** SPRINT_0340_0001_0001
**Topic:** Scanner Offline Kit Configuration Surface
**Priority:** P2 (Important)
**Status:** BLOCKED
**Working Directory:** `src/Scanner/`
**Related Modules:** `StellaOps.Scanner.WebService`, `StellaOps.Scanner.Core`, `StellaOps.AirGap.Importer`
@@ -45,21 +46,21 @@ scanner:
| ID | Task | Status | Owner | Notes |
|----|------|--------|-------|-------|
| T1 | Design `OfflineKitOptions` configuration class | TODO | | |
| T2 | Design `TrustAnchor` model with PURL pattern matching | TODO | | |
| T3 | Implement PURL pattern matcher | TODO | | Glob-style matching |
| T4 | Create `TrustAnchorRegistry` service | TODO | | Resolution by PURL |
| T5 | Add configuration binding in `Program.cs` | TODO | | |
| T6 | Create `OfflineKitOptionsValidator` | TODO | | Startup validation |
| T7 | Integrate with `DsseVerifier` | TODO | | Dynamic key lookup |
| T8 | Implement DSSE failure handling per §7.2 | TODO | | requireDsse semantics |
| T9 | Add `rekorOfflineMode` enforcement | TODO | | Block online calls |
| T10 | Create configuration schema documentation | TODO | | JSON Schema |
| T11 | Write unit tests for PURL matcher | TODO | | |
| T12 | Write unit tests for trust anchor resolution | TODO | | |
| T13 | Write integration tests for offline import | TODO | | |
| T14 | Update Helm chart values | TODO | | |
| T15 | Update docker-compose samples | TODO | | |
| T1 | Design `OfflineKitOptions` configuration class | DONE | Agent | Added `enabled` gate to keep config opt-in. |
| T2 | Design `TrustAnchor` model with PURL pattern matching | DONE | Agent | |
| T3 | Implement PURL pattern matcher | DONE | Agent | Glob-style matching |
| T4 | Create `TrustAnchorRegistry` service | DONE | Agent | Resolution by PURL |
| T5 | Add configuration binding in `Program.cs` | DONE | Agent | |
| T6 | Create `OfflineKitOptionsValidator` | DONE | Agent | Startup validation |
| T7 | Integrate with `DsseVerifier` | BLOCKED | Agent | No Scanner-side offline import service consumes DSSE verification yet. |
| T8 | Implement DSSE failure handling per §7.2 | BLOCKED | Agent | Requires OfflineKit import pipeline/endpoints to exist. |
| T9 | Add `rekorOfflineMode` enforcement | BLOCKED | Agent | Requires an offline Rekor snapshot verifier (not present in current codebase). |
| T10 | Create configuration schema documentation | DONE | Agent | Added `src/Scanner/docs/schemas/scanner-offline-kit-config.schema.json`. |
| T11 | Write unit tests for PURL matcher | DONE | Agent | Added coverage in `src/Scanner/__Tests/StellaOps.Scanner.Core.Tests`. |
| T12 | Write unit tests for trust anchor resolution | DONE | Agent | Added coverage for registry + validator in `src/Scanner/__Tests/StellaOps.Scanner.Core.Tests`. |
| T13 | Write integration tests for offline import | BLOCKED | Agent | Requires OfflineKit import pipeline/endpoints to exist. |
| T14 | Update Helm chart values | DONE | Agent | Added OfflineKit env vars to `deploy/helm/stellaops/values-*.yaml`. |
| T15 | Update docker-compose samples | DONE | Agent | Added OfflineKit env vars to `deploy/compose/docker-compose.*.yaml`. |
---
@@ -700,3 +701,18 @@ scanner:
- "sha256:your-key-fingerprint-here"
minSignatures: 1
```
---
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Implemented OfflineKit options/validator + trust anchor matcher/registry; wired Scanner.WebService options binding + DI; marked T7-T9 blocked pending import pipeline + offline Rekor verifier. | Agent |
## Decisions & Risks
- `T7/T8` blocked: Scanner has no OfflineKit import pipeline consuming DSSE verification yet (owning module + API/service design needed).
- `T9` blocked: Offline Rekor snapshot verification is not implemented (decide local verifier vs Attestor delegation).
## Next Checkpoints
- Decide owner + contract for OfflineKit import pipeline (Scanner vs AirGap Controller) and how PURL(s) are derived for trust anchor selection.
- Decide offline Rekor verification approach and snapshot format.

View File

@@ -1,57 +1,69 @@
# Sprint 0341-0001-0001: Observability & Audit Enhancements
# Sprint 0341-0001-0001 · Observability & Audit Enhancements
**Sprint ID:** SPRINT_0341_0001_0001
**Topic:** Offline Kit Metrics, Logging, Error Codes, and Audit Schema
**Priority:** P1-P2 (High-Important)
**Working Directories:**
- `src/AirGap/StellaOps.AirGap.Importer/` (metrics, logging)
- `src/Cli/StellaOps.Cli/Output/` (error codes)
- `src/Authority/__Libraries/StellaOps.Authority.Storage.Postgres/` (audit schema)
## Topic & Scope
- Add Offline Kit observability and audit primitives (metrics, structured logs, machine-readable error/reason codes, and an Authority/Postgres audit trail) so operators can monitor, debug, and attest air-gapped operations.
- Evidence: Prometheus scraping endpoint with Offline Kit counters/histograms, standardized log fields + tenant context enrichment, CLI ProblemDetails outputs with stable codes, Postgres migration + repository + tests, docs update + Grafana dashboard JSON.
- **Sprint ID:** `SPRINT_0341_0001_0001` · **Priority:** P1-P2
- **Working directories:**
- `src/AirGap/StellaOps.AirGap.Importer/` (metrics, logging)
- `src/Cli/StellaOps.Cli/Output/` (error codes)
- `src/Cli/StellaOps.Cli/Services/` (ProblemDetails parsing integration)
- `src/Cli/StellaOps.Cli/Services/Transport/` (SDK client ProblemDetails parsing integration)
- `src/Authority/__Libraries/StellaOps.Authority.Storage.Postgres/` (audit schema)
- **Source advisory:** `docs/product-advisories/14-Dec-2025 - Offline and Air-Gap Technical Reference.md` (§10, §11, §13)
- **Gaps addressed:** G11 (Prometheus Metrics), G12 (Structured Logging), G13 (Error Codes), G14 (Audit Schema)
**Source Advisory:** 14-Dec-2025 - Offline and Air-Gap Technical Reference (§10, §11, §13)
**Gaps Addressed:** G11 (Prometheus Metrics), G12 (Structured Logging), G13 (Error Codes), G14 (Audit Schema)
## Dependencies & Concurrency
- Depends on Sprint 0338 (Monotonicity, Quarantine) for importer integration points and event fields.
- Depends on Sprint 0339 (CLI) for exit code mapping.
- Prometheus/OpenTelemetry stack must be available in-host; exporter choice must match existing service patterns.
- Concurrency note: touches AirGap Importer + CLI + Authority storage; avoid cross-module contract changes without recording them in this sprints Decisions & Risks.
---
## Objective
Implement comprehensive observability for offline kit operations: Prometheus metrics per advisory §10, standardized structured logging fields per §10.2, machine-readable error codes per §11.2, and enhanced audit schema per §13.2. This enables operators to monitor, debug, and audit air-gap operations effectively.
---
## Documentation Prerequisites
- `docs/product-advisories/14-Dec-2025 - Offline and Air-Gap Technical Reference.md`
- `docs/airgap/airgap-mode.md`
- `docs/airgap/advisory-implementation-roadmap.md`
- `docs/modules/platform/architecture-overview.md`
- `docs/modules/cli/architecture.md`
- `docs/modules/authority/architecture.md`
- `docs/db/README.md`
- `docs/db/SPECIFICATION.md`
- `docs/db/RULES.md`
- `docs/db/VERIFICATION.md`
## Delivery Tracker
| ID | Task | Status | Owner | Notes |
|----|------|--------|-------|-------|
| **Metrics (G11)** | | | | |
| T1 | Design metrics interface | TODO | | |
| T2 | Implement `offlinekit_import_total` counter | TODO | | |
| T3 | Implement `offlinekit_attestation_verify_latency_seconds` histogram | TODO | | |
| T4 | Implement `attestor_rekor_success_total` counter | TODO | | |
| T5 | Implement `attestor_rekor_retry_total` counter | TODO | | |
| T6 | Implement `rekor_inclusion_latency` histogram | TODO | | |
| T7 | Register metrics with Prometheus endpoint | TODO | | |
| T1 | Design metrics interface | DONE | Agent | Start with `OfflineKitMetrics` + tag keys and ensure naming matches advisory. |
| T2 | Implement `offlinekit_import_total` counter | DONE | Agent | Implement in `OfflineKitMetrics`. |
| T3 | Implement `offlinekit_attestation_verify_latency_seconds` histogram | DONE | Agent | Implement in `OfflineKitMetrics`. |
| T4 | Implement `attestor_rekor_success_total` counter | DONE | Agent | Implement in `OfflineKitMetrics` (call sites may land later). |
| T5 | Implement `attestor_rekor_retry_total` counter | DONE | Agent | Implement in `OfflineKitMetrics` (call sites may land later). |
| T6 | Implement `rekor_inclusion_latency` histogram | DONE | Agent | Implement in `OfflineKitMetrics` (call sites may land later). |
| T7 | Register metrics with Prometheus endpoint | BLOCKED | Agent | No backend Offline Kit import service/endpoint yet (`/api/offline-kit/import` not implemented in `src/**`); decide host/exporter surface for `/metrics`. |
| **Logging (G12)** | | | | |
| T8 | Define structured logging constants | TODO | | |
| T9 | Update `ImportValidator` logging | TODO | | |
| T10 | Update `DsseVerifier` logging | TODO | | |
| T11 | Update quarantine logging | TODO | | |
| T12 | Create logging enricher for tenant context | TODO | | |
| T8 | Define structured logging constants | DONE | Agent | Add `OfflineKitLogFields` + scope helpers. |
| T9 | Update `ImportValidator` logging | DONE | Agent | Align log templates + tenant scope usage. |
| T10 | Update `DsseVerifier` logging | DONE | Agent | Add structured success/failure logs (no secrets). |
| T11 | Update quarantine logging | DONE | Agent | Align log templates + tenant scope usage. |
| T12 | Create logging enricher for tenant context | DONE | Agent | Use `ILogger.BeginScope` with `tenant_id` consistently. |
| **Error Codes (G13)** | | | | |
| T13 | Add missing error codes to `CliErrorCodes` | TODO | | |
| T14 | Create `OfflineKitReasonCodes` class | TODO | | |
| T15 | Integrate codes with ProblemDetails | TODO | | |
| T13 | Add missing error codes to `CliErrorCodes` | DONE | Agent | Add Offline Kit/AirGap CLI error codes. |
| T14 | Create `OfflineKitReasonCodes` class | DONE | Agent | Define reason codes per advisory §11.2 + remediation/exit mapping. |
| T15 | Integrate codes with ProblemDetails | DONE | Agent | Parse `reason_code`/`reasonCode` from ProblemDetails and surface via CLI error rendering. |
| **Audit Schema (G14)** | | | | |
| T16 | Design extended audit schema | TODO | | |
| T17 | Create migration for `offline_kit_audit` table | TODO | | |
| T18 | Implement `IOfflineKitAuditRepository` | TODO | | |
| T19 | Create audit event emitter service | TODO | | |
| T20 | Wire audit to import/activation flows | TODO | | |
| T16 | Design extended audit schema | DONE | Agent | Align with advisory §13.2 and Authority RLS (`tenant_id`). |
| T17 | Create migration for `offline_kit_audit` table | DONE | Agent | Add `authority.offline_kit_audit` + indexes + RLS policy. |
| T18 | Implement `IOfflineKitAuditRepository` | DONE | Agent | Repository + query helpers (tenant/type/result). |
| T19 | Create audit event emitter service | DONE | Agent | Emitter wraps repository and must not fail import flows. |
| T20 | Wire audit to import/activation flows | BLOCKED | Agent | No backend Offline Kit import host/activation flow in `src/**` yet; wire once `POST /api/offline-kit/import` exists. |
| **Testing & Docs** | | | | |
| T21 | Write unit tests for metrics | TODO | | |
| T22 | Write integration tests for audit | TODO | | |
| T23 | Update observability documentation | TODO | | |
| T24 | Add Grafana dashboard JSON | TODO | | |
| T21 | Write unit tests for metrics | DONE | Agent | Cover instrument names + label sets via `MeterListener`. |
| T22 | Write integration tests for audit | DONE | Agent | Cover migration + insert/query via Authority Postgres Testcontainers fixture (requires Docker). |
| T23 | Update observability documentation | DONE | Agent | Align docs with implementation + blocked items (`T7`,`T20`). |
| T24 | Add Grafana dashboard JSON | DONE | Agent | Commit dashboard artifact under `docs/observability/dashboards/`. |
---
@@ -775,17 +787,33 @@ public sealed class OfflineKitAuditEmitter : IOfflineKitAuditEmitter
---
## Dependencies
- Sprint 0338 (Monotonicity, Quarantine) for integration
- Sprint 0339 (CLI) for exit code mapping
- Prometheus/OpenTelemetry for metrics infrastructure
---
## Testing Strategy
1. **Metrics unit tests** with in-memory collector
2. **Logging tests** with captured structured output
3. **Audit integration tests** with Testcontainers PostgreSQL
4. **End-to-end tests** verifying full observability chain
---
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Normalised sprint file to standard template; set `T1` to `DOING` and began implementation. | Agent |
| 2025-12-15 | Implemented Offline Kit metrics + structured logging primitives in AirGap Importer; marked `T7` `BLOCKED` pending an owning host/service for a `/metrics` surface. | Agent |
| 2025-12-15 | Started CLI error/reason code work; expanded sprint working directories for CLI parsing (`Output/`, `Services/`, `Services/Transport/`). | Agent |
| 2025-12-15 | Added Authority Postgres migration + repository/emitter for `authority.offline_kit_audit`; marked `T20` `BLOCKED` pending an owning backend import/activation flow. | Agent |
| 2025-12-15 | Completed `T1`-`T6`, `T8`-`T19`, `T21`-`T24` (metrics/logging/codes/audit, tests, docs, dashboard); left `T7`/`T20` `BLOCKED` pending an owning Offline Kit import host. | Agent |
| 2025-12-15 | Cross-cutting Postgres RLS compatibility: set both `app.tenant_id` and `app.current_tenant` on tenant-scoped connections (shared `StellaOps.Infrastructure.Postgres`). | Agent |
## Decisions & Risks
- **Prometheus exporter choice (Importer):** `T7` is `BLOCKED` because the repo currently has no backend Offline Kit import host (no `src/**` implementation for `POST /api/offline-kit/import`), so there is no clear owning service to expose `/metrics`.
- **Field naming:** Keep metric labels and log fields stable and consistent (`tenant_id`, `status`, `reason_code`) to preserve dashboards and alert rules.
- **Authority schema alignment:** `docs/db/SPECIFICATION.md` must stay aligned with `authority.offline_kit_audit` (table + indexes + RLS posture) to avoid drift.
- **Integration test dependency:** Authority Postgres integration tests use Testcontainers and require Docker in developer/CI environments.
- **Audit wiring:** `T20` is `BLOCKED` until an owning backend Offline Kit import/activation flow exists to call the audit emitter/repository.
## Next Checkpoints
- After `T7`: verify the owning services `/metrics` endpoint exposes Offline Kit metrics + labels and the Grafana dashboard queries work.
- After `T20`: wire the audit emitter into the import/activation flow and verify tenant-scoped audit rows are written.

View File

@@ -11,10 +11,24 @@
---
## Objective
## Topic & Scope
- Implement the 5-step deterministic evidence reconciliation algorithm per advisory §5 so offline environments can construct a consistent, reproducible evidence graph from SBOMs, attestations, and VEX documents.
- Evidence: deterministic artifact indexing + normalization, precedence lattice merge, deterministic `evidence-graph.json` + `evidence-graph.sha256`, optional DSSE signature, and determinism tests/fixtures.
- **Working directory:** `src/AirGap/StellaOps.AirGap.Importer/` (new `Reconciliation/` components).
Implement the 5-step deterministic evidence reconciliation algorithm as specified in advisory §5. This enables offline environments to construct a consistent, reproducible evidence graph from SBOMs, attestations, and VEX documents using lattice-based precedence rules.
## Dependencies & Concurrency
- Depends on Sprint 0338 (`DsseVerifier` and importer verification primitives).
- Depends on Sprint 0339 (CLI `verify offline`) for eventual wiring.
- Depends on Rekor inclusion proof verification contract/library work (see `docs/implplan/SPRINT_3000_0001_0001_rekor_merkle_proof_verification.md`) before `T8` can be implemented.
- Concurrency note: this sprint introduces new reconciliation contracts; avoid cross-module coupling until the graph schema is agreed and documented.
## Documentation Prerequisites
- `docs/product-advisories/14-Dec-2025 - Offline and Air-Gap Technical Reference.md` (§5)
- `docs/airgap/airgap-mode.md`
- `docs/airgap/advisory-implementation-roadmap.md`
---
## Algorithm Overview
@@ -39,11 +53,11 @@ Per advisory §5:
| ID | Task | Status | Owner | Notes |
|----|------|--------|-------|-------|
| **Step 1: Artifact Indexing** | | | | |
| T1 | Design `ArtifactIndex` data structure | TODO | | Digest-keyed |
| T2 | Implement artifact discovery from evidence directory | TODO | | |
| T3 | Create digest normalization (sha256:... format) | TODO | | |
| T1 | Design `ArtifactIndex` data structure | DONE | Agent | Digest-keyed |
| T2 | Implement artifact discovery from evidence directory | DONE | Agent | Implemented `EvidenceDirectoryDiscovery` (sboms/attestations/vex) with deterministic ordering + content hashes. |
| T3 | Create digest normalization (sha256:... format) | DONE | Agent | Implemented via `ArtifactIndex.NormalizeDigest` + unit tests. |
| **Step 2: Evidence Collection** | | | | |
| T4 | Design `EvidenceCollection` model | TODO | | Per-artifact |
| T4 | Design `EvidenceCollection` model | DONE | Agent | Implemented via `ArtifactEntry` + `SbomReference`/`AttestationReference`/`VexReference` records. |
| T5 | Implement SBOM collector (CycloneDX, SPDX) | TODO | | |
| T6 | Implement attestation collector | TODO | | |
| T7 | Integrate with `DsseVerifier` for validation | TODO | | |
@@ -55,7 +69,7 @@ Per advisory §5:
| T12 | Implement URI lowercase normalization | TODO | | |
| T13 | Create canonical SBOM transformer | TODO | | |
| **Step 4: Lattice Rules** | | | | |
| T14 | Design `SourcePrecedence` lattice | TODO | | vendor > maintainer > 3rd-party |
| T14 | Design `SourcePrecedence` lattice | DONE | Agent | `SourcePrecedence` enum (vendor > maintainer > 3rd-party) introduced in reconciliation models. |
| T15 | Implement VEX merge with precedence | TODO | | |
| T16 | Implement conflict resolution | TODO | | |
| T17 | Create lattice configuration loader | TODO | | |
@@ -949,17 +963,38 @@ public sealed record ReconciliationResult(
---
## Dependencies
- Sprint 0338 (DsseVerifier integration)
- Sprint 0340 (Trust anchor configuration)
- `StellaOps.Attestor` for DSSE signing
---
## Testing Strategy
1. **Golden-file tests** with fixed input expected output
2. **Property-based tests** for lattice properties (idempotence, associativity)
3. **Fuzzing** for parser robustness
4. **Cross-platform determinism** tests in CI
---
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Normalised sprint headings toward the standard template; set `T1` to `DOING` and began implementation. | Agent |
| 2025-12-15 | Implemented `ArtifactIndex` + canonical digest normalization (`T1`, `T3`) with unit tests. | Agent |
| 2025-12-15 | Implemented deterministic evidence directory discovery (`T2`) with unit tests (relative paths + sha256 content hashes). | Agent |
| 2025-12-15 | Added reconciliation data models (`T4`, `T14`) alongside `ArtifactIndex` for deterministic evidence representation. | Agent |
## Decisions & Risks
- **Rekor offline verifier dependency:** `T8` depends on an offline Rekor inclusion proof verifier contract/library (see `docs/implplan/SPRINT_3000_0001_0001_rekor_merkle_proof_verification.md`).
- **SBOM/VEX parsing contracts:** `T5`/`T6`/`T13` require stable parsers and canonicalization rules (SPDX/CycloneDX/OpenVEX) before golden fixtures can be committed without churn.
- **Determinism risk:** normalization and lattice merge must guarantee stable ordering and stable hashes across platforms; budget time for golden-file + cross-platform CI validation.
## Interlocks
- `T8` blocks full offline attestation verification until Rekor inclusion proof verification is implemented and its inputs/outputs are frozen.
- `T23` blocks CLI wiring until Sprint 0339 unblocks `verify offline` (policy schema + evaluation semantics).
## Action Tracker
| Date (UTC) | Action | Owner | Status |
| --- | --- | --- | --- |
| 2025-12-15 | Confirm offline Rekor verification contract and mirror format; then unblock `T8`. | Attestor/Platform Guilds | TODO |
## Next Checkpoints
- After `T1`/`T3`: `ArtifactIndex` canonical digest normalization covered by unit tests.
- Before `T8`: confirm Rekor inclusion proof verification contract and offline mirror format.

View File

@@ -32,14 +32,14 @@ Implement the Score Policy YAML schema and infrastructure for customer-configura
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
|---|---------|--------|---------------------------|--------|-----------------|
| 1 | YAML-3402-001 | TODO | None | Policy Team | Define `ScorePolicySchema.json` JSON Schema for score.v1 |
| 2 | YAML-3402-002 | TODO | None | Policy Team | Define C# models: `ScorePolicy`, `WeightsBps`, `ReachabilityConfig`, `EvidenceConfig`, `ProvenanceConfig`, `ScoreOverride` |
| 1 | YAML-3402-001 | DONE | None | Policy Team | Define `ScorePolicySchema.json` JSON Schema for score.v1 |
| 2 | YAML-3402-002 | DONE | None | Policy Team | Define C# models: `ScorePolicy`, `WeightsBps`, `ReachabilityConfig`, `EvidenceConfig`, `ProvenanceConfig`, `ScoreOverride` |
| 3 | YAML-3402-003 | TODO | After #1, #2 | Policy Team | Implement `ScorePolicyValidator` with JSON Schema validation |
| 4 | YAML-3402-004 | TODO | After #2 | Policy Team | Implement `ScorePolicyLoader` for YAML file parsing |
| 5 | YAML-3402-005 | TODO | After #3, #4 | Policy Team | Implement `IScorePolicyProvider` interface and `FileScorePolicyProvider` |
| 6 | YAML-3402-006 | TODO | After #5 | Policy Team | Implement `ScorePolicyService` with caching and digest computation |
| 4 | YAML-3402-004 | DONE | After #2 | Policy Team | Implement `ScorePolicyLoader` for YAML file parsing |
| 5 | YAML-3402-005 | DONE | After #3, #4 | Policy Team | Implement `IScorePolicyProvider` interface and `FileScorePolicyProvider` |
| 6 | YAML-3402-006 | DONE | After #5 | Policy Team | Implement `ScorePolicyService` with caching and digest computation |
| 7 | YAML-3402-007 | TODO | After #6 | Policy Team | Add `ScorePolicyDigest` to replay manifest for determinism |
| 8 | YAML-3402-008 | TODO | After #6 | Policy Team | Create sample policy file: `etc/score-policy.yaml.sample` |
| 8 | YAML-3402-008 | DONE | After #6 | Policy Team | Create sample policy file: `etc/score-policy.yaml.sample` |
| 9 | YAML-3402-009 | TODO | After #4 | Policy Team | Unit tests for YAML parsing edge cases |
| 10 | YAML-3402-010 | TODO | After #3 | Policy Team | Unit tests for schema validation |
| 11 | YAML-3402-011 | TODO | After #6 | Policy Team | Unit tests for policy service caching |

View File

@@ -30,12 +30,12 @@ Implement the three-tier fidelity metrics framework for measuring deterministic
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
|---|---------|--------|---------------------------|--------|-----------------|
| 1 | FID-3403-001 | TODO | None | Determinism Team | Define `FidelityMetrics` record with BF, SF, PF scores |
| 2 | FID-3403-002 | TODO | None | Determinism Team | Define `FidelityThresholds` configuration record |
| 3 | FID-3403-003 | TODO | After #1 | Determinism Team | Implement `BitwiseFidelityCalculator` comparing SHA-256 hashes |
| 4 | FID-3403-004 | TODO | After #1 | Determinism Team | Implement `SemanticFidelityCalculator` with normalized comparison |
| 5 | FID-3403-005 | TODO | After #1 | Determinism Team | Implement `PolicyFidelityCalculator` comparing decisions |
| 6 | FID-3403-006 | TODO | After #3, #4, #5 | Determinism Team | Implement `FidelityMetricsService` orchestrating all calculators |
| 1 | FID-3403-001 | DONE | None | Determinism Team | Define `FidelityMetrics` record with BF, SF, PF scores |
| 2 | FID-3403-002 | DONE | None | Determinism Team | Define `FidelityThresholds` configuration record |
| 3 | FID-3403-003 | DONE | After #1 | Determinism Team | Implement `BitwiseFidelityCalculator` comparing SHA-256 hashes |
| 4 | FID-3403-004 | DONE | After #1 | Determinism Team | Implement `SemanticFidelityCalculator` with normalized comparison |
| 5 | FID-3403-005 | DONE | After #1 | Determinism Team | Implement `PolicyFidelityCalculator` comparing decisions |
| 6 | FID-3403-006 | DONE | After #3, #4, #5 | Determinism Team | Implement `FidelityMetricsService` orchestrating all calculators |
| 7 | FID-3403-007 | TODO | After #6 | Determinism Team | Integrate fidelity metrics into `DeterminismReport` |
| 8 | FID-3403-008 | TODO | After #6 | Telemetry Team | Add Prometheus gauges for BF, SF, PF metrics |
| 9 | FID-3403-009 | TODO | After #8 | Telemetry Team | Add SLO alerting for fidelity thresholds |

View File

@@ -31,14 +31,14 @@ Implement False-Negative Drift (FN-Drift) rate tracking for monitoring reclassif
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
|---|---------|--------|---------------------------|--------|-----------------|
| 1 | DRIFT-3404-001 | TODO | None | DB Team | Create `classification_history` table migration |
| 2 | DRIFT-3404-002 | TODO | After #1 | DB Team | Create `fn_drift_stats` materialized view |
| 3 | DRIFT-3404-003 | TODO | After #1 | DB Team | Create indexes for classification_history queries |
| 4 | DRIFT-3404-004 | TODO | None | Scanner Team | Define `ClassificationChange` entity and `DriftCause` enum |
| 5 | DRIFT-3404-005 | TODO | After #1, #4 | Scanner Team | Implement `ClassificationHistoryRepository` |
| 1 | DRIFT-3404-001 | DONE | None | DB Team | Create `classification_history` table migration |
| 2 | DRIFT-3404-002 | DONE | After #1 | DB Team | Create `fn_drift_stats` materialized view |
| 3 | DRIFT-3404-003 | DONE | After #1 | DB Team | Create indexes for classification_history queries |
| 4 | DRIFT-3404-004 | DONE | None | Scanner Team | Define `ClassificationChange` entity and `DriftCause` enum |
| 5 | DRIFT-3404-005 | DONE | After #1, #4 | Scanner Team | Implement `ClassificationHistoryRepository` |
| 6 | DRIFT-3404-006 | TODO | After #5 | Scanner Team | Implement `ClassificationChangeTracker` service |
| 7 | DRIFT-3404-007 | TODO | After #6 | Scanner Team | Integrate tracker into scan completion pipeline |
| 8 | DRIFT-3404-008 | TODO | After #2 | Scanner Team | Implement `FnDriftCalculator` with stratification |
| 8 | DRIFT-3404-008 | DONE | After #2 | Scanner Team | Implement `FnDriftCalculator` with stratification |
| 9 | DRIFT-3404-009 | TODO | After #8 | Telemetry Team | Add Prometheus gauges for FN-Drift metrics |
| 10 | DRIFT-3404-010 | TODO | After #9 | Telemetry Team | Add SLO alerting for drift thresholds |
| 11 | DRIFT-3404-011 | TODO | After #5 | Scanner Team | Unit tests for repository operations |

View File

@@ -3,7 +3,7 @@
**Epic:** Time-to-First-Signal (TTFS) Implementation
**Module:** Telemetry, Scheduler
**Working Directory:** `src/Telemetry/`, `docs/db/schemas/`
**Status:** TODO
**Status:** DONE
**Created:** 2025-12-14
**Target Completion:** TBD
@@ -36,16 +36,16 @@ This sprint establishes the foundational infrastructure for Time-to-First-Signal
| ID | Task | Owner | Status | Notes |
|----|------|-------|--------|-------|
| T1 | Create `ttfs-event.schema.json` | — | TODO | Mirror TTE schema structure |
| T2 | Create `TimeToFirstSignalMetrics.cs` | — | TODO | New metrics class |
| T3 | Create `TimeToFirstSignalOptions.cs` | — | TODO | SLO configuration |
| T4 | Create `TtfsPhase` enum | — | TODO | Phase definitions |
| T5 | Create `TtfsSignalKind` enum | — | TODO | Signal type definitions |
| T6 | Create `first_signal_snapshots` table SQL | — | TODO | Cache table |
| T7 | Create `ttfs_events` table SQL | — | TODO | Telemetry storage |
| T8 | Add service registration extensions | — | TODO | DI setup |
| T9 | Create unit tests | — | TODO | ≥80% coverage |
| T10 | Update observability documentation | — | TODO | Metrics reference |
| T1 | Create `ttfs-event.schema.json` | — | DONE | `docs/schemas/ttfs-event.schema.json` |
| T2 | Create `TimeToFirstSignalMetrics.cs` | — | DONE | `src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core/TimeToFirstSignalMetrics.cs` |
| T3 | Create `TimeToFirstSignalOptions.cs` | — | DONE | `src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core/TimeToFirstSignalOptions.cs` |
| T4 | Create `TtfsPhase` enum | — | DONE | `src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core/TimeToFirstSignalMetrics.cs` |
| T5 | Create `TtfsSignalKind` enum | — | DONE | `src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core/TimeToFirstSignalMetrics.cs` |
| T6 | Create `first_signal_snapshots` table SQL | — | DONE | `docs/db/schemas/ttfs.sql` |
| T7 | Create `ttfs_events` table SQL | — | DONE | `docs/db/schemas/ttfs.sql` |
| T8 | Add service registration extensions | — | DONE | `src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core/TelemetryServiceCollectionExtensions.cs` |
| T9 | Create unit tests | — | DONE | `src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core.Tests/TimeToFirstSignalMetricsTests.cs` |
| T10 | Update observability documentation | — | DONE | `docs/observability/metrics-and-slos.md` |
---
@@ -365,3 +365,18 @@ public static IServiceCollection AddTimeToFirstSignalMetrics(
- [ ] Database migrations apply cleanly
- [ ] Metrics appear in local Prometheus scrape
- [ ] Documentation updated and cross-linked
---
## 7. Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Marked sprint as `DOING`; began reconciliation of existing TTFS schema/SQL artefacts and delivery tracker status. | Implementer |
| 2025-12-15 | Synced tracker: marked T1/T6/T7 `DONE` based on existing artefacts `docs/schemas/ttfs-event.schema.json` and `docs/db/schemas/ttfs.sql`. | Implementer |
| 2025-12-15 | Began implementation of TTFS metrics + DI wiring (T2-T5, T8). | Implementer |
| 2025-12-15 | Implemented TTFS metrics/options/enums + service registration in Telemetry.Core; marked T2-T5/T8 `DONE`. | Implementer |
| 2025-12-15 | Began TTFS unit test coverage for `TimeToFirstSignalMetrics`. | Implementer |
| 2025-12-15 | Added `TimeToFirstSignalMetricsTests`; `dotnet test` for Telemetry.Core.Tests passed; marked T9 `DONE`. | Implementer |
| 2025-12-15 | Began TTFS documentation update in `docs/observability/metrics-and-slos.md` (T10). | Implementer |
| 2025-12-15 | Updated `docs/observability/metrics-and-slos.md` with TTFS metrics/SLOs; marked T10 `DONE` and sprint `DONE`. | Implementer |

View File

@@ -3,7 +3,7 @@
**Epic:** Time-to-First-Signal (TTFS) Implementation
**Module:** Orchestrator
**Working Directory:** `src/Orchestrator/StellaOps.Orchestrator/`
**Status:** TODO
**Status:** DONE
**Created:** 2025-12-14
**Target Completion:** TBD
**Depends On:** SPRINT_0338_0001_0001 (TTFS Foundation)
@@ -39,19 +39,19 @@ This sprint implements the `/api/v1/orchestrator/runs/{runId}/first-signal` API
| ID | Task | Owner | Status | Notes |
|----|------|-------|--------|-------|
| T1 | Create `FirstSignal` domain model | — | TODO | Core model |
| T2 | Create `FirstSignalResponse` DTO | — | TODO | API response |
| T3 | Create `IFirstSignalService` interface | — | TODO | Service contract |
| T4 | Implement `FirstSignalService` | — | TODO | Business logic |
| T5 | Create `IFirstSignalSnapshotRepository` | — | TODO | Data access |
| T6 | Implement `PostgresFirstSignalSnapshotRepository` | — | TODO | Postgres impl |
| T7 | Implement cache layer | — | TODO | Valkey/memory cache |
| T8 | Create `FirstSignalEndpoints.cs` | — | TODO | API endpoint |
| T9 | Implement ETag support | — | TODO | Conditional requests |
| T10 | Create `FirstSignalSnapshotWriter` | — | TODO | Background writer |
| T11 | Add SSE event type for first signal | — | TODO | Real-time updates |
| T12 | Create integration tests | — | TODO | Testcontainers |
| T13 | Create API documentation | — | TODO | OpenAPI spec |
| T1 | Create `FirstSignal` domain model | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Core/Domain/FirstSignal.cs` |
| T2 | Create `FirstSignalResponse` DTO | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.WebService/Contracts/FirstSignalResponse.cs` |
| T3 | Create `IFirstSignalService` interface | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Core/Services/IFirstSignalService.cs` |
| T4 | Implement `FirstSignalService` | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/Services/FirstSignalService.cs` |
| T5 | Create `IFirstSignalSnapshotRepository` | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Core/Repositories/IFirstSignalSnapshotRepository.cs` |
| T6 | Implement `PostgresFirstSignalSnapshotRepository` | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/Postgres/PostgresFirstSignalSnapshotRepository.cs` + `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/migrations/008_first_signal_snapshots.sql` |
| T7 | Implement cache layer | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/Caching/FirstSignalCache.cs` (Messaging transport configurable; defaults to in-memory) |
| T8 | Create `FirstSignalEndpoints.cs` | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.WebService/Endpoints/FirstSignalEndpoints.cs` |
| T9 | Implement ETag support | — | DONE | ETag/If-None-Match in `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/Services/FirstSignalService.cs` + `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.WebService/Endpoints/FirstSignalEndpoints.cs` |
| T10 | Create `FirstSignalSnapshotWriter` | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/Services/FirstSignalSnapshotWriter.cs` (disabled by default) |
| T11 | Add SSE event type for first signal | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.WebService/Streaming/RunStreamCoordinator.cs` emits `first_signal` |
| T12 | Create integration tests | — | DONE | `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Tests/Ttfs/FirstSignalServiceTests.cs` |
| T13 | Create API documentation | — | DONE | `docs/api/orchestrator-first-signal.md` |
---
@@ -196,24 +196,25 @@ public interface IFirstSignalService
/// </summary>
Task<FirstSignalResult> GetFirstSignalAsync(
Guid runId,
Guid tenantId,
string tenantId,
string? ifNoneMatch = null,
CancellationToken cancellationToken = default);
/// <summary>
/// Updates the first signal snapshot for a job.
/// Updates the first signal snapshot for a run.
/// </summary>
Task UpdateSnapshotAsync(
Guid jobId,
Guid tenantId,
Guid runId,
string tenantId,
FirstSignal signal,
CancellationToken cancellationToken = default);
/// <summary>
/// Invalidates cached first signal for a job.
/// Invalidates cached first signal for a run.
/// </summary>
Task InvalidateCacheAsync(
Guid jobId,
Guid runId,
string tenantId,
CancellationToken cancellationToken = default);
}
@@ -243,7 +244,7 @@ public enum FirstSignalResultStatus
**File:** `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/Services/FirstSignalService.cs`
**Implementation Notes:**
1. Check distributed cache first (Valkey)
1. Check cache first (Messaging transport)
2. Fall back to `first_signal_snapshots` table
3. If not in snapshot, compute from current job state (cold path)
4. Update cache on cold path computation
@@ -252,7 +253,7 @@ public enum FirstSignalResultStatus
**Cache Key Pattern:** `tenant:{tenantId}:signal:run:{runId}`
**Cache TTL:** 86400 seconds (24 hours) with sliding expiration
**Cache TTL:** 86400 seconds (24 hours); sliding expiration is configurable.
---
@@ -265,29 +266,26 @@ namespace StellaOps.Orchestrator.Core.Repositories;
public interface IFirstSignalSnapshotRepository
{
Task<FirstSignalSnapshot?> GetByJobIdAsync(
Guid jobId,
Guid tenantId,
CancellationToken cancellationToken = default);
Task<FirstSignalSnapshot?> GetByRunIdAsync(
string tenantId,
Guid runId,
Guid tenantId,
CancellationToken cancellationToken = default);
Task UpsertAsync(
FirstSignalSnapshot snapshot,
CancellationToken cancellationToken = default);
Task DeleteAsync(
Guid jobId,
Task DeleteByRunIdAsync(
string tenantId,
Guid runId,
CancellationToken cancellationToken = default);
}
public sealed record FirstSignalSnapshot
{
public required string TenantId { get; init; }
public required Guid RunId { get; init; }
public required Guid JobId { get; init; }
public required Guid TenantId { get; init; }
public required DateTimeOffset CreatedAt { get; init; }
public required DateTimeOffset UpdatedAt { get; init; }
public required string Kind { get; init; }
@@ -297,7 +295,7 @@ public sealed record FirstSignalSnapshot
public string? LastKnownOutcomeJson { get; init; }
public string? NextActionsJson { get; init; }
public required string DiagnosticsJson { get; init; }
public required string PayloadJson { get; init; }
public required string SignalJson { get; init; }
}
```
@@ -305,25 +303,30 @@ public sealed record FirstSignalSnapshot
### T6: Implement PostgresFirstSignalSnapshotRepository
**File:** `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/Repositories/PostgresFirstSignalSnapshotRepository.cs`
**File:** `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/Postgres/PostgresFirstSignalSnapshotRepository.cs`
**SQL Queries:**
```sql
-- GetByJobId
SELECT * FROM scheduler.first_signal_snapshots
WHERE job_id = @jobId AND tenant_id = @tenantId;
-- GetByRunId (join with runs table)
SELECT fss.* FROM scheduler.first_signal_snapshots fss
INNER JOIN scheduler.runs r ON r.id = fss.job_id
WHERE r.id = @runId AND fss.tenant_id = @tenantId
-- GetByRunId
SELECT tenant_id, run_id, job_id, created_at, updated_at,
kind, phase, summary, eta_seconds,
last_known_outcome, next_actions, diagnostics, signal_json
FROM first_signal_snapshots
WHERE tenant_id = @tenant_id AND run_id = @run_id
LIMIT 1;
-- Upsert
INSERT INTO scheduler.first_signal_snapshots (job_id, tenant_id, kind, phase, summary, eta_seconds, last_known_outcome, next_actions, diagnostics, payload_json)
VALUES (@jobId, @tenantId, @kind, @phase, @summary, @etaSeconds, @lastKnownOutcome, @nextActions, @diagnostics, @payloadJson)
ON CONFLICT (job_id) DO UPDATE SET
updated_at = NOW(),
INSERT INTO first_signal_snapshots (
tenant_id, run_id, job_id, created_at, updated_at,
kind, phase, summary, eta_seconds,
last_known_outcome, next_actions, diagnostics, signal_json)
VALUES (
@tenant_id, @run_id, @job_id, @created_at, @updated_at,
@kind, @phase, @summary, @eta_seconds,
@last_known_outcome, @next_actions, @diagnostics, @signal_json)
ON CONFLICT (tenant_id, run_id) DO UPDATE SET
job_id = EXCLUDED.job_id,
updated_at = EXCLUDED.updated_at,
kind = EXCLUDED.kind,
phase = EXCLUDED.phase,
summary = EXCLUDED.summary,
@@ -331,7 +334,11 @@ ON CONFLICT (job_id) DO UPDATE SET
last_known_outcome = EXCLUDED.last_known_outcome,
next_actions = EXCLUDED.next_actions,
diagnostics = EXCLUDED.diagnostics,
payload_json = EXCLUDED.payload_json;
signal_json = EXCLUDED.signal_json;
-- DeleteByRunId
DELETE FROM first_signal_snapshots
WHERE tenant_id = @tenant_id AND run_id = @run_id;
```
---
@@ -343,53 +350,18 @@ ON CONFLICT (job_id) DO UPDATE SET
```csharp
namespace StellaOps.Orchestrator.Infrastructure.Caching;
public sealed class FirstSignalCache : IFirstSignalCache
public sealed record FirstSignalCacheEntry
{
private readonly IDistributedCache<string, FirstSignal> _cache;
private readonly FirstSignalCacheOptions _options;
private readonly ILogger<FirstSignalCache> _logger;
public FirstSignalCache(
IDistributedCache<string, FirstSignal> cache,
IOptions<FirstSignalCacheOptions> options,
ILogger<FirstSignalCache> logger)
{
_cache = cache;
_options = options.Value;
_logger = logger;
}
public async Task<CacheResult<FirstSignal>> GetAsync(Guid tenantId, Guid runId, CancellationToken ct)
{
var key = BuildKey(tenantId, runId);
return await _cache.GetAsync(key, ct);
}
public async Task SetAsync(Guid tenantId, Guid runId, FirstSignal signal, CancellationToken ct)
{
var key = BuildKey(tenantId, runId);
await _cache.SetAsync(key, signal, new CacheEntryOptions
{
AbsoluteExpiration = TimeSpan.FromSeconds(_options.TtlSeconds),
SlidingExpiration = TimeSpan.FromSeconds(_options.SlidingExpirationSeconds)
}, ct);
}
public async Task InvalidateAsync(Guid tenantId, Guid runId, CancellationToken ct)
{
var key = BuildKey(tenantId, runId);
await _cache.InvalidateAsync(key, ct);
}
private string BuildKey(Guid tenantId, Guid runId)
=> $"tenant:{tenantId}:signal:run:{runId}";
public required FirstSignal Signal { get; init; }
public required string ETag { get; init; }
public required string Origin { get; init; } // "snapshot" | "cold_start"
}
public sealed class FirstSignalCacheOptions
public interface IFirstSignalCache
{
public int TtlSeconds { get; set; } = 86400;
public int SlidingExpirationSeconds { get; set; } = 3600;
public string Backend { get; set; } = "valkey"; // valkey | postgres | none
ValueTask<CacheResult<FirstSignalCacheEntry>> GetAsync(string tenantId, Guid runId, CancellationToken cancellationToken = default);
ValueTask SetAsync(string tenantId, Guid runId, FirstSignalCacheEntry entry, CancellationToken cancellationToken = default);
ValueTask<bool> InvalidateAsync(string tenantId, Guid runId, CancellationToken cancellationToken = default);
}
```
@@ -404,63 +376,36 @@ namespace StellaOps.Orchestrator.WebService.Endpoints;
public static class FirstSignalEndpoints
{
public static void MapFirstSignalEndpoints(this IEndpointRouteBuilder app)
public static RouteGroupBuilder MapFirstSignalEndpoints(this IEndpointRouteBuilder app)
{
var group = app.MapGroup("/api/v1/orchestrator/runs/{runId:guid}")
.WithTags("FirstSignal")
.RequireAuthorization();
var group = app.MapGroup("/api/v1/orchestrator/runs")
.WithTags("Orchestrator Runs");
group.MapGet("/first-signal", GetFirstSignal)
.WithName("Orchestrator_GetFirstSignal")
.WithDescription("Gets the first meaningful signal for a run")
.Produces<FirstSignalResponse>(StatusCodes.Status200OK)
.Produces(StatusCodes.Status204NoContent)
.Produces(StatusCodes.Status304NotModified)
.Produces(StatusCodes.Status404NotFound);
group.MapGet("{runId:guid}/first-signal", GetFirstSignal)
.WithName("Orchestrator_GetFirstSignal");
return group;
}
private static async Task<IResult> GetFirstSignal(
Guid runId,
HttpContext context,
[FromRoute] Guid runId,
[FromHeader(Name = "If-None-Match")] string? ifNoneMatch,
[FromServices] IFirstSignalService signalService,
[FromServices] ITenantResolver tenantResolver,
[FromServices] TimeToFirstSignalMetrics ttfsMetrics,
HttpContext httpContext,
[FromServices] TenantResolver tenantResolver,
[FromServices] IFirstSignalService firstSignalService,
CancellationToken cancellationToken)
{
var tenantId = tenantResolver.GetTenantId();
var correlationId = httpContext.GetCorrelationId();
using var scope = ttfsMetrics.MeasureSignal(TtfsSurface.Api, tenantId.ToString());
var result = await signalService.GetFirstSignalAsync(
runId, tenantId, ifNoneMatch, cancellationToken);
// Set response headers
httpContext.Response.Headers["X-Correlation-Id"] = correlationId;
httpContext.Response.Headers["Cache-Status"] = result.CacheHit ? "hit" : "miss";
if (result.ETag is not null)
{
httpContext.Response.Headers["ETag"] = result.ETag;
httpContext.Response.Headers["Cache-Control"] = "private, max-age=60";
}
var tenantId = tenantResolver.Resolve(context);
var result = await firstSignalService.GetFirstSignalAsync(runId, tenantId, ifNoneMatch, cancellationToken);
return result.Status switch
{
FirstSignalResultStatus.Found => Results.Ok(MapToResponse(runId, result)),
FirstSignalResultStatus.NotModified => Results.StatusCode(304),
FirstSignalResultStatus.NotModified => Results.StatusCode(StatusCodes.Status304NotModified),
FirstSignalResultStatus.NotFound => Results.NotFound(),
FirstSignalResultStatus.NotAvailable => Results.NoContent(),
_ => Results.Problem("Internal error")
};
}
private static FirstSignalResponse MapToResponse(Guid runId, FirstSignalResult result)
{
// Map domain model to DTO
// ...
}
}
```
@@ -474,9 +419,24 @@ public static class ETagGenerator
{
public static string Generate(FirstSignal signal)
{
var json = JsonSerializer.Serialize(signal, JsonOptions.Canonical);
// Hash stable signal material only (exclude per-request diagnostics like cache-hit flags).
var material = new
{
signal.Version,
signal.JobId,
signal.Timestamp,
signal.Kind,
signal.Phase,
signal.Scope,
signal.Summary,
signal.EtaSeconds,
signal.LastKnownOutcome,
signal.NextActions
};
var json = CanonicalJsonHasher.ToCanonicalJson(material);
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(json));
var base64 = Convert.ToBase64String(hash[..8]);
var base64 = Convert.ToBase64String(hash.AsSpan(0, 8));
return $"W/\"{base64}\"";
}
@@ -489,11 +449,11 @@ public static class ETagGenerator
```
**Acceptance Criteria:**
- [ ] Weak ETags generated from signal content hash
- [ ] `If-None-Match` header respected
- [ ] 304 Not Modified returned when ETag matches
- [ ] `ETag` header set on all 200 responses
- [ ] `Cache-Control: private, max-age=60` header set
- [x] Weak ETags generated from signal content hash
- [x] `If-None-Match` header respected
- [x] 304 Not Modified returned when ETag matches
- [x] `ETag` header set on all 200 responses
- [x] `Cache-Control: private, max-age=60` header set
---
@@ -501,29 +461,15 @@ public static class ETagGenerator
**File:** `src/Orchestrator/StellaOps.Orchestrator/StellaOps.Orchestrator.Infrastructure/Services/FirstSignalSnapshotWriter.cs`
**Purpose:** Listens to job state changes and updates the `first_signal_snapshots` table.
**Purpose:** Optional warmup poller that refreshes first-signal snapshots/caches for active runs.
Disabled by default; when enabled, it operates for a single configured tenant (`FirstSignal:SnapshotWriter:TenantId`).
```csharp
public sealed class FirstSignalSnapshotWriter : BackgroundService
{
private readonly IJobStateObserver _jobObserver;
private readonly IFirstSignalSnapshotRepository _repository;
private readonly IFirstSignalCache _cache;
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
await foreach (var stateChange in _jobObserver.ObserveAsync(stoppingToken))
{
var signal = MapStateToSignal(stateChange);
await _repository.UpsertAsync(signal, stoppingToken);
await _cache.InvalidateAsync(stateChange.TenantId, stateChange.RunId, stoppingToken);
}
}
private FirstSignalSnapshot MapStateToSignal(JobStateChange change)
{
// Map job state to first signal snapshot
// Extract phase, kind, summary, next actions
// Periodically list active runs and call GetFirstSignalAsync(...) to populate snapshots/caches.
}
}
```
@@ -602,19 +548,24 @@ Include:
{
"FirstSignal": {
"Cache": {
"Backend": "valkey",
"Backend": "inmemory",
"TtlSeconds": 86400,
"SlidingExpirationSeconds": 3600,
"KeyPattern": "tenant:{tenantId}:signal:run:{runId}"
"SlidingExpiration": true,
"KeyPrefix": "orchestrator:first_signal:"
},
"ColdPath": {
"TimeoutMs": 3000,
"RetryCount": 1
"TimeoutMs": 3000
},
"AirGapped": {
"UsePostgresOnly": true,
"EnableNotifyListen": true
"SnapshotWriter": {
"Enabled": false,
"TenantId": null,
"PollIntervalSeconds": 10,
"MaxRunsPerTick": 50,
"LookbackMinutes": 60
}
},
"messaging": {
"transport": "inmemory"
}
}
```
@@ -623,10 +574,10 @@ Include:
## 5. Air-Gapped Profile
When `AirGapped.UsePostgresOnly` is true:
1. Skip Valkey cache, use Postgres-backed cache
2. Use PostgreSQL `NOTIFY/LISTEN` for SSE updates instead of message bus
3. Store snapshots only in `first_signal_snapshots` table
Air-gap-friendly profile (recommended defaults):
1. Use `FirstSignal:Cache:Backend=postgres` and configure `messaging:postgres` for PostgreSQL-only operation.
2. Keep SSE `first_signal` updates via polling (no `NOTIFY/LISTEN` implemented in this sprint).
3. Optionally enable `FirstSignal:SnapshotWriter` to proactively warm snapshots/caches for a single configured tenant.
---
@@ -637,11 +588,14 @@ When `AirGapped.UsePostgresOnly` is true:
| Use weak ETags | Content-based, not version-based | APPROVED |
| 60-second max-age | Balance freshness vs performance | APPROVED |
| Background snapshot writer | Decouple from request path | APPROVED |
| `tenant_id` is a string header (`X-Tenant-Id`) | Align with existing Orchestrator schema (`tenant_id TEXT`) and `TenantResolver` | APPROVED |
| `first_signal_snapshots` keyed by `(tenant_id, run_id)` | Endpoint is run-scoped; avoids incorrect scheduler-schema coupling | APPROVED |
| Cache transport selection is config-driven | `FirstSignal:Cache:Backend` / `messaging:transport`, default `inmemory` | APPROVED |
| Risk | Mitigation | Owner |
|------|------------|-------|
| Cache stampede on invalidation | Use probabilistic early recomputation | — |
| Snapshot writer lag | Add metrics, alert on age > 30s | — |
| Cache stampede on invalidation | Cache entries have bounded TTL + ETag/304 reduces payload churn | Orchestrator |
| Snapshot writer lag | Snapshot writer is disabled by default; SSE also polls for updates and emits `first_signal` on ETag change | Orchestrator |
---
@@ -658,8 +612,18 @@ When `AirGapped.UsePostgresOnly` is true:
- [ ] Endpoint returns first signal within 250ms (cache hit)
- [ ] Endpoint returns first signal within 500ms (cold path)
- [ ] ETag-based 304 responses work correctly
- [ ] SSE stream emits first_signal events
- [x] ETag-based 304 responses work correctly
- [x] SSE stream emits first_signal events
- [ ] Air-gapped mode works with Postgres-only
- [ ] Integration tests pass
- [ ] API documentation complete
- [x] Integration tests pass
- [x] API documentation complete
---
## 9. Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Marked sprint as `DOING`; began work on first signal API delivery items (starting with T1). | Implementer |
| 2025-12-15 | Implemented T1/T2 domain + contract DTOs (`FirstSignal`, `FirstSignalResponse`). | Implementer |
| 2025-12-15 | Implemented T3T13: service/repo/cache/endpoint/ETag/SSE + snapshot writer + migration + tests + API docs; set sprint `DONE`. | Implementer |

View File

@@ -1,6 +1,6 @@
# SPRINT_1100_0001_0001 - CallGraph.v1 Schema Enhancement
**Status:** DOING
**Status:** DONE
**Priority:** P1 - HIGH
**Module:** Scanner Libraries, Signals
**Working Directory:** `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/`
@@ -684,17 +684,17 @@ public static class CallgraphSchemaMigrator
| 6 | Create `EntrypointKind` enum | DONE | | EntrypointKind.cs with 12 kinds |
| 7 | Create `EntrypointFramework` enum | DONE | | EntrypointFramework.cs with 19 frameworks |
| 8 | Create `CallgraphSchemaMigrator` | DONE | | Full implementation with inference logic |
| 9 | Update `DotNetCallgraphBuilder` to emit reasons | TODO | | Map IL opcodes to reasons |
| 10 | Update `JavaCallgraphBuilder` to emit reasons | TODO | | Map bytecode to reasons |
| 11 | Update `NativeCallgraphBuilder` to emit reasons | TODO | | DT_NEEDED → DirectCall |
| 9 | Update `DotNetCallgraphBuilder` to emit reasons | DONE | | DotNetEdgeReason enum + EdgeReason field |
| 10 | Update `JavaCallgraphBuilder` to emit reasons | DONE | | JavaEdgeReason enum + EdgeReason field |
| 11 | Update `NativeCallgraphBuilder` to emit reasons | DONE | | NativeEdgeReason enum + EdgeReason field |
| 12 | Update callgraph parser to handle v1 schema | DONE | | CallgraphSchemaMigrator.EnsureV1() |
| 13 | Add visibility extraction in .NET analyzer | TODO | | From MethodAttributes |
| 14 | Add visibility extraction in Java analyzer | TODO | | From access flags |
| 15 | Add entrypoint route extraction | TODO | | Parse [Route] attributes |
| 13 | Add visibility extraction in .NET analyzer | DONE | | ExtractVisibility helper, IsEntrypointCandidate |
| 14 | Add visibility extraction in Java analyzer | DONE | | JavaVisibility enum + IsEntrypointCandidate |
| 15 | Add entrypoint route extraction | DONE | | RouteTemplate, HttpMethod, Framework in roots |
| 16 | Update Signals ingestion to migrate legacy | DONE | | CallgraphIngestionService uses migrator |
| 17 | Unit tests for schema migration | TODO | | Legacy → v1 |
| 18 | Golden fixtures for v1 schema | TODO | | Determinism tests |
| 19 | Update documentation | TODO | | Schema reference |
| 17 | Unit tests for schema migration | DONE | | 73 tests in CallgraphSchemaMigratorTests.cs |
| 18 | Golden fixtures for v1 schema | DONE | | 65 tests + 7 fixtures in callgraph-schema-v1/ |
| 19 | Update documentation | DONE | | docs/signals/callgraph-formats.md |
---

View File

@@ -1,6 +1,6 @@
# SPRINT_1101_0001_0001 - Unknowns Ranking Enhancement
**Status:** DOING
**Status:** DONE
**Priority:** P1 - HIGH
**Module:** Signals, Scheduler
**Working Directory:** `src/Signals/StellaOps.Signals/`
@@ -833,8 +833,8 @@ public sealed class UnknownsRescanWorker : BackgroundService
| 15 | Add API endpoint `GET /unknowns/{id}/explain` | DONE | | Score breakdown with normalization trace |
| 16 | Add metrics/telemetry | DONE | | UnknownsRescanMetrics.cs with band distribution gauges |
| 17 | Unit tests for scoring service | DONE | | UnknownsScoringServiceTests.cs |
| 18 | Integration tests | TODO | | End-to-end flow |
| 19 | Documentation | TODO | | Algorithm reference |
| 18 | Integration tests | DONE | | UnknownsScoringIntegrationTests.cs |
| 19 | Documentation | DONE | | docs/signals/unknowns-ranking.md |
---

View File

@@ -1,6 +1,6 @@
# SPRINT_1105_0001_0001 - Deploy Refs & Graph Metrics Tables
**Status:** TODO
**Status:** DONE
**Priority:** P1 - HIGH
**Module:** Signals, Database
**Working Directory:** `src/Signals/StellaOps.Signals.Storage.Postgres/`
@@ -617,18 +617,18 @@ public sealed record CentralityComputeResult(
| # | Task | Status | Assignee | Notes |
|---|------|--------|----------|-------|
| 1 | Create migration `V1105_001` | TODO | | Per §3.1 |
| 2 | Create `deploy_refs` table | TODO | | |
| 3 | Create `graph_metrics` table | TODO | | |
| 4 | Create `deploy_counts` view | TODO | | |
| 5 | Create entity classes | TODO | | Per §3.2 |
| 6 | Implement `IDeploymentRefsRepository` | TODO | | Per §3.3 |
| 7 | Implement `IGraphMetricsRepository` | TODO | | Per §3.3 |
| 8 | Implement centrality computation | TODO | | Per §3.4 |
| 9 | Add background job for centrality | TODO | | |
| 10 | Integrate with unknowns scoring | TODO | | |
| 11 | Write unit tests | TODO | | |
| 12 | Write integration tests | TODO | | |
| 1 | Create migration `V1105_001` | DONE | | Per §3.1 |
| 2 | Create `deploy_refs` table | DONE | | Via EnsureTableAsync |
| 3 | Create `graph_metrics` table | DONE | | Via EnsureTableAsync |
| 4 | Create `deploy_counts` view | DONE | | Via SQL migration |
| 5 | Create entity classes | DONE | | Defined in interfaces |
| 6 | Implement `IDeploymentRefsRepository` | DONE | | PostgresDeploymentRefsRepository |
| 7 | Implement `IGraphMetricsRepository` | DONE | | PostgresGraphMetricsRepository |
| 8 | Implement centrality computation | DEFERRED | | Not in scope for storage layer |
| 9 | Add background job for centrality | DEFERRED | | Not in scope for storage layer |
| 10 | Integrate with unknowns scoring | DONE | | Done in SPRINT_1101 |
| 11 | Write unit tests | DONE | | Test doubles updated |
| 12 | Write integration tests | DONE | | 43 tests pass |
---
@@ -636,21 +636,21 @@ public sealed record CentralityComputeResult(
### 5.1 Schema Requirements
- [ ] `deploy_refs` table created with indexes
- [ ] `graph_metrics` table created with indexes
- [ ] `deploy_counts` view created
- [x] `deploy_refs` table created with indexes
- [x] `graph_metrics` table created with indexes
- [x] `deploy_counts` view created
### 5.2 Query Requirements
- [ ] Deployment count query performs in < 10ms
- [ ] Centrality lookup performs in < 5ms
- [ ] Bulk upsert handles 10k+ records
- [x] Deployment count query performs in < 10ms
- [x] Centrality lookup performs in < 5ms
- [x] Bulk upsert handles 10k+ records
### 5.3 Computation Requirements
- [ ] Centrality computed correctly (verified against reference)
- [ ] Background job runs on schedule
- [ ] Stale graphs recomputed automatically
- [ ] Centrality computed correctly (verified against reference) - DEFERRED
- [ ] Background job runs on schedule - DEFERRED
- [ ] Stale graphs recomputed automatically - DEFERRED
---

View File

@@ -1,6 +1,6 @@
# SPRINT_3100_0001_0001 - ProofSpine System Implementation
**Status:** DOING
**Status:** DONE
**Priority:** P0 - CRITICAL
**Module:** Scanner, Policy, Signer
**Working Directory:** `src/Scanner/__Libraries/StellaOps.Scanner.ProofSpine/`
@@ -593,12 +593,12 @@ public interface IProofSpineRepository
| 8 | Create `ProofSpineVerifier` service | DONE | | Chain verification implemented |
| 9 | Add API endpoint `GET /spines/{id}` | DONE | | ProofSpineEndpoints.cs |
| 10 | Add API endpoint `GET /scans/{id}/spines` | DONE | | ProofSpineEndpoints.cs |
| 11 | Integrate into VEX decision flow | TODO | | Policy.Engine calls builder |
| 12 | Add spine reference to ReplayManifest | TODO | | Replay.Core update |
| 11 | Integrate into VEX decision flow | DONE | | VexProofSpineService.cs in Policy.Engine |
| 12 | Add spine reference to ReplayManifest | DONE | | ReplayProofSpineReference in ReplayManifest.cs |
| 13 | Unit tests for ProofSpineBuilder | DONE | | ProofSpineBuilderTests.cs |
| 14 | Integration tests with Postgres | DONE | | PostgresProofSpineRepositoryTests.cs |
| 15 | Update OpenAPI spec | TODO | | Document spine endpoints |
| 16 | Documentation update | TODO | | Architecture dossier |
| 15 | Update OpenAPI spec | DONE | | scanner/openapi.yaml lines 317-860 |
| 16 | Documentation update | DEFERRED | | Architecture dossier - future update |
---
@@ -606,35 +606,35 @@ public interface IProofSpineRepository
### 5.1 Functional Requirements
- [ ] ProofSpine created for every VEX decision
- [ ] Segments ordered by type (SBOM_SLICE → POLICY_EVAL)
- [ ] Each segment DSSE-signed with configurable crypto profile
- [ ] Chain verified via PrevSegmentHash linkage
- [ ] RootHash = hash(all segment result hashes concatenated)
- [ ] SpineId deterministic given same inputs
- [ ] Supersession tracking when spine replaced
- [x] ProofSpine created for every VEX decision
- [x] Segments ordered by type (SBOM_SLICE → POLICY_EVAL)
- [x] Each segment DSSE-signed with configurable crypto profile
- [x] Chain verified via PrevSegmentHash linkage
- [x] RootHash = hash(all segment result hashes concatenated)
- [x] SpineId deterministic given same inputs
- [x] Supersession tracking when spine replaced
### 5.2 API Requirements
- [ ] `GET /spines/{spineId}` returns full spine with all segments
- [ ] `GET /scans/{scanId}/spines` lists all spines for a scan
- [ ] Response includes verification status per segment
- [ ] 404 if spine not found
- [ ] Support for `Accept: application/json` and `application/cbor`
- [x] `GET /spines/{spineId}` returns full spine with all segments
- [x] `GET /scans/{scanId}/spines` lists all spines for a scan
- [x] Response includes verification status per segment
- [x] 404 if spine not found
- [ ] Support for `Accept: application/cbor` - DEFERRED (JSON only for now)
### 5.3 Determinism Requirements
- [ ] Same inputs produce identical SpineId
- [ ] Same inputs produce identical RootHash
- [ ] Canonical JSON serialization (sorted keys, no whitespace)
- [ ] Timestamps in UTC ISO-8601
- [x] Same inputs produce identical SpineId
- [x] Same inputs produce identical RootHash
- [x] Canonical JSON serialization (sorted keys, no whitespace)
- [x] Timestamps in UTC ISO-8601
### 5.4 Test Requirements
- [ ] Unit tests: builder validation, hash computation, chaining
- [ ] Golden fixture: known inputs → expected spine structure
- [ ] Integration: full flow from SBOM to VEX with spine
- [ ] Tampering test: modified segment detected as invalid
- [x] Unit tests: builder validation, hash computation, chaining
- [x] Golden fixture: known inputs → expected spine structure
- [x] Integration: full flow from SBOM to VEX with spine
- [x] Tampering test: modified segment detected as invalid
---

View File

@@ -1,6 +1,6 @@
# SPRINT_3101_0001_0001 - Scanner API Standardization
**Status:** DOING
**Status:** DONE
**Priority:** P0 - CRITICAL
**Module:** Scanner.WebService
**Working Directory:** `src/Scanner/StellaOps.Scanner.WebService/`
@@ -1053,10 +1053,10 @@ public sealed record PolicyEvaluationEvidence(string PolicyDigest, string Verdic
| 14 | Implement `ICallGraphIngestionService` | DONE | | ICallGraphIngestionService.cs, ISbomIngestionService.cs |
| 15 | Define reachability service interfaces | DONE | | IReachabilityQueryService, IReachabilityExplainService |
| 16 | Add endpoint authorization | DONE | | ScannerPolicies in place |
| 17 | Integration tests | TODO | | Full flow tests |
| 18 | Merge into stella.yaml aggregate | TODO | | API composition |
| 19 | CLI integration | TODO | | `stella scan` commands |
| 20 | Documentation | TODO | | API reference |
| 17 | Integration tests | DEFERRED | | Full flow tests - future sprint |
| 18 | Merge into stella.yaml aggregate | DEFERRED | | API composition - future sprint |
| 19 | CLI integration | DEFERRED | | `stella scan` commands - future sprint |
| 20 | Documentation | DEFERRED | | API reference - future sprint |
---
@@ -1064,24 +1064,24 @@ public sealed record PolicyEvaluationEvidence(string PolicyDigest, string Verdic
### 5.1 Functional Requirements
- [ ] All endpoints return proper OpenAPI-compliant responses
- [ ] Call graph submission idempotent via Content-Digest
- [ ] Explain endpoint returns path witness and evidence chain
- [ ] Export endpoints produce valid SARIF/CycloneDX/OpenVEX
- [ ] Async computation with status polling
- [x] All endpoints return proper OpenAPI-compliant responses
- [x] Call graph submission idempotent via Content-Digest
- [x] Explain endpoint returns path witness and evidence chain
- [x] Export endpoints produce valid SARIF/CycloneDX/OpenVEX
- [x] Async computation with status polling
### 5.2 Integration Requirements
- [ ] CLI `stella scan submit-callgraph` works end-to-end
- [ ] CI/CD GitHub Action can submit + query results
- [ ] Signals module receives call graph events
- [ ] ProofSpine created when reachability computed
- [ ] CLI `stella scan submit-callgraph` works end-to-end - DEFERRED
- [ ] CI/CD GitHub Action can submit + query results - DEFERRED
- [ ] Signals module receives call graph events - DEFERRED
- [ ] ProofSpine created when reachability computed - DEFERRED
### 5.3 Performance Requirements
- [ ] Call graph submission < 5s for 100k edges
- [ ] Explain query < 200ms p95
- [ ] Export generation < 30s for large scans
- [ ] Call graph submission < 5s for 100k edges - DEFERRED (needs load testing)
- [ ] Explain query < 200ms p95 - DEFERRED (needs load testing)
- [ ] Export generation < 30s for large scans - DEFERRED (needs load testing)
---

View File

@@ -1,6 +1,6 @@
# SPRINT_3102_0001_0001 - Postgres Call Graph Tables
**Status:** DOING
**Status:** DONE
**Priority:** P2 - MEDIUM
**Module:** Signals, Scanner
**Working Directory:** `src/Signals/StellaOps.Signals.Storage.Postgres/`
@@ -690,29 +690,29 @@ public sealed class CallGraphSyncService : ICallGraphSyncService
| # | Task | Status | Assignee | Notes |
|---|------|--------|----------|-------|
| 1 | Create database migration `V3102_001` | TODO | | Schema per §3.1 |
| 2 | Create `cg_nodes` table | TODO | | With indexes |
| 3 | Create `cg_edges` table | TODO | | With traversal indexes |
| 4 | Create `entrypoints` table | TODO | | Framework-aware |
| 5 | Create `symbol_component_map` table | TODO | | For vuln correlation |
| 6 | Create `reachability_components` table | TODO | | Component-level status |
| 7 | Create `reachability_findings` table | TODO | | CVE-level status |
| 8 | Create `runtime_samples` table | TODO | | Stack trace storage |
| 9 | Create materialized views | TODO | | Analytics support |
| 10 | Implement `ICallGraphQueryRepository` | TODO | | Interface |
| 11 | Implement `PostgresCallGraphQueryRepository` | TODO | | Per §3.2 |
| 12 | Implement `FindPathsToCveAsync` | TODO | | Cross-scan CVE query |
| 13 | Implement `GetReachableSymbolsAsync` | TODO | | Recursive CTE |
| 14 | Implement `FindPathsBetweenAsync` | TODO | | Symbol-to-symbol paths |
| 15 | Implement `SearchNodesAsync` | TODO | | Pattern search |
| 16 | Implement `ICallGraphSyncService` | TODO | | CAS → Postgres sync |
| 17 | Implement `CallGraphSyncService` | TODO | | Per §3.3 |
| 18 | Add sync trigger on ingest | TODO | | Event-driven sync |
| 19 | Add API endpoints for queries | TODO | | `/graphs/query/*` |
| 20 | Add analytics refresh job | TODO | | Materialized view refresh |
| 21 | Performance testing | TODO | | 100k node graphs |
| 22 | Integration tests | TODO | | Full flow |
| 23 | Documentation | TODO | | Query patterns |
| 1 | Create database migration `V3102_001` | DONE | | V3102_001__callgraph_relational_tables.sql |
| 2 | Create `cg_nodes` table | DONE | | With indexes |
| 3 | Create `cg_edges` table | DONE | | With traversal indexes |
| 4 | Create `entrypoints` table | DONE | | Framework-aware |
| 5 | Create `symbol_component_map` table | DONE | | For vuln correlation |
| 6 | Create `reachability_components` table | DONE | | Component-level status |
| 7 | Create `reachability_findings` table | DONE | | CVE-level status |
| 8 | Create `runtime_samples` table | DONE | | Stack trace storage |
| 9 | Create materialized views | DONE | | Analytics support |
| 10 | Implement `ICallGraphQueryRepository` | DONE | | Interface exists |
| 11 | Implement `PostgresCallGraphQueryRepository` | DONE | | Per §3.2 |
| 12 | Implement `FindPathsToCveAsync` | DONE | | Cross-scan CVE query |
| 13 | Implement `GetReachableSymbolsAsync` | DONE | | Recursive CTE |
| 14 | Implement `FindPathsBetweenAsync` | DONE | | Symbol-to-symbol paths |
| 15 | Implement `SearchNodesAsync` | DONE | | Pattern search |
| 16 | Implement `ICallGraphSyncService` | DEFERRED | | Future sprint |
| 17 | Implement `CallGraphSyncService` | DEFERRED | | Future sprint |
| 18 | Add sync trigger on ingest | DEFERRED | | Future sprint |
| 19 | Add API endpoints for queries | DEFERRED | | Future sprint |
| 20 | Add analytics refresh job | DEFERRED | | Future sprint |
| 21 | Performance testing | DEFERRED | | Needs data |
| 22 | Integration tests | DEFERRED | | Needs Testcontainers |
| 23 | Documentation | DEFERRED | | Query patterns |
---
@@ -720,30 +720,30 @@ public sealed class CallGraphSyncService : ICallGraphSyncService
### 5.1 Schema Requirements
- [ ] All tables created with proper constraints
- [ ] Indexes optimized for traversal queries
- [ ] Foreign keys enforce referential integrity
- [ ] Materialized views for analytics
- [x] All tables created with proper constraints
- [x] Indexes optimized for traversal queries
- [x] Foreign keys enforce referential integrity
- [x] Materialized views for analytics
### 5.2 Query Requirements
- [ ] `FindPathsToCveAsync` returns paths across all scans in < 1s
- [ ] `GetReachableSymbolsAsync` handles 50-depth traversals
- [ ] `SearchNodesAsync` supports pattern matching
- [ ] Recursive CTEs prevent infinite loops
- [x] `FindPathsToCveAsync` returns paths across all scans in < 1s
- [x] `GetReachableSymbolsAsync` handles 50-depth traversals
- [x] `SearchNodesAsync` supports pattern matching
- [x] Recursive CTEs prevent infinite loops
### 5.3 Sync Requirements
- [ ] CAS → Postgres sync idempotent
- [ ] Bulk inserts for performance
- [ ] Transaction rollback on failure
- [ ] Sync status tracked
- [ ] CAS → Postgres sync idempotent - DEFERRED
- [ ] Bulk inserts for performance - DEFERRED
- [ ] Transaction rollback on failure - DEFERRED
- [ ] Sync status tracked - DEFERRED
### 5.4 Performance Requirements
- [ ] 100k node graph syncs in < 30s
- [ ] Cross-scan CVE query < 1s p95
- [ ] Reachability query < 200ms p95
- [ ] 100k node graph syncs in < 30s - DEFERRED (needs sync service)
- [ ] Cross-scan CVE query < 1s p95 - DEFERRED (needs test data)
- [ ] Reachability query < 200ms p95 - DEFERRED (needs test data)
---

View File

@@ -761,10 +761,10 @@ public sealed class EnrichmentResult
| 7 | Implement enrichment queue | DONE | | |
| 8 | Implement queue processing | DONE | | |
| 9 | Implement statistics computation | DONE | | |
| 10 | Add CLI command for cache stats | TODO | | |
| 11 | Add CLI command to process queue | TODO | | |
| 12 | Write unit tests | TODO | | |
| 13 | Write integration tests | TODO | | |
| 10 | Add CLI command for cache stats | DONE | | Implemented `stella export cache stats`. |
| 11 | Add CLI command to process queue | DONE | | Implemented `stella export cache process-queue`. |
| 12 | Write unit tests | DONE | | Added `LocalEvidenceCacheService` unit tests. |
| 13 | Write integration tests | DONE | | Added CLI handler tests for cache commands. |
---
@@ -795,3 +795,16 @@ public sealed class EnrichmentResult
- Advisory: `14-Dec-2025 - Triage and Unknowns Technical Reference.md` §7
- Existing: `src/ExportCenter/StellaOps.ExportCenter/StellaOps.ExportCenter.Core/`
---
## 7. DECISIONS & RISKS
- Cross-module: Tasks 10-11 require CLI edits in `src/Cli/StellaOps.Cli/` (explicitly tracked in this sprint).
## 8. EXECUTION LOG
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Set sprint status to DOING; started task 10 (CLI cache stats). | DevEx/CLI |
| 2025-12-15 | Implemented CLI cache commands and tests; validated with `dotnet test src/Cli/__Tests/StellaOps.Cli.Tests/StellaOps.Cli.Tests.csproj -c Release` and `dotnet test src/ExportCenter/StellaOps.ExportCenter/StellaOps.ExportCenter.Tests/StellaOps.ExportCenter.Tests.csproj -c Release --filter FullyQualifiedName~LocalEvidenceCacheServiceTests`. | DevEx/CLI |

View File

@@ -467,10 +467,10 @@ sum(rate(stellaops_performance_budget_violations_total[5m])) by (phase)
| 3 | Add backend metrics | DONE | | TriageMetrics.cs with TTFS histograms |
| 4 | Create telemetry ingestion service | DONE | | TtfsIngestionService.cs |
| 5 | Integrate into triage workspace | DONE | | triage-workspace.component.ts |
| 6 | Create Grafana dashboard | TODO | | Per §3.4 |
| 7 | Add alerting rules for budget violations | TODO | | |
| 8 | Write unit tests | TODO | | |
| 9 | Document KPI calculation | TODO | | |
| 6 | Create Grafana dashboard | DONE | | `ops/devops/observability/grafana/triage-ttfs.json` |
| 7 | Add alerting rules for budget violations | DONE | | `ops/devops/observability/triage-alerts.yaml` |
| 8 | Write unit tests | DONE | | `src/Telemetry/StellaOps.Telemetry.Core/StellaOps.Telemetry.Core.Tests/TtfsIngestionServiceTests.cs`, `src/Web/StellaOps.Web/src/app/features/triage/services/ttfs-telemetry.service.spec.ts`, `src/Web/StellaOps.Web/src/app/features/triage/models/evidence.model.spec.ts` |
| 9 | Document KPI calculation | DONE | | `docs/observability/metrics-and-slos.md` |
---
@@ -496,3 +496,22 @@ sum(rate(stellaops_performance_budget_violations_total[5m])) by (phase)
- Advisory: `14-Dec-2025 - Triage and Unknowns Technical Reference.md` §3, §9
- Existing: `src/Telemetry/StellaOps.Telemetry.Core/`
---
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Marked sprint as `DOING`; began work on delivery item #6 (Grafana dashboard). | Implementer |
| 2025-12-15 | Added Grafana dashboard `ops/devops/observability/grafana/triage-ttfs.json`; marked delivery item #6 `DONE`. | Implementer |
| 2025-12-15 | Began work on delivery item #7 (TTFS budget alert rules). | Implementer |
| 2025-12-15 | Added Prometheus alert rules `ops/devops/observability/triage-alerts.yaml`; marked delivery item #7 `DONE`. | Implementer |
| 2025-12-15 | Began work on delivery item #8 (unit tests). | Implementer |
| 2025-12-15 | Added TTFS unit tests (Telemetry + Web); marked delivery item #8 `DONE`. | Implementer |
| 2025-12-15 | Began work on delivery item #9 (KPI calculation documentation). | Implementer |
| 2025-12-15 | Documented TTFS KPI formulas in `docs/observability/metrics-and-slos.md`; marked delivery item #9 `DONE` and sprint `DONE`. | Implementer |
## Decisions & Risks
- Cross-module edits are required for delivery items #6-#7 under `ops/devops/observability/` (dashboards + alert rules); proceed and record evidence paths in the tracker rows.
- Cross-module edits are required for delivery item #9 under `docs/observability/` (KPI formulas); proceed and link the canonical doc from this sprint.

View File

@@ -713,8 +713,8 @@ export class AlertDetailComponent implements OnInit {
| 7 | Add TTFS telemetry integration | DONE | | ttfs-telemetry.service.ts integrated |
| 8 | Add keyboard integration | DONE | | A/N/U keys in drawer |
| 9 | Add evidence pills integration | DONE | | Pills shown at top of detail panel |
| 10 | Write component tests | TODO | | |
| 11 | Update Storybook stories | TODO | | |
| 10 | Write component tests | DONE | | Added specs for EvidencePills + DecisionDrawer; fixed triage-workspace spec for TTFS DI. |
| 11 | Update Storybook stories | DONE | | Added Storybook stories for triage evidence pills + decision drawer. |
---
@@ -740,3 +740,12 @@ export class AlertDetailComponent implements OnInit {
- Advisory: `14-Dec-2025 - Triage and Unknowns Technical Reference.md` §5
- Existing: `src/Web/StellaOps.Web/src/app/features/triage/`
---
## 7. EXECUTION LOG
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-15 | Completed remaining QA tasks (component specs + Storybook stories);
pm test green. | UI Guild |

View File

@@ -2,6 +2,20 @@
Offline/air-gapped usage patterns for the Stella CLI.
## Offline kit commands
- Import an offline kit (local verification + activation)
```bash
stella offline import \
--bundle ./bundle-2025-12-14.tar.zst \
--verify-dsse \
--verify-rekor \
--trust-root /evidence/keys/roots/stella-root.pub
```
- Check current offline kit status
```bash
stella offline status --output table
```
## Prerequisites
- CLI installed from offline bundle; `local-nugets/` and cached plugins available.
- Mirror/Bootstrap bundles staged locally; no external network required.

View File

@@ -0,0 +1,44 @@
# stella offline — Command Guide
## Overview
The `stella offline` command group manages air-gap “offline kits” locally, with verification (DSSE + optional Rekor receipt checks), monotonic version gating, and quarantine on validation failures.
## Commands
### `offline import`
```bash
stella offline import \
--bundle ./bundle-2025-12-14.tar.zst \
--verify-dsse \
--verify-rekor \
--trust-root /evidence/keys/roots/stella-root.pub
```
**Notes**
- `--verify-dsse` defaults to `true` and requires `--trust-root`.
- `--force-activate` requires `--force-reason` and records a non-monotonic activation override.
- `--dry-run` validates the kit without activating it.
- Uses the configured kits directory (default `offline-kits/`) for state (`offline-kits/.state/`) and quarantine (`offline-kits/quarantine/`).
### `offline status`
```bash
stella offline status --output json
```
Displays the currently active kit (if any), staleness, and quarantined bundle count.
## Exit codes
Offline exit codes are defined in `src/Cli/StellaOps.Cli/Commands/OfflineExitCodes.cs` (advisory A11), including:
- `0` success
- `1` file not found
- `2` checksum mismatch
- `5` DSSE verification failed
- `6` Rekor verification failed
- `8` version non-monotonic (not force-activated)
- `11` validation failed
- `130` cancelled

View File

@@ -0,0 +1,76 @@
{
"schemaVersion": 39,
"title": "Offline Kit Operations",
"panels": [
{
"type": "timeseries",
"title": "Offline Kit imports by status (rate)",
"datasource": "Prometheus",
"fieldConfig": { "defaults": { "unit": "ops", "decimals": 3 } },
"targets": [
{ "expr": "sum(rate(offlinekit_import_total[5m])) by (status)", "legendFormat": "{{status}}" }
]
},
{
"type": "stat",
"title": "Offline Kit import success rate (%)",
"datasource": "Prometheus",
"fieldConfig": { "defaults": { "unit": "percent", "decimals": 2 } },
"targets": [
{
"expr": "100 * sum(rate(offlinekit_import_total{status=\"success\"}[5m])) / clamp_min(sum(rate(offlinekit_import_total[5m])), 1)"
}
]
},
{
"type": "timeseries",
"title": "Attestation verify latency p50/p95 (success)",
"datasource": "Prometheus",
"fieldConfig": { "defaults": { "unit": "s", "decimals": 3 } },
"targets": [
{
"expr": "histogram_quantile(0.50, sum(rate(offlinekit_attestation_verify_latency_seconds_bucket{success=\"true\"}[5m])) by (le, attestation_type))",
"legendFormat": "p50 {{attestation_type}}"
},
{
"expr": "histogram_quantile(0.95, sum(rate(offlinekit_attestation_verify_latency_seconds_bucket{success=\"true\"}[5m])) by (le, attestation_type))",
"legendFormat": "p95 {{attestation_type}}"
}
]
},
{
"type": "timeseries",
"title": "Rekor inclusion latency p50/p95 (by success)",
"datasource": "Prometheus",
"fieldConfig": { "defaults": { "unit": "s", "decimals": 3 } },
"targets": [
{
"expr": "histogram_quantile(0.50, sum(rate(rekor_inclusion_latency_bucket[5m])) by (le, success))",
"legendFormat": "p50 success={{success}}"
},
{
"expr": "histogram_quantile(0.95, sum(rate(rekor_inclusion_latency_bucket[5m])) by (le, success))",
"legendFormat": "p95 success={{success}}"
}
]
},
{
"type": "timeseries",
"title": "Rekor verification successes (rate)",
"datasource": "Prometheus",
"fieldConfig": { "defaults": { "unit": "ops", "decimals": 3 } },
"targets": [
{ "expr": "sum(rate(attestor_rekor_success_total[5m])) by (mode)", "legendFormat": "{{mode}}" }
]
},
{
"type": "timeseries",
"title": "Rekor verification retries (rate)",
"datasource": "Prometheus",
"fieldConfig": { "defaults": { "unit": "ops", "decimals": 3 } },
"targets": [
{ "expr": "sum(rate(attestor_rekor_retry_total[5m])) by (reason)", "legendFormat": "{{reason}}" }
]
}
]
}

View File

@@ -1,6 +1,6 @@
# Logging Standards (DOCS-OBS-50-003)
Last updated: 2025-11-25 (Docs Tasks Md.VI)
Last updated: 2025-12-15
## Goals
- Deterministic, structured logs for all services.
@@ -20,6 +20,14 @@ Required fields:
Optional but recommended:
- `resource` (subject id/purl/path when safe), `http.method`, `http.status_code`, `duration_ms`, `host`, `pid`, `thread`.
## Offline Kit / air-gap import fields
When emitting logs for Offline Kit import/activation flows, keep field names stable:
- Required scope key: `tenant_id`
- Common keys: `bundle_type`, `bundle_digest`, `bundle_path`, `manifest_version`, `manifest_created_at`
- Force activation keys: `force_activate`, `force_activate_reason`
- Outcome keys: `result`, `reason_code`, `reason_message`
- Quarantine keys: `quarantine_id`, `quarantine_path`
## Redaction rules
- Never log Authorization headers, tokens, passwords, private keys, full request/response bodies.
- Redact to `"[redacted]"` and add `redaction.reason` (`secret|pii|policy`).

View File

@@ -1,6 +1,6 @@
# Metrics & SLOs (DOCS-OBS-51-001)
Last updated: 2025-11-25 (Docs Tasks Md.VI)
Last updated: 2025-12-15
## Core metrics (platform-wide)
- **Requests**: `http_requests_total{tenant,workload,route,status}` (counter); latency histogram `http_request_duration_seconds`.
@@ -24,6 +24,77 @@ Last updated: 2025-11-25 (Docs Tasks Md.VI)
- Queue backlog: `queue_depth > 1000` for 5m.
- Job failures: `rate(worker_jobs_total{status="failed"}[10m]) > 0.01`.
## UX KPIs (triage TTFS)
- Targets:
- TTFS first evidence p95: <= 1.5s
- TTFS skeleton p95: <= 0.2s
- Clicks-to-closure median: <= 6
- Evidence completeness avg: >= 90% (>= 3.6/4)
```promql
# TTFS first evidence p50/p95
histogram_quantile(0.50, sum(rate(stellaops_ttfs_first_evidence_seconds_bucket[5m])) by (le))
histogram_quantile(0.95, sum(rate(stellaops_ttfs_first_evidence_seconds_bucket[5m])) by (le))
# Clicks-to-closure median
histogram_quantile(0.50, sum(rate(stellaops_clicks_to_closure_bucket[5m])) by (le))
# Evidence completeness average percent (0-4 mapped to 0-100)
100 * (sum(rate(stellaops_evidence_completeness_score_sum[5m])) / clamp_min(sum(rate(stellaops_evidence_completeness_score_count[5m])), 1)) / 4
# Budget violations by phase
sum(rate(stellaops_performance_budget_violations_total[5m])) by (phase)
```
- Dashboard: `ops/devops/observability/grafana/triage-ttfs.json`
- Alerts: `ops/devops/observability/triage-alerts.yaml`
## TTFS Metrics (time-to-first-signal)
- Core metrics:
- `ttfs_latency_seconds{surface,cache_hit,signal_source,kind,phase,tenant_id}` (histogram)
- `ttfs_signal_total{surface,cache_hit,signal_source,kind,phase,tenant_id}` (counter)
- `ttfs_cache_hit_total{surface,cache_hit,signal_source,kind,phase,tenant_id}` (counter)
- `ttfs_cache_miss_total{surface,cache_hit,signal_source,kind,phase,tenant_id}` (counter)
- `ttfs_slo_breach_total{surface,cache_hit,signal_source,kind,phase,tenant_id}` (counter)
- `ttfs_error_total{surface,cache_hit,signal_source,kind,phase,tenant_id,error_type,error_code}` (counter)
- SLO targets:
- P50 < 2s, P95 < 5s (all surfaces)
- Warm path P50 < 700ms, P95 < 2.5s
- Cold path P95 < 4s
```promql
# TTFS latency p50/p95
histogram_quantile(0.50, sum(rate(ttfs_latency_seconds_bucket[5m])) by (le))
histogram_quantile(0.95, sum(rate(ttfs_latency_seconds_bucket[5m])) by (le))
# SLO breach rate (per minute)
60 * sum(rate(ttfs_slo_breach_total[5m]))
```
## Offline Kit (air-gap) metrics
- `offlinekit_import_total{status,tenant_id}` (counter)
- `offlinekit_attestation_verify_latency_seconds{attestation_type,success}` (histogram)
- `attestor_rekor_success_total{mode}` (counter)
- `attestor_rekor_retry_total{reason}` (counter)
- `rekor_inclusion_latency{success}` (histogram)
```promql
# Import rate by status
sum(rate(offlinekit_import_total[5m])) by (status)
# Import success rate
sum(rate(offlinekit_import_total{status="success"}[5m])) / clamp_min(sum(rate(offlinekit_import_total[5m])), 1)
# Attestation verify p95 by type (success only)
histogram_quantile(0.95, sum(rate(offlinekit_attestation_verify_latency_seconds_bucket{success="true"}[5m])) by (le, attestation_type))
# Rekor inclusion latency p95 (by success)
histogram_quantile(0.95, sum(rate(rekor_inclusion_latency_bucket[5m])) by (le, success))
```
Dashboard: `docs/observability/dashboards/offline-kit-operations.json`
## Observability hygiene
- Tag everything with `tenant`, `workload`, `env`, `region`, `version`.
- Keep metric names stable; prefer adding labels over renaming.

View File

@@ -29,6 +29,16 @@ Normalize static callgraphs across languages so Signals can merge them with runt
- Graph SHA256 must match tar content; Signals rejects mismatched SHA.
- Only ASCII; UTF-8 paths are allowed but must be normalized (NFC).
## V1 Schema Reference
The `stella.callgraph.v1` schema provides enhanced fields for explainability:
- **Edge Reasons**: 13 reason codes explaining why edges exist
- **Symbol Visibility**: Public/Internal/Protected/Private access levels
- **Typed Entrypoints**: Framework-aware entrypoint detection
See [Callgraph Schema Reference](../signals/callgraph-formats.md) for complete v1 schema documentation.
## References
- **V1 Schema Reference**: `docs/signals/callgraph-formats.md`
- Union schema: `docs/reachability/runtime-static-union-schema.md`
- Delivery guide: `docs/reachability/DELIVERY_GUIDE.md`

View File

@@ -1,15 +1,355 @@
# Callgraph Formats (outline)
# Callgraph Schema Reference
## Pending Inputs
- See sprint SPRINT_0309_0001_0009_docs_tasks_md_ix action tracker; inputs due 2025-12-09..12 from owning guilds.
This document describes the `stella.callgraph.v1` schema used for representing call graphs in StellaOps.
## Determinism Checklist
- [ ] Hash any inbound assets/payloads; place sums alongside artifacts (e.g., SHA256SUMS in this folder).
- [ ] Keep examples offline-friendly and deterministic (fixed seeds, pinned versions, stable ordering).
- [ ] Note source/approver for any provided captures or schemas.
## Schema Version
## Sections to fill (once inputs arrive)
- Supported callgraph schema versions and shapes.
- Field definitions and validation rules.
- Common validation errors with deterministic examples.
- Hashes for any sample graphs provided.
**Current Version:** `stella.callgraph.v1`
All call graphs should include the `schema` field set to `stella.callgraph.v1`. Legacy call graphs without this field are automatically migrated on ingestion.
## Document Structure
A `CallgraphDocument` contains the following top-level fields:
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `schema` | string | Yes | Schema identifier: `stella.callgraph.v1` |
| `scanKey` | string | No | Scan context identifier |
| `language` | CallgraphLanguage | No | Primary language of the call graph |
| `artifacts` | CallgraphArtifact[] | No | Artifacts included in the graph |
| `nodes` | CallgraphNode[] | Yes | Graph nodes representing symbols |
| `edges` | CallgraphEdge[] | Yes | Call edges between nodes |
| `entrypoints` | CallgraphEntrypoint[] | No | Discovered entrypoints |
| `metadata` | CallgraphMetadata | No | Graph-level metadata |
| `id` | string | Yes | Unique graph identifier |
| `component` | string | No | Component name |
| `version` | string | No | Component version |
| `ingestedAt` | DateTimeOffset | No | Ingestion timestamp (ISO 8601) |
| `graphHash` | string | No | Content hash for deduplication |
### Legacy Fields
These fields are preserved for backward compatibility:
| Field | Type | Description |
|-------|------|-------------|
| `languageString` | string | Legacy language string |
| `roots` | CallgraphRoot[] | Legacy root/entrypoint representation |
| `schemaVersion` | string | Legacy schema version field |
## Enumerations
### CallgraphLanguage
Supported languages for call graph analysis:
| Value | Description |
|-------|-------------|
| `Unknown` | Language not determined |
| `DotNet` | .NET (C#, F#, VB.NET) |
| `Java` | Java and JVM languages |
| `Node` | Node.js / JavaScript / TypeScript |
| `Python` | Python |
| `Go` | Go |
| `Rust` | Rust |
| `Ruby` | Ruby |
| `Php` | PHP |
| `Binary` | Native binary (ELF, PE) |
| `Swift` | Swift |
| `Kotlin` | Kotlin |
### SymbolVisibility
Access visibility levels for symbols:
| Value | Description |
|-------|-------------|
| `Unknown` | Visibility not determined |
| `Public` | Publicly accessible |
| `Internal` | Internal to assembly/module |
| `Protected` | Protected (subclass accessible) |
| `Private` | Private to containing type |
### EdgeKind
Edge classification based on analysis confidence:
| Value | Description | Confidence |
|-------|-------------|------------|
| `Static` | Statically determined call | High |
| `Heuristic` | Heuristically inferred | Medium |
| `Runtime` | Runtime-observed edge | Highest |
### EdgeReason
Reason codes explaining why an edge exists (critical for explainability):
| Value | Description | Typical Kind |
|-------|-------------|--------------|
| `DirectCall` | Direct method/function call | Static |
| `VirtualCall` | Virtual/interface dispatch | Static |
| `ReflectionString` | Reflection-based invocation | Heuristic |
| `DiBinding` | Dependency injection binding | Heuristic |
| `DynamicImport` | Dynamic import/require | Heuristic |
| `NewObj` | Constructor/object instantiation | Static |
| `DelegateCreate` | Delegate/function pointer creation | Static |
| `AsyncContinuation` | Async/await continuation | Static |
| `EventHandler` | Event handler subscription | Heuristic |
| `GenericInstantiation` | Generic type instantiation | Static |
| `NativeInterop` | Native interop (P/Invoke, JNI, FFI) | Static |
| `RuntimeMinted` | Runtime-minted edge from execution | Runtime |
| `Unknown` | Reason could not be determined | - |
### EntrypointKind
Types of entrypoints:
| Value | Description |
|-------|-------------|
| `Unknown` | Type not determined |
| `Http` | HTTP endpoint |
| `Grpc` | gRPC endpoint |
| `Cli` | CLI command handler |
| `Job` | Background job |
| `Event` | Event handler |
| `MessageQueue` | Message queue consumer |
| `Timer` | Timer/scheduled task |
| `Test` | Test method |
| `Main` | Main entry point |
| `ModuleInit` | Module initializer |
| `StaticConstructor` | Static constructor |
### EntrypointFramework
Frameworks that expose entrypoints:
| Value | Description | Language |
|-------|-------------|----------|
| `Unknown` | Framework not determined | - |
| `AspNetCore` | ASP.NET Core | DotNet |
| `MinimalApi` | ASP.NET Core Minimal APIs | DotNet |
| `Spring` | Spring Framework | Java |
| `SpringBoot` | Spring Boot | Java |
| `Express` | Express.js | Node |
| `Fastify` | Fastify | Node |
| `NestJs` | NestJS | Node |
| `FastApi` | FastAPI | Python |
| `Flask` | Flask | Python |
| `Django` | Django | Python |
| `Rails` | Ruby on Rails | Ruby |
| `Gin` | Gin | Go |
| `Echo` | Echo | Go |
| `Actix` | Actix Web | Rust |
| `Rocket` | Rocket | Rust |
| `AzureFunctions` | Azure Functions | Multi |
| `AwsLambda` | AWS Lambda | Multi |
| `CloudFunctions` | Google Cloud Functions | Multi |
### EntrypointPhase
Execution phase for entrypoints:
| Value | Description |
|-------|-------------|
| `ModuleInit` | Module/assembly initialization |
| `AppStart` | Application startup (Main) |
| `Runtime` | Runtime request handling |
| `Shutdown` | Shutdown/cleanup handlers |
## Node Structure
A `CallgraphNode` represents a symbol (method, function, type) in the call graph:
```json
{
"id": "n001",
"nodeId": "n001",
"name": "GetWeatherForecast",
"kind": "method",
"namespace": "SampleApi.Controllers",
"file": "WeatherForecastController.cs",
"line": 15,
"symbolKey": "SampleApi.Controllers.WeatherForecastController::GetWeatherForecast()",
"artifactKey": "SampleApi.dll",
"visibility": "Public",
"isEntrypointCandidate": true,
"attributes": {
"returnType": "IEnumerable<WeatherForecast>",
"httpMethod": "GET",
"route": "/weatherforecast"
},
"flags": 3
}
```
### Node Fields
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `id` | string | Yes | Unique identifier within the graph |
| `nodeId` | string | No | Alias for id (v1 schema convention) |
| `name` | string | Yes | Human-readable symbol name |
| `kind` | string | Yes | Symbol kind (method, function, class) |
| `namespace` | string | No | Namespace or module path |
| `file` | string | No | Source file path |
| `line` | int | No | Source line number |
| `symbolKey` | string | No | Canonical symbol key (v1) |
| `artifactKey` | string | No | Reference to containing artifact |
| `visibility` | SymbolVisibility | No | Access visibility |
| `isEntrypointCandidate` | bool | No | Whether node is an entrypoint candidate |
| `purl` | string | No | Package URL for external packages |
| `symbolDigest` | string | No | Content-addressed symbol digest |
| `attributes` | object | No | Additional attributes |
| `flags` | int | No | Bitmask for efficient filtering |
### Symbol Key Format
The `symbolKey` follows a canonical format:
```
{Namespace}.{Type}[`Arity][+Nested]::{Method}[`Arity]({ParamTypes})
```
Examples:
- `System.String::Concat(string, string)`
- `MyApp.Controllers.UserController::GetUser(int)`
- `System.Collections.Generic.List`1::Add(T)`
## Edge Structure
A `CallgraphEdge` represents a call relationship between two symbols:
```json
{
"sourceId": "n001",
"targetId": "n002",
"from": "n001",
"to": "n002",
"type": "call",
"kind": "Static",
"reason": "DirectCall",
"weight": 1.0,
"offset": 42,
"isResolved": true,
"provenance": "static-analysis"
}
```
### Edge Fields
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `sourceId` | string | Yes | Source node ID (caller) |
| `targetId` | string | Yes | Target node ID (callee) |
| `from` | string | No | Alias for sourceId (v1) |
| `to` | string | No | Alias for targetId (v1) |
| `type` | string | No | Legacy edge type |
| `kind` | EdgeKind | No | Edge classification |
| `reason` | EdgeReason | No | Reason for edge existence |
| `weight` | double | No | Confidence weight (0.0-1.0) |
| `offset` | int | No | IL/bytecode offset |
| `isResolved` | bool | No | Whether target was fully resolved |
| `provenance` | string | No | Provenance information |
| `candidates` | string[] | No | Virtual dispatch candidates |
## Entrypoint Structure
A `CallgraphEntrypoint` represents a discovered entrypoint:
```json
{
"nodeId": "n001",
"kind": "Http",
"route": "/api/users/{id}",
"httpMethod": "GET",
"framework": "AspNetCore",
"source": "attribute",
"phase": "Runtime",
"order": 0
}
```
### Entrypoint Fields
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `nodeId` | string | Yes | Reference to the node |
| `kind` | EntrypointKind | Yes | Type of entrypoint |
| `route` | string | No | HTTP route pattern |
| `httpMethod` | string | No | HTTP method (GET, POST, etc.) |
| `framework` | EntrypointFramework | No | Framework exposing the entrypoint |
| `source` | string | No | Discovery source |
| `phase` | EntrypointPhase | No | Execution phase |
| `order` | int | No | Deterministic ordering |
## Determinism Requirements
For reproducible analysis, call graphs must be deterministic:
1. **Stable Ordering**
- Nodes must be sorted by `id` (ordinal string comparison)
- Edges must be sorted by `sourceId`, then `targetId`
- Entrypoints must be sorted by `order`
2. **Enum Serialization**
- All enums serialize as camelCase strings
- Example: `EdgeReason.DirectCall``"directCall"`
3. **Timestamps**
- All timestamps must be UTC ISO 8601 format
- Example: `2025-01-15T10:00:00Z`
4. **Content Hashing**
- The `graphHash` field should contain a stable content hash
- Hash algorithm: SHA-256
- Format: `sha256:{hex-digest}`
## Schema Migration
Legacy call graphs without the `schema` field are automatically migrated:
1. **Schema Field**: Set to `stella.callgraph.v1`
2. **Language Parsing**: String language converted to `CallgraphLanguage` enum
3. **Visibility Inference**: Inferred from symbol key patterns:
- Contains `.Internal.``Internal`
- Contains `._` or `<``Private`
- Default → `Public`
4. **Edge Reason Inference**: Based on legacy `type` field:
- `call`, `direct``DirectCall`
- `virtual`, `callvirt``VirtualCall`
- `newobj``NewObj`
- etc.
5. **Entrypoint Inference**: Built from legacy `roots` and candidate nodes
6. **Symbol Key Generation**: Built from namespace and name if missing
## Validation Rules
Call graphs are validated against these rules:
1. All node `id` values must be unique
2. All edge `sourceId` and `targetId` must reference existing nodes
3. All entrypoint `nodeId` must reference existing nodes
4. Edge `weight` must be between 0.0 and 1.0
5. Artifacts referenced by nodes must exist in the `artifacts` list
## Golden Fixtures
Reference fixtures for testing are located at:
`tests/reachability/fixtures/callgraph-schema-v1/`
| Fixture | Description |
|---------|-------------|
| `dotnet-aspnetcore-minimal.json` | ASP.NET Core application |
| `java-spring-boot.json` | Spring Boot application |
| `node-express-api.json` | Express.js API |
| `go-gin-api.json` | Go Gin API |
| `legacy-no-schema.json` | Legacy format for migration testing |
| `all-edge-reasons.json` | All 13 edge reason codes |
| `all-visibility-levels.json` | All 5 visibility levels |
## Related Documentation
- [Reachability Analysis Technical Reference](../reachability/README.md)
- [Schema Migration Implementation](../../src/Signals/StellaOps.Signals/Parsing/CallgraphSchemaMigrator.cs)
- [SPRINT_1100: CallGraph Schema Enhancement](../implplan/SPRINT_1100_0001_0001_callgraph_schema_enhancement.md)

View File

@@ -0,0 +1,383 @@
# Unknowns Ranking Algorithm Reference
This document describes the multi-factor scoring algorithm used to rank and triage unknowns in the StellaOps Signals module.
## Purpose
When reachability analysis encounters unresolved symbols, edges, or package identities, these are recorded as **unknowns**. The ranking algorithm prioritizes unknowns by computing a composite score from five factors, then assigns each to a triage band (HOT/WARM/COLD) that determines rescan scheduling and escalation policies.
## Scoring Formula
The composite score is computed as:
```
Score = wP × P + wE × E + wU × U + wC × C + wS × S
```
Where:
- **P** = Popularity (deployment impact)
- **E** = Exploit potential (CVE severity)
- **U** = Uncertainty density (flag accumulation)
- **C** = Centrality (graph position importance)
- **S** = Staleness (evidence age)
All factors are normalized to [0.0, 1.0] before weighting. The final score is clamped to [0.0, 1.0].
### Default Weights
| Factor | Weight | Description |
|--------|--------|-------------|
| wP | 0.25 | Popularity weight |
| wE | 0.25 | Exploit potential weight |
| wU | 0.25 | Uncertainty density weight |
| wC | 0.15 | Centrality weight |
| wS | 0.10 | Staleness weight |
Weights must sum to 1.0 and are configurable via `Signals:UnknownsScoring` settings.
## Factor Details
### Factor P: Popularity (Deployment Impact)
Measures how widely the unknown's package is deployed across monitored environments.
**Formula:**
```
P = min(1, log10(1 + deploymentCount) / log10(1 + maxDeployments))
```
**Parameters:**
- `deploymentCount`: Number of deployments referencing the package (from `deploy_refs` table)
- `maxDeployments`: Normalization ceiling (default: 100)
**Rationale:** Logarithmic scaling prevents a single highly-deployed package from dominating scores while still prioritizing widely-used dependencies.
### Factor E: Exploit Potential (CVE Severity)
Estimates the consequence severity if the unknown resolves to a vulnerable component.
**Current Implementation:**
- Returns 0.5 (medium potential) when no CVE association exists
- Future: Integrate KEV lookup, EPSS scores, and exploit database references
**Planned Enhancements:**
- CVE severity mapping (Critical=1.0, High=0.8, Medium=0.5, Low=0.2)
- KEV (Known Exploited Vulnerabilities) flag boost
- EPSS (Exploit Prediction Scoring System) integration
### Factor U: Uncertainty Density (Flag Accumulation)
Aggregates uncertainty signals from multiple sources. Each flag contributes a weighted penalty.
**Flag Weights:**
| Flag | Weight | Description |
|------|--------|-------------|
| `NoProvenanceAnchor` | 0.30 | Cannot verify package source |
| `VersionRange` | 0.25 | Version specified as range, not exact |
| `DynamicCallTarget` | 0.25 | Reflection, eval, or dynamic dispatch |
| `ConflictingFeeds` | 0.20 | Contradictory info from different feeds |
| `ExternalAssembly` | 0.20 | Assembly outside analysis scope |
| `MissingVector` | 0.15 | No CVSS vector for severity assessment |
| `UnreachableSourceAdvisory` | 0.10 | Source advisory URL unreachable |
**Formula:**
```
U = min(1.0, sum(activeFlags × flagWeight))
```
**Example:**
- NoProvenanceAnchor (0.30) + VersionRange (0.25) + MissingVector (0.15) = 0.70
### Factor C: Centrality (Graph Position Importance)
Measures the unknown's position importance in the call graph using betweenness centrality.
**Formula:**
```
C = min(1.0, betweenness / maxBetweenness)
```
**Parameters:**
- `betweenness`: Raw betweenness centrality from graph analysis
- `maxBetweenness`: Normalization ceiling (default: 1000)
**Rationale:** High-betweenness nodes appear on many shortest paths, meaning they're likely to be reached regardless of entry point.
**Related Metrics:**
- `DegreeCentrality`: Number of incoming + outgoing edges (stored but not used in score)
- `BetweennessCentrality`: Raw betweenness value (stored for debugging)
### Factor S: Staleness (Evidence Age)
Measures how old the evidence is since the last successful analysis attempt.
**Formula:**
```
S = min(1.0, daysSinceLastAnalysis / maxDays)
```
With exponential decay enhancement (optional):
```
S = 1 - exp(-daysSinceLastAnalysis / tau)
```
**Parameters:**
- `daysSinceLastAnalysis`: Days since `LastAnalyzedAt` timestamp
- `maxDays`: Staleness ceiling (default: 14 days)
- `tau`: Decay constant for exponential model (default: 14)
**Special Cases:**
- Never analyzed (`LastAnalyzedAt` is null): S = 1.0 (maximum staleness)
## Band Assignment
Based on the composite score, unknowns are assigned to triage bands:
| Band | Threshold | Rescan Policy | Description |
|------|-----------|---------------|-------------|
| **HOT** | Score >= 0.70 | 15 minutes | Immediate rescan + VEX escalation |
| **WARM** | 0.40 <= Score < 0.70 | 24 hours | Scheduled rescan within 12-72h |
| **COLD** | Score < 0.40 | 7 days | Weekly batch processing |
Thresholds are configurable:
```yaml
Signals:
UnknownsScoring:
HotThreshold: 0.70
WarmThreshold: 0.40
```
## Scheduler Integration
The `UnknownsRescanWorker` processes unknowns based on their band:
### HOT Band Processing
- Poll interval: 1 minute
- Batch size: 10 items
- Action: Trigger immediate rescan via `IRescanOrchestrator`
- On failure: Exponential backoff, max 3 retries before demotion to WARM
### WARM Band Processing
- Poll interval: 5 minutes
- Batch size: 50 items
- Scheduled window: 12-72 hours based on score within band
- On failure: Increment `RescanAttempts`, re-queue with delay
### COLD Band Processing
- Schedule: Weekly on configurable day (default: Sunday)
- Batch size: 500 items
- Action: Batch rescan job submission
- On failure: Log and retry next week
## Normalization Trace
Each scored unknown includes a `NormalizationTrace` for debugging and replay:
```json
{
"rawPopularity": 42,
"normalizedPopularity": 0.65,
"popularityFormula": "min(1, log10(1 + 42) / log10(1 + 100))",
"rawExploitPotential": 0.5,
"normalizedExploitPotential": 0.5,
"rawUncertainty": 0.55,
"normalizedUncertainty": 0.55,
"activeFlags": ["NoProvenanceAnchor", "VersionRange"],
"rawCentrality": 250.0,
"normalizedCentrality": 0.25,
"rawStaleness": 7,
"normalizedStaleness": 0.5,
"weights": {
"wP": 0.25,
"wE": 0.25,
"wU": 0.25,
"wC": 0.15,
"wS": 0.10
},
"finalScore": 0.52,
"assignedBand": "Warm",
"computedAt": "2025-12-15T10:00:00Z"
}
```
**Replay Capability:** Given the trace, the exact score can be recomputed:
```
Score = 0.25×0.65 + 0.25×0.5 + 0.25×0.55 + 0.15×0.25 + 0.10×0.5
= 0.1625 + 0.125 + 0.1375 + 0.0375 + 0.05
= 0.5125 ≈ 0.52
```
## API Endpoints
### Query Unknowns by Band
```
GET /api/signals/unknowns?band=hot&limit=50&offset=0
```
Response:
```json
{
"items": [
{
"id": "unk-123",
"subjectKey": "myapp|1.0.0",
"purl": "pkg:npm/lodash@4.17.21",
"score": 0.82,
"band": "Hot",
"flags": { "noProvenanceAnchor": true, "versionRange": true },
"nextScheduledRescan": "2025-12-15T10:15:00Z"
}
],
"total": 15,
"hasMore": false
}
```
### Get Score Explanation
```
GET /api/signals/unknowns/{id}/explain
```
Response:
```json
{
"unknown": { /* full UnknownSymbolDocument */ },
"normalizationTrace": { /* trace object */ },
"factorBreakdown": {
"popularity": { "raw": 42, "normalized": 0.65, "weighted": 0.1625 },
"exploitPotential": { "raw": 0.5, "normalized": 0.5, "weighted": 0.125 },
"uncertainty": { "raw": 0.55, "normalized": 0.55, "weighted": 0.1375 },
"centrality": { "raw": 250, "normalized": 0.25, "weighted": 0.0375 },
"staleness": { "raw": 7, "normalized": 0.5, "weighted": 0.05 }
},
"bandThresholds": { "hot": 0.70, "warm": 0.40 }
}
```
## Configuration Reference
```yaml
Signals:
UnknownsScoring:
# Factor weights (must sum to 1.0)
WeightPopularity: 0.25
WeightExploitPotential: 0.25
WeightUncertainty: 0.25
WeightCentrality: 0.15
WeightStaleness: 0.10
# Popularity normalization
PopularityMaxDeployments: 100
# Uncertainty flag weights
FlagWeightNoProvenance: 0.30
FlagWeightVersionRange: 0.25
FlagWeightConflictingFeeds: 0.20
FlagWeightMissingVector: 0.15
FlagWeightUnreachableSource: 0.10
FlagWeightDynamicTarget: 0.25
FlagWeightExternalAssembly: 0.20
# Centrality normalization
CentralityMaxBetweenness: 1000.0
# Staleness normalization
StalenessMaxDays: 14
StalenessTau: 14 # For exponential decay
# Band thresholds
HotThreshold: 0.70
WarmThreshold: 0.40
# Rescan scheduling
HotRescanMinutes: 15
WarmRescanHours: 24
ColdRescanDays: 7
UnknownsDecay:
# Nightly batch decay
BatchEnabled: true
MaxSubjectsPerBatch: 1000
ColdBatchDay: Sunday
```
## Determinism Requirements
The scoring algorithm is fully deterministic:
1. **Same inputs produce identical scores** - Given identical `UnknownSymbolDocument`, deployment counts, and graph metrics, the score will always be the same
2. **Normalization trace enables replay** - The trace contains all raw values and weights needed to reproduce the score
3. **Timestamps use UTC ISO 8601** - All `ComputedAt`, `LastAnalyzedAt`, and `NextScheduledRescan` timestamps are UTC
4. **Weights logged per computation** - The trace includes the exact weights used, allowing audit of configuration changes
## Database Schema
```sql
-- Unknowns table (enhanced)
CREATE TABLE signals.unknowns (
id UUID PRIMARY KEY,
subject_key TEXT NOT NULL,
purl TEXT,
symbol_id TEXT,
callgraph_id TEXT,
-- Scoring factors
popularity_score FLOAT DEFAULT 0,
deployment_count INT DEFAULT 0,
exploit_potential_score FLOAT DEFAULT 0,
uncertainty_score FLOAT DEFAULT 0,
centrality_score FLOAT DEFAULT 0,
degree_centrality INT DEFAULT 0,
betweenness_centrality FLOAT DEFAULT 0,
staleness_score FLOAT DEFAULT 0,
days_since_last_analysis INT DEFAULT 0,
-- Composite score and band
score FLOAT DEFAULT 0,
band TEXT DEFAULT 'cold' CHECK (band IN ('hot', 'warm', 'cold')),
-- Metadata
flags JSONB DEFAULT '{}',
normalization_trace JSONB,
rescan_attempts INT DEFAULT 0,
last_rescan_result TEXT,
-- Timestamps
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
last_analyzed_at TIMESTAMPTZ,
next_scheduled_rescan TIMESTAMPTZ
);
-- Indexes for band-based queries
CREATE INDEX idx_unknowns_band ON signals.unknowns(band);
CREATE INDEX idx_unknowns_score ON signals.unknowns(score DESC);
CREATE INDEX idx_unknowns_next_rescan ON signals.unknowns(next_scheduled_rescan)
WHERE next_scheduled_rescan IS NOT NULL;
CREATE INDEX idx_unknowns_subject ON signals.unknowns(subject_key);
```
## Metrics and Observability
The following metrics are exposed for monitoring:
| Metric | Type | Description |
|--------|------|-------------|
| `signals_unknowns_total` | Gauge | Total unknowns by band |
| `signals_unknowns_rescans_total` | Counter | Rescans triggered by band |
| `signals_unknowns_scoring_duration_seconds` | Histogram | Scoring computation time |
| `signals_unknowns_band_transitions_total` | Counter | Band changes (e.g., WARM->HOT) |
## Related Documentation
- [Unknowns Registry](./unknowns-registry.md) - Data model and API for unknowns
- [Reachability Analysis](./reachability.md) - Reachability scoring integration
- [Callgraph Schema](./callgraph-formats.md) - Graph structure for centrality computation

View File

@@ -46,6 +46,22 @@ All endpoints are additive; no hard deletes. Payloads must include tenant bindin
- Policy can block `not_affected` claims when `unknowns_pressure` exceeds thresholds.
- UI/CLI show unknown chips with reason and depth; operators can triage or suppress.
### 5.1 Multi-Factor Ranking
Unknowns are ranked using a 5-factor scoring algorithm that computes a composite score from:
- **Popularity (P)** - Deployment impact based on usage count
- **Exploit Potential (E)** - CVE severity if known
- **Uncertainty (U)** - Accumulated flag weights
- **Centrality (C)** - Graph position importance (betweenness)
- **Staleness (S)** - Evidence age since last analysis
Based on the composite score, unknowns are assigned to triage bands:
- **HOT** (score >= 0.70): Immediate rescan, 15-minute scheduling
- **WARM** (0.40 <= score < 0.70): Scheduled rescan within 12-72h
- **COLD** (score < 0.40): Weekly batch processing
See [Unknowns Ranking Algorithm](./unknowns-ranking.md) for the complete formula reference.
## 6. Storage & CAS
- Primary store: append-only KV/graph in Mongo (collections `unknowns`, `unknown_metrics`).