tests fixes and sprints work

This commit is contained in:
master
2026-01-22 19:08:46 +02:00
parent c32fff8f86
commit 726d70dc7f
881 changed files with 134434 additions and 6228 deletions

View File

@@ -0,0 +1,170 @@
# Product Advisory: Golden Corpus Patch-Paired Artifacts
> **Date:** 2026-01-21
> **Status:** ARCHIVED - Translated to sprint tasks
> **Archive Date:** 2026-01-21
---
## Advisory Summary
This advisory proposed building a **permissively-licensed "golden corpus" of patch-paired artifacts** and a **minimal offline harness** to prove SBOM reproducibility and binary-level patch provenance.
### Key Value Proposition
If Stella Ops can **prove** (offline) that a shipped binary matches a fixed advisory and that its SBOM is deterministic, it enables:
- **Auditor-ready evidence bundles** for air-gapped customers
- **Clear moat signals** competitors lack
- **Verifiable patch provenance** independent of package metadata
---
## Original Proposal
### Corpus Sources
| Source | Type | URL |
|--------|------|-----|
| Debian Security Tracker / DSAs | Advisory | https://www.debian.org/security/ |
| Debian Snapshot | Binary archive | https://snapshot.debian.org |
| Ubuntu Security Notices (USN) | Advisory | https://ubuntu.com/security/notices |
| Alpine secdb | Advisory YAML | https://github.com/alpinelinux/alpine-secdb |
| OSV full dump | Unified schema | https://osv.dev |
### Dataset Selection Rules
1. Primary advisory present (DSA/USN/secdb) naming package + fixed version(s)
2. Patch-paired artifacts available (both pre-fix and post-fix)
3. Permissive licensing (MIT/Apache/BSD)
4. Reproducible-build tractability
### Proposed KPIs
| KPI | Target |
|-----|--------|
| Per-function match rate | >= 90% |
| False-negative patch detection | <= 5% |
| SBOM canonical-hash stability | 3/3 |
| Binary reconstruction equivalence | Track trend |
| End-to-end offline verify time | Track trend |
### Six-Week Deliverable Plan
- Wk 1-2: Mirror & pick 10 targets
- Wk 2-3: Canonical SBOM PoC
- Wk 3-4: Lifter/Matcher PoC
- Wk 5-6: End-to-end bundle & verifier
---
## Implementation Status
### Existing Capabilities (Pre-Advisory)
The following infrastructure already existed in the codebase:
| Component | Location | Status |
|-----------|----------|--------|
| Ground-truth corpus infrastructure | `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.*` | EXISTS |
| Golden set schema (YAML) | `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/` | EXISTS |
| Delta-sig framework | `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/` | EXISTS |
| SBOM canonicalization | `src/Attestor/__Libraries/StellaOps.Attestor.StandardPredicates/Writers/SpdxWriter.cs` | EXISTS |
| AirGap bundle format v2.0.0 | `src/AirGap/__Libraries/StellaOps.AirGap.Bundle/` | EXISTS |
| Semantic analysis library | `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/` | EXISTS |
| Symbol source abstractions | `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Abstractions/` | EXISTS |
### Gaps Identified
1. Validation harness orchestration (the "glue")
2. Complete symbol source connector implementations
3. Corpus-level data governance (deduplication, versioning)
4. CLI commands for corpus management
5. CI regression gates for KPIs
6. Offline evidence bundle export/import
7. Doctor health checks for corpus infrastructure
---
## Sprint Deliverables
This advisory was translated into three implementation sprints:
### Sprint 034 - Foundation (Weeks 1-2)
**File:** `docs/implplan/SPRINT_20260121_034_BinaryIndex_golden_corpus_foundation.md`
**Deliverables:**
- Local mirror layer for corpus sources
- Debuginfod symbol source connector (complete)
- Validation harness skeleton
- KPI tracking schema and baseline infrastructure
- 10 seed targets documented
### Sprint 035 - Connectors & CLI (Weeks 3-4)
**File:** `docs/implplan/SPRINT_20260121_035_BinaryIndex_golden_corpus_connectors_cli.md`
**Deliverables:**
- Ubuntu ddeb connector
- Debian buildinfo connector
- Alpine secdb connector
- SBOM canonical-hash stability KPI
- CLI commands (`stella groundtruth ...`)
- OSV cross-correlation
### Sprint 036 - Bundle & Verification (Weeks 5-6)
**File:** `docs/implplan/SPRINT_20260121_036_BinaryIndex_golden_corpus_bundle_verification.md`
**Deliverables:**
- Offline corpus bundle export
- Offline corpus bundle import and verification
- Standalone offline verifier binary
- Doctor health checks
- CI regression gates
- Documentation and runbooks
---
## Documentation Updates
The following documentation was created or updated as part of this advisory processing:
| Document | Status |
|----------|--------|
| `docs/benchmarks/golden-corpus-kpis.md` | CREATED |
| `docs/benchmarks/golden-corpus-seed-list.md` | CREATED |
| `docs/modules/binary-index/architecture.md` | UPDATED (Section 10 added) |
---
## References
### External Sources
- [Debian Security Tracker](https://www.debian.org/security/)
- [Debian Snapshot](https://snapshot.debian.org)
- [Ubuntu Security Notices](https://ubuntu.com/security/notices)
- [Alpine secdb](https://github.com/alpinelinux/alpine-secdb)
- [OSV Data Sources](https://google.github.io/osv.dev/data/)
- [Chromium Courgette/Zucchini](https://www.chromium.org/developers/design-documents/software-updates-courgette/)
- [zchunk](https://github.com/zchunk/zchunk)
### Internal Documentation
- [BinaryIndex Architecture](../../../docs/modules/binary-index/architecture.md)
- [Ground-Truth Corpus Specification](../../../docs/benchmarks/ground-truth-corpus.md)
- [Golden Corpus KPIs](../../../docs/benchmarks/golden-corpus-kpis.md)
---
## Archive Notes
This advisory has been fully processed:
- [x] Gaps identified and documented
- [x] Sprint tasks created (034, 035, 036)
- [x] Documentation created/updated
- [x] Architecture docs updated with KPIs and corpus sources
- [x] Advisory archived with implementation references
**Archive Reason:** Advisory fully translated into sprint tasks and documentation.

View File

@@ -0,0 +1,51 @@
# Archive Manifest: Golden Corpus Patch-Paired Artifacts
> **Archive Date:** 2026-01-21
> **Archive Reason:** Advisory translated to sprint tasks and documentation
## Archived Files
| File | Description |
|------|-------------|
| `21-Jan-2026 - Golden Corpus Patch-Paired Artifacts.md` | Original advisory with implementation status |
## Implementation References
### Sprint Files
| Sprint | File | Status |
|--------|------|--------|
| 034 | `docs/implplan/SPRINT_20260121_034_BinaryIndex_golden_corpus_foundation.md` | TODO |
| 035 | `docs/implplan/SPRINT_20260121_035_BinaryIndex_golden_corpus_connectors_cli.md` | TODO |
| 036 | `docs/implplan/SPRINT_20260121_036_BinaryIndex_golden_corpus_bundle_verification.md` | TODO |
### Documentation Created
| Document | Path |
|----------|------|
| KPI Specification | `docs/benchmarks/golden-corpus-kpis.md` |
| Seed List | `docs/benchmarks/golden-corpus-seed-list.md` |
### Documentation Updated
| Document | Path | Change |
|----------|------|--------|
| BinaryIndex Architecture | `docs/modules/binary-index/architecture.md` | Added Section 10: Golden Corpus |
## Maturity Assessment
**Pre-Advisory Maturity:** 60-70%
The codebase had strong foundational components:
- Ground-truth infrastructure with symbol source abstractions
- Complete SBOM canonicalization and SPDX 3.0.1 support
- DeltaSig v2 predicate framework with VEX integration
- AirGap bundle format with offline verification
- Reproducibility validation primitives
**Gaps Addressed by Sprints:**
- Validation harness orchestration
- Complete symbol source connector implementations
- CLI commands and user-facing workflows
- CI regression gates
- Offline evidence bundle export/import

View File

@@ -0,0 +1,90 @@
# Archive Manifest: Delta-Sig Predicate Advisory
**Archived**: 2026-01-22
**Status**: Superseded by existing implementation
**Disposition**: No action required - functionality already implemented
---
## Advisory Summary
The advisory proposed a DSSE-signed "delta-signature predicate" for proving byte-level changes in images/SBOMs with:
- `hunks[]` for byte-level patch evidence
- `original/patched.sbom_cdx_hash` for SBOM linking
- RFC 8785 JCS canonicalization
- OCI referrer storage
- Rekor transparency log recording
- Optional `function_fp` fingerprints
---
## Why Archived (Already Implemented)
The Stella Ops codebase has **more sophisticated implementations** of all proposed functionality:
### 1. Delta Signatures (Function-Level, Not Byte-Level)
- **Existing**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`
- `DeltaSigPredicate.cs` (v1) - function-level binary diffs
- `DeltaSigPredicateV2.cs` (v2) - with symbol provenance & IR diffs
- `DeltaSignatureGenerator.cs`, `DeltaSignatureMatcher.cs`
- **Predicate URIs**:
- `https://stellaops.dev/delta-sig/v1`
- `https://stella-ops.org/predicates/deltasig/v2`
- **Advantage over advisory**: Tracks **function semantics** (IR hashes, semantic similarity scores) rather than raw byte hunks, which is more resilient to compiler variations.
### 2. DSSE Signing
- **Existing**: `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Core/Signing/DsseSigningService.cs`
- Supports ECDSA P-256, Ed25519, RSA-PSS
### 3. RFC 8785 JCS Canonicalization
- **Existing**: `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Json/Rfc8785JsonCanonicalizer.cs`
- Includes NFC Unicode normalization for cross-platform stability
### 4. SBOM Canonicalization
- **Existing**: `src/Attestor/__Libraries/StellaOps.Attestor.StandardPredicates/Canonicalization/SbomCanonicalizer.cs`
- **Documentation**: `docs/sboms/DETERMINISM.md` (comprehensive guide)
### 5. SBOM Delta Predicates
- **Existing schema**: `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Core/Schemas/sbom-delta.v1.schema.json`
- Tracks component-level changes: added, removed, version changes
### 6. OCI Referrer Storage
- **Existing**: `src/Attestor/__Libraries/StellaOps.Attestor.Oci/Services/OrasAttestationAttacher.cs`
- OCI Distribution Spec 1.1 compliant, cosign compatible
### 7. Rekor Integration
- **Existing**: `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Infrastructure/Rekor/RekorBackendResolver.cs`
- V2 tile-based verification support
### 8. CLI Tooling
- **Existing**: `src/Cli/StellaOps.Cli/Commands/DeltaSig/DeltaSigCommandGroup.cs`
- Commands: `extract`, `author`, `sign`, `verify`, `match`, `pack`, `inspect`
---
## Key Files for Reference
| Component | Path |
|-----------|------|
| DeltaSig v1 Predicate | `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Attestation/DeltaSigPredicate.cs` |
| DeltaSig v2 Predicate | `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Attestation/DeltaSigPredicateV2.cs` |
| DSSE Signing | `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Core/Signing/DsseSigningService.cs` |
| RFC 8785 JCS | `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Json/Rfc8785JsonCanonicalizer.cs` |
| SBOM Canonicalizer | `src/Attestor/__Libraries/StellaOps.Attestor.StandardPredicates/Canonicalization/SbomCanonicalizer.cs` |
| OCI Attacher | `src/Attestor/__Libraries/StellaOps.Attestor.Oci/Services/OrasAttestationAttacher.cs` |
| SBOM Delta Schema | `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Core/Schemas/sbom-delta.v1.schema.json` |
| CLI Commands | `src/Cli/StellaOps.Cli/Commands/DeltaSig/DeltaSigCommandGroup.cs` |
| Determinism Docs | `docs/sboms/DETERMINISM.md` |
---
## Potential Minor Enhancements (Optional)
1. **Predicate Type URI alignment**: Could standardize on `https://stella-ops.org/...` vs `https://stellaops.dev/...`
2. **Documentation**: Could add formal schema documentation to `docs/modules/binary-index/delta-sig-predicate-spec.md`
---
## Reviewer Notes
The advisory describes a **simplified version** of what is already a **mature system**. The existing implementation is architecturally superior for backport detection because it operates at the function semantic level rather than raw bytes, which handles compiler/optimization variations.

View File

@@ -0,0 +1,79 @@
# Archive Manifest: eBPF Witness Contract Advisory
**Archived**: 2026-01-22
**Status**: Partially implemented - simplified action items identified
**Disposition**: Archive with sprint tasks for remaining gaps
---
## Advisory Summary
The advisory proposed an eBPF-based witness contract for cryptographically proving runtime code execution paths with:
- eBPF probe types (kprobe, uprobe, tracepoint, USDT)
- Build ID extraction for binary provenance
- `stella.ops/ebpfWitness@v1` predicate type
- JCS canonicalization + DSSE signing
- Offline replay verification
- Rekor transparency log integration
---
## Implementation Assessment (~85% Complete)
### Already Implemented
| Component | Location | Coverage |
|-----------|----------|----------|
| eBPF capture abstraction | `src/RuntimeInstrumentation/StellaOps.RuntimeInstrumentation.Linux/Adapters/LinuxEbpfCaptureAdapter.cs` | Full |
| Tetragon integration | `src/RuntimeInstrumentation/StellaOps.RuntimeInstrumentation.Tetragon/TetragonWitnessBridge.cs` | Full |
| Build ID extraction | `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.SymbolInfo/SymbolInfo.cs` | Full |
| Runtime witness predicates | `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/Witnesses/RuntimeWitnessPredicateTypes.cs` | Full |
| DSSE signing | `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Core/Signing/DsseSigningService.cs` | Full |
| JCS canonicalization | `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Json/Rfc8785JsonCanonicalizer.cs` | Full |
| Rekor V2 integration | `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Infrastructure/Rekor/RekorBackendResolver.cs` | Full |
| Witness CLI commands | `src/Cli/StellaOps.Cli/Commands/WitnessCommandGroup.cs` | Full |
| Zastava CLI commands | `src/Cli/StellaOps.Cli/Commands/ZastavaCommandGroup.cs` | Full |
| Witness viewer UI | `src/Web/StellaOps.Web/src/app/shared/ui/witness-viewer/` | Full |
### Simplified Gap Analysis
**Decision**: Use existing `runtimeWitness@v1` predicate type with `SourceType=Tetragon` rather than creating a separate `ebpfWitness@v1` type. The existing model is sufficient; we only need to add probe-type granularity.
| Gap | Priority | Action |
|-----|----------|--------|
| No `ProbeType` field in `RuntimeObservation` | Medium | Add optional `EbpfProbeType` enum and field |
| No probe-type CLI filtering | Low | Add `--probe-type` to `witness list` |
| No offline replay algorithm docs | Low | Document in Zastava architecture |
---
## Sprint Reference
**Sprint file**: `docs/implplan/SPRINT_20260122_038_Scanner_ebpf_probe_type.md`
### Tasks Created
| Task ID | Description | Status |
|---------|-------------|--------|
| EBPF-001 | Add ProbeType field to RuntimeObservation | TODO |
| EBPF-002 | Update Tetragon parser to populate ProbeType | TODO |
| EBPF-003 | Add --probe-type filter to witness list CLI | TODO |
| EBPF-004 | Document offline replay algorithm | TODO |
---
## Key Files for Reference
| Component | Path |
|-----------|------|
| Tetragon Bridge | `src/RuntimeInstrumentation/StellaOps.RuntimeInstrumentation.Tetragon/TetragonWitnessBridge.cs` |
| eBPF Adapter | `src/RuntimeInstrumentation/StellaOps.RuntimeInstrumentation.Linux/Adapters/LinuxEbpfCaptureAdapter.cs` |
| Predicate Types | `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/Witnesses/RuntimeWitnessPredicateTypes.cs` |
| Witness CLI | `src/Cli/StellaOps.Cli/Commands/WitnessCommandGroup.cs` |
| Zastava Architecture | `docs/modules/zastava/architecture.md` |
---
## Reviewer Notes
The existing implementation covers the advisory's core goals. The Tetragon integration provides production-grade eBPF observation capture with DSSE signing and Rekor publication. The simplified approach adds probe-type granularity to the existing model rather than creating a new predicate type, reducing complexity while still enabling probe-type-specific filtering and policy evaluation.

View File

@@ -0,0 +1,129 @@
# Rekor v2 Tile-Backed PostgreSQL Integration Advisory
> **Source:** ChatGPT-generated advisory
> **Date:** 2026-01-22
> **Status:** Archived (capabilities already exist)
---
Here's a tight game plan to run **Rekor v2 (tile-backed)** with **PostgreSQL for tile metadata** and **object storage for big tile blobs**-so you can bundle it cleanly inside Stella Ops without MySQL.
---
### Why this works (one-liner)
Rekor v2 ("rekor-tiles") already abstracts storage and ships with a modern **tile-backed** design and client SDKs, so adding a Postgres-metadata + object-blob driver fits the upstream model and ops goals. ([GitHub][1])
---
### Minimal Postgres schema (compact)
Use Postgres only for coordinates/indices and small metadata; keep bulk bytes in S3/GCS/MinIO.
```sql
CREATE TABLE tiles (
tile_id UUID PRIMARY KEY,
shard INT NOT NULL,
level INT NOT NULL,
x INT NOT NULL,
y INT NOT NULL,
tile_hash TEXT UNIQUE NOT NULL, -- content hash of the tile/bundle
storage_url TEXT NOT NULL, -- s3://bucket/... or gs://... or minio://...
size_bytes INT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX tile_coords_idx ON tiles(level, x, y);
CREATE INDEX tile_shard_idx ON tiles(shard, level, x, y);
CREATE INDEX tile_hash_idx ON tiles(tile_hash);
```
> If you *must* support tiny single-node/dev installs, you can add `tile_blob BYTEA` behind a feature flag-but avoid it at scale (BYTEA/LO trade-offs get painful for large binaries). ([CYBERTEC PostgreSQL | Services & Support][2])
---
### Storage driver sketch (upstream-friendly)
* **Metadata driver (Postgres):** CRUD for tile rows; coordinate queries; hash lookups.
* **Blob driver (Object store):** `PutTile`, `GetTile` read/write raw tile bytes to `storage_url`.
* **Reader path:** fetch coords+URL from Postgres -> stream bytes from object store.
* **Writer path:** write bytes to object store (get URL & size) -> commit metadata row in Postgres.
* **Dual-read option:** if URL missing, fall back to legacy backend during migration.
Rekor v2's clients read **checkpoints, tiles, bundles** over HTTP/gRPC; your server just needs to expose compatible read endpoints backed by these drivers. ([Go Packages][3])
---
### Migration & cutover (high-level)
1. **Backfill:** iterate existing tiles; upload to object store; insert Postgres rows (hash, coords, URL, size).
2. **Dual-read phase:** serve reads from **Postgres + object store**, still write to old backend.
3. **Client readiness:** follow the v2 client guidance/SigningConfig changes; verify your clients (cosign/sigstore-* SDKs) before flipping. ([Sigstore Blog][4])
4. **Canary writes -> shard cutover:** switch a shard (or % traffic) to new writers, validate, then complete.
5. **Decommission:** once parity checks pass, retire old storage.
---
### Compose (dev) snippet idea
```yaml
services:
rekor-v2:
image: yourfork/rekor-tiles:latest
env_file: .env
environment:
TILE_META_DSN: "Host=postgres;Database=rekor;Username=rekor;Password=rekor;SSL Mode=disable"
TILE_BLOB_BACKEND: "s3"
S3_ENDPOINT: "http://minio:9000"
S3_BUCKET: "rekor-tiles"
S3_ACCESS_KEY_ID: "minio"
S3_SECRET_ACCESS_KEY: "miniosecret"
S3_FORCE_PATH_STYLE: "true"
depends_on: [postgres, minio]
postgres:
image: postgres:16
environment:
POSTGRES_DB: rekor
POSTGRES_USER: rekor
POSTGRES_PASSWORD: rekor
minio:
image: quay.io/minio/minio
command: server /data --address ":9000" --console-address ":9001"
environment:
MINIO_ROOT_USER: minio
MINIO_ROOT_PASSWORD: miniosecret
```
(Use Helm for prod; Rekor v2 has charts you can study for flags/healthchecks.) ([Artifact Hub][5])
---
### Ops notes you'll care about
* **Indexing:** hot paths are `(level,x,y)` and `tile_hash` for dedupe/lookup-keep those btree indices.
* **Blob size policy:** set a cutoff (e.g., >1-5 MB -> object store); avoid Postgres bloat. ([CYBERTEC PostgreSQL | Services & Support][2])
* **Sharding/rotation:** v2 embraces shard-per-URL (CT-style). Plan your S3 prefixes per shard/year. ([Sigstore Blog][4])
* **Telemetry:** follow v2 infra notes (load balancer metrics, alerts) once you fork. ([GitHub][6])
---
### Licensing & forking
Rekor and Rekor-tiles are open source; upstream encourages client compatibility and publishes v2 milestones/blogs. Keep your storage drivers cleanly pluggable and upstreamable to reduce long-term burden. ([GitHub][7])
---
### Next small steps
* Wire a **prototype**: Postgres metadata + MinIO blobs behind the read APIs.
* Add **env-switch** for dual-read & a **backfill job**.
* Run **compat tests** with cosign using the v2 SigningConfig flow before enabling writes. ([Sigstore Blog][4])
Want me to draft the Postgres driver interface (Go) and the backfill job skeleton next?
[1]: https://github.com/sigstore/rekor-tiles?utm_source=chatgpt.com "sigstore/rekor-tiles"
[2]: https://www.cybertec-postgresql.com/en/binary-data-performance-in-postgresql/?utm_source=chatgpt.com "Binary data performance in PostgreSQL"
[3]: https://pkg.go.dev/github.com/sigstore/rekor-tiles/pkg/client/read?utm_source=chatgpt.com "read package - github.com/sigstore/rekor-tiles/pkg/client/read"
[4]: https://blog.sigstore.dev/rekor-v2-ga/?utm_source=chatgpt.com "Rekor v2 GA - Cheaper to run, simpler to maintain"
[5]: https://artifacthub.io/packages/helm/sigstore/rekor-tiles?utm_source=chatgpt.com "rekor-tiles - sigstore"
[6]: https://github.com/sigstore/rekor-tiles/milestone/3?utm_source=chatgpt.com "GA (v2.0) - Milestone #3 - sigstore/rekor-tiles"
[7]: https://github.com/sigstore/rekor?utm_source=chatgpt.com "sigstore/rekor: Software Supply Chain Transparency Log"

View File

@@ -0,0 +1,100 @@
# Archive Manifest: Rekor v2 Tile-Backed PostgreSQL Integration
## Metadata
- **Original Date:** 2026-01-22
- **Archived Date:** 2026-01-22
- **Advisory Title:** Rekor v2 (tile-backed) with PostgreSQL for tile metadata and object storage for blob data
- **Processing Owner:** Planning
## Summary
Product advisory proposing integration of Rekor v2 tile-backed architecture using PostgreSQL for tile metadata and S3/MinIO/GCS object storage for large tile blobs, eliminating MySQL dependency.
After analysis of the existing codebase, **no critical gaps were identified** - the core Rekor v2 functionality is already production-ready in StellaOps.
## Gap Analysis Results
### Existing Capabilities (Already Implemented)
| Advisory Recommendation | Current Implementation | Status |
|------------------------|------------------------|--------|
| Rekor v2 tile-backed architecture | `IRekorTileClient`, `HttpRekorTileClient` | **Complete** |
| PostgreSQL for metadata | `attestor.rekor_root_checkpoints`, `attestor.rekor_submission_queue` | **Complete** |
| RFC 6962 Merkle proof verification | `MerkleProofVerifier`, inclusion proof structures | **Complete** |
| Checkpoint signature verification | `CheckpointSignatureVerifier` (Ed25519/ECDSA) | **Complete** |
| Durable submission queue | `PostgresRekorSubmissionQueue` with exponential backoff | **Complete** |
| Offline verification | `RekorOfflineReceiptVerifier`, checkpoint bundling | **Complete** |
| Tile caching | `FileSystemRekorTileCache` (immutable, SHA-256 indexed) | **Complete** |
| Docker compose support | `devops/compose/docker-compose.rekor-v2.yaml` (POSIX tiles) | **Complete** |
| Background verification | `RekorVerificationJob`, `RekorVerificationService` | **Complete** |
| Time skew validation | `ITimeCorrelationValidator` with configurable thresholds | **Complete** |
| Health checks | Doctor plugin: connectivity, clock skew, job monitoring | **Complete** |
| Metrics & observability | OpenTelemetry: queue depth, verification counts, latency histograms | **Complete** |
| CLI tooling | `stella attest rekor *` commands | **Complete** |
| Budget/rate limiting | Per-tenant limits, burst allowance, queue caps | **Complete** |
| VEX linkage | `excititor.vex_observations` with Rekor columns | **Complete** |
### Optional Future Enhancement (Low Priority)
| Enhancement | Current State | Benefit |
|-------------|--------------|---------|
| S3/MinIO/GCS blob storage for tiles | Using `FileSystemRekorTileCache` | Better for distributed multi-node deployments |
| Tile coordinate indexing (level, x, y) | Using checkpoint-focused schema | Slightly faster tile lookups at extreme scale |
## Decision
**Archive without implementation sprint** - The advisory's core goals are already achieved:
1. **Rekor v2 support**: Fully implemented via `HttpRekorTileClient`
2. **PostgreSQL backend**: Already the standard (no MySQL dependency)
3. **Offline/air-gap support**: Checkpoint bundling and tile caching work
4. **MySQL elimination**: Already using POSIX tiles backend
The S3/MinIO blob storage enhancement is a nice-to-have for specific scale scenarios but is not blocking any current use cases. The existing filesystem cache is sufficient for:
- Single-node deployments
- Development environments
- Air-gap scenarios (tiles are bundled with checkpoints)
## Related Documentation
| Document | Location |
|----------|----------|
| Rekor Verification Design | `docs/modules/attestor/rekor-verification-design.md` |
| Transparency Architecture | `docs/modules/attestor/transparency.md` |
| Offline Verification Guide | `docs/modules/attestor/guides/offline-verification.md` |
| Rekor Policy (Rate Limits) | `docs/operations/rekor-policy.md` |
| Rekor Sync Guide | `docs/operations/rekor-sync-guide.md` |
| Checkpoint Divergence Runbook | `docs/operations/checkpoint-divergence-runbook.md` |
| Rekor Unavailable Runbook | `docs/operations/runbooks/attestor-rekor-unavailable.md` |
## Existing Infrastructure
### Database Tables
- `attestor.rekor_submission_queue` - Durable retry queue
- `attestor.rekor_root_checkpoints` - Checkpoint storage
- `attestor.entries` - Entry tracking with verification metadata
- `excititor.vex_observations` - VEX-Rekor linkage
### Key Source Files
- `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Core/Rekor/` - Core Rekor clients
- `src/Attestor/StellaOps.Attestor/StellaOps.Attestor.Infrastructure/Rekor/` - PostgreSQL implementations
- `devops/compose/docker-compose.rekor-v2.yaml` - Compose overlay
### Test Coverage
- 20+ test files covering unit, integration, and E2E scenarios
- Byzantine fault detection tests
- Offline verification tests
- Queue durability tests
## Advisory Source Reference
- Source: ChatGPT-generated advisory
- Links referenced: sigstore/rekor-tiles, Sigstore Blog, Artifact Hub, CYBERTEC PostgreSQL
## Future Considerations
If distributed tile storage becomes a requirement:
1. Add `ITileBlobStore` interface with S3/MinIO/GCS implementations
2. Extend `tiles` schema with `storage_url` column
3. Update `FileSystemRekorTileCache` to `ObjectStoreTileCache`
4. Add environment variables: `TILE_BLOB_BACKEND`, `S3_ENDPOINT`, `S3_BUCKET`
This would be a straightforward enhancement (~2-3 days) when demand arises.

View File

@@ -0,0 +1,210 @@
# Deterministic "Trust Score" Algebra (replayable)
**Date:** 2026-01-22
**Status:** Archived - Translated to sprint tasks
**Sprint:** SPRINT_20260122_037_Signals_unified_trust_score_algebra
**Architecture Doc:** docs/technical/scoring-algebra.md
---
**Core idea:**
Aggregate normalized signals with fixed weights, clamp to bounds, and always emit an evidence trail plus an "unknowns" flag so missing data is explicit (never guessed).
**Formula**
* **Score:** `score = clamp(floor, ceil, Σ_i (w_i * s_i))`
* **Unknowns:** `U = 1 - completeness_fraction` (0 = nothing missing, 1 = everything missing)
* If any primary input is missing → flip `unknowns=true`, include a delta note ("what would change if present").
**Signals (examples & ranges)**
* `cvss_v4_base_norm ∈ [0,1]` (CVSS base / 10)
* `kev_flag ∈ {0,1}` (in CISA KEV)
* `rekor_anchor ∈ {0,1}` (entry present)
* `dsse_signed ∈ {0,1}` (valid DSSE chain to trusted root)
* `lifter_match ∈ [0,1]` (binary/source lifter confidence)
* `attestation_age_decay = exp(-λ · days_since_attestation)` (λ tunable; recent = closer to 1)
**Weights**
* Versioned, immutable, and themselves part of provenance (e.g., `weights@v2026-01-22.json`).
* Examples (tweak to taste):
`cvss: +0.55, kev: +0.35, rekor: +0.15, dsse: +0.10, lifter: +0.20, age_decay: +0.10`
Negative weights are allowed for "good news" signals (e.g., strong provenance reduces risk).
**Bounds**
* `floor=0`, `ceil=10` by default (or 0100 if you prefer percent).
**Canonicalization (so runs are replayable)**
* Normalize inputs: CycloneDX/SPDX → canonical form.
* Cryptographic ordering: JCS (JSON Canonicalization Scheme).
* Deterministic transforms only (record transform name + parameters).
**Evidence chain (minimal but sufficient)**
* Ordered list of:
1. Canonical input hashes (SBOM, attestations, KEV snapshot, Rekor query result)
2. Normalized signals `s_i` with exact extraction rules
3. Weight set ID + hash
4. Transform IDs (e.g., `"normalize_spdx@1.1"`, `"apply_age_decay@λ=0.02"`)
5. Final Σ and clamp values
6. Unknowns bit and explicit deltas for each missing primary input
---
## JSON shape (input → output)
**Inputs**
```json
{
"artifact_digest": "sha256:...",
"sbom": { "type": "spdx", "body": "..." },
"attestations": [{ "type": "dsse", "body": "..." }],
"vuln": {
"cvss_v4_base": 7.8,
"kev": true
},
"supply_chain": {
"rekor_entry": true,
"lifter_match": 0.83,
"attested_at_utc": "2026-01-10T12:00:00Z"
},
"params": {
"lambda_age_decay": 0.02,
"bounds": { "floor": 0, "ceil": 10 },
"weights_ref": "weights@v2026-01-22.json"
}
}
```
**Outputs**
```json
{
"score": 8.41,
"bounds": { "floor": 0, "ceil": 10 },
"unknowns": false,
"U": 0.0,
"signals": {
"cvss_v4_base_norm": 0.78,
"kev_flag": 1,
"rekor_anchor": 1,
"dsse_signed": 1,
"lifter_match": 0.83,
"attestation_age_decay": 0.786
},
"weights_ref": "weights@v2026-01-22.json#sha256:...",
"evidence": {
"canonical_inputs": [
{"name":"sbom.spdx", "hash":"sha256:..."},
{"name":"dsse.att","hash":"sha256:..."},
{"name":"kev.snapshot","hash":"sha256:..."},
{"name":"rekor.query","hash":"sha256:..."}
],
"transforms": [
{"name":"canonicalize_spdx","version":"1.1"},
{"name":"normalize_cvss_v4","version":"1.0"},
{"name":"age_decay","params":{"lambda":0.02}}
],
"sum_components": [
{"signal":"cvss_v4_base_norm","w":0.55,"term":0.429},
{"signal":"kev_flag","w":0.35,"term":0.350},
{"signal":"rekor_anchor","w":0.15,"term":0.150},
{"signal":"dsse_signed","w":0.10,"term":0.100},
{"signal":"lifter_match","w":0.20,"term":0.166},
{"signal":"attestation_age_decay","w":0.10,"term":0.079}
],
"sum_raw": 1.274,
"scaled_sum": 8.41
},
"missing_inputs": []
}
```
---
## Handling missing data (no silent guesses)
* If `rekor_entry` is unknown:
* Set `unknowns=true`, include `missing_inputs=["rekor_entry"]`
* Compute `score` **without** it
* Add `"delta_if_present": {"rekor_anchor@1": +0.15}` so reviewers see maximum effect if/when it arrives.
---
## Why this helps (esp. for Stella Ops)
* **Auditable & reproducible:** Same inputs → same score; evidence lets auditors replay.
* **Deterministic merges:** Works cleanly with VEX/policy lattices—this is just the scalar "presentation" layer for dashboards and gates.
* **No hidden heuristics:** All weights are versioned artifacts you can pin in release pipelines.
* **Risk + uncertainty:** Operators see both *risk* and *how much we don't know* (U), which is often the real risk.
---
## Dropin implementation sketch (pseudocode)
```python
def trust_score(signals, weights, floor=0, ceil=10):
unknowns = [k for k,v in signals.items() if v is None]
completeness = (len(signals)-len(unknowns))/len(signals)
U = 1 - completeness
# treat None as 0 in sum, but keep unknowns bit and deltas
total = 0.0
terms = []
for k, w in weights.items():
s = 0.0 if signals.get(k) is None else signals[k]
term = w * s
terms.append((k, w, s, term))
total += term
# map raw total (usually already in 0..1-ish) into floor..ceil if needed
scaled = max(floor, min(ceil, total if ceil <= 1 else total * ceil))
return scaled, U, unknowns, terms
```
---
## Practical defaults
* Bounds: 010
* λ (age decay): `0.02` (≈ halflife ~35 days)
* Start weights (tune later):
* CVSS base 0.55
* KEV 0.35
* Rekor 0.15
* DSSE 0.10
* Lifter 0.20
* Age decay 0.10
---
## Archive Note
This advisory has been processed and translated into:
1. **Architecture Documentation:** `docs/technical/scoring-algebra.md`
- Full specification of the trust score algebra
- Signal normalization rules
- Weight manifest schema
- Evidence chain structure
- Input/output contracts
2. **Implementation Sprint:** `SPRINT_20260122_037_Signals_unified_trust_score_algebra`
- 10 tasks covering full implementation
- Weight manifest infrastructure (TSA-001)
- Signal normalizers (TSA-002)
- Evidence chain builder (TSA-003)
- Core scoring engine (TSA-004)
- Determinism verification (TSA-005)
- Unknowns integration (TSA-006)
- API endpoints (TSA-007)
- CLI commands (TSA-008)
- Attestation integration (TSA-009)
- Documentation updates (TSA-010)

View File

@@ -0,0 +1,114 @@
# Trust Score Replay Subsystem Advisory
**Date:** 22-Jan-2026
**Status:** Archived (translated to sprint tasks)
**Related Sprint:** SPRINT_20260122_037_Signals_unified_trust_score_algebra.md
---
## Original Advisory Content
### Why this exists (plain English)
Modern pipelines ingest lots of evidence (SBOMs, VEX, KEV lists, runtime witnesses). Teams need a **repeatable** way to normalize that evidence, score risk, and produce **proof** that anyone can independently replay.
### System components (at a glance)
* **Ingestors**: pull SBOMs, VEX, and CISA KEV; accept runtime witnesses.
* **Evidence Normalizer**: canonicalizes inputs + hashes them (stable byte-for-byte representation).
* **Trust Algebra Engine**: deterministic evaluator that turns normalized inputs into a numeric score.
* **Replay Verifier**: replays the exact steps (with versions + hashes) to prove the score.
* **Transparency Anchor**: writes inclusion proofs (e.g., Rekor v2 receipt).
* **Evidence Store**: keeps artifacts as OCI referrers (e.g., "StellaBundle").
* **UI / Audit Export**: human-readable view + downloadable signed replay logs.
### Data flow (simple)
**ingest -> normalize -> evaluate -> anchor -> store proof**
### Mini API (essential endpoints)
**POST `/v1/score/evaluate`**
Request:
```json
{
"sbom_ref": "oci://registry/app@sha256:...",
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
"vex_refs": ["oci://.../vex1", "oci://.../vex2"],
"rekor_receipts": ["BASE64-RECEIPT"],
"runtime_witnesses": [{"type":"process","data":"..."}],
"options": {"decay_lambda": 0.015, "weight_set_id": "default-v1"}
}
```
Response:
```json
{
"score_id": "sc_01H...",
"score_value": 8.5,
"unknowns": ["pkg:deb/...?version=?"],
"proof_ref": "oci://.../score-proof@sha256:..."
}
```
**GET `/v1/score/{id}/replay`**
Response:
```json
{
"signed_replay_log_dsse": "BASE64",
"rekor_inclusion": {"logIndex":12345,"rootHash":"..."},
"canonical_inputs": [{"name":"sbom.json","sha256":"..."}]
}
```
### Example signed attestation (DSSE-style)
```json
{
"payloadType": "application/vnd.stella.score+json",
"payload": "eyJzY29yZV92YWwiOjguNSwiZWFjaCI6W3siZGF0YSI6IiN...",
"signatures": [{"sig":"BASE64SIG","keyid":"authority:02"}]
}
```
The **replay log** records: input hashes, normalizer version, evaluator commit SHA, step-by-step algebra decisions, plus the Rekor inclusion proof.
### Determinism knobs (useful defaults)
* **Canonicalizer version** pinned per run.
* **Weight set** (e.g., "default-v1") for the Trust Algebra Engine.
* **Time decay** (e.g., `decay_lambda`) to gently drift stale evidence downward without surprise jumps.
### 90-day rollout (high level)
* **Weeks 0-2**: freeze algebra + canonicalizer; build golden-corpus replay tests.
* **Weeks 3-6**: implement Trust Algebra Engine, unitized Replay Verifier, and anchoring flow.
* **Weeks 7-10**: expose API; DSSE signing; Rekor v2 anchoring; internal audit.
* **Weeks 11-13**: pilot with two teams (CI gating + triage UI).
* **Weeks 14-~90**: tune weights, add "lifter" integrations, publish public audit docs, ship a signed **validator CLI** for external auditors.
### What you get out-of-the-box
* **Explainability**: every score is replayable, line-by-line.
* **Interop**: OCI refs for all artifacts; DSSE for signatures; Rekor receipts for transparency.
* **Vendor-safe**: unknowns are explicit; weight sets are swappable without changing code.
* **CI-ready**: single POST for a score, single GET to prove it.
---
## Archive Notes
This advisory was analyzed alongside the earlier "Deterministic Trust Score Algebra" advisory. After deep analysis of existing EWS and Determinization systems, we determined:
1. **Most components already exist** - Ingestors, Evidence Normalizer (partial), Trust Algebra Engine (EWS), Transparency Anchor (Rekor), Evidence Store exist
2. **B+C+D facade approach adopted** - Rather than rewrite, we expose existing systems through unified facade
3. **New additions from this advisory:**
- TSF-011: Explicit `/score/{id}/replay` endpoint with signed DSSE attestation
- TSF-007 expanded: `stella score replay` and `stella score verify` CLI commands
- DSSE payload type: `application/vnd.stella.score+json`
- OCI referrer pattern for replay proofs ("StellaBundle")
See Sprint 037 for full implementation details.

View File

@@ -0,0 +1,104 @@
# Archive Manifest: Trust Score Algebra Advisories
## Metadata
- **Original Date:** 2026-01-22
- **Archived Date:** 2026-01-22
- **Advisory Titles:**
1. Deterministic "Trust Score" Algebra (replayable)
2. Trust Score Replay Subsystem
- **Processing Owner:** Planning
## Summary
Two related product advisories proposing a unified, deterministic trust score algebra for aggregating multiple provenance and security signals into one auditable risk number.
After deep analysis of existing EWS, Determinization, and RiskEngine systems, the **B+C+D facade approach** was adopted instead of a full rewrite:
- **B: Unified API** - Single facade combining EWS scores + Determinization entropy
- **C: Versioned weight manifests** - Extract EWS weights to `etc/weights/*.json`
- **D: Unknowns fraction (U)** - Expose Determinization entropy as unified metric
## Gap Analysis Results
### Existing Capabilities (Preserved)
- **EWS (Evidence-Weighted Score):** 6-dimension scoring with guardrails, conflict detection
- **Determinization:** Entropy calculation, confidence decay, content-addressed fingerprints
- **VEX Trust Lattice:** Provenance, coverage, replayability vectors
- **Risk Scoring:** CVSS/KEV/EPSS providers with offline support
- **Rekor Integration:** Transparency anchoring via `RekorSubmissionService`
- **DSSE Signing:** `DsseVerificationReportSigner` for attestations
- **Score Proofs API:** Determinism hashes (policy digest, fingerprints)
### Gaps Addressed via Facade
1. No unified API combining EWS + Determinization -> TSF-002 (UnifiedScoreService)
2. No versioned weight manifests -> TSF-001 (weight manifest files)
3. No user-facing U metric -> TSF-003 (unknowns bands)
4. No delta-if-present for missing signals -> TSF-004
5. No explicit replay endpoint -> TSF-011 (from second advisory)
6. CLI/UI don't expose unified view -> TSF-006, TSF-007, TSF-008
### What We're NOT Doing (per B+C+D decision)
- NOT replacing EWS formula
- NOT replacing Determinization entropy calculation
- NOT changing guardrail logic
- NOT changing conflict detection
- NOT breaking existing CLI commands or API contracts
## Deliverables Created
### Documentation
| File | Description |
|------|-------------|
| `docs/technical/scoring-algebra.md` | Unified trust score architecture (facade approach) |
| `etc/weights/v2026-01-22.weights.json` | Initial weight manifest matching EWS defaults |
### Sprint Tasks
| Sprint | Tasks |
|--------|-------|
| `SPRINT_20260122_037_Signals_unified_trust_score_algebra` | 11 implementation tasks (TSF-001 through TSF-011) |
## Task Summary (B+C+D Facade Approach)
| Task ID | Summary | Status |
|---------|---------|--------|
| TSF-001 | Extract EWS Weights to Manifest Files | TODO |
| TSF-002 | Unified Score Facade Service | TODO |
| TSF-003 | Unknowns Band Mapping | TODO |
| TSF-004 | Delta-If-Present Calculations | TODO |
| TSF-005 | Platform API Endpoints (Score Evaluate) | TODO |
| TSF-006 | CLI `stella gate score` Enhancement | TODO |
| TSF-007 | CLI `stella score` Top-Level Command (incl. replay/verify) | TODO |
| TSF-008 | Console UI Score Display Enhancement | TODO |
| TSF-009 | Determinism & Replay Tests | TODO |
| TSF-010 | Documentation Updates | TODO |
| TSF-011 | Score Replay & Verification Endpoint | TODO |
## API Endpoints (Final)
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/v1/score/evaluate` | POST | Compute unified score |
| `/api/v1/score/{id}/replay` | GET | Fetch signed replay proof |
| `/api/v1/score/weights` | GET | List weight manifests |
| `/api/v1/score/weights/{version}` | GET | Get specific manifest |
## CLI Commands (Final)
| Command | Description |
|---------|-------------|
| `stella gate score evaluate --show-unknowns --show-deltas` | Enhanced gate scoring |
| `stella gate score weights list\|show\|diff` | Weight manifest management |
| `stella score compute` | Direct unified score computation |
| `stella score explain <finding-id>` | Detailed score breakdown |
| `stella score replay <score-id>` | Fetch replay proof |
| `stella score verify <score-id>` | Verify score locally |
## Related Documents
- Architecture: `docs/modules/policy/architecture.md` (Determinization section)
- EWS Design: `docs/modules/policy/design/confidence-to-ews-migration.md`
- Score Proofs: `docs/api/scanner-score-proofs-api.md`
- Scoring Algebra: `docs/technical/scoring-algebra.md`
## Advisory Files
| File | Description |
|------|-------------|
| `22-Jan-2026 - Deterministic Trust Score Algebra.md` | First advisory (scoring formula) |
| `22-Jan-2026 - Trust Score Replay Subsystem.md` | Second advisory (replay/verification) |