# Signals Provenance Contract v1.0.0 **Status:** APPROVED **Version:** 1.0.0 **Effective:** 2025-12-19 **Owner:** Signals Guild + Platform Storage Guild **Sprint:** SPRINT_0140_0001_0001 (unblocks SIGNALS-24-002, 24-003, 24-004, 24-005) --- ## 1. Purpose This contract defines the provenance tracking for runtime facts, callgraph storage, and CAS (Content-Addressable Storage) promotion policies. It enables deterministic, auditable signal processing with signed manifests and attestations. ## 2. Schema References | Schema | Location | |--------|----------| | Provenance Feed | `docs/schemas/provenance-feed.schema.json` | | Runtime Facts | `docs/signals/runtime-facts.md` | | Reachability Input | `docs/modules/policy/contracts/reachability-input-contract.md` | ## 3. CAS Storage Architecture ### 3.1 Bucket Structure ``` cas://signals/ ├── callgraphs/ │ ├── {tenant}/ │ │ ├── {graph_id}.ndjson.zst # Compressed callgraph │ │ └── {graph_id}.meta.json # Callgraph metadata │ └── global/ │ └── ... ├── manifests/ │ ├── {graph_id}.json # Signed manifest │ └── {graph_id}.json.dsse # DSSE envelope ├── runtime-facts/ │ ├── {tenant}/ │ │ ├── {batch_id}.ndjson.zst # Runtime fact batch │ │ └── {batch_id}.provenance.json # Provenance record │ └── global/ │ └── ... └── attestations/ └── {batch_id}.dsse # Batch attestation ``` ### 3.2 Access Policies | Principal | callgraphs | manifests | runtime-facts | attestations | |-----------|------------|-----------|---------------|--------------| | Signals Service | read/write | read/write | read/write | read/write | | Policy Engine | read | read | read | read | | Scanner Worker | write | - | - | - | | Audit Service | read | read | read | read | | All Others | deny | deny | deny | deny | ### 3.3 Retention Policies | Content Type | Retention | GC Policy | |--------------|-----------|-----------| | Manifests | Indefinite | Never delete | | Callgraphs (referenced) | Indefinite | Never delete | | Callgraphs (orphan) | 30 days | Rolling GC | | Runtime Facts | 90 days | Rolling GC | | Attestations | Indefinite | Never delete | ## 4. Manifest Schema ### 4.1 CallgraphManifest ```csharp public sealed record CallgraphManifest { /// Unique graph identifier (ULID). public required string GraphId { get; init; } /// SHA-256 digest of callgraph content. public required string Digest { get; init; } /// Programming language. public required string Language { get; init; } /// Source identifier (scanner, analyzer, runtime agent). public required string Source { get; init; } /// When the callgraph was created. public required DateTimeOffset CreatedAt { get; init; } /// Tenant scope. public required string TenantId { get; init; } /// Component PURL. public required string ComponentPurl { get; init; } /// Entry points discovered. public ImmutableArray EntryPoints { get; init; } /// Node count in the graph. public int NodeCount { get; init; } /// Edge count in the graph. public int EdgeCount { get; init; } /// Signing key ID. public string? SignerKeyId { get; init; } /// Signature (Base64). public string? Signature { get; init; } /// Rekor log UUID if transparency-logged. public string? RekorUuid { get; init; } } ``` ### 4.2 JSON Example ```json { "graphId": "01HWXYZ123456789ABCDEFGHJK", "digest": "sha256:7d9cd5f1a2a0dd9a41a2c43a5b7d8a0bcd9e34cf39b3f43a70595c834f0a4aee", "language": "javascript", "source": "stella-callgraph-node", "createdAt": "2025-12-19T10:00:00Z", "tenantId": "tenant-001", "componentPurl": "pkg:npm/%40acme/backend@1.2.3", "entryPoints": ["src/index.js", "src/server.js"], "nodeCount": 1523, "edgeCount": 4892, "signerKeyId": "signals-signer-2025-001", "signature": "base64...", "rekorUuid": "24296fb24b8ad77a..." } ``` ## 5. Runtime Facts Provenance ### 5.1 ProvenanceRecord ```csharp public sealed record RuntimeFactProvenance { /// Provenance record ID (ULID). public required string ProvenanceId { get; init; } /// Callgraph ID this fact batch relates to. public required string CallgraphId { get; init; } /// Batch ID for this fact set. public required string BatchId { get; init; } /// When facts were ingested. public required DateTimeOffset IngestedAt { get; init; } /// When facts were received from source. public required DateTimeOffset ReceivedAt { get; init; } /// Tenant scope. public required string TenantId { get; init; } /// Source host/service. public required string Source { get; init; } /// Pipeline version (git SHA or build ID). public required string PipelineVersion { get; init; } /// SHA-256 of raw fact blob. public required string ProvenanceHash { get; init; } /// Signing key ID. public string? SignerKeyId { get; init; } /// Rekor UUID or skip reason. public string? RekorUuid { get; init; } /// Skip reason if not transparency-logged. public string? SkipReason { get; init; } /// Fact count in this batch. public int FactCount { get; init; } /// Fact types included. public ImmutableArray FactTypes { get; init; } } ``` ### 5.2 Enrichment Pipeline ``` ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ Runtime Agent │────▶│ Signals Ingest │────▶│ CAS Storage │ │ (runtime-facts) │ │ (provenance) │ │ (facts+prov) │ └─────────────────┘ └──────────────────┘ └─────────────────┘ │ ▼ ┌──────────────────┐ │ DSSE Attestation │ │ (per batch) │ └──────────────────┘ ``` ## 6. API Endpoints ### 6.1 Callgraph Management | Endpoint | Method | Description | |----------|--------|-------------| | `POST /signals/callgraphs` | POST | Store new callgraph | | `GET /signals/callgraphs/{graphId}` | GET | Retrieve callgraph | | `GET /signals/callgraphs/{graphId}/manifest` | GET | Get signed manifest | | `GET /signals/callgraphs/by-purl/{purl}` | GET | Find by component PURL | ### 6.2 Runtime Facts | Endpoint | Method | Description | |----------|--------|-------------| | `POST /signals/runtime-facts` | POST | Ingest runtime fact batch | | `GET /signals/runtime-facts/{batchId}` | GET | Retrieve fact batch | | `GET /signals/runtime-facts/{batchId}/provenance` | GET | Get provenance record | | `GET /signals/runtime-facts/ndjson` | GET | Stream facts (with provenance) | ### 6.3 Query Parameters | Parameter | Type | Description | |-----------|------|-------------| | `tenant` | string | Filter by tenant | | `callgraph_id` | string | Filter by callgraph | | `since` | datetime | Facts after timestamp | | `include_provenance` | bool | Include provenance_hash and callgraph_id | ## 7. Signing and Attestation ### 7.1 Manifest Signing All callgraph manifests are signed using: - Algorithm: `ECDSA-P256-SHA256` or `Ed25519` - Key management: Via Authority service key registry - Transparency: Optional Sigstore Rekor logging ```csharp public interface IManifestSigner { Task SignAsync( CallgraphManifest manifest, CancellationToken cancellationToken = default); Task VerifyAsync( SignedManifest signedManifest, CancellationToken cancellationToken = default); } ``` ### 7.2 Batch Attestation Runtime fact batches are attested using in-toto/DSSE: ```csharp public sealed record RuntimeFactAttestation { public required string PredicateType { get; init; } // "https://stella.ops/attestation/runtime-facts/v1" public required string BatchId { get; init; } public required string ProvenanceHash { get; init; } public required int FactCount { get; init; } public required DateTimeOffset Timestamp { get; init; } public required ImmutableArray Subjects { get; init; } // callgraph IDs } ``` ## 8. Telemetry ### 8.1 Metrics | Metric | Type | Labels | Description | |--------|------|--------|-------------| | `signals_callgraphs_stored_total` | counter | `language`, `tenant` | Callgraphs stored | | `signals_callgraph_nodes_total` | histogram | `language` | Nodes per callgraph | | `signals_runtime_facts_ingested_total` | counter | `fact_type`, `tenant` | Facts ingested | | `signals_runtime_facts_batch_size` | histogram | - | Facts per batch | | `signals_provenance_records_total` | counter | - | Provenance records created | | `signals_attestations_created_total` | counter | - | DSSE attestations created | | `signals_cas_operations_total` | counter | `operation`, `result` | CAS operations | ### 8.2 Alerts ```yaml groups: - name: signals-provenance rules: - alert: SignalsAttestationFailure expr: increase(signals_attestations_created_total{result="failure"}[5m]) > 0 for: 1m labels: severity: warning annotations: summary: "Runtime fact attestation failures detected" - alert: SignalsProvenanceMissing expr: signals_runtime_facts_ingested_total - signals_provenance_records_total > 100 for: 5m labels: severity: critical annotations: summary: "Runtime facts missing provenance records" ``` ## 9. Configuration ```yaml # etc/signals.yaml Signals: CAS: BucketPrefix: "cas://signals" WriteEnabled: true RetentionDays: RuntimeFacts: 90 OrphanCallgraphs: 30 Provenance: Enabled: true SignManifests: true AttestBatches: true RekorEnabled: true # Set to false for air-gap Signing: KeyId: "signals-signer-2025-001" Algorithm: "ECDSA-P256-SHA256" ``` ## 10. Validation Rules 1. `GraphId` must be valid ULID 2. `Digest` must be valid `sha256:` prefixed hex 3. `Language` must be known language identifier 4. `TenantId` must exist in Authority tenant registry 5. `ComponentPurl` must be valid Package URL 6. `ProvenanceHash` must match recomputed hash of fact blob 7. Manifests must have valid signature if `SignManifests: true` 8. Attestations must have valid DSSE envelope --- ## Changelog | Version | Date | Changes | |---------|------|---------| | 1.0.0 | 2025-12-19 | Initial release - unblocks SIGNALS-24-002 through 24-005 |