Files
git.stella-ops.org/docs/specs/SYMBOL_MANIFEST_v1.md
master 3a2100aa78 Add unit and integration tests for VexCandidateEmitter and SmartDiff repositories
- Implemented comprehensive unit tests for VexCandidateEmitter to validate candidate emission logic based on various scenarios including absent and present APIs, confidence thresholds, and rate limiting.
- Added integration tests for SmartDiff PostgreSQL repositories, covering snapshot storage and retrieval, candidate storage, and material risk change handling.
- Ensured tests validate correct behavior for storing, retrieving, and querying snapshots and candidates, including edge cases and expected outcomes.
2025-12-16 19:00:43 +02:00

122 lines
4.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Symbol Manifest v1 Specification
> **Status:** Draft Sprint 401 (Symbols Server rollout)
> **Owners:** Symbols Guild · Scanner Guild · Runtime Signals Guild · DevOps Guild
## 1. Purpose
Provide a deterministic manifest format for publishing debug symbols, source maps, and runtime lookup metadata. Manifests are DSSE-signed and optionally logged to Rekor so Scanner.Symbolizer and runtime probes can resolve functions in air-gapped or sovereign environments.
## 2. Manifest structure
```json
{
"schema": "stellaops.symbols/manifest@v1",
"artifactDigest": "sha256:…", // build or container digest
"entries": [
{
"debugId": "3b2d…ef",
"os": "linux",
"arch": "amd64",
"format": "dwarf",
"hash": "sha256:…", // hash of blob archive
"path": "symbols/3b/2d/…/index.zip",
"size": 1234567,
"metadata": {
"lang": "c++",
"compiler": "clang-16"
}
}
],
"sourceMaps": [
{
"asset": "app.min.js",
"debugId": "sourcemap:…",
"hash": "sha256:…",
"path": "maps/app.min.js.map"
}
],
"toolchain": {
"name": "gha@actions",
"version": "2025.11.10",
"builderId": "urn:stellaops:builder:release"
},
"provenance": {
"timestamp": "2025-11-10T09:00:00Z",
"attestor": "stellaops-ci",
"reproducible": true
}
}
```
* `schema` is fixed to `stellaops.symbols/manifest@v1`.
* `entries` covers ELF/PE/Mach-O debug bundles; `sourceMaps` is optional.
* Paths are relative to the blob store root (e.g., MinIO bucket). DSSE signatures cover the canonical JSON (sorted keys, minified).
## 3. Canonical keys per platform
| Platform | `debugId` derivation | Notes |
|----------|---------------------|-------|
| ELF | NT_GNU_BUILD_ID (`.note.gnu.build-id`) or SHA-256 of `.text` as fallback | Task `SYMS-CLIENT-401-012` |
| PE/COFF | `pdbGuid:pdbAge` from CodeView debug directory | Portable PDB preferred |
| Mach-O | LC_UUID | Use corresponding dSYM when available |
| JVM | JAR SHA-256 + class/method signature triple | ASM-based scanner |
| Node/TS | Asset SHA-256 + sourceMap URL | Includes sourcemap content |
| Go/Rust/C++ | DWARF CU UUID or binary digest + address ranges | Handles stripped symbols |
Derivers live in `IPlatformKeyDeriver` implementations.
## 4. Upload & verification (`SYMS-INGEST-401-013`)
1. CI builds debug artefacts (PDB/dSYM/ELF DWARF, sourcemaps).
2. `symbols ingest` CLI:
* Normalises manifest JSON (sorted keys, minified).
* Signs the manifest via DSSE (keyless or KMS per tenant).
* Uploads blobs to MinIO/S3 using deterministic prefixes: `symbols/{tenant}/{os}/{arch}/{debugId}/…`.
* Calls `POST /v1/symbols/upload` with the signed manifest and metadata.
* Submits manifest DSSE to Rekor (optional but recommended).
3. Symbols.Server validates DSSE, stores manifest metadata in PostgreSQL (`symbol_index` table), and publishes gRPC/REST lookup availability.
## 5. Resolve APIs (`SYMS-SERVER-401-011`)
* `GET /v1/symbols/resolve?tenant=…&os=…&arch=…&debugId=…`
Returns blob location, hashes, and manifest metadata (sanitised per tenancy).
* `POST /v1/lookup/addresses`
Input: `{ debugId, addresses: [0x401000, …] }`
Output: `[{ addr, function, file, line }]`.
* `GET /v1/manifests/by-artifact/:digest`
Lists all debug IDs published for a build or image digest.
All lookups require OpTok scopes (`symbols.resolve`). Multi-tenant filtering is enforced at the query level.
## 6. Runtime proxy & caching
* Optional `Symbols.Proxy` sidecar runs near runtime probes, caching resolve results on disk with TTL/cap.
* Scanner.Symbolizer and runtime probes first check local LRU caches before hitting the server, falling back to Offline bundles in air-gap mode.
## 7. Offline bundles (`SYMS-BUNDLE-401-014`)
* `symbols bundle create` generates a TAR archive with:
* DSSE-signed `SymbolManifest v1`.
* Blob archives (zip/tar).
* Rekor checkpoints (if present).
* Bundles are content-addressed (CAS prefix `reachability/symbols/…`) and signed before distribution.
## 8. Security considerations
* Enforce per-tenant bucket prefixes; optionally replicate “public” symbol sets for vendor-supplied packages.
* DSSE + Rekor ensure tamper detection; Authority manages key rotation routes (GOST/SM/eIDAS) for sovereign deployments.
* Reject uploads where `hash` mismatch or `artifactDigest` not tied to known release pipelines.
## 9. Related tasks
| Area | Task ID | Notes |
|------|---------|-------|
| Server | `SYMS-SERVER-401-011` | REST/gRPC microservice |
| Client | `SYMS-CLIENT-401-012` | SDK + key derivation |
| CLI | `SYMS-INGEST-401-013` | DSSE-signed manifest upload |
| Offline bundles | `SYMS-BUNDLE-401-014` | Air-gap support |
| Docs | `DOCS-SYMS-70-003` | (this document) |
Future revisions (`@v2`) will extend the manifest with packer classification hints and reachability graph references.