consolidation of some of the modules, localization fixes, product advisories work, qa work

This commit is contained in:
master
2026-03-05 03:54:22 +02:00
parent 7bafcc3eef
commit 8e1cb9448d
3878 changed files with 72600 additions and 46861 deletions

View File

@@ -0,0 +1,52 @@
# Symbols Module
**Status:** Implemented
**Source:** `src/Symbols/`
## Purpose
The Symbols module provides debug symbol storage, resolution, and distribution for binary analysis and call-graph construction. It supports multi-tenant symbol management with DSSE-signed manifests and CAS-backed blob storage.
## Components
**Services:**
- **StellaOps.Symbols.Server** - REST API for symbol management and resolution
- **StellaOps.Symbols.Client** - Client SDK for Scanner/runtime integration
- **StellaOps.Symbols.Ingestor.Cli** - CLI tool for symbol ingestion
- **StellaOps.Symbols.Core** - Core abstractions, models, and interfaces
- **StellaOps.Symbols.Infrastructure** - In-memory and production implementations
- **StellaOps.Symbols.Bundle** - Offline bundle management
## Key Features
- **Symbol Resolution** - Address-to-symbol mapping for binaries (ELF, PE, Mach-O)
- **Multi-tenant Storage** - Tenant-isolated symbol manifests with BLAKE3 content hashing
- **DSSE Integration** - Optional cryptographic signing of manifests for transparency
- **CAS Blob Storage** - Content-addressed storage for symbol files and metadata
- **Debug Identifiers** - Support for ELF Build-IDs, PDB GUIDs, and GNU build-ids
## API Endpoints
- `POST /v1/symbols/manifests` - Upload symbol manifest
- `GET /v1/symbols/manifests/{id}` - Get manifest by ID
- `GET /v1/symbols/manifests` - Query manifests with filters
- `POST /v1/symbols/resolve` - Resolve addresses to symbols
- `GET /v1/symbols/by-debug-id/{debugId}` - Get manifests by debug ID
- `GET /health` - Health check (anonymous)
## Dependencies
- PostgreSQL (for production storage)
- Authority (authentication and multi-tenancy)
- Policy Engine (optional, for overlay integration)
- Graph module (for call-graph construction support)
## Related Documentation
- **Architecture:** See `src/Symbols/AGENTS.md` for detailed agent instructions
- **Sprint Plan:** Check `docs/implplan/SPRINT_*.md` for current development status
- **Fixtures:** Test data available in `tests/Symbols/fixtures/`
## Current Status
Active development. Supports symbol ingestion, resolution, and multi-tenant distribution with DSSE-signed manifests and CAS-backed storage.

View File

@@ -0,0 +1,79 @@
# component_architecture_symbols.md - **Stella Ops Symbols** (2025Q4)
> Symbol resolution and debug information management.
> **Scope.** Library and service architecture for **Symbols**: managing debug symbols, build IDs, and symbol-to-package mappings for reachability analysis.
---
## 0) Mission & boundaries
**Mission.** Provide **symbol resolution infrastructure** for native binary analysis. Map symbols to packages, manage debug information, and support stripped binary analysis.
**Boundaries.**
* Symbols **resolves and maps** symbols; execution is handled by Scanner.
* Debug symbols are **optional**; stripped binaries use heuristics.
* Supports **offline symbol stores** for air-gapped operation.
---
## 1) Solution & project layout
```
src/Symbols/
├─ StellaOps.Symbols/ # Core symbol resolution
│ ├─ Services/
│ │ ├─ ISymbolResolver.cs
│ │ └─ BuildIdResolver.cs
│ └─ Models/
│ ├─ SymbolInfo.cs
│ └─ BuildIdMapping.cs
└─ Integration with Scanner:
└─ src/Scanner/__Libraries/StellaOps.Scanner.Symbols.Native/
```
---
## 2) Contracts & data model
### 2.1 Symbol Info
```json
{
"symbolId": "sha256:abc123",
"name": "_ZN3foo3barEv",
"demangledName": "foo::bar()",
"sourceFile": "src/foo.cpp",
"lineNumber": 42,
"buildId": "abc123def456",
"packagePurl": "pkg:deb/debian/libfoo@1.2.3"
}
```
### 2.2 Build ID Mapping
```json
{
"buildId": "abc123def456",
"path": "/usr/lib/libfoo.so.1",
"packagePurl": "pkg:deb/debian/libfoo@1.2.3",
"debugPath": "/usr/lib/debug/.build-id/ab/c123def456.debug"
}
```
---
## Related Documentation
* Scanner native analysis: `../scanner/architecture.md`
* Reachability: `../../reachability/`
## Advisory Commitments (2026-02-26 Batch)
- `SPRINT_20260226_226_Symbols_dsse_rekor_merkle_and_hash_integrity` is the active commitment for:
- removing placeholder hashing labels in production verification paths,
- DSSE signing and verification for symbol bundles,
- Rekor submit/inclusion verification behavior and deterministic status surfaces,
- Merkle inclusion proof verification with negative test vectors.

View File

@@ -0,0 +1,129 @@
# Symbol Marketplace Architecture
**Module**: `src/Symbols/StellaOps.Symbols.Marketplace/`
**Server**: `src/Symbols/StellaOps.Symbols.Server/`
**Sprint**: SPRINT_20260220_001, SPRINT_20260220_002
**Status**: Implemented
---
## Overview
The Symbol Marketplace extends the existing Symbols module with a registry of symbol/debug pack sources, a browsable catalog, and a four-dimension trust scoring model. It provides the infrastructure needed to discover, evaluate, and install debug symbol packs from vendor, distro, community, and partner providers.
This directly strengthens the "Symbolized call-stack proofs" moat by ensuring Stella Ops can source verified debug symbols for any artifact in the reachability graph, enabling DSSE-signed call-stack resolution across platforms.
## Domain Primitives
### SymbolPackSource
Registry entry for a symbol provider. Each source has:
- **Key/Name**: Human-readable identifier (e.g., `microsoft-symbols`, `ubuntu-debuginfod`).
- **SourceType**: `vendor` | `distro` | `community` | `partner`.
- **Priority**: Integer ordering for resolution precedence.
- **FreshnessSLA**: Target sync interval in seconds (default: 6 hours).
- **WarningRatio**: Threshold (0-1) for warning state transition.
### SymbolPackCatalogEntry
Represents an installable symbol/debug pack:
- **PackId**: PURL-formatted package identifier.
- **Platform**: Target platform (e.g., `linux/amd64`, `any`).
- **Components**: Array of debug components included.
- **DsseDigest**: DSSE signature digest for integrity verification.
- **Installed**: Whether the pack is active for the tenant.
### SymbolSourceFreshnessRecord
Materialized freshness projection following the advisory source pattern:
- Tracks sync cadence, error rates, and SLA compliance.
- Freshness state machine: `healthy` -> `warning` -> `stale` -> `unavailable`.
- Includes signature coverage metrics (signed/unsigned/failure counts).
### SymbolSourceTrustScore
Four-dimension trust scoring:
| Dimension | Weight | Description |
|-----------|--------|-------------|
| Freshness | 0.30 | How up-to-date the source is relative to SLA |
| Signature | 0.30 | DSSE signature coverage (signed packs / total packs) |
| Coverage | 0.20 | Artifact coverage derived from sync success rate |
| SLA Compliance | 0.20 | Whether source stays within freshness window |
Overall score = weighted average, clamped to [0, 1].
## Database Schema
### symbol_pack_sources
| Column | Type | Description |
|--------|------|-------------|
| id | uuid PK | Source identifier |
| key | text UNIQUE | Machine-readable key |
| name | text | Display name |
| source_type | text | vendor/distro/community/partner |
| url | text NULL | Source endpoint URL |
| priority | int | Resolution priority |
| enabled | boolean | Active flag |
| freshness_sla_seconds | int | Target sync interval |
| warning_ratio | decimal | Warning threshold |
| created_at | timestamptz | Creation timestamp |
| updated_at | timestamptz NULL | Last update |
### symbol_pack_catalog
| Column | Type | Description |
|--------|------|-------------|
| id | uuid PK | Entry identifier |
| source_id | uuid FK | References symbol_pack_sources |
| pack_id | text | PURL identifier |
| platform | text | Target platform |
| components | text[] | Component list |
| dsse_digest | text | Signature digest |
| version | text | Pack version |
| size_bytes | bigint | Pack size |
| published_at | timestamptz | Publish date |
## API Surface
### Symbol Sources (`/api/v1/symbols/sources`)
| Method | Path | Description |
|--------|------|-------------|
| GET | `/` | List sources with freshness projections |
| GET | `/summary` | Summary cards (healthy/stale/unavailable counts + avg trust) |
| GET | `/{id}` | Source detail with trust score |
| GET | `/{id}/freshness` | Freshness detail |
| POST | `/` | Create source |
| PUT | `/{id}` | Update source |
| DELETE | `/{id}` | Disable source |
### Marketplace Catalog (`/api/v1/symbols/marketplace`)
| Method | Path | Description |
|--------|------|-------------|
| GET | `/` | List catalog entries |
| GET | `/search` | Search by PURL/platform |
| GET | `/{entryId}` | Catalog entry detail |
| POST | `/{entryId}/install` | Install pack for tenant |
| POST | `/{entryId}/uninstall` | Uninstall pack |
| GET | `/installed` | List installed packs |
| POST | `/sync` | Trigger sync from sources |
All responses include `dataAsOf` timestamp for staleness detection.
## Integration Points
### IntegrationType.SymbolSource (= 7)
New integration type added to `StellaOps.Integrations.Core`:
- `MicrosoftSymbols = 700`
- `UbuntuDebuginfod = 701`
- `FedoraDebuginfod = 702`
- `DebianDebuginfod = 703`
- `PartnerSymbols = 704`
### UI Integration
- **Symbol Sources list**: `/security-risk/symbol-sources` — freshness summary + source table.
- **Symbol Source detail**: `/security-risk/symbol-sources/:sourceId` — trust breakdown, sync timeline.
- **Symbol Marketplace**: `/security-risk/symbol-marketplace` — catalog browse/search with install/uninstall.
- Sidebar entries under "Security and Risk" section.
### Existing Module Touchpoints
- **Scanner**: Symbol resolution uses marketplace-installed packs for call-stack symbolication.
- **ReachGraph**: Coverage dimension reflects artifact matching from reachability analysis.
- **Attestor**: DSSE signatures on packs are verified through the existing proof chain infrastructure.
- **Policy**: Trust scores feed into policy gate decisions for symbol-dependent verdicts.

View File

@@ -0,0 +1,80 @@
# SYMBOL_MANIFEST v1
> **Imposed rule:** Symbol bundles must be content-addressed and tenant-scoped; every manifest entry must include the originating image digest and build-id to prevent cross-tenant leakage.
This document specifies the manifest format for distributing native debug symbols (ELF, PDB, dSYM) within StellaOps Offline Kits and symbol servers.
## 1. Use cases
- Offline debugging: GDB/LLDB/WinDBG pointing at local symbol server or file tree.
- Reachability analysis: map call stacks and function addresses to packages for scanner reachability overlays.
- Forensics: correlate runtime crash dumps to signed builds.
## 2. File layout
```
symbols/
manifest.json # SYMBOL_MANIFEST v1 (this spec)
manifest.json.sha256 # sha256 of manifest.json
/.build-id/aa/bbbbb.debug # ELF split DWARF
/.build-id/cc/ddddd.sym # PE/PDB (optional)
/mach-o/<uuid>.dSYM # Apple dSYM bundle (optional)
```
## 3. Manifest schema (JSON)
Top-level fields:
```jsonc
{
"schema": "SYMBOL_MANIFEST/v1",
"generated_at": "2025-11-26T12:00:00Z",
"generator": "stella symbol pack 1.0.0",
"tenant": "acme",
"entries": [
{
"image_digest": "sha256:...", // source image
"component": "openssl", // optional package/name
"build_id": "abcdef1234567890", // GNU build-id or PE GUID
"file": ".build-id/ab/cdef.debug", // relative path inside bundle
"format": "elf-debug" , // elf-debug | pdb | dsym
"arch": "linux/amd64", // GOARCH-style
"size": 123456, // bytes
"sha256": "sha256:...", // file digest
"source": {
"kind": "ci" | "vendor" | "runtime-capture",
"pipeline": "https://ci.example/pipeline/123", // optional
"attestation": "sha256:..." // DSSE bundle digest (optional)
}
}
]
}
```
Constraints:
- `schema` must be exactly `SYMBOL_MANIFEST/v1`.
- Entries sorted by `build_id` then `file` to keep deterministic ordering.
- `tenant` required; manifests are not shareable across tenants.
## 4. Validation
- Verify `manifest.json.sha256` matches `manifest.json`.
- For each entry, hash the referenced file and compare to `sha256`.
- Ensure `build_id` path matches file location (for ELF: `/.build-id/<aa>/<rest>.debug`).
- When attestation is present, validate the DSSE bundle before serving symbols.
## 5. Transport
- OCI artifact (recommended): media type `application/vnd.stella.symbols.manifest.v1+json`; symbols packed as a tar layer with manifest at root.
- File archive: deterministic `tar.gz` with POSIX `ustar`, sorted entries, UTC mtimes set to `0`.
- Export Center mirrors symbol bundles alongside SBOM/attestation bundles for air-gapped installs.
## 6. Tenant controls
- Symbol server enforces tenant header `X-Stella-Tenant`; manifests without matching tenant are rejected.
- Offline bundles carry tenant in manifest; Console warns if loading mismatched tenant.
## 7. Versioning
- Future versions add optional fields; parsers must ignore unknown fields.
- Breaking changes will bump to `SYMBOL_MANIFEST/v2`.
## 8. References
- `docs/OFFLINE_KIT.md` (debug store expectations)
- `docs/benchmarks/signals/bench-determinism.md`
- `docs/modules/scanner/architecture.md` (reachability + symbol linkage)

View File

@@ -0,0 +1,35 @@
# Symbol Server API
> **Imposed rule:** All API responses must include tenant scoping and content digests; cross-tenant symbol access is forbidden.
Base path: `/api/v1/symbols` (service: Symbol Server / Export Center plugin).
## Endpoints
- `GET /manifest` returns `SYMBOL_MANIFEST/v1` for the tenant.
- Headers: `X-Stella-Tenant` (required)
- Query: `image_digest` (optional filter), `build_id` (exact match)
- `GET /files/{path}` stream a symbol file by relative path in manifest.
- Headers: `X-Stella-Tenant`
- Responds with `Content-SHA256` header and ETag; 404 if tenant mismatch.
- `POST /ingest` upload a symbol bundle (tar or OCI artifact) and manifest.
- Headers: `X-Stella-Tenant`, `X-Stella-Attestation` (optional DSSE digest)
- Validates manifest checksum, entry digests, and tenant.
- `GET /health` readiness/liveness.
## Error model
- Problem+JSON; include `tenant`, `correlation_id`, and `policy` fields when access is denied.
- Rate limits: `429` with `Retry-After`; deterministic budget per tenant.
## Security
- Auth via Authority-issued JWT; enforce `symbols:read`/`symbols:write` scopes.
- Tenant check on every request; manifest tenant must equal header.
- Optional DSSE attestation digest header is recorded and surfaced in `/manifest` under `source.attestation`.
## Caching & offline
- Console/CLI cache manifest + files in CAS; revalidate via `If-None-Match` on `GET /manifest`.
- Offline kits mount symbol bundle read-only; API client can be pointed at `file://` CAS handler for air-gapped use.
## Observability
- Emit counters per tenant: `symbol_manifest_requests`, `symbol_file_bytes_served`, `symbol_ingest_failures`.
- Logs include `build_id`, `image_digest`, `tenant`, `attested` flag.

View File

@@ -0,0 +1,55 @@
# Symbol Bundle Guide
This guide explains how to package, validate, and distribute symbol bundles that comply with `SYMBOL_MANIFEST/v1`.
## 1. Packaging steps
1. Gather debug artifacts (ELF `.debug`, PDB, dSYM) for the target release.
2. Compute `sha256` for each file and record size/arch/format.
3. Build `manifest.json` as per `SYMBOL_MANIFEST_v1.md`; sort entries by `build_id`, then `file`.
4. Write `manifest.json.sha256` with the hex digest of `manifest.json`.
5. Create a deterministic tarball:
- POSIX ustar
- Sorted file order
- `mtime=0`, `uid=gid=0`, `uname=guname=root`
- Compression: gzip `-n` to avoid timestamps
6. Optional: wrap as OCI artifact with media type `application/vnd.stella.symbols.manifest.v1+json`.
## 2. Validation checklist
- [ ] `manifest.json` hashes to `manifest.json.sha256`.
- [ ] Each file hash matches manifest entry.
- [ ] Build-id path structure is correct (ELF `.build-id/aa/<rest>.debug`).
- [ ] Tenant in manifest matches upload tenant.
- [ ] Tarball ordering is lexicographic and mtimes are zeroed.
## 3. Ingestion (API)
- POST the tar/OCI blob to `/api/v1/symbols/ingest` with header `X-Stella-Tenant`.
- Server recomputes digests; rejects mismatches or tenant mismatch.
- Optional DSSE attestation digest recorded in manifest for downstream verification.
## 4. Reachability integration
- Scanner attaches `build_id` and source image digest to reachability edges; Graph API can fetch symbol manifests to render function names in overlays.
- When symbols are missing, UI shows “symbol lookup unavailable” badge; import the matching manifest to enable function-level overlays.
## 5. Offline kits
- Place `symbols/` directory (manifest + files) at the kit root; include tarball and manifest digest.
- `debug-manifest.json` in Offline Kit should link to symbol manifest for cross-reference.
## 6. Tenant controls & audit
- Symbol server enforces tenant; exports are tagged with tenant in manifest and tar annotations.
- Emit Timeline events on ingest with bundle digest and tenant; attach DSSE attestation if present.
## 7. Offline verification procedure (2026-02-26 batch)
1. Load bundle archive and `manifest.json`.
2. Verify manifest hash and every object digest before cryptographic checks.
3. Verify DSSE envelope signature and payload binding.
4. Verify Rekor inclusion metadata when present (or classify as explicit offline/missing-proof state).
5. Verify Merkle inclusion proof nodes against expected root and reject mismatches deterministically.
Expected failure classes for automation:
- `payload_too_large`
- `missing_subject`
- `invalid_media_type`
- `referrer_cycle_detected`
- `missing_symbol_bundle`