feat(ruby): Implement RubyManifestParser for parsing gem groups and dependencies
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

feat(ruby): Add RubyVendorArtifactCollector to collect vendor artifacts

test(deno): Add golden tests for Deno analyzer with various fixtures

test(deno): Create Deno module and package files for testing

test(deno): Implement Deno lock and import map for dependency management

test(deno): Add FFI and worker scripts for Deno testing

feat(ruby): Set up Ruby workspace with Gemfile and dependencies

feat(ruby): Add expected output for Ruby workspace tests

feat(signals): Introduce CallgraphManifest model for signal processing
This commit is contained in:
master
2025-11-10 09:27:03 +02:00
parent 69c59defdc
commit 56c687253f
87 changed files with 2462 additions and 542 deletions

View File

@@ -14,5 +14,5 @@ Execute the tasks below strictly in order; each artifact unblocks the next analy
| 4 | `SCANNER-ANALYZERS-DENO-26-004` | DONE | Add the permission/capability analyzer covering FS/net/env/process/crypto/FFI/workers plus dynamic-import + literal fetch heuristics with reason codes. | Deno Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Deno) | SCANNER-ANALYZERS-DENO-26-003 |
| 5 | `SCANNER-ANALYZERS-DENO-26-005` | DONE | Build bundle/binary inspectors for eszip and `deno compile` executables to recover graphs, configs, embedded resources, and snapshots. | Deno Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Deno) | SCANNER-ANALYZERS-DENO-26-004 |
| 6 | `SCANNER-ANALYZERS-DENO-26-006` | DONE | Implement the OCI/container adapter that stitches per-layer Deno caches, vendor trees, and compiled binaries back into provenance-aware analyzer inputs. | Deno Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Deno) | SCANNER-ANALYZERS-DENO-26-005 |
| 7 | `SCANNER-ANALYZERS-DENO-26-007` | DOING | Produce AOC-compliant observation writers (entrypoints, modules, capability edges, workers, warnings, binaries) with deterministic reason codes. | Deno Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Deno) | SCANNER-ANALYZERS-DENO-26-006 |
| 8 | `SCANNER-ANALYZERS-DENO-26-008` | TODO | Finalize fixture + benchmark suite (vendor/npm/FFI/worker/dynamic import/bundle/cache/container cases) validating analyzer determinism and performance. | Deno Analyzer Guild, QA Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Deno) | SCANNER-ANALYZERS-DENO-26-007 |
| 7 | `SCANNER-ANALYZERS-DENO-26-007` | DONE | Produce AOC-compliant observation writers (entrypoints, modules, capability edges, workers, warnings, binaries) with deterministic reason codes. | Deno Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Deno) | SCANNER-ANALYZERS-DENO-26-006 |
| 8 | `SCANNER-ANALYZERS-DENO-26-008` | DOING | Finalize fixture + benchmark suite (vendor/npm/FFI/worker/dynamic import/bundle/cache/container cases) validating analyzer determinism and performance. | Deno Analyzer Guild, QA Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Deno) | SCANNER-ANALYZERS-DENO-26-007 |

View File

@@ -14,15 +14,17 @@
| `SCANNER-ENG-0013` | TODO | Plan Swift Package Manager coverage (Package.resolved, xcframeworks, runtime hints) with policy hooks. | Swift Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Swift) | — |
| `SCANNER-ENG-0014` | TODO | Align Kubernetes/VM target coverage between Scanner and Zastava per comparison findings; publish joint roadmap. | Runtime Guild, Zastava Guild (docs/modules/scanner) | — |
| `SCANNER-ENG-0015` | DOING (2025-11-09) | Document DSSE/Rekor operator enablement guidance and rollout levers surfaced in the gap analysis. | Export Center Guild, Scanner Guild (docs/modules/scanner) | — |
| `SCANNER-ENG-0016` | DOING (2025-11-02) | Implement `RubyLockCollector` + vendor cache ingestion per design §4.14.3. | Ruby Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Ruby) | SCANNER-ENG-0009 |
| `SCANNER-ENG-0016` | DOING (2025-11-10) | Implement `RubyLockCollector` + vendor cache ingestion per design §4.14.3. | Ruby Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Ruby) | SCANNER-ENG-0009 |
| `SCANNER-ENG-0017` | DONE (2025-11-09) | Build the runtime require/autoload graph builder with tree-sitter Ruby per design §4.4 and integrate EntryTrace hints. | Ruby Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Ruby) | SCANNER-ENG-0016 |
| `SCANNER-ENG-0018` | DONE (2025-11-09) | Emit Ruby capability + framework surface signals as defined in design §4.5 with policy predicate hooks. | Ruby Analyzer Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Ruby) | SCANNER-ENG-0017 |
| `SCANNER-ENG-0019` | DOING (2025-11-10) | Ship Ruby CLI verbs (`stella ruby inspect|resolve`) and Offline Kit packaging per design §4.6. | Ruby Analyzer Guild, CLI Guild (src/Scanner/StellaOps.Scanner.Analyzers.Lang.Ruby) | SCANNER-ENG-0016..0018 |
| `SCANNER-LIC-0001` | DOING (2025-11-02) | Vet tree-sitter Ruby licensing + Offline Kit packaging requirements and document SPDX posture. | Scanner Guild, Legal Guild (docs/modules/scanner) | SCANNER-ENG-0016 |
| `SCANNER-POLICY-0001` | TODO | Define Policy Engine predicates for Ruby groups/capabilities and align lattice weights. | Policy Guild, Ruby Analyzer Guild (docs/modules/scanner) | SCANNER-ENG-0018 |
| `SCANNER-CLI-0001` | DOING (2025-11-09) | Coordinate CLI UX/help text for new Ruby verbs and update CLI docs/golden outputs. | CLI Guild, Ruby Analyzer Guild (src/Cli/StellaOps.Cli) | SCANNER-ENG-0019 |
| `SCANNER-CLI-0001` | DONE (2025-11-10) | Coordinate CLI UX/help text for new Ruby verbs and update CLI docs/golden outputs. | CLI Guild, Ruby Analyzer Guild (src/Cli/StellaOps.Cli) | SCANNER-ENG-0019 |
### Updates — 2025-11-09
- `SCANNER-CLI-0001`: Completed Spectre table wrapping fix for runtime/lockfile columns, expanded Ruby resolve JSON assertions, removed ad-hoc debug artifacts, and drafted CLI docs covering `stellaops-cli ruby inspect|resolve`. Pending: final verification + handoff once docs/tests merge.
- `SCANNER-CLI-0001`: Wired `stellaops-cli ruby inspect|resolve` into `CommandFactory` so the verbs are available via `System.CommandLine` with the expected `--root`, `--image/--scan-id`, and `--format` options; `dotnet test ... --filter Ruby` passes.
- `SCANNER-CLI-0001`: Added CLI unit tests (`CommandFactoryTests`, Ruby inspect JSON assertions) to guard the new verbs and runtime metadata output; `dotnet test src/Cli/__Tests/StellaOps.Cli.Tests/StellaOps.Cli.Tests.csproj --filter "CommandFactoryTests|Ruby"` now covers the CLI surface.
- `SCANNER-ENG-0016`: 2025-11-10 — resumed to finish `RubyLockCollector` + vendor cache ingestion (Codex agent) per §4.14.3, targeting lockfile multi-source coverage and bundler group metadata.

View File

@@ -51,6 +51,7 @@ This file now only tracks the runtime & signals status snapshot. Active backlog
| Concelier Link-Not-Merge schema slips | SBOM-SERVICE-21-001..004 + Advisory AI SBOM endpoints stay blocked | Concelier + Cartographer guilds to publish CARTO-GRAPH-21-002 ETA during next coordination call; SBOM guild to prep schema doc meanwhile. |
| Scanner surface artifact delay | GRAPH-INDEX-28-007+ and ZASTAVA-SURFACE-* cannot even start | Scanner guild to deliver analyzer artifact roadmap; Graph/Zastava teams to prepare mocks/tests in advance. |
| Signals host/callgraph merge misses 2025-11-09 | SIGNALS-24-003/004/005 remain blocked, pushing reachability scoring past sprint goals | Signals + Authority guilds to prioritize AUTH-SIG-26-001 review and merge SIGNALS-24-001/002 before 2025-11-10 standup. |
| Authority build regression (`PackApprovalFreshAuthWindow`) | Signals test suite cannot run in CI, delaying validation of new endpoints | Coordinate with Authority guild to restore missing constant in `StellaOps.Auth.ServerIntegration`; rerun Signals tests once fixed. |
# Coordination log

View File

@@ -19,5 +19,6 @@ SIGNALS-24-003 | DOING (2025-11-09) | Implement runtime facts ingestion endpoint
> 2025-11-07: Upstream SIGNALS-24-001 / SIGNALS-24-002 now DOING; this flips to DOING once host + callgraph ingestion merge.
> 2025-11-08: Targeting 2025-11-09 merge for SIGNALS-24-001/002; schema + AOC contract drafted so SIGNALS-24-003 can move to DOING immediately after those PRs land (dependencies confirmed, none missing).
> 2025-11-09: Added runtime facts ingestion service + endpoint, aggregated runtime hit storage, and unit tests; next steps are NDJSON/gzip ingestion and provenance metadata wiring.
> 2025-11-09: Added `/signals/runtime-facts/ndjson` streaming endpoint (JSON/NDJSON + gzip) with sealed-mode gating; provenance/context enrichment + scoring linkage remain.
SIGNALS-24-004 | BLOCKED (2025-10-27) | Deliver reachability scoring engine producing states/scores and writing to `reachability_facts`; expose configuration for weights. Dependencies: SIGNALS-24-003.<br>2025-10-27: Upstream ingestion pipelines (`SIGNALS-24-002/003`) blocked; scoring engine cannot proceed. | Signals Guild, Data Science (src/Signals/StellaOps.Signals)
SIGNALS-24-005 | BLOCKED (2025-10-27) | Implement Redis caches (`reachability_cache:*`), invalidation on new facts, and publish `signals.fact.updated` events. Dependencies: SIGNALS-24-004.<br>2025-10-27: Awaiting scoring engine and ingestion layers before wiring cache/events. | Signals Guild, Platform Events Guild (src/Signals/StellaOps.Signals)

View File

@@ -9,11 +9,11 @@ Each wave groups sprints that declare the same leading dependency. Start waves o
- Shared prerequisite(s): None (explicit)
- Parallelism guidance: No upstream sprint recorded; confirm module AGENTS and readiness gates before parallel execution.
- Sprints:
- SPRINT_110_ingestion_evidence.md — Sprint 110 - Ingestion & Evidence
- SPRINT_130_scanner_surface.md — Sprint 130 - Scanner & Surface
- SPRINT_137_scanner_gap_design.md — Sprint 137 - Scanner & Surface
- SPRINT_138_scanner_ruby_parity.md — Sprint 138 - Scanner & Surface
- SPRINT_140_runtime_signals.md — Sprint 140 - Runtime & Signals
- SPRINT_110_ingestion_evidence.md — Sprint 110 - Ingestion & Evidence. Done.
- SPRINT_130_scanner_surface.md — Sprint 130 - Scanner & Surface. Done.
- SPRINT_137_scanner_gap_design.md — Sprint 137 - Scanner & Surface. Done.
- SPRINT_138_scanner_ruby_parity.md — Sprint 138 - Scanner & Surface. In progress.
- SPRINT_140_runtime_signals.md — Sprint 140 - Runtime & Signals. In progress.
- SPRINT_150_scheduling_automation.md — Sprint 150 - Scheduling & Automation
- SPRINT_160_export_evidence.md — Sprint 160 - Export & Evidence
- SPRINT_170_notifications_telemetry.md — Sprint 170 - Notifications & Telemetry

View File

@@ -25,7 +25,7 @@ This guide translates the deterministic reachability blueprint into concrete wor
|-------|----------|----------|--------------|
| SBOM per layer & composed image | Scanner Worker + Sbomer | `sbom.layer.cdx.json`, `sbom.image.cdx.json` | Deterministic CycloneDX 1.6, DSSE envelope, CAS URI |
| Static reachability graph | Scanner Worker lifters (DotNet, Go, Node/Deno, Rust, Swift, JVM, Binary, Shell) | `richgraph-v1.json` + `sha256` | Canonical SymbolIDs, framework entries, predicates, graph hash |
| Runtime facts | Zastava Observer / runtime probes | `runtime-trace.ndjson` | EntryTrace schema, CAS pointer, optional compression |
| Runtime facts | Zastava Observer / runtime probes | `runtime-trace.ndjson` (gzip or JSON) | EntryTrace schema, CAS pointer, process/socket/container metadata, optional compression |
| Replay manifest | Scanner Worker + Replay Core | `replay.yaml` | Contains analyzer versions, feed locks, graph hash, runtime trace digests |
| VEX statements | Scanner WebService + Policy Engine | `reachability.json` + OpenVEX doc | Links SBOM attn, graph attn, runtime evidence IDs |
| Signed bundle | Authority + Signer | DSSE envelope referencing above | Support FIPS + PQ variants (Dilithium where required) |
@@ -37,7 +37,7 @@ This guide translates the deterministic reachability blueprint into concrete wor
| Stream | Owner Guild(s) | Key deliverables |
|--------|----------------|------------------|
| **Language lifters** | Scanner Worker | CLI/hosted lifters for DotNet, Go, Node/Deno, JVM, Rust, Swift, Binary, Shell with CAS uploads and richgraph output |
| **Signals ingestion & scoring** | Signals | `/callgraphs`, `/runtime-facts`, `/graphs/{id}`, `/reachability/recompute` GA; CAS-backed storage, runtime dedupe, BFS+predicates scoring |
| **Signals ingestion & scoring** | Signals | `/callgraphs`, `/runtime-facts` (JSON + NDJSON/gzip), `/graphs/{id}`, `/reachability/recompute` GA; CAS-backed storage, runtime dedupe, BFS+predicates scoring |
| **Runtime capture** | Zastava + Runtime Guild | EntryTrace/eBPF samplers, NDJSON batches (symbol IDs + timestamps + counts) |
| **Replay evidence** | Replay Core + Scanner Worker | Manifest schema v2, `ReachabilityReplayWriter` integration, hash-lock tests |
| **Authority attestations** | Authority + Signer | DSSE predicates for SBOM, Graph, Replay, VEX; Rekor mirror alignment |

View File

@@ -22,7 +22,7 @@ Out of scope: implementing disassemblers or symbol servers; those will be handle
|-------------|-------------|-----------------|-------|
| Immutable code identity (`code_id` = `{format, build_id, start, length}` + optional `code_block_hash`) | Callgraph nodes are opaque strings with no address metadata. | Sprint401 `GRAPH-CAS-401-001`, `GAP-SCAN-001`, `GAP-SYM-007` | `code_id` should live alongside existing `SymbolID` helpers so analyzers can emit it without duplicating logic. |
| Symbol hints (demangled name, source, confidence) | No schema fields for symbol metadata; demangling is ad-hoc per analyzer. | `GAP-SYM-007` | Require deterministic casing + `symbol.source ∈ {DWARF,PDB,SYM,none}`. |
| Runtime facts mapped to code anchors | `/signals/runtime-facts` is a stub; Zastava streams only Build-IDs. | Sprint400 `ZASTAVA-REACH-201-001`, Sprint401 `SIGNALS-RUNTIME-401-002`, `GAP-ZAS-002`, `GAP-SIG-003` | Need NDJSON schema documenting `code_id`, `symbol.sid`, `hit_count`, `loader_base`. |
| Runtime facts mapped to code anchors | `/signals/runtime-facts` now accepts JSON and NDJSON (gzip) streams, stores symbol/code/process/container metadata. | Sprint400 `ZASTAVA-REACH-201-001`, Sprint401 `SIGNALS-RUNTIME-401-002`, `GAP-ZAS-002`, `GAP-SIG-003` | Provenance enrichment (process/socket/container) persisted; next step is exposing CAS URIs + context facts and emitting events for Policy/Replay. |
| Replay/DSSE coverage | Replay manifests dont enforce hash/CAS registration for graphs/traces. | Sprint400 `REPLAY-REACH-201-005`, Sprint401 `REPLAY-401-004`, `GAP-REP-004` | Extend manifest v2 with analyzer versions + BLAKE3 digests; add DSSE predicate types. |
| Policy/VEX/UI explainability | Policy uses coarse `reachability:*` tags; UI/CLI cannot show call paths or evidence hashes. | Sprint401 `POLICY-VEX-401-006`, `UI-CLI-401-007`, `GAP-POL-005`, `GAP-VEX-006`, `EXPERIENCE-GAP-401-012` | Evidence blocks must cite `code_id`, graph hash, runtime CAS URI, analyzer version. |
| Operator documentation & samples | No guide shows how to replay `{build_id,start,len}` across CLI/API. | Sprint401 `QA-DOCS-401-008`, `GAP-DOC-008` | Produce samples under `samples/reachability/**` plus CLI walkthroughs. |
@@ -78,6 +78,14 @@ API contracts to amend:
- `POST /signals/runtime-facts` request body schema (NDJSON) with `symbol_id`, `code_id`, `hit_count`, `loader_base`.
- `GET /policy/findings` payload must surface `reachability.evidence[]` objects.
### 4.1 Signals runtime ingestion snapshot (Nov 2025)
- `/signals/runtime-facts` (JSON) and `/signals/runtime-facts/ndjson` (streaming, optional gzip) accept the following event fields:
- `symbolId` (required), `codeId`, `loaderBase`, `hitCount`, `processId`, `processName`, `socketAddress`, `containerId`, `evidenceUri`, `metadata`.
- Subject context (`scanId` / `imageDigest` / `component` / `version`) plus `callgraphId` is supplied either in the JSON body or as query params for the NDJSON endpoint.
- Signals dedupes events, merges metadata, and persists the aggregated `RuntimeFacts` onto `ReachabilityFactDocument`. These facts now feed reachability scoring (SIGNALS-24-004/005) as part of the runtime bonus lattice.
- Outstanding work: record CAS URIs for runtime traces, emit provenance events, and expose the enriched context to Policy/Replay consumers.
---
## 5. Test & Fixture Expectations
@@ -99,4 +107,3 @@ All fixtures must remain deterministic: sort nodes/edges, normalise casing, and
5. Before shipping, run the reachbench fixtures end-to-end and capture hashes for inclusion in replay docs.
Keep this document updated as tasks change state; it is the authoritative hand-off note for the advisory.