feat(scanner): Implement Deno analyzer and associated tests
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
- Added Deno analyzer with comprehensive metadata and evidence structure. - Created a detailed implementation plan for Sprint 130 focusing on Deno analyzer. - Introduced AdvisoryAiGuardrailOptions for managing guardrail configurations. - Developed GuardrailPhraseLoader for loading blocked phrases from JSON files. - Implemented tests for AdvisoryGuardrailOptions binding and phrase loading. - Enhanced telemetry for Advisory AI with metrics tracking. - Added VexObservationProjectionService for querying VEX observations. - Created extensive tests for VexObservationProjectionService functionality. - Introduced Ruby language analyzer with tests for simple and complex workspaces. - Added Ruby application fixtures for testing purposes.
This commit is contained in:
@@ -36,6 +36,8 @@ This guide translates the deterministic reachability blueprint into concrete wor
|
||||
|
||||
| Stream | Owner Guild(s) | Key deliverables |
|
||||
|--------|----------------|------------------|
|
||||
| **Native symbols & callgraphs** | Scanner Worker · Symbols Guild | Ship `Scanner.Symbols.Native` + `Scanner.CallGraph.Native`, integrate Symbol Manifest v1, demangle Itanium/MSVC names, emit `FuncNode`/`CallEdge` CAS bundles (task `SCANNER-NATIVE-401-015`). |
|
||||
| **Reachability store** | Signals · BE-Base Platform | Provision shared Mongo collections (`func_nodes`, `call_edges`, `cve_func_hits`), indexes, and repositories plus REST hooks for reuse (task `SIG-STORE-401-016`). |
|
||||
| **Language lifters** | Scanner Worker | CLI/hosted lifters for DotNet, Go, Node/Deno, JVM, Rust, Swift, Binary, Shell with CAS uploads and richgraph output |
|
||||
| **Signals ingestion & scoring** | Signals | `/callgraphs`, `/runtime-facts` (JSON + NDJSON/gzip), `/graphs/{id}`, `/reachability/recompute` GA; CAS-backed storage, runtime dedupe, BFS+predicates scoring |
|
||||
| **Runtime capture** | Zastava + Runtime Guild | EntryTrace/eBPF samplers, NDJSON batches (symbol IDs + timestamps + counts) |
|
||||
@@ -104,7 +106,8 @@ Each sprint is two weeks; refer to `docs/implplan/SPRINT_401_reachability_eviden
|
||||
|
||||
- Place developer-facing updates here (`docs/reachability`).
|
||||
- [Function-level evidence guide](function-level-evidence.md) captures the Nov 2025 advisory scope, task references, and schema expectations; keep it in lockstep with sprint status.
|
||||
- Operator runbooks (`docs/runbooks/reachability-runtime.md`) – TODO reference to be added when runtime pipeline lands.
|
||||
- [Reachability runtime runbook](../runbooks/reachability-runtime.md) now documents ingestion, CAS staging, air-gap handling, and troubleshooting—link every runtime feature PR to this guide.
|
||||
- [VEX Evidence Playbook](../benchmarks/vex-evidence-playbook.md) defines the bench repo layout, artifact shapes, verifier tooling, and metrics; keep it updated when Policy/Signer/CLI features land.
|
||||
- Update module dossiers (Scanner, Signals, Replay, Authority, Policy, UI) once each guild lands work.
|
||||
|
||||
---
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Function-Level Evidence Readiness (Nov 2025 Advisory)
|
||||
|
||||
_Last updated: 2025-11-09. Owner: Business Analysis Guild._
|
||||
_Last updated: 2025-11-12. Owner: Business Analysis Guild._
|
||||
|
||||
This memo captures the outstanding work required to make Stella Ops scanners emit stable, function-level evidence that matches the November 2025 advisory. It does **not** implement any code; instead it enumerates requirements, links them to sprint tasks, and spells out the schema/API updates that the next agent must land.
|
||||
|
||||
@@ -62,6 +62,18 @@ Out of scope: implementing disassemblers or symbol servers; those will be handle
|
||||
* Write CLI/API walkthroughs in `docs/09_API_CLI_REFERENCE.md` and `docs/api/policy.md` showing how to request reachability evidence and verify DSSE chains.
|
||||
* Produce OpenVEX + replay samples under `samples/reachability/` showing `facts.type = "stella.reachability"` with `graph_hash` and `code_id` arrays.
|
||||
|
||||
### 3.6 Native lifter & Reachability Store (SCANNER-NATIVE-401-015 / SIG-STORE-401-016)
|
||||
|
||||
* Stand up `Scanner.Symbols.Native` + `Scanner.CallGraph.Native` libraries that:
|
||||
* parse ELF (DWARF + `.symtab`/`.dynsym`), PE/COFF (CodeView/PDB), and stripped binaries via probabilistic carving;
|
||||
* emit deterministic `FuncNode` + `CallEdge` records with demangled names, language hints, and `{confidence,evidence}` arrays; and
|
||||
* attach analyzer + toolchain identifiers consumed by `richgraph-v1`.
|
||||
* Introduce `Reachability.Store` collections in Mongo:
|
||||
* `func_nodes` – keyed by `func:<format>:<sha256>:<va>` with `{binDigest,name,addr,size,lang,confidence,sym}`.
|
||||
* `call_edges` – `{from,to,kind,confidence,evidence[]}` linking internal/external nodes.
|
||||
* `cve_func_hits` – `{cve,purl,func_id,match_kind,confidence,source}` for advisory alignment.
|
||||
* Build indexes (`binDigest+name`, `from→to`, `cve+func_id`) and expose repository interfaces so Scanner, Signals, and Policy can reuse the same canonical data without duplicating queries.
|
||||
|
||||
---
|
||||
|
||||
## 4. Schema & API Touchpoints
|
||||
@@ -86,6 +98,50 @@ API contracts to amend:
|
||||
- Signals dedupes events, merges metadata, and persists the aggregated `RuntimeFacts` onto `ReachabilityFactDocument`. These facts now feed reachability scoring (SIGNALS-24-004/005) as part of the runtime bonus lattice.
|
||||
- Outstanding work: record CAS URIs for runtime traces, emit provenance events, and expose the enriched context to Policy/Replay consumers.
|
||||
|
||||
### 4.2 Reachability store layout (SIG-STORE-401-016)
|
||||
|
||||
All producers **must** persist native function evidence using the shared collections below (names are advisory; exact names live in Mongo options):
|
||||
|
||||
```json
|
||||
// func_nodes
|
||||
{
|
||||
"_id": "func:ELF:sha256:4012a0",
|
||||
"binDigest": "sha256:deadbeef...",
|
||||
"name": "ssl3_read_bytes",
|
||||
"addr": "0x4012a0",
|
||||
"size": 312,
|
||||
"lang": "c",
|
||||
"confidence": 0.92,
|
||||
"symbol": { "mangled": "_Z15ssl3_read_bytes", "demangled": "ssl3_read_bytes", "source": "DWARF" },
|
||||
"sym": "present"
|
||||
}
|
||||
|
||||
// call_edges
|
||||
{
|
||||
"from": "func:ELF:sha256:4012a0",
|
||||
"to": "func:ELF:sha256:40f0ff",
|
||||
"kind": "static",
|
||||
"confidence": 0.88,
|
||||
"evidence": ["reloc:.plt.got", "bb-target:0x40f0ff"]
|
||||
}
|
||||
|
||||
// cve_func_hits
|
||||
{
|
||||
"cve": "CVE-2023-XXXX",
|
||||
"purl": "pkg:generic/openssl@1.1.1u",
|
||||
"func_id": "func:ELF:sha256:4012a0",
|
||||
"match": "name+version",
|
||||
"confidence": 0.77,
|
||||
"source": "concelier:openssl-advisory"
|
||||
}
|
||||
```
|
||||
|
||||
Writers **must**:
|
||||
|
||||
1. Upsert `func_nodes` before emitting edges/hits to ensure `_id` lookups remain stable.
|
||||
2. Serialize evidence arrays in deterministic order (`reloc`, `bb-target`, `import`, …) and normalise hex casing.
|
||||
3. Attach analyzer fingerprints (`scanner.native@sha256:...`) so Replay/Policy can enforce provenance.
|
||||
|
||||
---
|
||||
|
||||
## 5. Test & Fixture Expectations
|
||||
|
||||
Reference in New Issue
Block a user