feat(scanner): Implement Deno analyzer and associated tests
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Added Deno analyzer with comprehensive metadata and evidence structure.
- Created a detailed implementation plan for Sprint 130 focusing on Deno analyzer.
- Introduced AdvisoryAiGuardrailOptions for managing guardrail configurations.
- Developed GuardrailPhraseLoader for loading blocked phrases from JSON files.
- Implemented tests for AdvisoryGuardrailOptions binding and phrase loading.
- Enhanced telemetry for Advisory AI with metrics tracking.
- Added VexObservationProjectionService for querying VEX observations.
- Created extensive tests for VexObservationProjectionService functionality.
- Introduced Ruby language analyzer with tests for simple and complex workspaces.
- Added Ruby application fixtures for testing purposes.
This commit is contained in:
master
2025-11-12 10:01:54 +02:00
parent 0e8655cbb1
commit babb81af52
75 changed files with 3346 additions and 187 deletions

View File

@@ -138,7 +138,27 @@ Both subcommands honour offline-first expectations (no network access) and norma
* Uses `STELLAOPS_ADVISORYAI_URL` when configured; otherwise it reuses the backend base address and adds `X-StellaOps-Scopes` (`advisory:run` + task scope) per request.
* `--timeout 0` performs a single cache lookup (for CI flows that only want cached artefacts).
### 2.12 Air-gap guard
### 2.12 Decision evidence (new)
- `decision export`
* Parameters: `--cve`, `--product <purl or digest>`, `--scan-id <optional>`, `--output-dir`.
* Pulls `decision.openvex.json`, `decision.dsse.json`, `rekor.txt`, and evidence metadata from Policy Engine and writes them into the `bench/findings/<CVE>/` layout defined in [docs/benchmarks/vex-evidence-playbook.md](../benchmarks/vex-evidence-playbook.md).
* When `--sync` is set, uploads the bundle to Git (bench repo) with deterministic commit messages.
- `decision verify`
* Offline verifier that wraps `tools/verify.sh`/`verify.py` from the bench repo. Checks DSSE signature, optional Rekor inclusion, and recomputes digests for reachability/SBOM artifacts.
* Supports `--from bench` (local path) and `--remote` (fetch via API). Exit codes align with `verify.sh` (0 success, 3 signature failure, 18 truncated evidence).
- `decision compare`
* Executes the benchmark harness against baseline scanners (Trivy/Syft/Grype/Snyk/Xray), capturing false-positive reduction, mean-time-to-decision, and reproducibility metrics into `results/summary.csv`.
* Flags regressions when StellaOps produces more false positives or slower MTTD than the configured target.
All verbs require scopes `policy.findings:read`, `signer.verify`, and (for Rekor lookups) `attestor.read`. They honour sealed-mode rules by falling back to offline verification only when Rekor/Signer endpoints are unreachable.
### 2.13 Air-gap guard
- CLI outbound HTTP flows (Authority auth, backend APIs, advisory downloads) route through `StellaOps.AirGap.Policy`. When sealed mode is active the CLI refuses commands that would require external egress and surfaces the shared `AIRGAP_EGRESS_BLOCKED` remediation guidance instead of attempting the request.

View File

@@ -556,12 +556,13 @@ concelier:
* `concelier.fetch.docs_total{source}`
* `concelier.fetch.bytes_total{source}`
* `concelier.parse.failures_total{source}`
* `concelier.map.statements_total{source}`
* `concelier.observations.write_total{result=ok|noop|error}`
* `concelier.linksets.updated_total{result=ok|skip|error}`
* `concelier.linksets.conflicts_total{type}`
* `concelier.export.bytes{kind}`
* `concelier.export.duration_seconds{kind}`
* `concelier.map.statements_total{source}`
* `concelier.observations.write_total{result=ok|noop|error}`
* `concelier.linksets.updated_total{result=ok|skip|error}`
* `concelier.linksets.conflicts_total{type}`
* `concelier.export.bytes{kind}`
* `concelier.export.duration_seconds{kind}`
* `advisory_ai_chunk_requests_total{tenant,result,cache}` and `advisory_ai_guardrail_blocks_total{tenant,reason,cache}` instrument the `/advisories/{key}/chunks` surfaces that Advisory AI consumes. Cache hits now emit the same guardrail counters so operators can see blocked segments even when responses are served from cache.
* **Tracing** around fetch/parse/map/observe/linkset/export.
* **Logs**: structured with `source`, `uri`, `docDigest`, `advisoryKey`, `exportId`.

View File

@@ -15,12 +15,13 @@ The service operates strictly downstream of the **Aggregation-Only Contract (AOC
## 1·Responsibilities & Constraints
- Compile and evaluate `stella-dsl@1` policy packs into deterministic verdicts.
- Join SBOM inventory, Concelier advisories, and Excititor VEX evidence via canonical linksets and equivalence tables.
- Materialise effective findings (`effective_finding_{policyId}`) with append-only history and produce explain traces.
- Operate incrementally: react to change streams (advisory/vex/SBOM deltas) with ≤5min SLA.
- Provide simulations with diff summaries for UI/CLI workflows without modifying state.
- Enforce strict determinism guard (no wall-clock, RNG, network beyond allow-listed services) and RBAC + tenancy via Authority scopes.
- Compile and evaluate `stella-dsl@1` policy packs into deterministic verdicts.
- Join SBOM inventory, Concelier advisories, and Excititor VEX evidence via canonical linksets and equivalence tables.
- Materialise effective findings (`effective_finding_{policyId}`) with append-only history and produce explain traces.
- Emit per-finding OpenVEX decisions anchored to reachability evidence, forward them to Signer/Attestor for DSSE/Rekor, and publish the resulting artifacts for bench/verification consumers.
- Operate incrementally: react to change streams (advisory/vex/SBOM deltas) with ≤5min SLA.
- Provide simulations with diff summaries for UI/CLI workflows without modifying state.
- Enforce strict determinism guard (no wall-clock, RNG, network beyond allow-listed services) and RBAC + tenancy via Authority scopes.
- Support sealed/air-gapped deployments with offline bundles and sealed-mode hints.
Non-goals: policy authoring UI (handled by Console), ingestion or advisory normalisation (Concelier), VEX consensus (Excititor), runtime enforcement (Zastava).
@@ -110,8 +111,9 @@ Key notes:
| **Materialiser** (`Materialization/`) | Upsert effective findings, append history, manage explain bundle exports. | Mongo transactions per SBOM chunk. |
| **Orchestrator** (`Runs/`) | Change-stream ingestion, fairness, retry/backoff, queue writer. | Works with Scheduler Models DTOs. |
| **API** (`Api/`) | Minimal API endpoints, DTO validation, problem responses, idempotency. | Generated clients for CLI/UI. |
| **Observability** (`Telemetry/`) | Metrics (`policy_run_seconds`, `rules_fired_total`), traces, structured logs. | Sampled rule-hit logs with redaction. |
| **Offline Adapter** (`Offline/`) | Bundle export/import (policies, simulations, runs), sealed-mode enforcement. | Uses DSSE signing via Signer service. |
| **Observability** (`Telemetry/`) | Metrics (`policy_run_seconds`, `rules_fired_total`), traces, structured logs. | Sampled rule-hit logs with redaction. |
| **Offline Adapter** (`Offline/`) | Bundle export/import (policies, simulations, runs), sealed-mode enforcement. | Uses DSSE signing via Signer service. |
| **VEX Decision Emitter** (`Vex/Emitter/`) | Build OpenVEX statements, attach reachability evidence hashes, request DSSE signing, and persist artifacts for Export Center / bench repo. | New (Sprint401); integrates with Signer predicate `stella.ops/vexDecision@v1` and Attestor Rekor logging. |
---
@@ -172,22 +174,36 @@ Determinism guard instrumentation wraps the evaluator, rejecting access to forbi
---
## 6·Run Orchestration & Incremental Flow
- **Change streams:** Concelier and Excititor publish document changes to the scheduler queue (`policy.trigger.delta`). Payload includes `tenant`, `source`, `linkset digests`, `cursor`.
- **Orchestrator:** Maintains per-tenant backlog; merges deltas until time/size thresholds met, then enqueues `PolicyRunRequest`.
- **Queue:** Mongo queue with lease; each job assigned `leaseDuration`, `maxAttempts`.
## 6·Run Orchestration & Incremental Flow
- **Change streams:** Concelier and Excititor publish document changes to the scheduler queue (`policy.trigger.delta`). Payload includes `tenant`, `source`, `linkset digests`, `cursor`.
- **Orchestrator:** Maintains per-tenant backlog; merges deltas until time/size thresholds met, then enqueues `PolicyRunRequest`.
- **Queue:** Mongo queue with lease; each job assigned `leaseDuration`, `maxAttempts`.
- **Workers:** Lease jobs, execute evaluation pipeline, report status (success/failure/canceled). Failures with recoverable errors requeue with backoff; determinism or schema violations mark job `failed` and raise incident event.
- **Fairness:** Round-robin per `{tenant, policyId}`; emergency jobs (`priority=emergency`) jump queue but limited via circuit breaker.
- **Replay:** On demand, orchestrator rehydrates run via stored cursors and exports sealed bundle for audit/CI determinism checks.
- **Batch evaluation service (`/api/policy/eval/batch`):** Stateless evaluator powering Findings Ledger and replay/offline workflows. Requests contain canonical ledger events plus optional current projection; responses return status/severity/labels/rationale without mutating state. Policy Engine enforces per-tenant cost budgets, caches results by `(tenantId, policyVersion, eventHash, projectionHash)`, and falls back to inline evaluation when the remote service is disabled.
---
## 7·Security & Tenancy
- **Auth:** All API calls pass through Authority gateway; DPoP tokens enforced for service-to-service (Policy Engine service principal). CLI/UI tokens include scope claims.
- **Scopes:** Mutations require `policy:*` scopes corresponding to action; `effective:write` restricted to service identity.
### 6.1·VEX decision attestation pipeline
1. **Verdict capture.** Each evaluation result contains `{findingId, cve, productKey, reachabilityState, evidenceRefs}` plus SBOM and runtime CAS hashes.
2. **OpenVEX serialization.** `VexDecisionEmitter` builds an OpenVEX document with one statement per `(cve, productKey)` and fills:
- `status`, `justification`, `status_notes`, `impact_statement`, `action_statement`.
- `products` (purl) and `evidence` array referencing `reachability.json`, `sbom.cdx.json`, `runtimeFacts`.
3. **DSSE signing.** The emitter calls Signer `POST /api/v1/signer/sign/dsse` with predicate `stella.ops/vexDecision@v1`. Signer verifies PoE + scanner integrity and returns a DSSE envelope (`decision.dsse.json`).
4. **Transparency (optional).** When Rekor integration is enabled, Attestor logs the DSSE payload and returns `{uuid, logIndex, checkpoint}` which we persist next to the decision.
5. **Export.** API/CLI endpoints expose `decision.openvex.json`, `decision.dsse.json`, `rekor.txt`, and evidence metadata so Export Center + bench automation can mirror them into `bench/findings/**` as defined in the [VEX Evidence Playbook](../../benchmarks/vex-evidence-playbook.md).
All payloads are immutable and include analyzer fingerprints (`scanner.native@sha256:...`, `policyEngine@sha256:...`) so replay tooling can recompute identical digests. Determinism tests cover both the OpenVEX JSON and the DSSE payload bytes.
---
## 7·Security & Tenancy
- **Auth:** All API calls pass through Authority gateway; DPoP tokens enforced for service-to-service (Policy Engine service principal). CLI/UI tokens include scope claims.
- **Scopes:** Mutations require `policy:*` scopes corresponding to action; `effective:write` restricted to service identity.
- **Tenancy:** All queries filter by `tenant`. Service identity uses `tenant-global` for shared policies; cross-tenant reads prohibited unless `policy:tenant-admin` scope present.
- **Secrets:** Configuration loaded via environment variables or sealed secrets; runtime avoids writing secrets to logs.
- **Determinism guard:** Static analyzer prevents referencing forbidden namespaces; runtime guard intercepts `DateTime.Now`, `Random`, `Guid`, HTTP clients beyond allow-list.

View File

@@ -31,7 +31,9 @@ src/
├─ StellaOps.Scanner.EntryTrace/ # ENTRYPOINT/CMD → terminal program resolver (shell AST)
├─ StellaOps.Scanner.Analyzers.OS.[Apk|Dpkg|Rpm]/
├─ StellaOps.Scanner.Analyzers.Lang.[Java|Node|Python|Go|DotNet|Rust]/
├─ StellaOps.Scanner.Analyzers.Native.[ELF|PE|MachO]/ # PE/Mach-O planned (M2)
├─ StellaOps.Scanner.Analyzers.Native.[ELF|PE|MachO]/ # PE/Mach-O planned (M2)
├─ StellaOps.Scanner.Symbols.Native/ # NEW native symbol reader/demangler (Sprint 401)
├─ StellaOps.Scanner.CallGraph.Native/ # NEW function/call-edge builder + CAS emitter
├─ StellaOps.Scanner.Emit.CDX/ # CycloneDX (JSON + Protobuf)
├─ StellaOps.Scanner.Emit.SPDX/ # SPDX 3.0.1 JSON
├─ StellaOps.Scanner.Diff/ # image→layer→component threeway diff
@@ -216,14 +218,17 @@ When `scanner.events.enabled = true`, the WebService serialises the signed repor
> **Rule:** We only report components proven **on disk** with authoritative metadata. Lockfiles are evidence only.
**C) Native link graph**
* **ELF**: parse `PT_INTERP`, `DT_NEEDED`, RPATH/RUNPATH, **GNU symbol versions**; map **SONAMEs** to file paths; link executables → libs.
* **PE/MachO** (planned M2): import table, delayimports; version resources; code signatures.
* Map libs back to **OS packages** if possible (via file lists); else emit `bin:{sha256}` components.
* The exported metadata (`stellaops.os.*` properties, license list, source package) feeds policy scoring and export pipelines
directly Policy evaluates quiet rules against package provenance while Exporters forward the enriched fields into
downstream JSON/Trivy payloads.
**C) Native link graph**
* **ELF**: parse `PT_INTERP`, `DT_NEEDED`, RPATH/RUNPATH, **GNU symbol versions**; map **SONAMEs** to file paths; link executables → libs.
* **PE/MachO** (planned M2): import table, delayimports; version resources; code signatures.
* Map libs back to **OS packages** if possible (via file lists); else emit `bin:{sha256}` components.
* The exported metadata (`stellaops.os.*` properties, license list, source package) feeds policy scoring and export pipelines
directly Policy evaluates quiet rules against package provenance while Exporters forward the enriched fields into
downstream JSON/Trivy payloads.
* Sprint401 introduces `StellaOps.Scanner.Symbols.Native` (DWARF/PDB reader + demangler) and `StellaOps.Scanner.CallGraph.Native`
(function boundary detector + call-edge builder). These libraries feed `FuncNode`/`CallEdge` CAS bundles and enrich reachability
graphs with `{code_id, confidence, evidence}` so Signals/Policy/UI can cite function-level justifications.
**D) EntryTrace (ENTRYPOINT/CMD → terminal program)**

View File

@@ -20,17 +20,17 @@
## 1) Responsibilities (contract)
1. **Authenticate** caller with **OpTok** (Authority OIDC, DPoP or mTLSbound).
2. **Authorize** scopes (`signer.sign`) + audience (`aud=signer`) + tenant/installation.
3. **Validate entitlement** via **PoE** (ProofofEntitlement) against Cloud Licensing `/license/introspect`.
4. **Verify release integrity** of the **scanner** image digest presented in the request: must be **cosignsigned** by StellaOps release key, discoverable via **OCI Referrers API**.
5. **Enforce plan & quotas** (concurrency/QPS/artifact size/rate caps).
6. **Mint signing identity**:
1. **Authenticate** caller with **OpTok** (Authority OIDC, DPoP or mTLSbound).
2. **Authorize** scopes (`signer.sign`) + audience (`aud=signer`) + tenant/installation.
3. **Validate entitlement** via **PoE** (ProofofEntitlement) against Cloud Licensing `/license/introspect`.
4. **Verify release integrity** of the **scanner** image digest presented in the request: must be **cosignsigned** by StellaOps release key, discoverable via **OCI Referrers API**.
5. **Enforce plan & quotas** (concurrency/QPS/artifact size/rate caps).
6. **Mint signing identity**:
* **Keyless** (default): get a shortlived X.509 cert from **Fulcio** using the Signers OIDC identity and sign the DSSE.
* **Keyful** (optional): sign with an HSM/KMS key.
7. **Return DSSE bundle** (subject digests + predicate + cert chain or KMS key id).
8. **Audit** every decision; expose metrics.
7. **Return DSSE bundle** (subject digests + predicate + cert chain or KMS key id).
8. **Audit** every decision; expose metrics.
---
@@ -115,18 +115,30 @@ Errors (RFC7807):
* `400 invalid_request` (schema/predicate/type invalid)
* `500 signing_unavailable` (Fulcio/KMS outage)
### 3.2 `GET /verify/referrers?imageDigest=<sha256>`
Checks whether the **image** at digest is signed by **StellaOps release key**.
Response:
### 3.2 `GET /verify/referrers?imageDigest=<sha256>`
Checks whether the **image** at digest is signed by **StellaOps release key**.
Response:
```json
{ "trusted": true, "signatures": [ { "type": "cosign", "digest": "sha256:...", "signedBy": "StellaOps Release 2027 Q2" } ] }
```
{ "trusted": true, "signatures": [ { "type": "cosign", "digest": "sha256:...", "signedBy": "StellaOps Release 2027 Q2" } ] }
```
> **Note:** This endpoint is also used internally by Signer before issuing signatures.
### 3.3 Predicate catalog (Sprint401 update)
Signer now enforces an allowlist of predicate identifiers:
| Predicate | Description | Producer |
|-----------|-------------|----------|
| `stella.ops/sbom@v1` | SBOM/report attestation (existing). | Scanner WebService. |
| `stella.ops/promotion@v1` | Promotion evidence (see `docs/release/promotion-attestations.md`). | DevOps/Export Center. |
| `stella.ops/vexDecision@v1` | OpenVEX decision for a single `(cve, product)` pair, including reachability evidence references. | Policy Engine / VEXer. |
Requests with unknown predicates receive `400 predicate_not_allowed`. Policy Engine must supply the OpenVEX JSON as the `predicate` body; Signer preserves payload bytes verbatim so DSSE digest = OpenVEX digest.
---
### KMS drivers (keyful mode)