Add PHP Analyzer Plugin and Composer Lock Data Handling
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Implemented the PhpAnalyzerPlugin to analyze PHP projects.
- Created ComposerLockData class to represent data from composer.lock files.
- Developed ComposerLockReader to load and parse composer.lock files asynchronously.
- Introduced ComposerPackage class to encapsulate package details.
- Added PhpPackage class to represent PHP packages with metadata and evidence.
- Implemented PhpPackageCollector to gather packages from ComposerLockData.
- Created PhpLanguageAnalyzer to perform analysis and emit results.
- Added capability signals for known PHP frameworks and CMS.
- Developed unit tests for the PHP language analyzer and its components.
- Included sample composer.lock and expected output for testing.
- Updated project files for the new PHP analyzer library and tests.
This commit is contained in:
StellaOps Bot
2025-11-22 14:02:49 +02:00
parent a7f3c7869a
commit b6b9ffc050
158 changed files with 16272 additions and 809 deletions

View File

@@ -37,7 +37,7 @@ Each conflict includes `field`, `reason`, and `values` (array of `source: value`
## Linkset output shape additions
- `key.confidence`: populated from formula above.
- `conflicts[]`: as defined; may be empty but never null.
- `conflicts[]`: as defined; may be empty but never null. Each conflict also carries `sourceIds[]` (vendors/sources that produced the values) for provenance.
- `normalized` retains add-only fields from `link-not-merge-schema.md`; do not drop raw ranges even when disjoint.
- `provenance.hashes`: sorted list of `observationHash` values; used by replay bundles.

View File

@@ -25,6 +25,7 @@ Canonical JSON must sort object keys (`bundleId`, `importOperator`, …) to keep
2. **Event enrichment:** The importer populates `airgap.bundle` fields on each event produced from the bundle. `bundleId` equals manifest digest (SHA-256). `merkleRoot` is the bundles manifest Merkle root; `timeAnchor` is the authoritative timestamp from the bundle.
3. **Anchoring:** Merkle batching includes bundle metadata; anchor references in `ledger_merkle_roots.anchor_reference` use format `airgap::<bundleId>` when not externally anchored.
4. **Projection staleness:** Projector updates `airgap.stalenessSeconds` comparing current time with `bundle.timeAnchor` per artifact scope; CLI + Console read the value to display freshness indicators.
5. **API surface:** `POST /internal/ledger/airgap-import` records bundle provenance (returns `ledgerEventId`, `chainId`, `sequence`) and persists the same metadata into `airgap_imports` for audit.
## 4. Staleness enforcement
- Config option `AirGapPolicies:FreshnessThresholdSeconds` (default 604800 = 7days) sets allowable age.

View File

@@ -12,9 +12,9 @@
| Metric | Type | Labels | Description / target |
| --- | --- | --- | --- |
| `ledger_write_latency_seconds` | Histogram | `tenant`, `event_type` | End-to-end append latency (API ingress → persisted). P95 ≤120ms. |
| `ledger_write_duration_seconds` | Histogram | `tenant`, `event_type`, `source` | End-to-end append latency (API ingress → persisted). P95 ≤120ms. |
| `ledger_events_total` | Counter | `tenant`, `event_type`, `source` (`policy`, `workflow`, `orchestrator`) | Incremented per committed event. Mirrors Merkle leaf count. |
| `ledger_ingest_backlog_events` | Gauge | `tenant` | Number of events buffered in the writer queue. Alert when >5000 for 5min. |
| `ledger_ingest_backlog_events` | Gauge | | Number of events buffered in the writer/anchor queues. Alert when >5000 for 5min. |
| `ledger_projection_lag_seconds` | Gauge | `tenant` | Wall-clock difference between latest ledger event and projection tail. Target <30s. |
| `ledger_projection_rebuild_seconds` | Histogram | `tenant` | Duration of replay/rebuild operations triggered by LEDGER-29-008 harness. |
| `ledger_projection_apply_seconds` | Histogram | `tenant`, `event_type`, `policy_version`, `evaluation_status` | Time to apply a single ledger event to projection. Target P95 <1s. |

View File

@@ -19,7 +19,8 @@
1. **Ingestion:** Cartographer/SBOM Service emit SBOM snapshots (`sbom_snapshot` events) captured by the Graph Indexer. Advisories/VEX from Concelier/Excititor generate edge updates, policy runs attach overlay metadata.
2. **ETL:** Normalises nodes/edges into canonical IDs, deduplicates, enforces tenant partitions, and writes to the graph store (planned: Neo4j-compatible or document + adjacency lists in Mongo).
3. **Overlay computation:** Batch workers build materialised views for frequently used queries (impact lists, saved queries, policy overlays) and store as immutable blobs for Offline Kit exports.
4. **Diffing:** `graph_diff` jobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption.
4. **Diffing:** `graph_diff` jobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption.
5. **Analytics (Runtime & Signals 140.A):** background workers run Louvain-style clustering + degree/betweenness approximations on ingested graphs, emitting overlays per tenant/snapshot and writing cluster ids back to nodes when enabled.
## 3) APIs
@@ -44,7 +45,8 @@
## 6) Observability
- Metrics: ingestion lag (`graph_ingest_lag_seconds`), node/edge counts, query latency per saved query, overlay generation duration.
- Metrics: ingestion lag (`graph_ingest_lag_seconds`), node/edge counts, query latency per saved query, overlay generation duration.
- New analytics metrics: `graph_analytics_runs_total`, `graph_analytics_failures_total`, `graph_analytics_clusters_total`, `graph_analytics_centrality_total`, plus change-stream/backfill counters (`graph_changes_total`, `graph_backfill_total`, `graph_change_failures_total`, `graph_change_lag_seconds`).
- Logs: structured events for ETL stages and query execution (with trace IDs).
- Traces: ETL pipeline spans, query engine spans.

View File

@@ -0,0 +1,31 @@
# Graph Indexer packaging (Runtime & Signals 140.A)
## Deployment overlays
- Helm/Compose should expose two timers for analytics: `GRAPH_ANALYTICS_CLUSTER_INTERVAL` and `GRAPH_ANALYTICS_CENTRALITY_INTERVAL` (ISO-8601 duration, default 5m). Map to `GraphAnalyticsOptions`.
- Change-stream/backfill worker toggles via `GRAPH_CHANGE_POLL_INTERVAL`, `GRAPH_BACKFILL_INTERVAL`, `GRAPH_CHANGE_MAX_RETRIES`, `GRAPH_CHANGE_RETRY_BACKOFF`.
- New Mongo collections:
- `graph_cluster_overlays` — cluster assignments (`tenant`, `snapshot_id`, `node_id`, `cluster_id`, `generated_at`).
- `graph_centrality_overlays` — degree + betweenness approximations per node.
- optional node updates write `attributes.cluster_id` when `WriteClusterAssignmentsToNodes=true`.
## Offline kit alignment
- Cluster/centrality overlays are exportable alongside `nodes.jsonl`/`edges.jsonl`; keep under `artifacts/graph-snapshots/{snapshotId}/overlays/` for air-gapped imports.
- Seed bundle layout:
- `clusters.ndjson` — overlay records (one per node) matching `graph_cluster_overlays` schema.
- `centrality.ndjson` — overlay records with `degree`/`betweenness`.
- `manifest.json` — references snapshot manifest hash and run timestamps.
- Determinism: overlays ordered by `node_id` (ordinal) to keep bundle hashes stable.
## Observability hooks
- Metrics (Meter `StellaOps.Graph.Indexer`):
- `graph_analytics_runs_total`, `graph_analytics_failures_total`, `graph_analytics_duration_seconds`, `graph_analytics_clusters_total`, `graph_analytics_centrality_total`.
- `graph_changes_total`, `graph_backfill_total`, `graph_change_failures_total`, `graph_change_lag_seconds`.
- Recommended alerts: lag > 5m, failures > 0 over 10m window, cluster job duration > 2m.
## Configuration defaults
- Cluster/centrality intervals: 5 minutes; label-propagation iterations: 6; betweenness sample size: 12.
- Change stream: poll every 5s, backfill every 15m, max retries 3 with 3s backoff, batch size 256.
## Notes
- Analytics writes are idempotent (upserts keyed on tenant+snapshot+node_id). Change-stream processing is idempotent via sequence tokens persisted in `IIdempotencyStore` (Mongo or in-memory for tests).
- Keep Helm/Compose values in sync with these defaults when publishing the Runtime & Signals 140.A bundle.

View File

@@ -0,0 +1,21 @@
# Link-Not-Merge v1 Fixtures
Status: Awaiting drop (2025-11-22)
Expected contents (all JSON, canonicalized, UTF-8):
- `projections.json` — canonical SBOM projection payloads keyed by snapshot ID.
- `assets.json` — asset metadata overlays (tenant-scoped, append-only).
- `paths.json` — ordered dependency paths with runtime flags and blast-radius hints.
- `events.json``sbom.version.created` envelopes aligned to CAS/provenance fields.
- `schema-version.txt` — git SHA / semantic version of the frozen projection schema.
- `SHA256SUMS` — checksums for all files above.
Drop instructions:
- Place files in this directory and update `SHA256SUMS` via `sha256sum *.json *.txt > SHA256SUMS`.
- Keep ordering stable; prefer NDJSON converted to JSON arrays only if deterministic sorting is applied.
- Record drop commit in sprint 0140/0142 Execution Logs and link here.
Consumers:
- SBOM-SERVICE-21-001..004 implementation and tests.
- Advisory AI and Console replay suites.
- AirGap parity review (`docs/modules/sbomservice/runbooks/airgap-parity-review.md`).

View File

@@ -0,0 +1,31 @@
# SBOM Service Prep — PREP-SBOM-SERVICE-GUILD-CARTOGRAPHER-GUILD-OB
Status: Published (2025-11-22)
Owners: SBOM Service Guild · Cartographer Guild · Observability Guild · Zastava Observer/Webhook Guilds · Security Guild
Scope: Capture a single readiness note for Runtime & Signals wave (0140) so SBOM-SERVICE-21-001..004 and SBOM-AIAI-31-001/002 can start once fixtures and AirGap approvals land.
## Current inputs (as of 2025-11-22)
- Link-Not-Merge v1 projection schema frozen on 2025-11-17 (per Sprint 0140 decisions); JSON fixtures have not been published.
- Mock surface bundle v1 exists; real scanner cache ETA is still outstanding, so Graph/Zastava cannot validate parity yet.
- CAS/provenance decisions are tracked under `docs/signals/cas-promotion-24-002.md` and `docs/signals/provenance-24-003.md`; SBOM events must align with these provenance fields.
## Outstanding blockers to flip SBOM wave to DOING
- Publish LNM v1 JSON fixtures with hash list to `docs/modules/sbomservice/fixtures/lnm-v1/` plus `SHA256SUMS`. Owners: Concelier Core · Cartographer Guild.
- Run AirGap parity review for `/sbom/paths`, `/sbom/versions`, and `/sbom/events`; template and minutes location published at `docs/modules/sbomservice/runbooks/airgap-parity-review.md`. Owner: Observability Guild with SBOM Service Guild.
- Confirm scanner cache drop timeline and hash for the real surface cache; mirror in sprint 0140 tracker once published. Owner: Scanner Guild.
## Ready-to-start checklist for SBOM-SERVICE-21-001..004
- Verify fixtures landed at the path above and match the frozen field list; add deterministic fixture IDs to tests.
- Emit projection change events with schema version and fixture set hash; expose counters and optional OTEL traces behind config.
- Provide observability baselines (dashboards/alerts) for path/timeline endpoints with latency and error-rate SLOs.
- Document tenant scoping and add-only evolution in API reference before exposing to Console and Advisory AI consumers.
## Evidence
- This prep note: `docs/modules/sbomservice/prep/2025-11-22-prep-sbom-service-guild-cartographer-ob.md`.
- Blocker detail mirrored in `docs/implplan/SPRINT_0140_0001_0001_runtime_signals.md` Delivery Tracker and Decisions & Risks.
## Exit criteria
- LNM v1 fixtures and AirGap review minutes committed and linked in sprints 0140 and 0142.
- Sprint 0140 SBOM wave can move from BLOCKED to DOING with cache ETA recorded.

View File

@@ -0,0 +1,31 @@
# AirGap Parity Review — SBOM Service runtime/signals (Sprint 0140/0142)
Status: Template published (2025-11-22)
Owners: Observability Guild · SBOM Service Guild · Cartographer Guild · Runtime & Signals coordination (0140) · Concelier Core (schema fidelity)
## Purpose
Document a repeatable AirGap parity review for `/sbom/paths`, `/sbom/versions`, and SBOM event streams so SBOM-SERVICE-21-001..004 can move from BLOCKED to DOING once fixtures land.
## Prerequisites
- Link-Not-Merge v1 fixtures available under `docs/modules/sbomservice/fixtures/lnm-v1/` with `SHA256SUMS`.
- Projection schema frozen (record SHA/commit).
- Mock surface bundle hash and real scanner cache ETA published in sprint 0140 tracker.
- CAS/provenance appendices (signals) frozen: `docs/signals/cas-promotion-24-002.md`, `docs/signals/provenance-24-003.md`.
- Test environment with offline toggle enabled; mirrored packages only.
## Checklist
- Verify fixture integrity: run `sha256sum -c SHA256SUMS` in `fixtures/lnm-v1`.
- Replay fixtures in offline mode; capture latency/p95/p99 for `/sbom/paths` and `/sbom/versions` with deterministic seeds.
- Confirm tenant scoping and add-only evolution (no in-place updates) using two-tenant replay script.
- Validate event envelopes (`sbom.version.created`) against CAS/provenance requirements; ensure DSSE fields present or `skip_reason: offline`.
- Check orchestrator backpressure behavior with AirGap throttling; record SLO thresholds.
- Capture logs/traces snapshots (if enabled) and redact secrets before attaching.
## Outputs
- Minutes + decisions appended to this file (Execution Notes section) with timestamps and owners.
- Metrics table with p50/p95/p99 latency, error rate, and cache hit ratio.
- Actions list with owners and due dates; blockers mirrored to sprint 0140/0142 Decisions & Risks.
## Execution Notes
- 2025-11-22: Template published; awaiting fixtures and review scheduling.

View File

@@ -9,29 +9,30 @@ This document specifies how the Deno analyzer will generate `deno-runtime.ndjson
## Approach
1) **Shim loader**
- Entry file `trace-shim.ts` injected ahead of user entrypoint (via `--import-map` or `--unstable-preload-module`).
- Entry file `trace-shim.ts` is written alongside the analyzer and executed via `deno run --cached-only --allow-read --allow-env --quiet trace-shim.ts` with `STELLA_DENO_ENTRYPOINT` set to the target module.
- Registers listeners:
- `Deno.permissions.query/deny/permit` wrappers to observe grants.
- `globalThis.__originalImport = WebAssembly.instantiateStreaming` to observe wasm loads (fallback to buffer) and record importer URL.
- Wraps dynamic import by monkeypatching `import` via `globalThis.__dynamicImport` using `createDynamicImportProxy` helper (supported in Deno 1.42+).
- Hooks `Deno[Deno.internal].moduleLoader.load` (where available) to observe resolved specifier and cache hit/miss reason; fallback to `performance.resourceTimingBuffer` not used.
- `Deno.permissions.request/query/revoke` wrappers to capture permission uses and maintain a granted-permission snapshot (normalized to fs/net/env/ffi/process/worker).
- Hooks `Deno[Deno.internal].moduleLoader.load` when available to observe module loads (static/dynamic/npm) before execution.
- Wraps `WebAssembly.instantiate` / `instantiateStreaming` to record wasm loads.
- Wraps `Deno.dlopen` to record FFI permission use.
- Uses a synchronous SHA-256 implementation (no WebCrypto) to hash normalized module paths for determinism/offline safety.
2) **Event buffering**
- Collects events in-memory; each event includes UTC timestamp and relative path (computed against analyzer root) plus `path_sha256`.
- Origin normalization: for remote specifiers, strip query/fragment; record registry host/version if npm.
3) **Execution**
- Analyzer runs `deno run --allow-read --allow-env --no-lock --no-npm --quiet --import-map trace-import-map.json trace-shim.ts <user-entry>`.
- Optional: respect `DENO_DIR` from workspace normalization; no network fetch allowed (set `--cached-only`).
- Analyzer/worker runs `deno run --cached-only --allow-read --allow-env --quiet trace-shim.ts` with `STELLA_DENO_ENTRYPOINT=<entry>` (absolute or cwd-relative) and optional `STELLA_DENO_BINARY` override.
- Respects `DENO_DIR` if present for npm cache resolution; still offline (`--cached-only`).
4) **Output**
- After user code exits, shim writes buffered events as NDJSON sorted by timestamp then type to `<root>/deno-runtime.ndjson`.
- Also prints SHA256 to stdout for diagnostics; Analyzer reads file and stores payload in AnalysisStore + signals.
- Analyzer ingests the NDJSON, hashes content, stores payload in AnalysisStore under `ScanAnalysisKeys.DenoRuntimePayload` (legacy alias `"deno.runtime"` kept for backward compatibility), and emits policy signals keyed `surface.lang.deno.*`.
5) **Determinism & safety**
- Timestamps: `Date.now()` captured and converted to ISO-8601 UTC.
- Paths: use analyzer root + `path.relative` + forward slashes; hash with SHA256(lowercase hex).
- No module source or env values persisted; only paths + hashes.
- Timestamps: `Date.now()` captured and converted to ISO-8601 UTC; events sorted by ts then type.
- Paths: resolved to analyzer-relative form, forward-slash normalized, hashed with built-in synchronous SHA-256 (lowercase hex); remote origins normalized to protocol//host/path.
- No module source or env values persisted; only paths + hashes; npm resolutions recorded as cache hits only.
## Validation plan
- Add fixtures: simple import graph, dynamic import, wasm load, npm: chalk (cached), permission use via `Deno.permissions.request`.