Add PHP Analyzer Plugin and Composer Lock Data Handling
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Implemented the PhpAnalyzerPlugin to analyze PHP projects.
- Created ComposerLockData class to represent data from composer.lock files.
- Developed ComposerLockReader to load and parse composer.lock files asynchronously.
- Introduced ComposerPackage class to encapsulate package details.
- Added PhpPackage class to represent PHP packages with metadata and evidence.
- Implemented PhpPackageCollector to gather packages from ComposerLockData.
- Created PhpLanguageAnalyzer to perform analysis and emit results.
- Added capability signals for known PHP frameworks and CMS.
- Developed unit tests for the PHP language analyzer and its components.
- Included sample composer.lock and expected output for testing.
- Updated project files for the new PHP analyzer library and tests.
This commit is contained in:
StellaOps Bot
2025-11-22 14:02:49 +02:00
parent a7f3c7869a
commit b6b9ffc050
158 changed files with 16272 additions and 809 deletions

View File

@@ -19,7 +19,8 @@
1. **Ingestion:** Cartographer/SBOM Service emit SBOM snapshots (`sbom_snapshot` events) captured by the Graph Indexer. Advisories/VEX from Concelier/Excititor generate edge updates, policy runs attach overlay metadata.
2. **ETL:** Normalises nodes/edges into canonical IDs, deduplicates, enforces tenant partitions, and writes to the graph store (planned: Neo4j-compatible or document + adjacency lists in Mongo).
3. **Overlay computation:** Batch workers build materialised views for frequently used queries (impact lists, saved queries, policy overlays) and store as immutable blobs for Offline Kit exports.
4. **Diffing:** `graph_diff` jobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption.
4. **Diffing:** `graph_diff` jobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption.
5. **Analytics (Runtime & Signals 140.A):** background workers run Louvain-style clustering + degree/betweenness approximations on ingested graphs, emitting overlays per tenant/snapshot and writing cluster ids back to nodes when enabled.
## 3) APIs
@@ -44,7 +45,8 @@
## 6) Observability
- Metrics: ingestion lag (`graph_ingest_lag_seconds`), node/edge counts, query latency per saved query, overlay generation duration.
- Metrics: ingestion lag (`graph_ingest_lag_seconds`), node/edge counts, query latency per saved query, overlay generation duration.
- New analytics metrics: `graph_analytics_runs_total`, `graph_analytics_failures_total`, `graph_analytics_clusters_total`, `graph_analytics_centrality_total`, plus change-stream/backfill counters (`graph_changes_total`, `graph_backfill_total`, `graph_change_failures_total`, `graph_change_lag_seconds`).
- Logs: structured events for ETL stages and query execution (with trace IDs).
- Traces: ETL pipeline spans, query engine spans.

View File

@@ -0,0 +1,31 @@
# Graph Indexer packaging (Runtime & Signals 140.A)
## Deployment overlays
- Helm/Compose should expose two timers for analytics: `GRAPH_ANALYTICS_CLUSTER_INTERVAL` and `GRAPH_ANALYTICS_CENTRALITY_INTERVAL` (ISO-8601 duration, default 5m). Map to `GraphAnalyticsOptions`.
- Change-stream/backfill worker toggles via `GRAPH_CHANGE_POLL_INTERVAL`, `GRAPH_BACKFILL_INTERVAL`, `GRAPH_CHANGE_MAX_RETRIES`, `GRAPH_CHANGE_RETRY_BACKOFF`.
- New Mongo collections:
- `graph_cluster_overlays` — cluster assignments (`tenant`, `snapshot_id`, `node_id`, `cluster_id`, `generated_at`).
- `graph_centrality_overlays` — degree + betweenness approximations per node.
- optional node updates write `attributes.cluster_id` when `WriteClusterAssignmentsToNodes=true`.
## Offline kit alignment
- Cluster/centrality overlays are exportable alongside `nodes.jsonl`/`edges.jsonl`; keep under `artifacts/graph-snapshots/{snapshotId}/overlays/` for air-gapped imports.
- Seed bundle layout:
- `clusters.ndjson` — overlay records (one per node) matching `graph_cluster_overlays` schema.
- `centrality.ndjson` — overlay records with `degree`/`betweenness`.
- `manifest.json` — references snapshot manifest hash and run timestamps.
- Determinism: overlays ordered by `node_id` (ordinal) to keep bundle hashes stable.
## Observability hooks
- Metrics (Meter `StellaOps.Graph.Indexer`):
- `graph_analytics_runs_total`, `graph_analytics_failures_total`, `graph_analytics_duration_seconds`, `graph_analytics_clusters_total`, `graph_analytics_centrality_total`.
- `graph_changes_total`, `graph_backfill_total`, `graph_change_failures_total`, `graph_change_lag_seconds`.
- Recommended alerts: lag > 5m, failures > 0 over 10m window, cluster job duration > 2m.
## Configuration defaults
- Cluster/centrality intervals: 5 minutes; label-propagation iterations: 6; betweenness sample size: 12.
- Change stream: poll every 5s, backfill every 15m, max retries 3 with 3s backoff, batch size 256.
## Notes
- Analytics writes are idempotent (upserts keyed on tenant+snapshot+node_id). Change-stream processing is idempotent via sequence tokens persisted in `IIdempotencyStore` (Mongo or in-memory for tests).
- Keep Helm/Compose values in sync with these defaults when publishing the Runtime & Signals 140.A bundle.