Add Ruby language analyzer and related functionality

- Introduced global usings for Ruby analyzer. - Implemented RubyLockData, RubyLockEntry, and RubyLockParser for handling Gemfile.lock files. - Created RubyPackage and RubyPackageCollector to manage Ruby packages and vendor cache. - Developed RubyAnalyzerPlugin and RubyLanguageAnalyzer for analyzing Ruby projects. - Added tests for Ruby language analyzer with sample Gemfile.lock and expected output. - Included necessary project files and references for the Ruby analyzer. - Added third-party licenses for tree-sitter dependencies.
2025-11-03 01:15:43 +02:00
parent ff0eca3a51
commit bf2bf4b395
88 changed files with 6557 additions and 1568 deletions
--- a/docs/modules/advisory-ai/architecture.md
+++ b/docs/modules/advisory-ai/architecture.md
@@ -1,100 +1,115 @@
-# Advisory AI architecture
-
-> Captures the retrieval, guardrail, and inference packaging requirements defined in the Advisory AI implementation plan and related module guides.
-
-## 1) Goals
-
- Summarise advisories/VEX evidence into operator-ready briefs with citations.
- Explain conflicting statements with provenance and trust weights (using VEX Lens & Excititor data).
- Suggest remediation plans aligned with Offline Kit deployment models and scheduler follow-ups.
- Operate deterministically where possible; cache generated artefacts with digests for audit.
-
-## 2) Pipeline overview
-
-```
-                       +---------------------+
-   Concelier/VEX Lens  |  Evidence Retriever |
-   Policy Engine ----> |  (vector + keyword) | ---> Context Pack (JSON)
-   Zastava runtime     +---------------------+
-                               |
-                               v
-                        +-------------+
-                        | Prompt      |
-                        | Assembler   |
-                        +-------------+
-                               |
-                               v
-                        +-------------+
-                        | Guarded LLM |
-                        | (local/host)|
-                        +-------------+
-                               |
-                               v
-                        +-----------------+
-                        | Citation &     |
-                        | Validation      |
-                        +-----------------+
-                               |
-                               v
-                        +----------------+
-                        | Output cache   |
-                        | (hash, bundle) |
-                        +----------------+
-```
-
-## 3) Retrieval & context
-
- Hybrid search: vector embeddings (SBERT-compatible) + keyword filters for advisory IDs, PURLs, CVEs.
- Context packs include:
-  - Advisory raw excerpts with highlighted sections and source URLs.
-  - VEX statements (normalized tuples + trust metadata).
-  - Policy explain traces for the affected finding.
-  - Runtime/impact hints from Zastava (exposure, entrypoints).
-  - Export-ready remediation data (fixed versions, patches).
-
-All context references include `content_hash` and `source_id` enabling verifiable citations.
-
-## 4) Guardrails
-
- Prompt templates enforce structure: summary, conflicts, remediation, references.
- Response validator ensures:
-  - No hallucinated advisories (every fact must map to input context).
-  - Citations follow `[n]` indexing referencing actual sources.
-  - Remediation suggestions only cite policy-approved sources (fixed versions, vendor hotfixes).
- Moderation/PII filters prevent leaking secrets; responses failing validation are rejected and logged.
-
-## 5) Output persistence
-
- Cached artefacts stored in `advisory_ai_outputs` with fields:
-  - `output_hash` (sha256 of JSON response).
-  - `input_digest` (hash of context pack).
-  - `summary`, `conflicts`, `remediation`, `citations`.
-  - `generated_at`, `model_id`, `profile` (Sovereign/FIPS etc.).
-  - `signatures` (optional DSSE if run in deterministic mode).
- Offline bundle format contains `summary.md`, `citations.json`, `context_manifest.json`, `signatures/`.
-
-## 6) Profiles & sovereignty
-
- **Profiles:** `default`, `fips-local` (FIPS-compliant local model), `gost-local`, `cloud-openai` (optional, disabled by default). Each profile defines allowed models, key management, and telemetry endpoints.
- **CryptoProfile/RootPack integration:** generated artefacts can be signed using configured CryptoProfile to satisfy procurement/trust requirements.
-
-## 7) APIs
-
- `POST /v1/advisory-ai/summaries` — generate (or retrieve cached) summary for `{advisoryKey, artifactId, policyVersion}`.
- `POST /v1/advisory-ai/conflicts` — explain conflicting VEX statements with trust ranking.
- `POST /v1/advisory-ai/remediation` — fetch remediation plan with target fix versions, prerequisites, verification steps.
- `GET /v1/advisory-ai/outputs/{hash}` — retrieve cached artefact (used by CLI/Console/Export Center).
-
-All endpoints accept `profile` parameter (default `fips-local`) and return `output_hash`, `input_digest`, and `citations` for verification.
-
-## 8) Observability
-
- Metrics: `advisory_ai_requests_total{profile,type}`, `advisory_ai_latency_seconds`, `advisory_ai_validation_failures_total`.
- Logs: include `output_hash`, `input_digest`, `profile`, `model_id`, `tenant`, `artifacts`. Sensitive context is not logged.
- Traces: spans for retrieval, prompt assembly, model inference, validation, cache write.
-
-## 9) Operational controls
-
- Feature flags per tenant (`ai.summary.enabled`, `ai.remediation.enabled`).
- Rate limits (per tenant, per profile) enforced by Orchestrator to prevent runaway usage.
- Offline/air-gapped deployments run local models packaged with Offline Kit; model weights validated via manifest digests.
+# Advisory AI architecture
+
+> Captures the retrieval, guardrail, and inference packaging requirements defined in the Advisory AI implementation plan and related module guides.
+
+## 1) Goals
+
+- Summarise advisories/VEX evidence into operator-ready briefs with citations.
+- Explain conflicting statements with provenance and trust weights (using VEX Lens & Excititor data).
+- Suggest remediation plans aligned with Offline Kit deployment models and scheduler follow-ups.
+- Operate deterministically where possible; cache generated artefacts with digests for audit.
+
+## 2) Pipeline overview
+
+```
+                       +---------------------+
+   Concelier/VEX Lens  |  Evidence Retriever |
+   Policy Engine ----> |  (vector + keyword) | ---> Context Pack (JSON)
+   Zastava runtime     +---------------------+
+                               |
+                               v
+                        +-------------+
+                        | Prompt      |
+                        | Assembler   |
+                        +-------------+
+                               |
+                               v
+                        +-------------+
+                        | Guarded LLM |
+                        | (local/host)|
+                        +-------------+
+                               |
+                               v
+                        +-----------------+
+                        | Citation &     |
+                        | Validation      |
+                        +-----------------+
+                               |
+                               v
+                        +----------------+
+                        | Output cache   |
+                        | (hash, bundle) |
+                        +----------------+
+```
+
+## 3) Retrieval & context
+
+- Hybrid search: vector embeddings (SBERT-compatible) + keyword filters for advisory IDs, PURLs, CVEs.
+- Context packs include:
+  - Advisory raw excerpts with highlighted sections and source URLs.
+  - VEX statements (normalized tuples + trust metadata).
+  - Policy explain traces for the affected finding.
+  - Runtime/impact hints from Zastava (exposure, entrypoints).
+  - Export-ready remediation data (fixed versions, patches).
+- **SBOM context retriever** (AIAI-31-002) hydrates:
+  - Version timelines (first/last observed, status, fix availability).
+  - Dependency paths (runtime vs build/test, deduped by coordinate chain).
+  - Tenant environment flags (prod/stage toggles) with optional blast radius summary.
+  - Service-side clamps: max 500 timeline entries, 200 dependency paths, with client-provided toggles for env/blast data.
+
+Retriever requests and results are trimmed/normalized before hashing; metadata (counts, provenance keys) is returned for downstream guardrails. Unit coverage ensures deterministic ordering and flag handling.
+
+All context references include `content_hash` and `source_id` enabling verifiable citations.
+
+## 4) Guardrails
+
+- Prompt templates enforce structure: summary, conflicts, remediation, references.
+- Response validator ensures:
+  - No hallucinated advisories (every fact must map to input context).
+  - Citations follow `[n]` indexing referencing actual sources.
+  - Remediation suggestions only cite policy-approved sources (fixed versions, vendor hotfixes).
+- Moderation/PII filters prevent leaking secrets; responses failing validation are rejected and logged.
+
+## 5) Deterministic tooling
+
+- **Version comparators** — offline semantic version + RPM EVR parsers with range evaluators. Supports chained constraints (`>=`, `<=`, `!=`) used by remediation advice and blast radius calcs.
+  - Registered via `AddAdvisoryDeterministicToolset` for reuse across orchestrator, CLI, and services.
+- **Orchestration pipeline** — see `orchestration-pipeline.md` for prerequisites, task breakdown, and cross-guild responsibilities before wiring the execution flows.
+- **Planned extensions** — NEVRA/EVR comparators, ecosystem-specific normalisers, dependency chain scorers (AIAI-31-003 scope).
+- Exposed via internal interfaces to allow orchestrator/toolchain reuse; all helpers stay side-effect free and deterministic for golden testing.
+
+## 6) Output persistence
+
+- Cached artefacts stored in `advisory_ai_outputs` with fields:
+  - `output_hash` (sha256 of JSON response).
+  - `input_digest` (hash of context pack).
+  - `summary`, `conflicts`, `remediation`, `citations`.
+  - `generated_at`, `model_id`, `profile` (Sovereign/FIPS etc.).
+  - `signatures` (optional DSSE if run in deterministic mode).
+- Offline bundle format contains `summary.md`, `citations.json`, `context_manifest.json`, `signatures/`.
+
+## 7) Profiles & sovereignty
+
+- **Profiles:** `default`, `fips-local` (FIPS-compliant local model), `gost-local`, `cloud-openai` (optional, disabled by default). Each profile defines allowed models, key management, and telemetry endpoints.
+- **CryptoProfile/RootPack integration:** generated artefacts can be signed using configured CryptoProfile to satisfy procurement/trust requirements.
+
+## 8) APIs
+
+- `POST /v1/advisory-ai/summaries` — generate (or retrieve cached) summary for `{advisoryKey, artifactId, policyVersion}`.
+- `POST /v1/advisory-ai/conflicts` — explain conflicting VEX statements with trust ranking.
+- `POST /v1/advisory-ai/remediation` — fetch remediation plan with target fix versions, prerequisites, verification steps.
+- `GET /v1/advisory-ai/outputs/{hash}` — retrieve cached artefact (used by CLI/Console/Export Center).
+
+All endpoints accept `profile` parameter (default `fips-local`) and return `output_hash`, `input_digest`, and `citations` for verification.
+
+## 9) Observability
+
+- Metrics: `advisory_ai_requests_total{profile,type}`, `advisory_ai_latency_seconds`, `advisory_ai_validation_failures_total`.
+- Logs: include `output_hash`, `input_digest`, `profile`, `model_id`, `tenant`, `artifacts`. Sensitive context is not logged.
+- Traces: spans for retrieval, prompt assembly, model inference, validation, cache write.
+
+## 10) Operational controls
+
+- Feature flags per tenant (`ai.summary.enabled`, `ai.remediation.enabled`).
+- Rate limits (per tenant, per profile) enforced by Orchestrator to prevent runaway usage.
+- Offline/air-gapped deployments run local models packaged with Offline Kit; model weights validated via manifest digests.
--- a/docs/modules/advisory-ai/orchestration-pipeline.md
+++ b/docs/modules/advisory-ai/orchestration-pipeline.md
@@ -0,0 +1,82 @@
+# Advisory AI Orchestration Pipeline (Planning Notes)
+
+> **Status:** Draft – prerequisite design for AIAI-31-004 integration work.  
+> **Audience:** Advisory AI guild, WebService/Worker guilds, CLI guild, Docs/QA support teams.
+
+## 1. Goal
+
+Wire the deterministic pipeline (Summary / Conflict / Remediation flows) into the Advisory AI service, workers, and CLI with deterministic caching, prompt preparation, and guardrail fallback. This document captures the pre-integration checklist and task breakdown so each guild understands their responsibilities before coding begins.
+
+## 2. Prerequisites
+
+| Area | Requirement | Owner | Status |
+|------|-------------|-------|--------|
+| **Toolset** | Deterministic comparators, dependency analyzer (`IDeterministicToolset`, `AdvisoryPipelineOrchestrator`) | Advisory AI | ✅ landed (AIAI-31-003) |
+| **SBOM context** | Real SBOM context client delivering timelines + dependency paths | SBOM Service Guild | ⏳ pending (AIAI-31-002) |
+| **Prompt artifacts** | Liquid/Handlebars prompt templates for summary/conflict/remediation | Advisory AI Docs Guild | ⏳ authoring needed |
+| **Cache strategy** | Decision on DSSE or hash-only cache entries, TTLs, and eviction policy | Advisory AI + Platform | 🔲 define |
+| **Auth scopes** | Confirm service account scopes for new API endpoints/worker-to-service calls | Authority Guild | 🔲 define |
+
+**Blocking risk:** SBOM client and prompt templates must exist (even stubbed) before the orchestrator can produce stable plans.
+
+## 3. Integration plan (high-level)
+
+1. **Service layer (WebService / Worker)**
+   - Inject `IAdvisoryPipelineOrchestrator` via `AddAdvisoryPipeline`.
+   - Define REST endpoint `POST /v1/advisories/{key}/pipeline/{task}` (task ∈ summary/conflict/remediation).
+   - Worker consumes queue messages (`advisory.pipeline.execute`) -> fetches plan -> executes prompt -> persists output & provenance.
+   - Add metrics: `advisory_pipeline_requests_total`, `advisory_pipeline_plan_cache_hits_total`, `advisory_pipeline_latency_seconds`.
+2. **CLI**
+   - New command `stella advise run <task>` with flags for artifact id, profile, policy version, `--force-refresh`.
+   - Render JSON/Markdown outputs; handle caching hints (print cache key, status).
+3. **Caching / storage**
+   - Choose storage (Mongo collection vs existing DSSE output store).  
+   - Persist `AdvisoryTaskPlan` metadata + generated output keyed by cache key + policy version.
+   - Expose TTL/force-refresh semantics.
+4. **Docs & QA**
+   - Publish API spec (`docs/advisory-ai/api.md`) + CLI docs.
+   - Add golden outputs for deterministic runs; property tests for cache key stability.
+
+## 4. Task Breakdown
+
+### AIAI-31-004A (Service orchestration wiring)
+
+- **Scope:** WebService/Worker injection, REST/queue plumbing, metrics counters, basic cache stub.
+- **Dependencies:** `AddAdvisoryPipeline`, SBOM client stub.
+- **Exit:** API responds with plan metadata + queue message; worker logs execution attempt; metrics emitted.
+
+### AIAI-31-004B (Prompt assembly & cache persistence)
+
+- **Scope:** Implement prompt assembler, connect to guardrails, persist cache entries w/ DSSE metadata.
+- **Dependencies:** Prompt templates, cache storage decision, guardrail interface.
+- **Exit:** Deterministic outputs stored; force-refresh honoured; tests cover prompt assembly + caching.
+
+### AIAI-31-004C (CLI integration & docs)
+
+- **Scope:** CLI command + output renderer, docs updates, CLI tests (golden outputs).
+- **Dependencies:** Service endpoints stable, caching semantics documented.
+- **Exit:** CLI command produces deterministic output, docs updated, smoke tests recorded.
+
+### Supporting tasks (other guilds)
+
+- **AUTH-AIAI-31-004** – Update scopes and DSSE policy (Authority guild).
+- **DOCS-AIAI-31-003** – Publish API documentation, CLI guide updates (Docs guild).
+- **QA-AIAI-31-004** – Golden/properties/perf suite for pipeline (QA guild).
+
+## 5. Acceptance checklist (per task)
+
+| Item | Notes |
+|------|-------|
+| Cache key stability | `AdvisoryPipelineOrchestrator` hash must remain stable under re-run of identical inputs. |
+| Metrics & logging | Request id, cache key, task type, profile, latency; guardrail results logged without sensitive prompt data. |
+| Offline readiness | All prompt templates bundled with Offline Kit; CLI works in air-gapped mode with cached data. |
+| Policy awareness | Plans encode policy version used; outputs reference policy digest for audit. |
+| Testing | Unit tests (plan generation, cache keys, DI), integration (service endpoint, worker, CLI), deterministic golden outputs. |
+
+## 6. Next steps
+
+1. Finalize SBOM context client (AIAI-31-002) and prompt templates.
+2. Create queue schema spec (`docs/modules/advisory-ai/queue-contracts.md`) if not already available.
+3. Schedule cross-guild kickoff to agree on cache store & DSSE policy.
+
+_Last updated: 2025-11-02_
--- a/docs/modules/issuer-directory/architecture.md
+++ b/docs/modules/issuer-directory/architecture.md
@@ -90,6 +90,11 @@ Payloads follow the contract in `Contracts/IssuerDtos.cs` and align with domain
 3. **SDK integration (ISSUER-30-004)** — supply cached issuer metadata to VEX Lens and Excititor clients.
 4. **Observability & Ops (ISSUER-30-005/006)** — metrics, dashboards, deployment automation, offline kit.

+## 9. Operations & runbooks
+- [Deployment guide](operations/deployment.md)
+- [Backup & restore](operations/backup-restore.md)
+- [Offline kit notes](operations/offline-kit.md)
+
 ---

 *Document owner: Issuer Directory Guild*
--- a/docs/modules/issuer-directory/operations/backup-restore.md
+++ b/docs/modules/issuer-directory/operations/backup-restore.md
@@ -0,0 +1,103 @@
+# Issuer Directory Backup & Restore
+
+## Scope
+- **Applies to:** Issuer Directory when deployed via Docker Compose (`deploy/compose/docker-compose.*.yaml`) or the Helm chart (`deploy/helm/stellaops`).
+- **Artifacts covered:** MongoDB database `issuer-directory`, service configuration (`etc/issuer-directory.yaml`), CSAF seed file (`data/csaf-publishers.json`), and secret material for the Mongo connection string.
+- **Frequency:** Take a hot backup before every upgrade and at least daily in production. Keep encrypted copies off-site/air-gapped according to your compliance program.
+
+## Inventory checklist
+| Component | Location (Compose default) | Notes |
+| --- | --- | --- |
+| Mongo data | `mongo-data` volume (`/var/lib/docker/volumes/.../mongo-data`) | Contains `issuers`, `issuer_keys`, `issuer_trust_overrides`, and `issuer_audit` collections. |
+| Configuration | `etc/issuer-directory.yaml` | Mounted read-only at `/etc/issuer-directory.yaml` inside the container. |
+| CSAF seed file | `src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json` | Ensure customised seeds are part of the backup; regenerate if you ship regional overrides. |
+| Mongo secret | `.env` entry `ISSUER_DIRECTORY_MONGO_CONNECTION_STRING` or secret store export | Required to restore connectivity; treat as sensitive. |
+
+> **Tip:** Export the secret via `kubectl get secret issuer-directory-secrets -o yaml` (sanitize before storage) or copy the Compose `.env` file into an encrypted vault.
+
+## Hot backup (no downtime)
+1. **Create output directory**
+   ```bash
+   BACKUP_DIR=backup/issuer-directory/$(date +%Y-%m-%dT%H%M%S)
+   mkdir -p "$BACKUP_DIR"
+   ```
+2. **Dump Mongo collections**
+   ```bash
+   docker compose -f deploy/compose/docker-compose.prod.yaml exec mongo \
+     mongodump --archive=/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).gz \
+     --gzip --db issuer-directory
+
+   docker compose -f deploy/compose/docker-compose.prod.yaml cp \
+     mongo:/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).gz "$BACKUP_DIR/"
+   ```
+   For Kubernetes, run the same `mongodump` command inside the `stellaops-mongo` pod and copy the archive via `kubectl cp`.
+3. **Capture configuration and seeds**
+   ```bash
+   cp etc/issuer-directory.yaml "$BACKUP_DIR/"
+   cp src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json "$BACKUP_DIR/"
+   ```
+4. **Capture secrets**
+   ```bash
+   grep '^ISSUER_DIRECTORY_MONGO_CONNECTION_STRING=' dev.env > "$BACKUP_DIR/issuer-directory.mongo.secret"
+   chmod 600 "$BACKUP_DIR/issuer-directory.mongo.secret"
+   ```
+5. **Generate checksums and encrypt**
+   ```bash
+   (cd "$BACKUP_DIR" && sha256sum * > SHA256SUMS)
+   tar czf "$BACKUP_DIR.tar.gz" -C "$BACKUP_DIR" .
+   age -r you@example.org "$BACKUP_DIR.tar.gz" > "$BACKUP_DIR.tar.gz.age"
+   ```
+
+## Cold backup (planned downtime)
+1. Notify stakeholders and pause automation calling the API.
+2. Stop services:
+   ```bash
+   docker compose -f deploy/compose/docker-compose.prod.yaml down issuer-directory
+   ```
+   (For Helm: `kubectl scale deploy stellaops-issuer-directory --replicas=0`.)
+3. Snapshot volumes:
+   ```bash
+   docker run --rm -v mongo-data:/data \
+     -v "$(pwd)":/backup busybox tar czf /backup/mongo-data-$(date +%Y%m%d).tar.gz -C /data .
+   ```
+4. Copy configuration, seeds, and secrets as in the hot backup.
+5. Restart services and confirm `/health/live` returns `200 OK`.
+
+## Restore procedure
+1. **Provision clean volumes**
+   - Compose: `docker volume rm mongo-data` (optional) then `docker compose up -d mongo`.
+   - Helm: delete the Mongo PVC or attach a fresh volume snapshot.
+2. **Restore Mongo**
+   ```bash
+   docker compose exec -T mongo \
+     mongorestore --archive \
+     --gzip --drop < issuer-directory-YYYYMMDDTHHMMSSZ.gz
+   ```
+3. **Restore configuration/secrets**
+   - Copy `issuer-directory.yaml` into `etc/`.
+   - Reapply the secret: `kubectl apply -f issuer-directory-secret.yaml` or repopulate `.env`.
+4. **Restore CSAF seeds** (optional)
+   - If you maintain a customised seed file, copy it back before starting the container. Otherwise the bundled file will be used.
+5. **Start services**
+   ```bash
+   docker compose up -d issuer-directory
+   # or
+   kubectl scale deploy stellaops-issuer-directory --replicas=1
+   ```
+6. **Validate**
+   - `curl -fsSL https://localhost:8447/health/live`
+   - Issue an access token and list issuers to confirm results.
+   - Check Mongo counts match expectations (`db.issuers.countDocuments()`, etc.).
+
+## Disaster recovery notes
+- **Retention:** Maintain 30 daily + 12 monthly archives. Store copies in geographically separate, access-controlled vaults.
+- **Audit reconciliation:** Ensure `issuer_audit` entries cover the restore window; export them for compliance.
+- **Seed replay:** If the CSAF seed file was lost, set `ISSUER_DIRECTORY_SEED_CSAF=true` for the first restart to rehydrate the global tenant.
+- **Testing:** Run quarterly restore drills in a staging environment to validate procedure drift.
+
+## Verification checklist
+- [ ] `/health/live` returns `200 OK`.
+- [ ] Mongo collections (`issuers`, `issuer_keys`, `issuer_trust_overrides`) have expected counts.
+- [ ] `issuer_directory_changes_total` and `issuer_directory_key_operations_total` metrics resume within 1 minute.
+- [ ] Audit entries exist for post-restore CRUD activity.
+- [ ] Client integrations (VEX Lens, Excititor) resolve issuers successfully.
--- a/docs/modules/issuer-directory/operations/deployment.md
+++ b/docs/modules/issuer-directory/operations/deployment.md
@@ -0,0 +1,100 @@
+# Issuer Directory Deployment Guide
+
+## Scope
+- **Applies to:** Issuer Directory WebService (`stellaops/issuer-directory-web`) running via the provided Docker Compose bundles (`deploy/compose/docker-compose.*.yaml`) or the Helm chart (`deploy/helm/stellaops`).
+- **Covers:** Environment prerequisites, secret handling, Compose + Helm rollout steps, and post-deploy verification.
+- **Audience:** Platform/DevOps engineers responsible for Identity & Signing sprint deliverables.
+
+## 1 · Prerequisites
+- Authority must be running and reachable at the issuer URL you configure (default Compose host: `https://authority:8440`).
+- MongoDB 4.2+ with credentials for the `issuer-directory` database (Compose defaults to the root user defined in `.env`).
+- Network access to Authority, MongoDB, and (optionally) Prometheus if you scrape metrics.
+- Issuer Directory configuration file `etc/issuer-directory.yaml` checked and customised for your environment (tenant header, audiences, telemetry level, CSAF seed path).
+
+> **Secrets:** Use `etc/secrets/issuer-directory.mongo.secret.example` as a template. Store the real connection string in an untracked file or secrets manager and reference it via environment variables (`ISSUER_DIRECTORY_MONGO_CONNECTION_STRING`) rather than committing credentials.
+
+## 2 · Deploy with Docker Compose
+1. **Prepare environment variables**
+   ```bash
+   cp deploy/compose/env/dev.env.example dev.env
+   cp etc/secrets/issuer-directory.mongo.secret.example issuer-directory.mongo.env
+   # Edit dev.env and issuer-directory.mongo.env with production-ready secrets.
+   ```
+
+2. **Inspect the merged configuration**
+   ```bash
+   docker compose \
+     --env-file dev.env \
+     --env-file issuer-directory.mongo.env \
+     -f deploy/compose/docker-compose.dev.yaml config
+   ```
+   The command confirms the new `issuer-directory` service resolves the port (`${ISSUER_DIRECTORY_PORT:-8447}`) and the Mongo connection string is in place.
+
+3. **Launch the stack**
+   ```bash
+   docker compose \
+     --env-file dev.env \
+     --env-file issuer-directory.mongo.env \
+     -f deploy/compose/docker-compose.dev.yaml up -d issuer-directory
+   ```
+   Compose automatically mounts `../../etc/issuer-directory.yaml` into the container at `/etc/issuer-directory.yaml`, seeds CSAF publishers, and exposes the API on `https://localhost:8447`.
+
+4. **Smoke test**
+   ```bash
+   curl -k https://localhost:8447/health/live
+   stellaops-cli issuer-directory issuers list \
+     --base-url https://localhost:8447 \
+     --tenant demo \
+     --access-token "$(stellaops-cli auth token issue --scope issuer-directory:read)"
+   ```
+
+5. **Upgrade & rollback**
+   - Update Compose images to the desired release manifest (`deploy/releases/*.yaml`), re-run `docker compose config`, then `docker compose up -d`.
+   - Rollbacks follow the same steps with the previous manifest. Mongo collections are backwards compatible within `2025.10.x`.
+
+## 3 · Deploy with Helm
+1. **Create or update the secret**
+   ```bash
+   kubectl create secret generic issuer-directory-secrets \
+     --from-literal=ISSUERDIRECTORY__MONGO__CONNECTIONSTRING='mongodb://stellaops:<password>@stellaops-mongo:27017' \
+     --dry-run=client -o yaml | kubectl apply -f -
+   ```
+   Add optional overrides (e.g. `ISSUERDIRECTORY__AUTHORITY__ISSUER`) if your Authority issuer differs from the default.
+
+2. **Template for validation**
+   ```bash
+   helm template issuer-directory deploy/helm/stellaops \
+     -f deploy/helm/stellaops/values-prod.yaml \
+     --set services.issuer-directory.env.ISSUERDIRECTORY__AUTHORITY__ISSUER=https://authority.prod.stella-ops.org \
+     > /tmp/issuer-directory.yaml
+   ```
+
+3. **Install / upgrade**
+   ```bash
+   helm upgrade --install stellaops deploy/helm/stellaops \
+     -f deploy/helm/stellaops/values-prod.yaml \
+     --set services.issuer-directory.env.ISSUERDIRECTORY__AUTHORITY__ISSUER=https://authority.prod.stella-ops.org
+   ```
+   The chart provisions:
+   - ConfigMap `stellaops-issuer-directory-config` with `IssuerDirectory` settings.
+   - Deployment `stellaops-issuer-directory` with readiness/liveness probes on `/health/live`.
+   - Service on port `8080` (ClusterIP by default).
+
+4. **Expose for operators (optional)**
+   - Use an Ingress/HTTPRoute to publish `https://issuer-directory.<env>.stella-ops.org`.
+   - Ensure the upstream includes DPoP headers if proxied through an API gateway.
+
+5. **Post-deploy validation**
+   ```bash
+   kubectl exec deploy/stellaops-issuer-directory -- \
+     curl -sf http://127.0.0.1:8080/health/live
+   kubectl logs deploy/stellaops-issuer-directory | grep 'IssuerDirectory Mongo connected'
+   ```
+   Prometheus should begin scraping `issuer_directory_changes_total` and related metrics (labels: `tenant`, `issuer`, `action`).
+
+## 4 · Operational checklist
+- **Secrets:** Connection strings live in `issuer-directory-secrets` (Helm) or an `.env` file stored in your secrets vault (Compose). Rotate credentials via secret update + pod restart.
+- **Audit streams:** Confirm `issuer_directory_audit` collection receives entries when CRUD operations run; export logs for compliance.
+- **Tenants:** The service enforces the `X-StellaOps-Tenant` header. For multi-tenant staging, configure the reverse proxy to inject the correct tenant or issue scoped tokens.
+- **CSAF seeds:** `ISSUER_DIRECTORY_SEED_CSAF=true` replays `data/csaf-publishers.json` on startup. Set to `false` once production tenants are fully managed, or override `csafSeedPath` with a curated bundle.
+- **Release alignment:** Before promotion, run `deploy/tools/validate-profiles.sh` to lint Compose/Helm bundles, then verify the new `issuer-directory-web` entry in `deploy/releases/2025.10-edge.yaml` (or the relevant manifest) matches the channel you intend to ship.
--- a/docs/modules/issuer-directory/operations/offline-kit.md
+++ b/docs/modules/issuer-directory/operations/offline-kit.md
@@ -0,0 +1,71 @@
+# Issuer Directory Offline Kit Notes
+
+## Purpose
+Operators bundling Stella Ops for fully disconnected environments must include the Issuer Directory service so VEX Lens, Excititor, and Policy Engine can resolve trusted issuers without reaching external registries.
+
+## 1 · Bundle contents
+Include the following artefacts in your Offline Update Kit staging tree:
+
+| Path (within kit) | Source | Notes |
+| --- | --- | --- |
+| `images/issuer-directory-web.tar` | `registry.stella-ops.org/stellaops/issuer-directory-web` (digest from `deploy/releases/<channel>.yaml`) | Export with `crane pull --format=tar` or `skopeo copy docker://... oci:...`. |
+| `config/issuer-directory/issuer-directory.yaml` | `etc/issuer-directory.yaml` (customised) | Replace Authority issuer, tenant header, and log level as required. |
+| `config/issuer-directory/csaf-publishers.json` | `src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json` or regional override | Operators can edit before import to add private publishers. |
+| `secrets/issuer-directory/connection.env` | Secure secret store export (`ISSUER_DIRECTORY_MONGO_CONNECTION_STRING=`) | Encrypt at rest; Offline Kit importer places it in the Compose/Helm secret. |
+| `docs/issuer-directory/deployment.md` | `docs/modules/issuer-directory/operations/deployment.md` | Ship alongside kit documentation for operators. |
+
+> **Image digests:** Update `deploy/releases/2025.10-edge.yaml` (or the relevant manifest) with the exact digest before building the kit so `offline-manifest.json` can assert integrity.
+
+## 2 · Compose (air-gapped) deployment
+1. Load images locally on the target:
+   ```bash
+   docker load < images/issuer-directory-web.tar
+   ```
+2. Copy Compose artefacts:
+   ```bash
+   cp deploy/compose/docker-compose.airgap.yaml .
+   cp deploy/compose/env/airgap.env.example airgap.env
+   cp secrets/issuer-directory/connection.env issuer-directory.mongo.env
+   ```
+3. Update `airgap.env` with site-specific values (Authority issuer, tenant, ports) and remove outbound endpoints.
+4. Bring up the service:
+   ```bash
+   docker compose \
+     --env-file airgap.env \
+     --env-file issuer-directory.mongo.env \
+     -f docker-compose.airgap.yaml up -d issuer-directory
+   ```
+5. Verify via `curl -k https://issuer-directory.airgap.local:8447/health/live`.
+
+## 3 · Kubernetes (air-gapped) deployment
+1. Pre-load the OCI image into your local registry mirror and update `values-airgap.yaml` to reference it.
+2. Apply the secret bundled in the kit:
+   ```bash
+   kubectl apply -f secrets/issuer-directory/connection-secret.yaml
+   ```
+   (Generate this file during packaging with `kubectl create secret generic issuer-directory-secrets ... --dry-run=client -o yaml`.)
+3. Install/upgrade the chart:
+   ```bash
+   helm upgrade --install stellaops deploy/helm/stellaops \
+     -f deploy/helm/stellaops/values-airgap.yaml \
+     --set services.issuer-directory.env.ISSUERDIRECTORY__AUTHORITY__ISSUER=https://authority.airgap.local/realms/stellaops
+   ```
+4. Confirm `issuer_directory_changes_total` is visible in your offline Prometheus stack.
+
+## 4 · Import workflow summary
+1. Run `ops/offline-kit/build_offline_kit.py` with the additional artefacts noted above.
+2. Sign the resulting tarball and manifest (Cosign) and record the SHA-256 in the release notes.
+3. At the destination:
+   ```bash
+   stellaops-cli offline kit import \
+     --bundle stella-ops-offline-kit-<version>-airgap.tar.gz \
+     --destination /opt/stellaops/offline-kit
+   ```
+4. Follow the Compose or Helm path depending on your topology.
+
+## 5 · Post-import validation
+- [ ] `docker images | grep issuer-directory` (Compose) or `kubectl get deploy stellaops-issuer-directory` (Helm) shows the expected version.
+- [ ] `csaf-publishers.json` in the container matches the offline bundle (hash check).
+- [ ] `/issuer-directory/issuers` returns global seed issuers (requires token with `issuer-directory:read` scope).
+- [ ] Audit collection receives entries when you create/update issuers offline.
+- [ ] Offline kit manifest (`offline-manifest.json`) lists `images/issuer-directory-web.tar` and `config/issuer-directory/issuer-directory.yaml` with SHA-256 values you recorded during packaging.
--- a/docs/modules/scanner/TASKS.md
+++ b/docs/modules/scanner/TASKS.md
@@ -8,13 +8,20 @@
 | SCANNER-DOCS-0002 | DONE (2025-11-02) | Docs Guild | Keep scanner benchmark comparisons (Trivy/Grype/Snyk) and deep-dive matrix current with source references. | Coordinate with docs/benchmarks owners |
 | SCANNER-DOCS-0003 | TODO | Docs Guild, Product Guild | Gather Windows/macOS analyzer demand signals and record findings in `docs/benchmarks/scanner/windows-macos-demand.md`. | Coordinate with Product Marketing & Sales enablement |
 | SCANNER-ENG-0008 | TODO | EntryTrace Guild, QA Guild | Maintain EntryTrace heuristic cadence per `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md`. | Include quarterly pattern review + explain trace updates |
-| SCANNER-ENG-0009 | TODO | Ruby Analyzer Guild | SCANNER-ANALYZERS-RUBY-28-001..012 | Deliver Ruby analyzer parity and observation pipeline per gap doc (lockfiles, runtime graph, policy signals). | Design complete; fixtures published; CLI/Offline docs updated. |
+| SCANNER-ENG-0009 | DOING (2025-11-02) | Ruby Analyzer Guild | SCANNER-ANALYZERS-RUBY-28-001..012 | Deliver Ruby analyzer parity and observation pipeline per gap doc (lockfiles, runtime graph, policy signals). | Design complete; fixtures published; CLI/Offline docs updated. |
 | SCANNER-ENG-0010 | TODO | PHP Analyzer Guild | SCANNER-ANALYZERS-PHP-27-001..012 | Ship PHP analyzer pipeline (composer lock, autoload graph, capability signals) to close comparison gaps. | Analyzer + policy integration merged; fixtures + docs aligned. |
 | SCANNER-ENG-0011 | TODO | Language Analyzer Guild | — | Scope Deno runtime analyzer (lockfile resolver, import graphs) based on competitor techniques. | Design doc approved; backlog split into analyzer/runtime work. |
 | SCANNER-ENG-0012 | TODO | Language Analyzer Guild | — | Evaluate Dart analyzer requirements (pubspec parsing, AOT artifacts) to restore parity. | Investigation summary + task split filed with Dart guild. |
 | SCANNER-ENG-0013 | TODO | Swift Analyzer Guild | — | Plan Swift Package Manager coverage (Package.resolved, xcframeworks, runtime hints) with policy hooks. | Design brief approved; backlog seeded with analyzer tasks. |
 | SCANNER-ENG-0014 | TODO | Runtime Guild, Zastava Guild | — | Align Kubernetes/VM target coverage roadmap between Scanner and Zastava per comparison findings. | Joint roadmap doc approved; cross-guild tasks opened. |
 | SCANNER-ENG-0015 | TODO | Export Center Guild, Scanner Guild | — | Document DSSE/Rekor operator enablement guidance and rollout levers surfaced in gap analysis. | Playbook drafted; Export Center backlog updated. |
+| SCANNER-ENG-0016 | DOING (2025-11-02) | Ruby Analyzer Guild (Lockfile Squad) | Implement `RubyLockCollector` and vendor cache ingestion per design §4.1–4.3. | Coordinate fixtures under `fixtures/lang/ruby/lockfiles`; target alpha by Sprint 21. |
+| SCANNER-ENG-0017 | TODO | Ruby Analyzer Guild (Runtime Squad) | Build runtime require/autoload graph builder with tree-sitter Ruby per design §4.4. | Deliver edges with reason codes and integrate EntryTrace hints. |
+| SCANNER-ENG-0018 | TODO | Ruby Analyzer Guild (Capability Squad) | Emit Ruby capability and framework surface signals as defined in design §4.5. | Policy predicates prototyped; capability records available in SBOM overlays. |
+| SCANNER-ENG-0019 | TODO | Ruby Analyzer Guild, CLI Guild | Ship Ruby CLI verbs (`stella ruby inspect|resolve`) and Offline Kit packaging per design §4.6. | CLI commands documented; offline manifest updated; e2e tests pass. |
+| SCANNER-LIC-0001 | DOING (2025-11-02) | Scanner Guild, Legal Guild | Vet tree-sitter Ruby licensing and Offline Kit packaging requirements. | SPDX review complete; packaging plan approved. |
+| SCANNER-POLICY-0001 | TODO | Policy Guild, Ruby Analyzer Guild | Define Policy Engine predicates for Ruby groups/capabilities and align lattice weights. | Policy schema merged; tests cover new predicates. |
+| SCANNER-CLI-0001 | TODO | CLI Guild, Ruby Analyzer Guild | Coordinate CLI UX/help text for new Ruby verbs and update CLI docs. | CLI help + docs updated; golden outputs recorded. |
 | SCANNER-ENG-0002 | TODO | Scanner Guild, CLI Guild | Design Node.js lockfile collector/CLI validator per `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md`. | Capture Surface & policy requirements before implementation |
 | SCANNER-ENG-0003 | TODO | Python Analyzer Guild, CLI Guild | Design Python lockfile/editable install parity checks per `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md`. | Include policy predicates & CLI story in design |
 | SCANNER-ENG-0004 | TODO | Java Analyzer Guild, CLI Guild | Design Java lockfile ingestion & validation per `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md`. | Cover Gradle/SBT collectors, CLI verb, policy hooks |
--- a/docs/modules/scanner/design/ruby-analyzer.md
+++ b/docs/modules/scanner/design/ruby-analyzer.md
@@ -0,0 +1,137 @@
+# Ruby Analyzer Parity Design (SCANNER-ENG-0009)
+
+**Status:** Draft • Owner: Ruby Analyzer Guild • Updated: 2025-11-02
+
+## 1. Goals & Non-Goals
+- **Goals**
+  - Deterministically catalogue Ruby application dependencies (Gemfile/Gemfile.lock, vendored specs, .gem archives) for container layers and local workspaces.
+  - Build runtime usage graphs (require/require_relative, Zeitwerk autoloads, Rack boot chains, Sidekiq/ActiveJob schedulers).
+  - Emit capability signals (exec/fs/net/serialization, framework fingerprints, job schedulers) consumable by Policy Engine and explain traces.
+  - Provide CLI verbs (`stella ruby inspect`, `stella ruby resolve`) and Offline Kit parity for air-gapped deployments.
+- **Non-Goals**
+  - Shipping dynamic runtime profilers (log-based or APM) in this iteration.
+  - Implementing UI changes beyond exposing explain traces the Policy/UI guilds already support.
+
+## 2. Scope & Inputs
+| Input | Location | Notes |
+|-------|----------|-------|
+| Gemfile / Gemfile.lock | Source tree, layer filesystem | Handle multiple apps per repo; honour Bundler groups. |
+| Vendor bundles (`vendor/bundle`, `.bundle/config`) | Layer filesystem | Needed for offline/built images; avoid double-counting platform-specific gems. |
+| `.gemspec` files / cached specs | `~/.bundle/cache`, `vendor/cache`, gems in layers | Support deterministic parsing without executing gem metadata. |
+| Framework configs | `config/application.rb`, `config/routes.rb`, `config/sidekiq.yml`, etc. | Feed framework surface mapper. |
+| Container metadata | Layer digests via RustFS CAS | Support incremental composition per layer. |
+
+## 3. High-Level Architecture
+```
+┌─────────────────────────┐        ┌────────────────────┐
+│  Bundler Lock Collector │───────▶│  Package Graph     │
+└─────────────────────────┘        │  Aggregator        │
+                                   └─────────┬──────────┘
+┌─────────────────────────┐                │
+│  Gemspec Inspector      │───────────────▶│
+└─────────────────────────┘                │
+                                           ▼
+                                   ┌────────────────────┐
+┌─────────────────────────┐        │ Runtime Graph      │
+│  Require/Autoload Scan  │───────▶│ Builder (Zeitwerk) │
+└─────────────────────────┘        └─────────┬──────────┘
+                                           │
+                                           ▼
+                                   ┌────────────────────┐
+                                   │ Capability Emitter │
+                                   └─────────┬──────────┘
+                                           │
+                                           ▼
+                                   ┌────────────────────┐
+                                   │ SBOM Writer        │
+                                   │ + Policy Signals   │
+                                   └────────────────────┘
+```
+
+## 4. Detailed Components
+### 4.1 Bundler Lock Collector
+- Parse `Gemfile.lock` deterministically (no network) using new `RubyLockCollector` under `StellaOps.Scanner.Analyzers.Lang.Ruby`.
+- Support alternative manifests (`gems.rb`, `gems.locked`) and workspace overrides.
+- Emit package nodes with fields: `name`, `version`, `source` (path/git/rubygems), `bundlerGroup[]`, `platform`, `declaredOnly` flag.
+- Implementation:
+  - Reuse parsing strategy from Trivy (`pkg/fanal/analyzer/language/ruby/bundler`) but port to C# with streaming reader and stable ordering.
+  - Integrate with Surface.Validation to enforce size limits and tenant allowlists for git/path sources.
+
+### 4.2 Gemspec Inspector
+- Scan cached specs under `vendor/cache`, `.bundle/cache`, and gem directories to pick up transitive packages when lockfiles missing.
+- Parse without executing Ruby by using a deterministic DSL subset (similar to Trivy gemspec parser).
+- Link results to lockfile entries by `<name, version, platform>`; create new records flagged `InferredFromSpec` when lockfile absent.
+
+### 4.3 Package Aggregator
+- New orchestrator `RubyPackageAggregator` merges lock and gemspec data with installed gems from container layers (once runtime analyzer ships).
+- Precedence: Installed > Lockfile > Gemspec.
+- Deduplicate by package key (name+version+platform) and attach provenance bits for Policy Engine.
+
+### 4.4 Runtime Graph Builder
+- Static analysis for `require`, `require_relative`, `autoload`, Zeitwerk conventions, and Rails initialisers.
+- Implementation phases:
+  1. Parse AST using tree-sitter Ruby embedded under `StellaOps.Scanner.Analyzers.Lang.Ruby.Syntax` with deterministic bindings.
+  2. Generate edges `entrypoint -> file` and `file -> package` with reason codes (`require-static`, `autoload-zeitwerk`, `autoload-const_missing`).
+  3. Identify framework entrypoints (Rails controllers, Rack middleware, Sidekiq workers) via heuristics defined in `SCANNER-ANALYZERS-RUBY-28-*` tasks.
+- Output merges with EntryTrace usage hints to support runtime filtering in Policy Engine.
+
+### 4.5 Capability & Surface Signals
+- Emit evidence documents for:
+  - Process/exec usage (`Kernel.system`, `` `cmd` ``, `Open3`).
+  - Network clients (`Net::HTTP`, `Faraday`, `Redis`, `ActiveRecord::Base.establish_connection`).
+  - Serialization sinks (`Marshal.load`, `YAML.load`, `Oj.load`).
+  - Job schedulers (Sidekiq, Resque, ActiveJob, Whenever, Clockwork) with schedule metadata.
+- Capability records flow to Policy Engine under `capability.ruby.*` namespaces to allow gating on dangerous constructs.
+
+### 4.6 CLI & Offline Integration
+- Add CLI verbs:
+  - `stella ruby inspect <path>` – runs collector locally, outputs JSON summary with provenance.
+  - `stella ruby resolve --image <ref>` – fetches scan artifacts, prints dependency graph grouped by bundler group/platform.
+- Ship analyzer DLLs and rules in Offline Kit manifest; include autoload/zeitwerk fingerprints and heuristics hashed for determinism.
+
+## 5. Data Contracts
+| Artifact | Shape | Consumer |
+|----------|-------|----------|
+| `ruby_packages.json` | Array `{id, name, version, source, provenance, groups[], platform}` | SBOM Composer, Policy Engine |
+| `ruby_runtime_edges.json` | Edges `{from, to, reason, confidence}` | EntryTrace overlay, Policy explain traces |
+| `ruby_capabilities.json` | Capability `{kind, location, evidenceHash, params}` | Policy Engine (capability predicates) |
+
+All records follow AOC appender rules (immutable, tenant-scoped) and include `hash`, `layerDigest`, and `timestamp` normalized to UTC ISO-8601.
+
+## 6. Testing Strategy
+- **Fixtures**: Extend `fixtures/lang/ruby` with Rails, Sinatra, Sidekiq, Rack, container images (with/without vendor cache).
+- **Determinism**: Golden snapshots for package lists and capability outputs across repeated runs.
+- **Integration**: Worker e2e to ensure per-layer aggregation; CLI golden outputs (`stella ruby inspect`).
+- **Policy**: Unit tests verifying new predicates (`ruby.group`, `ruby.capability.exec`, etc.) in Policy Engine test suite.
+
+## 7. Rollout Plan & Dependencies
+1. Implement collectors and aggregators (SCANNER-ANALYZERS-RUBY-28-001..004).
+2. Add capability analyzer and observations (SCANNER-ANALYZERS-RUBY-28-005..008).
+3. Wire CLI commands and Offline Kit packaging (SCANNER-ANALYZERS-RUBY-28-011).
+4. Update docs (DOCS-SCANNER-BENCH-62-009 follow-up) once analyzer alpha ready.
+
+**Dependencies**
+- Tree-sitter Ruby grammar inclusion (needs Offline Kit packaging and licensing check).
+- Policy Engine support for new predicates and capability schemas.
+- Surface.Validation updates for git/path gem sources and secret resolution.
+
+## 8. Open Questions
+- Do we require dynamic runtime logs (e.g., `ActiveSupport::Notifications`) for confidence boosts? (defer to future iteration)
+- Should we enforce signed gem provenance in MVP? Pending Product decision.
+- Need alignment with Export Center on Ruby-specific manifest emissions.
+
+## 9. Licensing & Offline Packaging (SCANNER-LIC-0001)
+- **License**: tree-sitter core and `tree-sitter-ruby` grammar are MIT licensed (confirmed via upstream LICENSE files retrieved 2025-11-02).
+- **Obligations**:
+  1. Include both MIT license texts in `/third-party-licenses/` and in Offline Kit manifests.
+  2. Update `NOTICE.md` to acknowledge embedded grammars per company policy.
+  3. Record the grammar commit hashes in build metadata; regenerate generated C/WASM artifacts deterministically.
+  4. Ensure build pipeline uses `tree-sitter-cli` only as a build-time tool (not redistributed) to avoid extra licensing obligations.
+- **Deliverables**:
+  - SCANNER-LIC-0001 to capture Legal sign-off and update packaging scripts.
+  - Export Center to mirror license files into Offline Kit bundle.
+
+---
+*References:*
+- Trivy: `pkg/fanal/analyzer/language/ruby/bundler`, `pkg/fanal/analyzer/language/ruby/gemspec`
+- Gap analysis: `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md#ruby-analyzer-parity-trivy-grype-snyk`