Add Ruby language analyzer and related functionality
- Introduced global usings for Ruby analyzer. - Implemented RubyLockData, RubyLockEntry, and RubyLockParser for handling Gemfile.lock files. - Created RubyPackage and RubyPackageCollector to manage Ruby packages and vendor cache. - Developed RubyAnalyzerPlugin and RubyLanguageAnalyzer for analyzing Ruby projects. - Added tests for Ruby language analyzer with sample Gemfile.lock and expected output. - Included necessary project files and references for the Ruby analyzer. - Added third-party licenses for tree-sitter dependencies.
This commit is contained in:
@@ -1,100 +1,115 @@
|
||||
# Advisory AI architecture
|
||||
|
||||
> Captures the retrieval, guardrail, and inference packaging requirements defined in the Advisory AI implementation plan and related module guides.
|
||||
|
||||
## 1) Goals
|
||||
|
||||
- Summarise advisories/VEX evidence into operator-ready briefs with citations.
|
||||
- Explain conflicting statements with provenance and trust weights (using VEX Lens & Excititor data).
|
||||
- Suggest remediation plans aligned with Offline Kit deployment models and scheduler follow-ups.
|
||||
- Operate deterministically where possible; cache generated artefacts with digests for audit.
|
||||
|
||||
## 2) Pipeline overview
|
||||
|
||||
```
|
||||
+---------------------+
|
||||
Concelier/VEX Lens | Evidence Retriever |
|
||||
Policy Engine ----> | (vector + keyword) | ---> Context Pack (JSON)
|
||||
Zastava runtime +---------------------+
|
||||
|
|
||||
v
|
||||
+-------------+
|
||||
| Prompt |
|
||||
| Assembler |
|
||||
+-------------+
|
||||
|
|
||||
v
|
||||
+-------------+
|
||||
| Guarded LLM |
|
||||
| (local/host)|
|
||||
+-------------+
|
||||
|
|
||||
v
|
||||
+-----------------+
|
||||
| Citation & |
|
||||
| Validation |
|
||||
+-----------------+
|
||||
|
|
||||
v
|
||||
+----------------+
|
||||
| Output cache |
|
||||
| (hash, bundle) |
|
||||
+----------------+
|
||||
```
|
||||
|
||||
## 3) Retrieval & context
|
||||
|
||||
- Hybrid search: vector embeddings (SBERT-compatible) + keyword filters for advisory IDs, PURLs, CVEs.
|
||||
- Context packs include:
|
||||
- Advisory raw excerpts with highlighted sections and source URLs.
|
||||
- VEX statements (normalized tuples + trust metadata).
|
||||
- Policy explain traces for the affected finding.
|
||||
- Runtime/impact hints from Zastava (exposure, entrypoints).
|
||||
- Export-ready remediation data (fixed versions, patches).
|
||||
|
||||
All context references include `content_hash` and `source_id` enabling verifiable citations.
|
||||
|
||||
## 4) Guardrails
|
||||
|
||||
- Prompt templates enforce structure: summary, conflicts, remediation, references.
|
||||
- Response validator ensures:
|
||||
- No hallucinated advisories (every fact must map to input context).
|
||||
- Citations follow `[n]` indexing referencing actual sources.
|
||||
- Remediation suggestions only cite policy-approved sources (fixed versions, vendor hotfixes).
|
||||
- Moderation/PII filters prevent leaking secrets; responses failing validation are rejected and logged.
|
||||
|
||||
## 5) Output persistence
|
||||
|
||||
- Cached artefacts stored in `advisory_ai_outputs` with fields:
|
||||
- `output_hash` (sha256 of JSON response).
|
||||
- `input_digest` (hash of context pack).
|
||||
- `summary`, `conflicts`, `remediation`, `citations`.
|
||||
- `generated_at`, `model_id`, `profile` (Sovereign/FIPS etc.).
|
||||
- `signatures` (optional DSSE if run in deterministic mode).
|
||||
- Offline bundle format contains `summary.md`, `citations.json`, `context_manifest.json`, `signatures/`.
|
||||
|
||||
## 6) Profiles & sovereignty
|
||||
|
||||
- **Profiles:** `default`, `fips-local` (FIPS-compliant local model), `gost-local`, `cloud-openai` (optional, disabled by default). Each profile defines allowed models, key management, and telemetry endpoints.
|
||||
- **CryptoProfile/RootPack integration:** generated artefacts can be signed using configured CryptoProfile to satisfy procurement/trust requirements.
|
||||
|
||||
## 7) APIs
|
||||
|
||||
- `POST /v1/advisory-ai/summaries` — generate (or retrieve cached) summary for `{advisoryKey, artifactId, policyVersion}`.
|
||||
- `POST /v1/advisory-ai/conflicts` — explain conflicting VEX statements with trust ranking.
|
||||
- `POST /v1/advisory-ai/remediation` — fetch remediation plan with target fix versions, prerequisites, verification steps.
|
||||
- `GET /v1/advisory-ai/outputs/{hash}` — retrieve cached artefact (used by CLI/Console/Export Center).
|
||||
|
||||
All endpoints accept `profile` parameter (default `fips-local`) and return `output_hash`, `input_digest`, and `citations` for verification.
|
||||
|
||||
## 8) Observability
|
||||
|
||||
- Metrics: `advisory_ai_requests_total{profile,type}`, `advisory_ai_latency_seconds`, `advisory_ai_validation_failures_total`.
|
||||
- Logs: include `output_hash`, `input_digest`, `profile`, `model_id`, `tenant`, `artifacts`. Sensitive context is not logged.
|
||||
- Traces: spans for retrieval, prompt assembly, model inference, validation, cache write.
|
||||
|
||||
## 9) Operational controls
|
||||
|
||||
- Feature flags per tenant (`ai.summary.enabled`, `ai.remediation.enabled`).
|
||||
- Rate limits (per tenant, per profile) enforced by Orchestrator to prevent runaway usage.
|
||||
- Offline/air-gapped deployments run local models packaged with Offline Kit; model weights validated via manifest digests.
|
||||
# Advisory AI architecture
|
||||
|
||||
> Captures the retrieval, guardrail, and inference packaging requirements defined in the Advisory AI implementation plan and related module guides.
|
||||
|
||||
## 1) Goals
|
||||
|
||||
- Summarise advisories/VEX evidence into operator-ready briefs with citations.
|
||||
- Explain conflicting statements with provenance and trust weights (using VEX Lens & Excititor data).
|
||||
- Suggest remediation plans aligned with Offline Kit deployment models and scheduler follow-ups.
|
||||
- Operate deterministically where possible; cache generated artefacts with digests for audit.
|
||||
|
||||
## 2) Pipeline overview
|
||||
|
||||
```
|
||||
+---------------------+
|
||||
Concelier/VEX Lens | Evidence Retriever |
|
||||
Policy Engine ----> | (vector + keyword) | ---> Context Pack (JSON)
|
||||
Zastava runtime +---------------------+
|
||||
|
|
||||
v
|
||||
+-------------+
|
||||
| Prompt |
|
||||
| Assembler |
|
||||
+-------------+
|
||||
|
|
||||
v
|
||||
+-------------+
|
||||
| Guarded LLM |
|
||||
| (local/host)|
|
||||
+-------------+
|
||||
|
|
||||
v
|
||||
+-----------------+
|
||||
| Citation & |
|
||||
| Validation |
|
||||
+-----------------+
|
||||
|
|
||||
v
|
||||
+----------------+
|
||||
| Output cache |
|
||||
| (hash, bundle) |
|
||||
+----------------+
|
||||
```
|
||||
|
||||
## 3) Retrieval & context
|
||||
|
||||
- Hybrid search: vector embeddings (SBERT-compatible) + keyword filters for advisory IDs, PURLs, CVEs.
|
||||
- Context packs include:
|
||||
- Advisory raw excerpts with highlighted sections and source URLs.
|
||||
- VEX statements (normalized tuples + trust metadata).
|
||||
- Policy explain traces for the affected finding.
|
||||
- Runtime/impact hints from Zastava (exposure, entrypoints).
|
||||
- Export-ready remediation data (fixed versions, patches).
|
||||
- **SBOM context retriever** (AIAI-31-002) hydrates:
|
||||
- Version timelines (first/last observed, status, fix availability).
|
||||
- Dependency paths (runtime vs build/test, deduped by coordinate chain).
|
||||
- Tenant environment flags (prod/stage toggles) with optional blast radius summary.
|
||||
- Service-side clamps: max 500 timeline entries, 200 dependency paths, with client-provided toggles for env/blast data.
|
||||
|
||||
Retriever requests and results are trimmed/normalized before hashing; metadata (counts, provenance keys) is returned for downstream guardrails. Unit coverage ensures deterministic ordering and flag handling.
|
||||
|
||||
All context references include `content_hash` and `source_id` enabling verifiable citations.
|
||||
|
||||
## 4) Guardrails
|
||||
|
||||
- Prompt templates enforce structure: summary, conflicts, remediation, references.
|
||||
- Response validator ensures:
|
||||
- No hallucinated advisories (every fact must map to input context).
|
||||
- Citations follow `[n]` indexing referencing actual sources.
|
||||
- Remediation suggestions only cite policy-approved sources (fixed versions, vendor hotfixes).
|
||||
- Moderation/PII filters prevent leaking secrets; responses failing validation are rejected and logged.
|
||||
|
||||
## 5) Deterministic tooling
|
||||
|
||||
- **Version comparators** — offline semantic version + RPM EVR parsers with range evaluators. Supports chained constraints (`>=`, `<=`, `!=`) used by remediation advice and blast radius calcs.
|
||||
- Registered via `AddAdvisoryDeterministicToolset` for reuse across orchestrator, CLI, and services.
|
||||
- **Orchestration pipeline** — see `orchestration-pipeline.md` for prerequisites, task breakdown, and cross-guild responsibilities before wiring the execution flows.
|
||||
- **Planned extensions** — NEVRA/EVR comparators, ecosystem-specific normalisers, dependency chain scorers (AIAI-31-003 scope).
|
||||
- Exposed via internal interfaces to allow orchestrator/toolchain reuse; all helpers stay side-effect free and deterministic for golden testing.
|
||||
|
||||
## 6) Output persistence
|
||||
|
||||
- Cached artefacts stored in `advisory_ai_outputs` with fields:
|
||||
- `output_hash` (sha256 of JSON response).
|
||||
- `input_digest` (hash of context pack).
|
||||
- `summary`, `conflicts`, `remediation`, `citations`.
|
||||
- `generated_at`, `model_id`, `profile` (Sovereign/FIPS etc.).
|
||||
- `signatures` (optional DSSE if run in deterministic mode).
|
||||
- Offline bundle format contains `summary.md`, `citations.json`, `context_manifest.json`, `signatures/`.
|
||||
|
||||
## 7) Profiles & sovereignty
|
||||
|
||||
- **Profiles:** `default`, `fips-local` (FIPS-compliant local model), `gost-local`, `cloud-openai` (optional, disabled by default). Each profile defines allowed models, key management, and telemetry endpoints.
|
||||
- **CryptoProfile/RootPack integration:** generated artefacts can be signed using configured CryptoProfile to satisfy procurement/trust requirements.
|
||||
|
||||
## 8) APIs
|
||||
|
||||
- `POST /v1/advisory-ai/summaries` — generate (or retrieve cached) summary for `{advisoryKey, artifactId, policyVersion}`.
|
||||
- `POST /v1/advisory-ai/conflicts` — explain conflicting VEX statements with trust ranking.
|
||||
- `POST /v1/advisory-ai/remediation` — fetch remediation plan with target fix versions, prerequisites, verification steps.
|
||||
- `GET /v1/advisory-ai/outputs/{hash}` — retrieve cached artefact (used by CLI/Console/Export Center).
|
||||
|
||||
All endpoints accept `profile` parameter (default `fips-local`) and return `output_hash`, `input_digest`, and `citations` for verification.
|
||||
|
||||
## 9) Observability
|
||||
|
||||
- Metrics: `advisory_ai_requests_total{profile,type}`, `advisory_ai_latency_seconds`, `advisory_ai_validation_failures_total`.
|
||||
- Logs: include `output_hash`, `input_digest`, `profile`, `model_id`, `tenant`, `artifacts`. Sensitive context is not logged.
|
||||
- Traces: spans for retrieval, prompt assembly, model inference, validation, cache write.
|
||||
|
||||
## 10) Operational controls
|
||||
|
||||
- Feature flags per tenant (`ai.summary.enabled`, `ai.remediation.enabled`).
|
||||
- Rate limits (per tenant, per profile) enforced by Orchestrator to prevent runaway usage.
|
||||
- Offline/air-gapped deployments run local models packaged with Offline Kit; model weights validated via manifest digests.
|
||||
|
||||
82
docs/modules/advisory-ai/orchestration-pipeline.md
Normal file
82
docs/modules/advisory-ai/orchestration-pipeline.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# Advisory AI Orchestration Pipeline (Planning Notes)
|
||||
|
||||
> **Status:** Draft – prerequisite design for AIAI-31-004 integration work.
|
||||
> **Audience:** Advisory AI guild, WebService/Worker guilds, CLI guild, Docs/QA support teams.
|
||||
|
||||
## 1. Goal
|
||||
|
||||
Wire the deterministic pipeline (Summary / Conflict / Remediation flows) into the Advisory AI service, workers, and CLI with deterministic caching, prompt preparation, and guardrail fallback. This document captures the pre-integration checklist and task breakdown so each guild understands their responsibilities before coding begins.
|
||||
|
||||
## 2. Prerequisites
|
||||
|
||||
| Area | Requirement | Owner | Status |
|
||||
|------|-------------|-------|--------|
|
||||
| **Toolset** | Deterministic comparators, dependency analyzer (`IDeterministicToolset`, `AdvisoryPipelineOrchestrator`) | Advisory AI | ✅ landed (AIAI-31-003) |
|
||||
| **SBOM context** | Real SBOM context client delivering timelines + dependency paths | SBOM Service Guild | ⏳ pending (AIAI-31-002) |
|
||||
| **Prompt artifacts** | Liquid/Handlebars prompt templates for summary/conflict/remediation | Advisory AI Docs Guild | ⏳ authoring needed |
|
||||
| **Cache strategy** | Decision on DSSE or hash-only cache entries, TTLs, and eviction policy | Advisory AI + Platform | 🔲 define |
|
||||
| **Auth scopes** | Confirm service account scopes for new API endpoints/worker-to-service calls | Authority Guild | 🔲 define |
|
||||
|
||||
**Blocking risk:** SBOM client and prompt templates must exist (even stubbed) before the orchestrator can produce stable plans.
|
||||
|
||||
## 3. Integration plan (high-level)
|
||||
|
||||
1. **Service layer (WebService / Worker)**
|
||||
- Inject `IAdvisoryPipelineOrchestrator` via `AddAdvisoryPipeline`.
|
||||
- Define REST endpoint `POST /v1/advisories/{key}/pipeline/{task}` (task ∈ summary/conflict/remediation).
|
||||
- Worker consumes queue messages (`advisory.pipeline.execute`) -> fetches plan -> executes prompt -> persists output & provenance.
|
||||
- Add metrics: `advisory_pipeline_requests_total`, `advisory_pipeline_plan_cache_hits_total`, `advisory_pipeline_latency_seconds`.
|
||||
2. **CLI**
|
||||
- New command `stella advise run <task>` with flags for artifact id, profile, policy version, `--force-refresh`.
|
||||
- Render JSON/Markdown outputs; handle caching hints (print cache key, status).
|
||||
3. **Caching / storage**
|
||||
- Choose storage (Mongo collection vs existing DSSE output store).
|
||||
- Persist `AdvisoryTaskPlan` metadata + generated output keyed by cache key + policy version.
|
||||
- Expose TTL/force-refresh semantics.
|
||||
4. **Docs & QA**
|
||||
- Publish API spec (`docs/advisory-ai/api.md`) + CLI docs.
|
||||
- Add golden outputs for deterministic runs; property tests for cache key stability.
|
||||
|
||||
## 4. Task Breakdown
|
||||
|
||||
### AIAI-31-004A (Service orchestration wiring)
|
||||
|
||||
- **Scope:** WebService/Worker injection, REST/queue plumbing, metrics counters, basic cache stub.
|
||||
- **Dependencies:** `AddAdvisoryPipeline`, SBOM client stub.
|
||||
- **Exit:** API responds with plan metadata + queue message; worker logs execution attempt; metrics emitted.
|
||||
|
||||
### AIAI-31-004B (Prompt assembly & cache persistence)
|
||||
|
||||
- **Scope:** Implement prompt assembler, connect to guardrails, persist cache entries w/ DSSE metadata.
|
||||
- **Dependencies:** Prompt templates, cache storage decision, guardrail interface.
|
||||
- **Exit:** Deterministic outputs stored; force-refresh honoured; tests cover prompt assembly + caching.
|
||||
|
||||
### AIAI-31-004C (CLI integration & docs)
|
||||
|
||||
- **Scope:** CLI command + output renderer, docs updates, CLI tests (golden outputs).
|
||||
- **Dependencies:** Service endpoints stable, caching semantics documented.
|
||||
- **Exit:** CLI command produces deterministic output, docs updated, smoke tests recorded.
|
||||
|
||||
### Supporting tasks (other guilds)
|
||||
|
||||
- **AUTH-AIAI-31-004** – Update scopes and DSSE policy (Authority guild).
|
||||
- **DOCS-AIAI-31-003** – Publish API documentation, CLI guide updates (Docs guild).
|
||||
- **QA-AIAI-31-004** – Golden/properties/perf suite for pipeline (QA guild).
|
||||
|
||||
## 5. Acceptance checklist (per task)
|
||||
|
||||
| Item | Notes |
|
||||
|------|-------|
|
||||
| Cache key stability | `AdvisoryPipelineOrchestrator` hash must remain stable under re-run of identical inputs. |
|
||||
| Metrics & logging | Request id, cache key, task type, profile, latency; guardrail results logged without sensitive prompt data. |
|
||||
| Offline readiness | All prompt templates bundled with Offline Kit; CLI works in air-gapped mode with cached data. |
|
||||
| Policy awareness | Plans encode policy version used; outputs reference policy digest for audit. |
|
||||
| Testing | Unit tests (plan generation, cache keys, DI), integration (service endpoint, worker, CLI), deterministic golden outputs. |
|
||||
|
||||
## 6. Next steps
|
||||
|
||||
1. Finalize SBOM context client (AIAI-31-002) and prompt templates.
|
||||
2. Create queue schema spec (`docs/modules/advisory-ai/queue-contracts.md`) if not already available.
|
||||
3. Schedule cross-guild kickoff to agree on cache store & DSSE policy.
|
||||
|
||||
_Last updated: 2025-11-02_
|
||||
@@ -90,6 +90,11 @@ Payloads follow the contract in `Contracts/IssuerDtos.cs` and align with domain
|
||||
3. **SDK integration (ISSUER-30-004)** — supply cached issuer metadata to VEX Lens and Excititor clients.
|
||||
4. **Observability & Ops (ISSUER-30-005/006)** — metrics, dashboards, deployment automation, offline kit.
|
||||
|
||||
## 9. Operations & runbooks
|
||||
- [Deployment guide](operations/deployment.md)
|
||||
- [Backup & restore](operations/backup-restore.md)
|
||||
- [Offline kit notes](operations/offline-kit.md)
|
||||
|
||||
---
|
||||
|
||||
*Document owner: Issuer Directory Guild*
|
||||
|
||||
103
docs/modules/issuer-directory/operations/backup-restore.md
Normal file
103
docs/modules/issuer-directory/operations/backup-restore.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# Issuer Directory Backup & Restore
|
||||
|
||||
## Scope
|
||||
- **Applies to:** Issuer Directory when deployed via Docker Compose (`deploy/compose/docker-compose.*.yaml`) or the Helm chart (`deploy/helm/stellaops`).
|
||||
- **Artifacts covered:** MongoDB database `issuer-directory`, service configuration (`etc/issuer-directory.yaml`), CSAF seed file (`data/csaf-publishers.json`), and secret material for the Mongo connection string.
|
||||
- **Frequency:** Take a hot backup before every upgrade and at least daily in production. Keep encrypted copies off-site/air-gapped according to your compliance program.
|
||||
|
||||
## Inventory checklist
|
||||
| Component | Location (Compose default) | Notes |
|
||||
| --- | --- | --- |
|
||||
| Mongo data | `mongo-data` volume (`/var/lib/docker/volumes/.../mongo-data`) | Contains `issuers`, `issuer_keys`, `issuer_trust_overrides`, and `issuer_audit` collections. |
|
||||
| Configuration | `etc/issuer-directory.yaml` | Mounted read-only at `/etc/issuer-directory.yaml` inside the container. |
|
||||
| CSAF seed file | `src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json` | Ensure customised seeds are part of the backup; regenerate if you ship regional overrides. |
|
||||
| Mongo secret | `.env` entry `ISSUER_DIRECTORY_MONGO_CONNECTION_STRING` or secret store export | Required to restore connectivity; treat as sensitive. |
|
||||
|
||||
> **Tip:** Export the secret via `kubectl get secret issuer-directory-secrets -o yaml` (sanitize before storage) or copy the Compose `.env` file into an encrypted vault.
|
||||
|
||||
## Hot backup (no downtime)
|
||||
1. **Create output directory**
|
||||
```bash
|
||||
BACKUP_DIR=backup/issuer-directory/$(date +%Y-%m-%dT%H%M%S)
|
||||
mkdir -p "$BACKUP_DIR"
|
||||
```
|
||||
2. **Dump Mongo collections**
|
||||
```bash
|
||||
docker compose -f deploy/compose/docker-compose.prod.yaml exec mongo \
|
||||
mongodump --archive=/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).gz \
|
||||
--gzip --db issuer-directory
|
||||
|
||||
docker compose -f deploy/compose/docker-compose.prod.yaml cp \
|
||||
mongo:/dump/issuer-directory-$(date +%Y%m%dT%H%M%SZ).gz "$BACKUP_DIR/"
|
||||
```
|
||||
For Kubernetes, run the same `mongodump` command inside the `stellaops-mongo` pod and copy the archive via `kubectl cp`.
|
||||
3. **Capture configuration and seeds**
|
||||
```bash
|
||||
cp etc/issuer-directory.yaml "$BACKUP_DIR/"
|
||||
cp src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json "$BACKUP_DIR/"
|
||||
```
|
||||
4. **Capture secrets**
|
||||
```bash
|
||||
grep '^ISSUER_DIRECTORY_MONGO_CONNECTION_STRING=' dev.env > "$BACKUP_DIR/issuer-directory.mongo.secret"
|
||||
chmod 600 "$BACKUP_DIR/issuer-directory.mongo.secret"
|
||||
```
|
||||
5. **Generate checksums and encrypt**
|
||||
```bash
|
||||
(cd "$BACKUP_DIR" && sha256sum * > SHA256SUMS)
|
||||
tar czf "$BACKUP_DIR.tar.gz" -C "$BACKUP_DIR" .
|
||||
age -r you@example.org "$BACKUP_DIR.tar.gz" > "$BACKUP_DIR.tar.gz.age"
|
||||
```
|
||||
|
||||
## Cold backup (planned downtime)
|
||||
1. Notify stakeholders and pause automation calling the API.
|
||||
2. Stop services:
|
||||
```bash
|
||||
docker compose -f deploy/compose/docker-compose.prod.yaml down issuer-directory
|
||||
```
|
||||
(For Helm: `kubectl scale deploy stellaops-issuer-directory --replicas=0`.)
|
||||
3. Snapshot volumes:
|
||||
```bash
|
||||
docker run --rm -v mongo-data:/data \
|
||||
-v "$(pwd)":/backup busybox tar czf /backup/mongo-data-$(date +%Y%m%d).tar.gz -C /data .
|
||||
```
|
||||
4. Copy configuration, seeds, and secrets as in the hot backup.
|
||||
5. Restart services and confirm `/health/live` returns `200 OK`.
|
||||
|
||||
## Restore procedure
|
||||
1. **Provision clean volumes**
|
||||
- Compose: `docker volume rm mongo-data` (optional) then `docker compose up -d mongo`.
|
||||
- Helm: delete the Mongo PVC or attach a fresh volume snapshot.
|
||||
2. **Restore Mongo**
|
||||
```bash
|
||||
docker compose exec -T mongo \
|
||||
mongorestore --archive \
|
||||
--gzip --drop < issuer-directory-YYYYMMDDTHHMMSSZ.gz
|
||||
```
|
||||
3. **Restore configuration/secrets**
|
||||
- Copy `issuer-directory.yaml` into `etc/`.
|
||||
- Reapply the secret: `kubectl apply -f issuer-directory-secret.yaml` or repopulate `.env`.
|
||||
4. **Restore CSAF seeds** (optional)
|
||||
- If you maintain a customised seed file, copy it back before starting the container. Otherwise the bundled file will be used.
|
||||
5. **Start services**
|
||||
```bash
|
||||
docker compose up -d issuer-directory
|
||||
# or
|
||||
kubectl scale deploy stellaops-issuer-directory --replicas=1
|
||||
```
|
||||
6. **Validate**
|
||||
- `curl -fsSL https://localhost:8447/health/live`
|
||||
- Issue an access token and list issuers to confirm results.
|
||||
- Check Mongo counts match expectations (`db.issuers.countDocuments()`, etc.).
|
||||
|
||||
## Disaster recovery notes
|
||||
- **Retention:** Maintain 30 daily + 12 monthly archives. Store copies in geographically separate, access-controlled vaults.
|
||||
- **Audit reconciliation:** Ensure `issuer_audit` entries cover the restore window; export them for compliance.
|
||||
- **Seed replay:** If the CSAF seed file was lost, set `ISSUER_DIRECTORY_SEED_CSAF=true` for the first restart to rehydrate the global tenant.
|
||||
- **Testing:** Run quarterly restore drills in a staging environment to validate procedure drift.
|
||||
|
||||
## Verification checklist
|
||||
- [ ] `/health/live` returns `200 OK`.
|
||||
- [ ] Mongo collections (`issuers`, `issuer_keys`, `issuer_trust_overrides`) have expected counts.
|
||||
- [ ] `issuer_directory_changes_total` and `issuer_directory_key_operations_total` metrics resume within 1 minute.
|
||||
- [ ] Audit entries exist for post-restore CRUD activity.
|
||||
- [ ] Client integrations (VEX Lens, Excititor) resolve issuers successfully.
|
||||
100
docs/modules/issuer-directory/operations/deployment.md
Normal file
100
docs/modules/issuer-directory/operations/deployment.md
Normal file
@@ -0,0 +1,100 @@
|
||||
# Issuer Directory Deployment Guide
|
||||
|
||||
## Scope
|
||||
- **Applies to:** Issuer Directory WebService (`stellaops/issuer-directory-web`) running via the provided Docker Compose bundles (`deploy/compose/docker-compose.*.yaml`) or the Helm chart (`deploy/helm/stellaops`).
|
||||
- **Covers:** Environment prerequisites, secret handling, Compose + Helm rollout steps, and post-deploy verification.
|
||||
- **Audience:** Platform/DevOps engineers responsible for Identity & Signing sprint deliverables.
|
||||
|
||||
## 1 · Prerequisites
|
||||
- Authority must be running and reachable at the issuer URL you configure (default Compose host: `https://authority:8440`).
|
||||
- MongoDB 4.2+ with credentials for the `issuer-directory` database (Compose defaults to the root user defined in `.env`).
|
||||
- Network access to Authority, MongoDB, and (optionally) Prometheus if you scrape metrics.
|
||||
- Issuer Directory configuration file `etc/issuer-directory.yaml` checked and customised for your environment (tenant header, audiences, telemetry level, CSAF seed path).
|
||||
|
||||
> **Secrets:** Use `etc/secrets/issuer-directory.mongo.secret.example` as a template. Store the real connection string in an untracked file or secrets manager and reference it via environment variables (`ISSUER_DIRECTORY_MONGO_CONNECTION_STRING`) rather than committing credentials.
|
||||
|
||||
## 2 · Deploy with Docker Compose
|
||||
1. **Prepare environment variables**
|
||||
```bash
|
||||
cp deploy/compose/env/dev.env.example dev.env
|
||||
cp etc/secrets/issuer-directory.mongo.secret.example issuer-directory.mongo.env
|
||||
# Edit dev.env and issuer-directory.mongo.env with production-ready secrets.
|
||||
```
|
||||
|
||||
2. **Inspect the merged configuration**
|
||||
```bash
|
||||
docker compose \
|
||||
--env-file dev.env \
|
||||
--env-file issuer-directory.mongo.env \
|
||||
-f deploy/compose/docker-compose.dev.yaml config
|
||||
```
|
||||
The command confirms the new `issuer-directory` service resolves the port (`${ISSUER_DIRECTORY_PORT:-8447}`) and the Mongo connection string is in place.
|
||||
|
||||
3. **Launch the stack**
|
||||
```bash
|
||||
docker compose \
|
||||
--env-file dev.env \
|
||||
--env-file issuer-directory.mongo.env \
|
||||
-f deploy/compose/docker-compose.dev.yaml up -d issuer-directory
|
||||
```
|
||||
Compose automatically mounts `../../etc/issuer-directory.yaml` into the container at `/etc/issuer-directory.yaml`, seeds CSAF publishers, and exposes the API on `https://localhost:8447`.
|
||||
|
||||
4. **Smoke test**
|
||||
```bash
|
||||
curl -k https://localhost:8447/health/live
|
||||
stellaops-cli issuer-directory issuers list \
|
||||
--base-url https://localhost:8447 \
|
||||
--tenant demo \
|
||||
--access-token "$(stellaops-cli auth token issue --scope issuer-directory:read)"
|
||||
```
|
||||
|
||||
5. **Upgrade & rollback**
|
||||
- Update Compose images to the desired release manifest (`deploy/releases/*.yaml`), re-run `docker compose config`, then `docker compose up -d`.
|
||||
- Rollbacks follow the same steps with the previous manifest. Mongo collections are backwards compatible within `2025.10.x`.
|
||||
|
||||
## 3 · Deploy with Helm
|
||||
1. **Create or update the secret**
|
||||
```bash
|
||||
kubectl create secret generic issuer-directory-secrets \
|
||||
--from-literal=ISSUERDIRECTORY__MONGO__CONNECTIONSTRING='mongodb://stellaops:<password>@stellaops-mongo:27017' \
|
||||
--dry-run=client -o yaml | kubectl apply -f -
|
||||
```
|
||||
Add optional overrides (e.g. `ISSUERDIRECTORY__AUTHORITY__ISSUER`) if your Authority issuer differs from the default.
|
||||
|
||||
2. **Template for validation**
|
||||
```bash
|
||||
helm template issuer-directory deploy/helm/stellaops \
|
||||
-f deploy/helm/stellaops/values-prod.yaml \
|
||||
--set services.issuer-directory.env.ISSUERDIRECTORY__AUTHORITY__ISSUER=https://authority.prod.stella-ops.org \
|
||||
> /tmp/issuer-directory.yaml
|
||||
```
|
||||
|
||||
3. **Install / upgrade**
|
||||
```bash
|
||||
helm upgrade --install stellaops deploy/helm/stellaops \
|
||||
-f deploy/helm/stellaops/values-prod.yaml \
|
||||
--set services.issuer-directory.env.ISSUERDIRECTORY__AUTHORITY__ISSUER=https://authority.prod.stella-ops.org
|
||||
```
|
||||
The chart provisions:
|
||||
- ConfigMap `stellaops-issuer-directory-config` with `IssuerDirectory` settings.
|
||||
- Deployment `stellaops-issuer-directory` with readiness/liveness probes on `/health/live`.
|
||||
- Service on port `8080` (ClusterIP by default).
|
||||
|
||||
4. **Expose for operators (optional)**
|
||||
- Use an Ingress/HTTPRoute to publish `https://issuer-directory.<env>.stella-ops.org`.
|
||||
- Ensure the upstream includes DPoP headers if proxied through an API gateway.
|
||||
|
||||
5. **Post-deploy validation**
|
||||
```bash
|
||||
kubectl exec deploy/stellaops-issuer-directory -- \
|
||||
curl -sf http://127.0.0.1:8080/health/live
|
||||
kubectl logs deploy/stellaops-issuer-directory | grep 'IssuerDirectory Mongo connected'
|
||||
```
|
||||
Prometheus should begin scraping `issuer_directory_changes_total` and related metrics (labels: `tenant`, `issuer`, `action`).
|
||||
|
||||
## 4 · Operational checklist
|
||||
- **Secrets:** Connection strings live in `issuer-directory-secrets` (Helm) or an `.env` file stored in your secrets vault (Compose). Rotate credentials via secret update + pod restart.
|
||||
- **Audit streams:** Confirm `issuer_directory_audit` collection receives entries when CRUD operations run; export logs for compliance.
|
||||
- **Tenants:** The service enforces the `X-StellaOps-Tenant` header. For multi-tenant staging, configure the reverse proxy to inject the correct tenant or issue scoped tokens.
|
||||
- **CSAF seeds:** `ISSUER_DIRECTORY_SEED_CSAF=true` replays `data/csaf-publishers.json` on startup. Set to `false` once production tenants are fully managed, or override `csafSeedPath` with a curated bundle.
|
||||
- **Release alignment:** Before promotion, run `deploy/tools/validate-profiles.sh` to lint Compose/Helm bundles, then verify the new `issuer-directory-web` entry in `deploy/releases/2025.10-edge.yaml` (or the relevant manifest) matches the channel you intend to ship.
|
||||
71
docs/modules/issuer-directory/operations/offline-kit.md
Normal file
71
docs/modules/issuer-directory/operations/offline-kit.md
Normal file
@@ -0,0 +1,71 @@
|
||||
# Issuer Directory Offline Kit Notes
|
||||
|
||||
## Purpose
|
||||
Operators bundling Stella Ops for fully disconnected environments must include the Issuer Directory service so VEX Lens, Excititor, and Policy Engine can resolve trusted issuers without reaching external registries.
|
||||
|
||||
## 1 · Bundle contents
|
||||
Include the following artefacts in your Offline Update Kit staging tree:
|
||||
|
||||
| Path (within kit) | Source | Notes |
|
||||
| --- | --- | --- |
|
||||
| `images/issuer-directory-web.tar` | `registry.stella-ops.org/stellaops/issuer-directory-web` (digest from `deploy/releases/<channel>.yaml`) | Export with `crane pull --format=tar` or `skopeo copy docker://... oci:...`. |
|
||||
| `config/issuer-directory/issuer-directory.yaml` | `etc/issuer-directory.yaml` (customised) | Replace Authority issuer, tenant header, and log level as required. |
|
||||
| `config/issuer-directory/csaf-publishers.json` | `src/IssuerDirectory/StellaOps.IssuerDirectory/data/csaf-publishers.json` or regional override | Operators can edit before import to add private publishers. |
|
||||
| `secrets/issuer-directory/connection.env` | Secure secret store export (`ISSUER_DIRECTORY_MONGO_CONNECTION_STRING=`) | Encrypt at rest; Offline Kit importer places it in the Compose/Helm secret. |
|
||||
| `docs/issuer-directory/deployment.md` | `docs/modules/issuer-directory/operations/deployment.md` | Ship alongside kit documentation for operators. |
|
||||
|
||||
> **Image digests:** Update `deploy/releases/2025.10-edge.yaml` (or the relevant manifest) with the exact digest before building the kit so `offline-manifest.json` can assert integrity.
|
||||
|
||||
## 2 · Compose (air-gapped) deployment
|
||||
1. Load images locally on the target:
|
||||
```bash
|
||||
docker load < images/issuer-directory-web.tar
|
||||
```
|
||||
2. Copy Compose artefacts:
|
||||
```bash
|
||||
cp deploy/compose/docker-compose.airgap.yaml .
|
||||
cp deploy/compose/env/airgap.env.example airgap.env
|
||||
cp secrets/issuer-directory/connection.env issuer-directory.mongo.env
|
||||
```
|
||||
3. Update `airgap.env` with site-specific values (Authority issuer, tenant, ports) and remove outbound endpoints.
|
||||
4. Bring up the service:
|
||||
```bash
|
||||
docker compose \
|
||||
--env-file airgap.env \
|
||||
--env-file issuer-directory.mongo.env \
|
||||
-f docker-compose.airgap.yaml up -d issuer-directory
|
||||
```
|
||||
5. Verify via `curl -k https://issuer-directory.airgap.local:8447/health/live`.
|
||||
|
||||
## 3 · Kubernetes (air-gapped) deployment
|
||||
1. Pre-load the OCI image into your local registry mirror and update `values-airgap.yaml` to reference it.
|
||||
2. Apply the secret bundled in the kit:
|
||||
```bash
|
||||
kubectl apply -f secrets/issuer-directory/connection-secret.yaml
|
||||
```
|
||||
(Generate this file during packaging with `kubectl create secret generic issuer-directory-secrets ... --dry-run=client -o yaml`.)
|
||||
3. Install/upgrade the chart:
|
||||
```bash
|
||||
helm upgrade --install stellaops deploy/helm/stellaops \
|
||||
-f deploy/helm/stellaops/values-airgap.yaml \
|
||||
--set services.issuer-directory.env.ISSUERDIRECTORY__AUTHORITY__ISSUER=https://authority.airgap.local/realms/stellaops
|
||||
```
|
||||
4. Confirm `issuer_directory_changes_total` is visible in your offline Prometheus stack.
|
||||
|
||||
## 4 · Import workflow summary
|
||||
1. Run `ops/offline-kit/build_offline_kit.py` with the additional artefacts noted above.
|
||||
2. Sign the resulting tarball and manifest (Cosign) and record the SHA-256 in the release notes.
|
||||
3. At the destination:
|
||||
```bash
|
||||
stellaops-cli offline kit import \
|
||||
--bundle stella-ops-offline-kit-<version>-airgap.tar.gz \
|
||||
--destination /opt/stellaops/offline-kit
|
||||
```
|
||||
4. Follow the Compose or Helm path depending on your topology.
|
||||
|
||||
## 5 · Post-import validation
|
||||
- [ ] `docker images | grep issuer-directory` (Compose) or `kubectl get deploy stellaops-issuer-directory` (Helm) shows the expected version.
|
||||
- [ ] `csaf-publishers.json` in the container matches the offline bundle (hash check).
|
||||
- [ ] `/issuer-directory/issuers` returns global seed issuers (requires token with `issuer-directory:read` scope).
|
||||
- [ ] Audit collection receives entries when you create/update issuers offline.
|
||||
- [ ] Offline kit manifest (`offline-manifest.json`) lists `images/issuer-directory-web.tar` and `config/issuer-directory/issuer-directory.yaml` with SHA-256 values you recorded during packaging.
|
||||
@@ -8,13 +8,20 @@
|
||||
| SCANNER-DOCS-0002 | DONE (2025-11-02) | Docs Guild | Keep scanner benchmark comparisons (Trivy/Grype/Snyk) and deep-dive matrix current with source references. | Coordinate with docs/benchmarks owners |
|
||||
| SCANNER-DOCS-0003 | TODO | Docs Guild, Product Guild | Gather Windows/macOS analyzer demand signals and record findings in `docs/benchmarks/scanner/windows-macos-demand.md`. | Coordinate with Product Marketing & Sales enablement |
|
||||
| SCANNER-ENG-0008 | TODO | EntryTrace Guild, QA Guild | Maintain EntryTrace heuristic cadence per `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md`. | Include quarterly pattern review + explain trace updates |
|
||||
| SCANNER-ENG-0009 | TODO | Ruby Analyzer Guild | SCANNER-ANALYZERS-RUBY-28-001..012 | Deliver Ruby analyzer parity and observation pipeline per gap doc (lockfiles, runtime graph, policy signals). | Design complete; fixtures published; CLI/Offline docs updated. |
|
||||
| SCANNER-ENG-0009 | DOING (2025-11-02) | Ruby Analyzer Guild | SCANNER-ANALYZERS-RUBY-28-001..012 | Deliver Ruby analyzer parity and observation pipeline per gap doc (lockfiles, runtime graph, policy signals). | Design complete; fixtures published; CLI/Offline docs updated. |
|
||||
| SCANNER-ENG-0010 | TODO | PHP Analyzer Guild | SCANNER-ANALYZERS-PHP-27-001..012 | Ship PHP analyzer pipeline (composer lock, autoload graph, capability signals) to close comparison gaps. | Analyzer + policy integration merged; fixtures + docs aligned. |
|
||||
| SCANNER-ENG-0011 | TODO | Language Analyzer Guild | — | Scope Deno runtime analyzer (lockfile resolver, import graphs) based on competitor techniques. | Design doc approved; backlog split into analyzer/runtime work. |
|
||||
| SCANNER-ENG-0012 | TODO | Language Analyzer Guild | — | Evaluate Dart analyzer requirements (pubspec parsing, AOT artifacts) to restore parity. | Investigation summary + task split filed with Dart guild. |
|
||||
| SCANNER-ENG-0013 | TODO | Swift Analyzer Guild | — | Plan Swift Package Manager coverage (Package.resolved, xcframeworks, runtime hints) with policy hooks. | Design brief approved; backlog seeded with analyzer tasks. |
|
||||
| SCANNER-ENG-0014 | TODO | Runtime Guild, Zastava Guild | — | Align Kubernetes/VM target coverage roadmap between Scanner and Zastava per comparison findings. | Joint roadmap doc approved; cross-guild tasks opened. |
|
||||
| SCANNER-ENG-0015 | TODO | Export Center Guild, Scanner Guild | — | Document DSSE/Rekor operator enablement guidance and rollout levers surfaced in gap analysis. | Playbook drafted; Export Center backlog updated. |
|
||||
| SCANNER-ENG-0016 | DOING (2025-11-02) | Ruby Analyzer Guild (Lockfile Squad) | Implement `RubyLockCollector` and vendor cache ingestion per design §4.1–4.3. | Coordinate fixtures under `fixtures/lang/ruby/lockfiles`; target alpha by Sprint 21. |
|
||||
| SCANNER-ENG-0017 | TODO | Ruby Analyzer Guild (Runtime Squad) | Build runtime require/autoload graph builder with tree-sitter Ruby per design §4.4. | Deliver edges with reason codes and integrate EntryTrace hints. |
|
||||
| SCANNER-ENG-0018 | TODO | Ruby Analyzer Guild (Capability Squad) | Emit Ruby capability and framework surface signals as defined in design §4.5. | Policy predicates prototyped; capability records available in SBOM overlays. |
|
||||
| SCANNER-ENG-0019 | TODO | Ruby Analyzer Guild, CLI Guild | Ship Ruby CLI verbs (`stella ruby inspect|resolve`) and Offline Kit packaging per design §4.6. | CLI commands documented; offline manifest updated; e2e tests pass. |
|
||||
| SCANNER-LIC-0001 | DOING (2025-11-02) | Scanner Guild, Legal Guild | Vet tree-sitter Ruby licensing and Offline Kit packaging requirements. | SPDX review complete; packaging plan approved. |
|
||||
| SCANNER-POLICY-0001 | TODO | Policy Guild, Ruby Analyzer Guild | Define Policy Engine predicates for Ruby groups/capabilities and align lattice weights. | Policy schema merged; tests cover new predicates. |
|
||||
| SCANNER-CLI-0001 | TODO | CLI Guild, Ruby Analyzer Guild | Coordinate CLI UX/help text for new Ruby verbs and update CLI docs. | CLI help + docs updated; golden outputs recorded. |
|
||||
| SCANNER-ENG-0002 | TODO | Scanner Guild, CLI Guild | Design Node.js lockfile collector/CLI validator per `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md`. | Capture Surface & policy requirements before implementation |
|
||||
| SCANNER-ENG-0003 | TODO | Python Analyzer Guild, CLI Guild | Design Python lockfile/editable install parity checks per `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md`. | Include policy predicates & CLI story in design |
|
||||
| SCANNER-ENG-0004 | TODO | Java Analyzer Guild, CLI Guild | Design Java lockfile ingestion & validation per `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md`. | Cover Gradle/SBT collectors, CLI verb, policy hooks |
|
||||
|
||||
137
docs/modules/scanner/design/ruby-analyzer.md
Normal file
137
docs/modules/scanner/design/ruby-analyzer.md
Normal file
@@ -0,0 +1,137 @@
|
||||
# Ruby Analyzer Parity Design (SCANNER-ENG-0009)
|
||||
|
||||
**Status:** Draft • Owner: Ruby Analyzer Guild • Updated: 2025-11-02
|
||||
|
||||
## 1. Goals & Non-Goals
|
||||
- **Goals**
|
||||
- Deterministically catalogue Ruby application dependencies (Gemfile/Gemfile.lock, vendored specs, .gem archives) for container layers and local workspaces.
|
||||
- Build runtime usage graphs (require/require_relative, Zeitwerk autoloads, Rack boot chains, Sidekiq/ActiveJob schedulers).
|
||||
- Emit capability signals (exec/fs/net/serialization, framework fingerprints, job schedulers) consumable by Policy Engine and explain traces.
|
||||
- Provide CLI verbs (`stella ruby inspect`, `stella ruby resolve`) and Offline Kit parity for air-gapped deployments.
|
||||
- **Non-Goals**
|
||||
- Shipping dynamic runtime profilers (log-based or APM) in this iteration.
|
||||
- Implementing UI changes beyond exposing explain traces the Policy/UI guilds already support.
|
||||
|
||||
## 2. Scope & Inputs
|
||||
| Input | Location | Notes |
|
||||
|-------|----------|-------|
|
||||
| Gemfile / Gemfile.lock | Source tree, layer filesystem | Handle multiple apps per repo; honour Bundler groups. |
|
||||
| Vendor bundles (`vendor/bundle`, `.bundle/config`) | Layer filesystem | Needed for offline/built images; avoid double-counting platform-specific gems. |
|
||||
| `.gemspec` files / cached specs | `~/.bundle/cache`, `vendor/cache`, gems in layers | Support deterministic parsing without executing gem metadata. |
|
||||
| Framework configs | `config/application.rb`, `config/routes.rb`, `config/sidekiq.yml`, etc. | Feed framework surface mapper. |
|
||||
| Container metadata | Layer digests via RustFS CAS | Support incremental composition per layer. |
|
||||
|
||||
## 3. High-Level Architecture
|
||||
```
|
||||
┌─────────────────────────┐ ┌────────────────────┐
|
||||
│ Bundler Lock Collector │───────▶│ Package Graph │
|
||||
└─────────────────────────┘ │ Aggregator │
|
||||
└─────────┬──────────┘
|
||||
┌─────────────────────────┐ │
|
||||
│ Gemspec Inspector │───────────────▶│
|
||||
└─────────────────────────┘ │
|
||||
▼
|
||||
┌────────────────────┐
|
||||
┌─────────────────────────┐ │ Runtime Graph │
|
||||
│ Require/Autoload Scan │───────▶│ Builder (Zeitwerk) │
|
||||
└─────────────────────────┘ └─────────┬──────────┘
|
||||
│
|
||||
▼
|
||||
┌────────────────────┐
|
||||
│ Capability Emitter │
|
||||
└─────────┬──────────┘
|
||||
│
|
||||
▼
|
||||
┌────────────────────┐
|
||||
│ SBOM Writer │
|
||||
│ + Policy Signals │
|
||||
└────────────────────┘
|
||||
```
|
||||
|
||||
## 4. Detailed Components
|
||||
### 4.1 Bundler Lock Collector
|
||||
- Parse `Gemfile.lock` deterministically (no network) using new `RubyLockCollector` under `StellaOps.Scanner.Analyzers.Lang.Ruby`.
|
||||
- Support alternative manifests (`gems.rb`, `gems.locked`) and workspace overrides.
|
||||
- Emit package nodes with fields: `name`, `version`, `source` (path/git/rubygems), `bundlerGroup[]`, `platform`, `declaredOnly` flag.
|
||||
- Implementation:
|
||||
- Reuse parsing strategy from Trivy (`pkg/fanal/analyzer/language/ruby/bundler`) but port to C# with streaming reader and stable ordering.
|
||||
- Integrate with Surface.Validation to enforce size limits and tenant allowlists for git/path sources.
|
||||
|
||||
### 4.2 Gemspec Inspector
|
||||
- Scan cached specs under `vendor/cache`, `.bundle/cache`, and gem directories to pick up transitive packages when lockfiles missing.
|
||||
- Parse without executing Ruby by using a deterministic DSL subset (similar to Trivy gemspec parser).
|
||||
- Link results to lockfile entries by `<name, version, platform>`; create new records flagged `InferredFromSpec` when lockfile absent.
|
||||
|
||||
### 4.3 Package Aggregator
|
||||
- New orchestrator `RubyPackageAggregator` merges lock and gemspec data with installed gems from container layers (once runtime analyzer ships).
|
||||
- Precedence: Installed > Lockfile > Gemspec.
|
||||
- Deduplicate by package key (name+version+platform) and attach provenance bits for Policy Engine.
|
||||
|
||||
### 4.4 Runtime Graph Builder
|
||||
- Static analysis for `require`, `require_relative`, `autoload`, Zeitwerk conventions, and Rails initialisers.
|
||||
- Implementation phases:
|
||||
1. Parse AST using tree-sitter Ruby embedded under `StellaOps.Scanner.Analyzers.Lang.Ruby.Syntax` with deterministic bindings.
|
||||
2. Generate edges `entrypoint -> file` and `file -> package` with reason codes (`require-static`, `autoload-zeitwerk`, `autoload-const_missing`).
|
||||
3. Identify framework entrypoints (Rails controllers, Rack middleware, Sidekiq workers) via heuristics defined in `SCANNER-ANALYZERS-RUBY-28-*` tasks.
|
||||
- Output merges with EntryTrace usage hints to support runtime filtering in Policy Engine.
|
||||
|
||||
### 4.5 Capability & Surface Signals
|
||||
- Emit evidence documents for:
|
||||
- Process/exec usage (`Kernel.system`, `` `cmd` ``, `Open3`).
|
||||
- Network clients (`Net::HTTP`, `Faraday`, `Redis`, `ActiveRecord::Base.establish_connection`).
|
||||
- Serialization sinks (`Marshal.load`, `YAML.load`, `Oj.load`).
|
||||
- Job schedulers (Sidekiq, Resque, ActiveJob, Whenever, Clockwork) with schedule metadata.
|
||||
- Capability records flow to Policy Engine under `capability.ruby.*` namespaces to allow gating on dangerous constructs.
|
||||
|
||||
### 4.6 CLI & Offline Integration
|
||||
- Add CLI verbs:
|
||||
- `stella ruby inspect <path>` – runs collector locally, outputs JSON summary with provenance.
|
||||
- `stella ruby resolve --image <ref>` – fetches scan artifacts, prints dependency graph grouped by bundler group/platform.
|
||||
- Ship analyzer DLLs and rules in Offline Kit manifest; include autoload/zeitwerk fingerprints and heuristics hashed for determinism.
|
||||
|
||||
## 5. Data Contracts
|
||||
| Artifact | Shape | Consumer |
|
||||
|----------|-------|----------|
|
||||
| `ruby_packages.json` | Array `{id, name, version, source, provenance, groups[], platform}` | SBOM Composer, Policy Engine |
|
||||
| `ruby_runtime_edges.json` | Edges `{from, to, reason, confidence}` | EntryTrace overlay, Policy explain traces |
|
||||
| `ruby_capabilities.json` | Capability `{kind, location, evidenceHash, params}` | Policy Engine (capability predicates) |
|
||||
|
||||
All records follow AOC appender rules (immutable, tenant-scoped) and include `hash`, `layerDigest`, and `timestamp` normalized to UTC ISO-8601.
|
||||
|
||||
## 6. Testing Strategy
|
||||
- **Fixtures**: Extend `fixtures/lang/ruby` with Rails, Sinatra, Sidekiq, Rack, container images (with/without vendor cache).
|
||||
- **Determinism**: Golden snapshots for package lists and capability outputs across repeated runs.
|
||||
- **Integration**: Worker e2e to ensure per-layer aggregation; CLI golden outputs (`stella ruby inspect`).
|
||||
- **Policy**: Unit tests verifying new predicates (`ruby.group`, `ruby.capability.exec`, etc.) in Policy Engine test suite.
|
||||
|
||||
## 7. Rollout Plan & Dependencies
|
||||
1. Implement collectors and aggregators (SCANNER-ANALYZERS-RUBY-28-001..004).
|
||||
2. Add capability analyzer and observations (SCANNER-ANALYZERS-RUBY-28-005..008).
|
||||
3. Wire CLI commands and Offline Kit packaging (SCANNER-ANALYZERS-RUBY-28-011).
|
||||
4. Update docs (DOCS-SCANNER-BENCH-62-009 follow-up) once analyzer alpha ready.
|
||||
|
||||
**Dependencies**
|
||||
- Tree-sitter Ruby grammar inclusion (needs Offline Kit packaging and licensing check).
|
||||
- Policy Engine support for new predicates and capability schemas.
|
||||
- Surface.Validation updates for git/path gem sources and secret resolution.
|
||||
|
||||
## 8. Open Questions
|
||||
- Do we require dynamic runtime logs (e.g., `ActiveSupport::Notifications`) for confidence boosts? (defer to future iteration)
|
||||
- Should we enforce signed gem provenance in MVP? Pending Product decision.
|
||||
- Need alignment with Export Center on Ruby-specific manifest emissions.
|
||||
|
||||
## 9. Licensing & Offline Packaging (SCANNER-LIC-0001)
|
||||
- **License**: tree-sitter core and `tree-sitter-ruby` grammar are MIT licensed (confirmed via upstream LICENSE files retrieved 2025-11-02).
|
||||
- **Obligations**:
|
||||
1. Include both MIT license texts in `/third-party-licenses/` and in Offline Kit manifests.
|
||||
2. Update `NOTICE.md` to acknowledge embedded grammars per company policy.
|
||||
3. Record the grammar commit hashes in build metadata; regenerate generated C/WASM artifacts deterministically.
|
||||
4. Ensure build pipeline uses `tree-sitter-cli` only as a build-time tool (not redistributed) to avoid extra licensing obligations.
|
||||
- **Deliverables**:
|
||||
- SCANNER-LIC-0001 to capture Legal sign-off and update packaging scripts.
|
||||
- Export Center to mirror license files into Offline Kit bundle.
|
||||
|
||||
---
|
||||
*References:*
|
||||
- Trivy: `pkg/fanal/analyzer/language/ruby/bundler`, `pkg/fanal/analyzer/language/ruby/gemspec`
|
||||
- Gap analysis: `docs/benchmarks/scanner/scanning-gaps-stella-misses-from-competitors.md#ruby-analyzer-parity-trivy-grype-snyk`
|
||||
Reference in New Issue
Block a user