# Ruby Capability & Source Predicates (SCANNER-POLICY-0001) **Status:** Implemented · Owner: Policy Guild · Updated: 2025-11-10 **Scope:** Extend Policy Engine DSL to consume Ruby analyzer metadata (`groups`, `declaredOnly`, capabilities, git/path provenance) emitted in Sprint 138. --- ## 1. Goals 1. Allow policies to express intent around Bundler groups (e.g., blocking `development` gems in production promotes). 2. Expose Ruby capability evidence (exec/net/serialization/job schedulers) as first-class predicates. 3. Differentiate package provenance: registry, git, path/vendor cache. 4. Ensure new predicates work in offline/air-gapped evaluation and export deterministically. Non-goals: UI wiring (handled by Policy Studio team), policy templates rollout (tracked separately in DOCS-POLICY backlog). ## 2. Source Metadata Scanner now emits the following fields per Ruby component: | Field | Type | Example | Notes | |-------|------|---------|-------| | `groups` | `string` (semi-colon list) | `development;test` | Aggregated from manifest + lockfile. | | `declaredOnly` | `bool` (string `"true"/"false"`) | `"false"` | False indicates vendor cache evidence present. | | `source` | `string` | `git:https://github.com/example/git-gem.git@` | Registry (`https://`), `git:`, `path:`, `vendor-cache`. | | `artifact` | `string?` | `vendor/cache/path-gem-2.1.3.gem` | Only when cached artefact observed. | | Capability flags | `string -> bool` | `capability.exec = "true"` etc. | Includes scheduler sub-keys. | ## 3. Proposed Predicates | Predicate | Signature | Description | |-----------|-----------|-------------| | `ruby.group(name: string)` | `bool` | True if component belongs to Bundler group `name`. | | `ruby.groups()` | `set` | Returns all groups for aggregations. | | `ruby.declared_only()` | `bool` | True when component has no vendor/installed evidence. | | `ruby.source(kind?: string)` | `bool` | Kind matches prefix (`registry`, `git`, `path`, `vendor-cache`). | | `ruby.capability(name: string)` | `bool` | Supported names: `exec`, `net`, `serialization`, `scheduler`, scheduler subtypes (`scheduler.activejob`, etc.). | | `ruby.capability_any(names: set)` | `bool` | Utility predicate to check multiple capabilities. | Implementation detail: compile-time validation ensures predicate usage only within Ruby component scope (similar to `node.group` pattern). ## 4. DSL & Engine Changes 1. **Schema mapping:** Update `ComponentFacts` model to surface new Ruby metadata in evaluation context. 2. **Predicate registry:** Add Ruby-specific predicate handlers to `PolicyPredicateRegistry` with deterministic ordering. 3. **Explain traces:** Include matched predicates + metadata in explain output. 4. **Exports:** Ensure Offline Kit bundles include updated predicate metadata (no runtime fetch). ## 5. Policy Templates (follow-up) Create sample rules under `policy/templates/ruby`: - Block `ruby.group("development")` when `promotion.target == "prod"`. - Flag `ruby.capability("exec")` components unless allowlisted. - Require `ruby.source("git")` packages to provide pinned hash allowlists. Tracking: DOCS-POLICY follow-up (not part of SCANNER-POLICY-0001 initial kick-off). ## 6. Testing Strategy - Unit tests for each predicate (true/false cases, unsupported values). - Integration test tying sample Scanner payload to simulated policy evaluation. - Determinism run: repeated evaluation with same snapshot must yield identical explain trace hash. - Offline regression: ensure `seed-data/analyzers/ruby/git-sources` fixture flows through offline-kit policy evaluation script. ## 7. Timeline & Dependencies | Step | Owner | Target | |------|-------|--------| | Predicate implementation + tests | Policy Engine Guild | Sprint 138 (in progress) | | Offline kit regression update | Policy + Ops | Sprint 138 | | Policy templates & docs | Docs Guild | Sprint 139 | Dependencies: Scanner metadata in place (SCANNER-ENG-0016 DONE); no additional service contracts required. ## 8. Open Questions 1. Should `declaredOnly` interact with existing waiver semantics (e.g., treat as lower severity)? → Needs risk review. 2. Do we expose scheduler sub-types individually or aggregate under `ruby.capability("scheduler")` only? → Proposed to expose both for flexibility. 3. Is git URL normalization required (strip credentials, hash fragments)? → Ensure sanitization before evaluation. Please comment in `docs/modules/policy/design/ruby-capability-predicates.md` or via SCANNER-POLICY-0001 sprint entry.