Restructure solution layout by module
	
		
			
	
		
	
	
		
	
		
			Some checks failed
		
		
	
	
		
			
				
	
				Docs CI / lint-and-preview (push) Has been cancelled
				
			
		
		
	
	
				
					
				
			
		
			Some checks failed
		
		
	
	Docs CI / lint-and-preview (push) Has been cancelled
				
			This commit is contained in:
		| @@ -1,96 +1,96 @@ | ||||
| # Policy Engine FAQ | ||||
|  | ||||
| Answers to questions that Support, Ops, and Policy Guild teams receive most frequently. Pair this FAQ with the [Policy Lifecycle](../policy/lifecycle.md), [Runs](../policy/runs.md), and [CLI guide](../cli/policy.md) for deeper explanations. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Authoring & DSL | ||||
|  | ||||
| **Q:** *Lint succeeds locally, but submit still fails with `ERR_POL_001`. Why?*   | ||||
| **A:** The CLI requires lint & compile artefacts newer than 24 hours. Re-run `stella policy lint` and `stella policy compile` before submitting; ensure you upload the latest diff files with `--attach`. | ||||
|  | ||||
| **Q:** *How do I layer tenant-specific overrides on top of the baseline policy?*   | ||||
| **A:** Keep the baseline in `tenant-global`. For tenant overrides, create a policy referencing the baseline via CLI (`stella policy new --from baseline@<version>`), then adjust rules. Activation is per tenant. | ||||
|  | ||||
| **Q:** *Can I import YAML/Rego policies from earlier releases?*   | ||||
| **A:** No direct import. Use the migration script (`stella policy migrate legacy.yaml`) which outputs `stella-dsl@1` skeletons. Review manually before submission. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Simulation & Determinism | ||||
|  | ||||
| **Q:** *Simulation shows huge differences even though I only tweaked metadata. What did I miss?*   | ||||
| **A:** Check if your simulation used the same SBOM set/env as previous runs. CLI default uses golden fixtures; UI can store custom presets. Large diffs may also indicate Concelier updates; compare advisory cursors in the Simulation tab. | ||||
|  | ||||
| **Q:** *How do we guard against non-deterministic behaviour?*   | ||||
| **A:** CI runs `policy simulate` twice with identical inputs and compares outputs (`DEVOPS-POLICY-20-003`). Any difference fails the pipeline. Locally you can use `stella policy run replay` to verify determinism. | ||||
|  | ||||
| **Q:** *What happens if the determinism guard (`ERR_POL_004`) triggers?*   | ||||
| **A:** Policy Engine halts the run, raises `policy.run.failed` with code `ERR_POL_004`, and switches to incident mode (100 % sampling). Review recent code changes; often caused by new helpers that call `DateTime.Now` or non-allowlisted HTTP clients. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## VEX & Suppressions | ||||
|  | ||||
| **Q:** *A vendor marked a CVE `not_affected` but the policy still blocks. Why?*   | ||||
| **A:** Check the required justifications. Baseline policy only accepts `component_not_present` and `vulnerable_code_not_present`. Other statuses need explicit rules. Use `stella findings explain` to see which VEX statement was considered. | ||||
|  | ||||
| **Q:** *Can we quiet a finding indefinitely?*   | ||||
| **A:** Avoid indefinite quiets. Policy DSL requires an `until` timestamp. If the use case is permanent, move the rule into baseline logic with strong justification and documentation. | ||||
|  | ||||
| **Q:** *How do we detect overuse of suppressions?*   | ||||
| **A:** Observability exports `policy_suppressions_total` and CLI `stella policy stats`. Review weekly; Support flags tenants whose suppressions grow faster than remediation tickets. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Runs & Operations | ||||
|  | ||||
| **Q:** *Incremental runs are backlogged. What should we check first?*   | ||||
| **A:** Inspect `policy_run_queue_depth` and `policy_delta_backlog_age_seconds` dashboards. If queue depth high, scale worker replicas or investigate upstream change storms (Concelier/Excititor). Use `stella policy run list --status failed` for recent errors. | ||||
|  | ||||
| **Q:** *Full runs take longer than 30 min. Is that a breach?*   | ||||
| **A:** Goal is ≤ 30 min, but large tenants may exceed temporarily. Ensure Mongo indexes are current and that worker nodes meet sizing (4 vCPU). Consider sharding runs by SBOM group. | ||||
|  | ||||
| **Q:** *How do I replay a run for audit evidence?*   | ||||
| **A:** `stella policy run replay <runId> --output replay.tgz` produces a sealed bundle. Upload to evidence locker or attach to incident tickets. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Approvals & Governance | ||||
|  | ||||
| **Q:** *Can authors approve their own policies?*   | ||||
| **A:** No. Authority denies approval if `approved_by == submitted_by`. Assign at least two reviewers (one security, one product). | ||||
|  | ||||
| **Q:** *What scopes do bots need for CI pipelines?*   | ||||
| **A:** Typically `policy:read`, `policy:simulate`, `policy:runs`. Only grant `policy:run` if the pipeline should trigger runs. Never give CI tokens `policy:approve`. | ||||
|  | ||||
| **Q:** *How do we manage policies in air-gapped deployments?*   | ||||
| **A:** Use `stella policy bundle export --sealed` on a connected site, transfer via approved media, then `stella policy bundle import` inside the enclave. Enable `--sealed` flag in CLI/UI to block accidental outbound calls. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Troubleshooting | ||||
|  | ||||
| **Q:** *API calls return `403` despite valid token.*   | ||||
| **A:** Verify scope includes the specific operation (`policy:activate` vs `policy:run`). Check tenant header matches token tenant. Inspect Authority logs for denial reason (`policy_scope_denied_total` metric). | ||||
|  | ||||
| **Q:** *`stella policy run` exits with code `30`.*   | ||||
| **A:** Network/transport error. Check connectivity to Policy Engine endpoint, TLS configuration, and CLI proxy settings. | ||||
|  | ||||
| **Q:** *Explain drawer shows no VEX data.*   | ||||
| **A:** Either no VEX statement matched or the tenant lacks `findings:read` scope. If VEX should exist, confirm Excititor ingestion and policy joiners (see Observability dashboards). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Compliance Checklist | ||||
|  | ||||
| - [ ] FAQ linked from Console help menu and CLI `stella policy help`. | ||||
| - [ ] Entries reviewed quarterly by Policy & Support Guilds. | ||||
| - [ ] Answers cross-reference lifecycle, runs, observability, and governance docs. | ||||
| - [ ] Incident/Escalation contact details kept current in Support playbooks. | ||||
| - [ ] FAQ translated for supported locales (if applicable). | ||||
|  | ||||
| --- | ||||
|  | ||||
| *Last updated: 2025-10-26 (Sprint 20).* | ||||
|  | ||||
| # Policy Engine FAQ | ||||
|  | ||||
| Answers to questions that Support, Ops, and Policy Guild teams receive most frequently. Pair this FAQ with the [Policy Lifecycle](../policy/lifecycle.md), [Runs](../policy/runs.md), and [CLI guide](../cli/policy.md) for deeper explanations. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Authoring & DSL | ||||
|  | ||||
| **Q:** *Lint succeeds locally, but submit still fails with `ERR_POL_001`. Why?*   | ||||
| **A:** The CLI requires lint & compile artefacts newer than 24 hours. Re-run `stella policy lint` and `stella policy compile` before submitting; ensure you upload the latest diff files with `--attach`. | ||||
|  | ||||
| **Q:** *How do I layer tenant-specific overrides on top of the baseline policy?*   | ||||
| **A:** Keep the baseline in `tenant-global`. For tenant overrides, create a policy referencing the baseline via CLI (`stella policy new --from baseline@<version>`), then adjust rules. Activation is per tenant. | ||||
|  | ||||
| **Q:** *Can I import YAML/Rego policies from earlier releases?*   | ||||
| **A:** No direct import. Use the migration script (`stella policy migrate legacy.yaml`) which outputs `stella-dsl@1` skeletons. Review manually before submission. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Simulation & Determinism | ||||
|  | ||||
| **Q:** *Simulation shows huge differences even though I only tweaked metadata. What did I miss?*   | ||||
| **A:** Check if your simulation used the same SBOM set/env as previous runs. CLI default uses golden fixtures; UI can store custom presets. Large diffs may also indicate Concelier updates; compare advisory cursors in the Simulation tab. | ||||
|  | ||||
| **Q:** *How do we guard against non-deterministic behaviour?*   | ||||
| **A:** CI runs `policy simulate` twice with identical inputs and compares outputs (`DEVOPS-POLICY-20-003`). Any difference fails the pipeline. Locally you can use `stella policy run replay` to verify determinism. | ||||
|  | ||||
| **Q:** *What happens if the determinism guard (`ERR_POL_004`) triggers?*   | ||||
| **A:** Policy Engine halts the run, raises `policy.run.failed` with code `ERR_POL_004`, and switches to incident mode (100 % sampling). Review recent code changes; often caused by new helpers that call `DateTime.Now` or non-allowlisted HTTP clients. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## VEX & Suppressions | ||||
|  | ||||
| **Q:** *A vendor marked a CVE `not_affected` but the policy still blocks. Why?*   | ||||
| **A:** Check the required justifications. Baseline policy only accepts `component_not_present` and `vulnerable_code_not_present`. Other statuses need explicit rules. Use `stella findings explain` to see which VEX statement was considered. | ||||
|  | ||||
| **Q:** *Can we quiet a finding indefinitely?*   | ||||
| **A:** Avoid indefinite quiets. Policy DSL requires an `until` timestamp. If the use case is permanent, move the rule into baseline logic with strong justification and documentation. | ||||
|  | ||||
| **Q:** *How do we detect overuse of suppressions?*   | ||||
| **A:** Observability exports `policy_suppressions_total` and CLI `stella policy stats`. Review weekly; Support flags tenants whose suppressions grow faster than remediation tickets. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Runs & Operations | ||||
|  | ||||
| **Q:** *Incremental runs are backlogged. What should we check first?*   | ||||
| **A:** Inspect `policy_run_queue_depth` and `policy_delta_backlog_age_seconds` dashboards. If queue depth high, scale worker replicas or investigate upstream change storms (Concelier/Excititor). Use `stella policy run list --status failed` for recent errors. | ||||
|  | ||||
| **Q:** *Full runs take longer than 30 min. Is that a breach?*   | ||||
| **A:** Goal is ≤ 30 min, but large tenants may exceed temporarily. Ensure Mongo indexes are current and that worker nodes meet sizing (4 vCPU). Consider sharding runs by SBOM group. | ||||
|  | ||||
| **Q:** *How do I replay a run for audit evidence?*   | ||||
| **A:** `stella policy run replay <runId> --output replay.tgz` produces a sealed bundle. Upload to evidence locker or attach to incident tickets. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Approvals & Governance | ||||
|  | ||||
| **Q:** *Can authors approve their own policies?*   | ||||
| **A:** No. Authority denies approval if `approved_by == submitted_by`. Assign at least two reviewers (one security, one product). | ||||
|  | ||||
| **Q:** *What scopes do bots need for CI pipelines?*   | ||||
| **A:** Typically `policy:read`, `policy:simulate`, `policy:runs`. Only grant `policy:run` if the pipeline should trigger runs. Never give CI tokens `policy:approve`. | ||||
|  | ||||
| **Q:** *How do we manage policies in air-gapped deployments?*   | ||||
| **A:** Use `stella policy bundle export --sealed` on a connected site, transfer via approved media, then `stella policy bundle import` inside the enclave. Enable `--sealed` flag in CLI/UI to block accidental outbound calls. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Troubleshooting | ||||
|  | ||||
| **Q:** *API calls return `403` despite valid token.*   | ||||
| **A:** Verify scope includes the specific operation (`policy:activate` vs `policy:run`). Check tenant header matches token tenant. Inspect Authority logs for denial reason (`policy_scope_denied_total` metric). | ||||
|  | ||||
| **Q:** *`stella policy run` exits with code `30`.*   | ||||
| **A:** Network/transport error. Check connectivity to Policy Engine endpoint, TLS configuration, and CLI proxy settings. | ||||
|  | ||||
| **Q:** *Explain drawer shows no VEX data.*   | ||||
| **A:** Either no VEX statement matched or the tenant lacks `findings:read` scope. If VEX should exist, confirm Excititor ingestion and policy joiners (see Observability dashboards). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## Compliance Checklist | ||||
|  | ||||
| - [ ] FAQ linked from Console help menu and CLI `stella policy help`. | ||||
| - [ ] Entries reviewed quarterly by Policy & Support Guilds. | ||||
| - [ ] Answers cross-reference lifecycle, runs, observability, and governance docs. | ||||
| - [ ] Incident/Escalation contact details kept current in Support playbooks. | ||||
| - [ ] FAQ translated for supported locales (if applicable). | ||||
|  | ||||
| --- | ||||
|  | ||||
| *Last updated: 2025-10-26 (Sprint 20).* | ||||
|  | ||||
|   | ||||
		Reference in New Issue
	
	Block a user