feat(crypto): Complete Phase 2 - Configuration-driven crypto architecture with 100% compliance

## Summary

This commit completes Phase 2 of the configuration-driven crypto architecture, achieving
100% crypto compliance by eliminating all hardcoded cryptographic implementations.

## Key Changes

### Phase 1: Plugin Loader Infrastructure
- **Plugin Discovery System**: Created StellaOps.Cryptography.PluginLoader with manifest-based loading
- **Configuration Model**: Added CryptoPluginConfiguration with regional profiles support
- **Dependency Injection**: Extended DI to support plugin-based crypto provider registration
- **Regional Configs**: Created appsettings.crypto.{international,russia,eu,china}.yaml
- **CI Workflow**: Added .gitea/workflows/crypto-compliance.yml for audit enforcement

### Phase 2: Code Refactoring
- **API Extension**: Added ICryptoProvider.CreateEphemeralVerifier for verification-only scenarios
- **Plugin Implementation**: Created OfflineVerificationCryptoProvider with ephemeral verifier support
  - Supports ES256/384/512, RS256/384/512, PS256/384/512
  - SubjectPublicKeyInfo (SPKI) public key format
- **100% Compliance**: Refactored DsseVerifier to remove all BouncyCastle cryptographic usage
- **Unit Tests**: Created OfflineVerificationProviderTests with 39 passing tests
- **Documentation**: Created comprehensive security guide at docs/security/offline-verification-crypto-provider.md
- **Audit Infrastructure**: Created scripts/audit-crypto-usage.ps1 for static analysis

### Testing Infrastructure (TestKit)
- **Determinism Gate**: Created DeterminismGate for reproducibility validation
- **Test Fixtures**: Added PostgresFixture and ValkeyFixture using Testcontainers
- **Traits System**: Implemented test lane attributes for parallel CI execution
- **JSON Assertions**: Added CanonicalJsonAssert for deterministic JSON comparisons
- **Test Lanes**: Created test-lanes.yml workflow for parallel test execution

### Documentation
- **Architecture**: Created CRYPTO_CONFIGURATION_DRIVEN_ARCHITECTURE.md master plan
- **Sprint Tracking**: Created SPRINT_1000_0007_0002_crypto_refactoring.md (COMPLETE)
- **API Documentation**: Updated docs2/cli/crypto-plugins.md and crypto.md
- **Testing Strategy**: Created testing strategy documents in docs/implplan/SPRINT_5100_0007_*

## Compliance & Testing

-  Zero direct System.Security.Cryptography usage in production code
-  All crypto operations go through ICryptoProvider abstraction
-  39/39 unit tests passing for OfflineVerificationCryptoProvider
-  Build successful (AirGap, Crypto plugin, DI infrastructure)
-  Audit script validates crypto boundaries

## Files Modified

**Core Crypto Infrastructure:**
- src/__Libraries/StellaOps.Cryptography/CryptoProvider.cs (API extension)
- src/__Libraries/StellaOps.Cryptography/CryptoSigningKey.cs (verification-only constructor)
- src/__Libraries/StellaOps.Cryptography/EcdsaSigner.cs (fixed ephemeral verifier)

**Plugin Implementation:**
- src/__Libraries/StellaOps.Cryptography.Plugin.OfflineVerification/ (new)
- src/__Libraries/StellaOps.Cryptography.PluginLoader/ (new)

**Production Code Refactoring:**
- src/AirGap/StellaOps.AirGap.Importer/Validation/DsseVerifier.cs (100% compliant)

**Tests:**
- src/__Libraries/__Tests/StellaOps.Cryptography.Plugin.OfflineVerification.Tests/ (new, 39 tests)
- src/__Libraries/__Tests/StellaOps.Cryptography.PluginLoader.Tests/ (new)

**Configuration:**
- etc/crypto-plugins-manifest.json (plugin registry)
- etc/appsettings.crypto.*.yaml (regional profiles)

**Documentation:**
- docs/security/offline-verification-crypto-provider.md (600+ lines)
- docs/implplan/CRYPTO_CONFIGURATION_DRIVEN_ARCHITECTURE.md (master plan)
- docs/implplan/SPRINT_1000_0007_0002_crypto_refactoring.md (Phase 2 complete)

## Next Steps

Phase 3: Docker & CI/CD Integration
- Create multi-stage Dockerfiles with all plugins
- Build regional Docker Compose files
- Implement runtime configuration selection
- Add deployment validation scripts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
master
2025-12-23 18:20:00 +02:00
parent b444284be5
commit dac8e10e36
241 changed files with 22567 additions and 307 deletions

View File

@@ -0,0 +1,46 @@
# Air-gap bundles and formats
Air-gapped deployments use signed bundles with deterministic manifests. Bundles
are verified before import and tracked by mirror generation.
Bundle types
- Mirror and bootstrap bundles (images, charts, plugins).
- Advisory and VEX bundles with AOC guardrails.
- Risk and EPSS bundles for scoring.
- Symbol bundles for reachability overlays.
- Evidence bundles for findings and decisions.
- Revocation bundles for Authority token and key revocations.
Bundle format (offline bundles)
- Archive: .stella.bundle.tgz with deterministic tar settings.
- manifest.json lists entries with sha256 hashes and sizes.
- DSSE envelope signs the manifest payload.
- Optional receipt.json records import verification and audit metadata.
Manifest rules
- Sorted keys and stable ordering.
- SHA-256 digests for every entry.
- root_hash over all entries for quick validation.
Time anchors and staleness
- Time anchors are signed snapshots of time source state.
- Staleness checks gate use of bundles in sealed mode.
- Offline bundles should include time anchor and staleness metadata.
Sealed mode expectations
- Deny-all egress; only registered bundles are accepted.
- Imports emit audit events and are tracked by mirrorGeneration.
- UI displays sealed-mode banner with manifest hash and time anchor status.
Verification workflow
- Verify archive hash and DSSE signature.
- Validate manifest and any schema-specific entries.
- Reject bundles with missing provenance or invalid hashes.
Related references
- docs/airgap/overview.md
- docs/airgap/offline-bundle-format.md
- docs/airgap/staleness-and-time.md
- docs/airgap/portable-evidence.md
- docs/airgap/symbol-bundles.md
- docs/security/revocation-bundle.md

View File

@@ -0,0 +1,27 @@
# Air-gap runbooks (summary)
Core runbooks
- Import and verify: unpack bundle, validate manifest, verify DSSE signatures.
- AV scan: scan bundle contents before import if required by policy.
- Quarantine: isolate bundles with hash or signature mismatches.
- Sealed startup diagnostics: confirm egress block and time anchor validity.
Import and verify
- Validate bundle hash, manifest entries, and schema checks.
- Record import receipt with operator, time anchor, and manifest hash.
- Reject and log any mismatches or missing provenance.
Quarantine handling
- Preserve the original bundle and verification logs.
- Open an incident if mismatches indicate tampering.
- Re-import only after a new bundle is signed and verified.
Operational notes
- Keep previous mirror generation as rollback baseline.
- Use deterministic tools and fixed ordering for all checks.
Related references
- docs/airgap/runbooks/import-verify.md
- docs/airgap/runbooks/av-scan.md
- docs/airgap/runbooks/quarantine-investigation.md
- docs/airgap/sealed-startup-diagnostics.md

View File

@@ -32,3 +32,7 @@
- DSSE envelopes and cached transparency proofs enable local verification.
- Reachability and replay bundles can be verified without network access.
- Keep analyzer manifests and policy hashes with the replay bundle.
## Related references
- docs2/operations/airgap-bundles.md
- docs2/operations/airgap-runbooks.md

View File

@@ -0,0 +1,17 @@
# Binary prerequisites
StellaOps supports offline operation by pinning binaries and packages in
local mirrors with deterministic manifests.
Layout
- local-nugets/ for NuGet packages and cache.
- vendor/ for pinned third-party binaries.
- offline/feeds/ for air-gap bundles.
Rules
- Update manifest files when adding binaries.
- Prefer source builds when possible.
- Enforce offline builds with local sources first.
Related references
- docs/ops/binary-prereqs.md

View File

@@ -0,0 +1,28 @@
# Deployment versioning
StellaOps uses environment-specific version tags and promotion steps to keep
deployments reproducible and auditable.
Version tags
- Release tags follow semver (X.Y.Z).
- Environment variants add suffixes (for example, airgap).
- Immutable deployments use digests instead of tags.
Promotion model
- Dev to staging: unit and integration tests are green.
- Staging to prod: end-to-end, security, and performance gates pass.
- Prod to airgap: offline validation and bundle verification complete.
Naming conventions
- registry/<service>:<version>
- registry/<service>:<version>-<variant>
- registry/<service>@sha256:<digest>
Operational guidance
- Keep version matrices in sync with release bundles.
- Use pinned digests for air-gapped imports.
- Record promotion metadata with evidence bundles.
Related references
- docs/deployment/VERSION_MATRIX.md
- docs/13_RELEASE_ENGINEERING_PLAYBOOK.md

View File

@@ -0,0 +1,40 @@
# Notifications Studio
Notifications Studio turns platform events into tenant-scoped alerts that are
explainable, deterministic, and offline friendly.
Core capabilities
- Rules engine for filtering by event kind, severity, and context.
- Channel connectors for chat, email, and webhook delivery.
- Templates with deterministic rendering and safe helpers.
- Digests to coalesce bursts into scheduled summaries.
- Delivery ledger for audit and troubleshooting.
Operational model
- Notify.Worker evaluates rules per tenant.
- Connectors deliver rendered payloads and report outcomes.
- Notify.WebService exposes API endpoints for UI and CLI.
Security and governance
- Tenancy enforced on all rules and deliveries.
- Secrets are referenced via secretRef, not stored in config.
- Ack tokens are DSSE signed and authority scoped.
- Webhook deliveries are HMAC-SHA256 signed with nonce or timestamp.
- Outbound allowlists block public egress in sealed deployments.
Offline posture
- Offline kits bundle default rules, templates, and plugins.
- Deterministic rendering keeps hashes stable across environments.
Related references
- docs/notifications/overview.md
- docs/notifications/rules.md
- docs/notifications/templates.md
- docs/notifications/digests.md
- docs/modules/notify/architecture.md
- docs2/notifications/overview.md
- docs2/notifications/rules.md
- docs2/notifications/channels.md
- docs2/notifications/templates.md
- docs2/notifications/digests.md
- docs2/notifications/pack-approvals.md

View File

@@ -0,0 +1,35 @@
# Quickstart
This quickstart covers a minimal first scan in a local or lab environment.
It assumes container runtime access and a basic Docker or Kubernetes setup.
Prerequisites
- Linux host with container runtime and Compose or Kubernetes.
- Local PostgreSQL and Valkey or bundled containers.
- Sufficient disk for SBOM caches and bundles.
Baseline steps
1) Prepare configuration
- Set admin credentials and service endpoints.
- Use local or bundled database and cache for first run.
2) Start core services
- Bring up Authority, Scanner, Concelier, Policy, and UI services.
- Confirm health endpoints are ready.
3) Run first scan
- Authenticate CLI with Authority.
- Submit a scan for a known image or SBOM.
4) Verify results
- Open the Console to inspect findings and evidence.
- Export a DSSE bundle and verify signatures.
Offline and sovereign notes
- Offline kits bundle feeds, plugins, and config for sealed installs.
- Crypto profiles can be applied without rebuilding services.
Related references
- docs/quickstart.md
- docs/21_INSTALL_GUIDE.md
- docs/24_OFFLINE_KIT.md

View File

@@ -0,0 +1,26 @@
# Router rate limiting
Router rate limiting is enforced at the gateway to avoid per-service throttling.
It supports instance-local and environment-wide limits.
Behavior
- Denied requests return 429 with Retry-After and rate limit headers.
- Response includes a JSON body with limit and window details.
Scopes
- Instance: in-memory sliding window per router instance.
- Environment: Valkey-backed fixed window across instances.
Configuration
- rate_limiting.process_back_pressure_when_more_than_per_5min gates Valkey use.
- rules support multiple windows with AND semantics.
- microservice overrides replace default rules.
- route overrides apply per service route name.
Failover
- Environment limiting is fail-open when Valkey is unavailable.
- Instance limits remain active for baseline protection.
Related references
- docs/router/rate-limiting.md
- docs/router/rate-limiting-routes.md

View File

@@ -0,0 +1,20 @@
# Runtime readiness
Runtime readiness ensures services expose the metadata required by downstream
consumers and operations tooling.
Core checks
- Event schemas and samples are up to date.
- Signed report payloads include required summary fields.
- Scan progress events include stable data keys.
- Health and readiness endpoints reflect dependency checks.
Validation
- Validate event payloads against JSON schemas.
- Capture canonical samples for replay and regression tests.
- Verify DSSE signatures on report artifacts.
Related references
- docs/runtime/SCANNER_RUNTIME_READINESS.md
- docs/events/README.md
- docs/09_API_CLI_REFERENCE.md

18
docs2/operations/slo.md Normal file
View File

@@ -0,0 +1,18 @@
# Service SLOs
Service level objectives define availability, latency, and queue health
expectations for core services.
Typical SLOs (example)
- API availability target per month.
- P95 run duration target for standard workflows.
- Queue backlog thresholds per tenant.
- Event delivery success targets.
Operational practice
- Track error budgets over a rolling window.
- Alert on burn rates and sustained backlog.
- Keep dashboards aligned with SLO definitions.
Related references
- docs/slo/orchestrator-slo.md