feat: Add new projects to solution and implement contract testing documentation

- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution.
- Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done.
- Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
This commit is contained in:
2025-10-27 07:57:55 +02:00
parent 1e41ba7ffa
commit 651b8e0fa3
355 changed files with 17276 additions and 1160 deletions

View File

@@ -0,0 +1,356 @@
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
# Epic 17: SDKs and OpenAPI Docs
**Short name:** SDKs & OpenAPI
**Primary components:** API Gateway, Web Services, Policy Engine, Conseiller (Feedser), Excitator (VEXer), Orchestrator, Findings Ledger, Export Center, Authority & Tenancy, Console, CLI
**Surfaces:** OpenAPI 3.1 contracts, language SDKs (TS/Node, Python, Go, Java, C#), dev portal, examples, mock server, conformance tests, changelogs, versioning, deprecations
**Dependencies:** AuthorityBacked Scopes & Tenancy, CLI Parity, Export Center, Notifications Studio, AirGapped Mode, Observability
**AOC ground rule reminder:** Conseiller and Excitator aggregate and link advisories/VEX. They never merge or mutate source records. SDKs must preserve this invariant and expose source identity in all models.
---
## 1) What it is
A contractfirst program that standardizes StellaOps APIs with OpenAPI 3.1 and ships official, versioned SDKs for popular languages. It includes:
* A single **sourceoftruth OpenAPI** for each service and a canonical aggregate spec.
* **Generated SDKs** with idiomatic ergonomics, retries, auth helpers, pagination cursors, streaming downloads, and typed error envelopes.
* A **developer portal** with interactive reference, runnable examples, and “copycurl” snippets.
* **Mock server & conformance tests** so changes are validated against the contract before code ships.
* **Versioning & deprecation policy**, automated changelogs, and notification hooks.
* **Airgapped bundles** of docs and SDKs for disconnected environments.
Net result: partners and internal teams integrate quickly without reverseengineering request bodies from error logs.
---
## 2) Why
* Reduce friction and support load with a single, accurate contract.
* Make the platform extensible: third parties can build automation, dashboards, and policy pipelines without trawling source code.
* Enforce stability: contract linting and backwardscompat checks prevent accidental breakage.
* Bring CLI and Console parity to programmatic users through firstclass clients.
---
## 3) How it should work
### 3.1 Source of truth and layout
* Each service owns a **modulescoped OAS** file: `src/StellaOps.Api.OpenApi/<service>/openapi.yaml`.
* An aggregate spec `src/StellaOps.Api.OpenApi/stella.yaml` is produced by build tooling that composes perservice specs, resolves `$ref`s, and validates crossservice schemas.
* JSON Schema dialect: 202012 (OpenAPI 3.1). No vendorspecific features for core models.
* Every response and error has at least one **validated example**.
### 3.2 API conventions (normative)
* **Paths:** `/v1/{resource}`, plural nouns. Subresources use `/v1/resources/{id}/subresources`.
* **Identifiers:** `id` fields are ULID/UUIDv7 as strings.
* **Pagination:** cursorbased: `?cursor=<token>&limit=<n>`, response envelope includes `next_cursor`.
* **Sorting/filtering:** `?sort=field:asc|desc`, `?filter[field]=op:value` with documented operators.
* **Idempotency:** POST operations that create or mutate accept `Idempotency-Key`.
* **Errors:** single envelope:
```json
{
"error": {
"code": "STRING_CODE",
"message": "human friendly",
"details": { "field": "value" },
"trace_id": "..."
}
}
```
Standard codes include `AIRGAP_EGRESS_BLOCKED`, `POLICY_VIOLATION`, `NOT_FOUND`, `RATE_LIMITED`.
* **Auth:** OAuth2 client credentials and PAT. Scopes are explicit (see 14: AuthorityBacked Scopes). Tenancy via claims; optional override header: `X-Stella-Scope: tenant/<id>` if the token permits delegation.
* **Content negotiation:** JSON only for request/response unless endpoint is a stream or file download (`application/octet-stream`).
* **Longrunning operations:** either webhooks (if enabled) or polling via `operation_id` resource.
### 3.3 Versioning and deprecation
* **SemVer** for the aggregate API: `v1`, `v2` in base path.
* Backwardscompatible changes allowed in minor versions (add fields, new optional params).
* Breaking changes require new major version and coexistence for a **deprecation window** (min 12 months) with:
* Deprecation headers: `Deprecation: true`, `Sunset: <rfc1123 date>`, `Link: <doc rel="deprecation">`.
* Portal banners and Notifications Studio broadcast.
### 3.4 Governance and linting
* Enforce naming, pagination, error envelope, and example requirements via an OAS linter.
* CI gate: no PR merges if OAS validation fails or coverage < 100% for operation examples.
* **Compatibility check:** diff new OAS vs previous release, fail on breaking changes unless explicitly flagged.
### 3.5 SDK generation
* Initial languages: **TypeScript/Node**, **Python**, **Go**, **Java**. C# and Rust are followups.
* Generated via a stable, reproducible toolchain. Postgeneration patches are applied by templates, not hand edits.
* **Capabilities:**
* Auth helpers: PAT and OAuth2.
* Retries with decorrelated jitter and `RetryAfter` respect.
* Pluggable HTTP transport for proxies and airgapped environments.
* Binary download helpers and upload helpers for multipart endpoints.
* Paginators that yield items and handle `next_cursor`.
* Rich error types mapping `error.code` to language enums.
* Telemetry hooks (before/after request callbacks).
* **Packaging:**
* TS: npm package with ESM and CJS builds, types included.
* Python: PyPI package, Pydanticfriendly models, type hints.
* Go: module with contextaware methods and `io.Reader` streaming.
* Java: Maven coordinates, builder pattern, OkHttp/HTTP client provider.
* **Versioning:** SDK major matches API major. Minor/patch track generator changes only.
### 3.6 Dev portal and artifacts
* **Reference docs** autobuilt from the aggregate OAS with searchable nav, schema diagrams, and example blocks.
* **Tryit** panel wired to the sandbox environment (disabled in airgap).
* **Download center:** links to SDKs, changelogs, and Postman/HTTP collection exports.
* **.wellknown discovery:** `GET /.well-known/openapi` returns the canonical spec.
### 3.7 Conformance testing
* **Mock server** generated from OAS for contract tests.
* **Replay tests**: real services are validated against the OAS via request/response capture; deviations fail CI.
* **Golden examples**: every endpoint has recorded examples exercised in tests.
### 3.8 AirGapped support
* Export Center can build a **Docs & SDKs bundle**: `stella export devportal --offline`, including HTML docs, specs, and packages.
* SDKs avoid network discovery and accept explicit base URLs; no autoupdates.
### 3.9 Domainspecific notes
* **Conseiller/Excitator:** models expose `source_id`, `source_type`, `source_digest`. SDKs never hide source multiplicity.
* **Policy Engine:** policy documents are versioned; SDK supports dryrun/simulate endpoints with structured explanations.
* **Findings Ledger:** paginated listing includes stable, filterable fields for evidence export.
---
## 4) Architecture
### 4.1 New modules
* `src/StellaOps.Api.OpenApi/*` per service and aggregate composer
* `src/StellaOps.Api.Governance` OAS linter rules and compatibility checker
* `src/StellaOps.Sdk.Generator` codegen drivers, postprocessing templates, smoke tests
* `src/StellaOps.Sdk.Release` packaging, signing, publishing
* `src/StellaOps.DevPortal.Site` static generator and assets
* `test/contract` mock server config, golden examples
* `src/StellaOps.ExportCenter.DevPortalOffline` bundler
### 4.2 Build flow
1. Validate perservice specs → compose aggregate → lint → compatibility diff.
2. Generate SDKs → build → run languagelevel tests → publish to internal registry.
3. Build dev portal and publish.
4. Optionally build offline bundle.
### 4.3 Runtime contracts
* `GET /.well-known/openapi` per service and at the gateway.
* All services embed `x-stella-service` and `x-stella-version` extensions for traceability.
---
## 5) APIs and contracts (select)
* **Discovery**: `GET /.well-known/openapi` → JSON or YAML.
* **Errors**: standard envelope (see 3.2).
* **Rate limits**: expose `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset`.
* **Operations**: longrunning ops expose `operation_id` and `status` via `GET /v1/operations/{id}`.
---
## 6) Documentation changes
Create or update:
1. `/docs/api/overview.md`
* API surface, auth, tenancy, pagination, idempotency, rate limits.
2. `/docs/api/conventions.md`
* Path, naming, errors, filters, sorting, examples.
3. `/docs/api/versioning.md`
* SemVer policy, deprecation windows, headers, migration playbooks.
4. `/docs/api/reference/`
* Autogenerated OAS site; link into service pages.
5. `/docs/sdks/overview.md`
* Supported languages, install, helloworld, retry/auth patterns.
6. `/docs/sdks/typescript.md`, `/python.md`, `/go.md`, `/java.md`
* Languagespecific guides, snippets, paginator usage, streaming.
7. `/docs/devportal/publishing.md`
* Build pipeline, offline bundle steps.
8. `/docs/contributing/api-contracts.md`
* How to edit OAS, lint rules, compatibility checks, examples.
9. `/docs/testing/contract-testing.md`
* Mock server, golden examples, replay tests.
10. `/docs/security/auth-scopes.md`
* OAuth2, PAT, scope mapping, tenancy header.
11. `/docs/airgap/devportal-offline.md`
* Airgapped docs and SDK bundle usage.
Add the banner at the top of each page:
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
## 7) Implementation plan
### Phase 1 — Foundations
* Establish perservice OAS skeletons and the aggregate composer.
* Introduce linting and compatibility checks in CI.
* Define the standard error envelope and migrate services.
### Phase 2 — Reference & discovery
* Implement `/.well-known/openapi` for gateway and services.
* Build the dev portal with search, schema diagrams, and examples.
### Phase 3 — SDKs (TS, Python, Go, Java)
* Implement generator drivers and templates.
* Publish alpha packages internally; integrate in CLI and Console integration tests.
* Add paginators, retries, auth helpers, and streaming.
### Phase 4 — Conformance & examples
* Wire mock server into PR CI.
* Record golden example fixtures and replay tests against staging.
* Automate example extraction into docs.
### Phase 5 — Release automation & deprecation
* Automate changelogs from OAS diffs.
* Notifications Studio integration for API deprecations.
* Offline dev portal bundle through Export Center.
### Phase 6 — Followups
* C# and Rust SDKs, Postman/HTTP collections, sample apps repo.
---
## 8) Engineering tasks
**OAS & governance**
* [ ] Create `src/StellaOps.Api.OpenApi/<service>/openapi.yaml` for all services with minimal paths and shared components.
* [ ] Implement aggregate composer and `$ref` resolver.
* [ ] Add CI job: lint, validate, compatibility diff; block merges on failure.
* [ ] Migrate all endpoints to standard error envelope and provide examples.
**Discovery & portal**
* [ ] Implement `GET /.well-known/openapi` at service and gateway.
* [ ] Build dev portal: nav, search, schema viewer, tryit (nonprod), copycurl.
* [ ] Add version selector for v1/v2 specs.
**SDKs**
* [ ] Generator driver with pinned templates; forbid manual edits in generated folders.
* [ ] TS SDK: ESM/CJS build, treeshaking, paginator, middleware hooks.
* [ ] Python SDK: async and sync clients, type hints, file upload/download helpers.
* [ ] Go SDK: contextfirst API, streaming, error type mapping.
* [ ] Java SDK: builder pattern, HTTP client provider abstraction.
* [ ] Common: retries, `RetryAfter` handling, idempotency key helper, auth helpers, telemetry hooks.
* [ ] Languagespecific tests and smoke examples.
**Conformance**
* [ ] Mock server config with operation examples.
* [ ] Replay tests against staging; fail on schema drift.
* [ ] Golden example extraction pipeline.
**AirGapped**
* [ ] Export Center job: `devportal --offline` producing HTML docs, specs, and package artifacts.
* [ ] SDKs accept explicit base URLs; disable online discovery.
**Authority & Tenancy**
* [ ] Document scopes per endpoint in OAS (`securitySchemes` + `security` blocks).
* [ ] Implement optional `X-Stella-Scope` override with validation.
**Release automation**
* [ ] Version bump tooling for OAS and SDKs; SemVer aligned.
* [ ] Autogenerate `CHANGELOG.md` from OAS diffs.
* [ ] Publish to registries with signed artifacts and provenance.
**Docs**
* [ ] Author all pages listed in section 6; embed code snippets pulled from tested examples.
* [ ] Insert banner statement in each page.
**Testing**
* [ ] Contract tests in PR CI; 100% operation coverage with at least one example.
* [ ] Language SDK integration tests against mock server and staging.
* [ ] Backwardscompat test suite comparing last N releases.
---
## 9) Feature changes required in other components
* **Web Services:** unify on error envelope, pagination, idempotency handling, and deprecation headers.
* **CLI:** consume the official TS or Go SDK instead of bespoke HTTP calls; this enforces parity.
* **Console:** use SDKs for backend calls where appropriate; helps dogfood the clients.
* **Export Center:** add `devportal --offline` and package signing.
* **Observability:** include `x-stella-service` and API version attributes in spans; trace IDs mirrored in error responses.
* **Notifications Studio:** templates for API deprecations and SDK updates.
* **AirGapped Mode:** ship offline dev portal and SDKs bundle; console disables tryit.
> **Imposed rule reminder:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
## 10) Acceptance criteria
* Aggregate OpenAPI validates, lints cleanly, and covers 100% of public endpoints with examples.
* `/.well-known/openapi` available at gateway and service level.
* Dev portal builds with search, example blocks, and version selector.
* TS/Python/Go/Java SDKs publish successfully; each has paginators, retries, auth helpers, streaming, and typed errors.
* CLI integrations pass using SDKs.
* Contract tests run in PR CI; schema drift causes failures.
* Deprecation headers and Notifications Studio flow proven in a staged deprecation.
* Offline dev portal bundle exports and renders in a sealed environment.
---
## 11) Risks and mitigations
* **Spec drift vs code reality:** mockfirst development and replay tests keep services aligned with OAS.
* **Generator churn:** pin generator and templates; only update via planned minor releases.
* **Breaking changes under pressure:** enforce compatibility gate and documented exception process.
* **SDK ergonomics mismatch:** run languagenative design reviews with maintainers before GA.
* **Airgapped constraints:** prebuild full offline bundles; avoid dynamic CDN assets in docs.
---
## 12) Philosophy
* **Contract first, code second.** The spec is the product; servers and SDKs are implementations.
* **Stability over cleverness.** Boring, predictable APIs beat “magical” behavior.
* **Truth preservation.** Never hide or merge advisory/VEX sources; surface provenance everywhere.
* **Automation everywhere.** Humans shouldnt manually edit generated code or publish packages.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.