- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution. - Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done. - Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
15 KiB
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
Epic 17: SDKs and OpenAPI Docs
Short name: SDKs & OpenAPI Primary components: API Gateway, Web Services, Policy Engine, Conseiller (Feedser), Excitator (VEXer), Orchestrator, Findings Ledger, Export Center, Authority & Tenancy, Console, CLI Surfaces: OpenAPI 3.1 contracts, language SDKs (TS/Node, Python, Go, Java, C#), dev portal, examples, mock server, conformance tests, changelogs, versioning, deprecations Dependencies: Authority‑Backed Scopes & Tenancy, CLI Parity, Export Center, Notifications Studio, Air‑Gapped Mode, Observability
AOC ground rule reminder: Conseiller and Excitator aggregate and link advisories/VEX. They never merge or mutate source records. SDKs must preserve this invariant and expose source identity in all models.
1) What it is
A contract‑first program that standardizes StellaOps’ APIs with OpenAPI 3.1 and ships official, versioned SDKs for popular languages. It includes:
- A single source‑of‑truth OpenAPI for each service and a canonical aggregate spec.
- Generated SDKs with idiomatic ergonomics, retries, auth helpers, pagination cursors, streaming downloads, and typed error envelopes.
- A developer portal with interactive reference, runnable examples, and “copy‑curl” snippets.
- Mock server & conformance tests so changes are validated against the contract before code ships.
- Versioning & deprecation policy, automated changelogs, and notification hooks.
- Air‑gapped bundles of docs and SDKs for disconnected environments.
Net result: partners and internal teams integrate quickly without reverse‑engineering request bodies from error logs.
2) Why
- Reduce friction and support load with a single, accurate contract.
- Make the platform extensible: third parties can build automation, dashboards, and policy pipelines without trawling source code.
- Enforce stability: contract linting and backwards‑compat checks prevent accidental breakage.
- Bring CLI and Console parity to programmatic users through first‑class clients.
3) How it should work
3.1 Source of truth and layout
- Each service owns a module‑scoped OAS file:
src/StellaOps.Api.OpenApi/<service>/openapi.yaml. - An aggregate spec
src/StellaOps.Api.OpenApi/stella.yamlis produced by build tooling that composes per‑service specs, resolves$refs, and validates cross‑service schemas. - JSON Schema dialect: 2020‑12 (OpenAPI 3.1). No vendor‑specific features for core models.
- Every response and error has at least one validated example.
3.2 API conventions (normative)
-
Paths:
/v1/{resource}, plural nouns. Subresources use/v1/resources/{id}/subresources. -
Identifiers:
idfields are ULID/UUIDv7 as strings. -
Pagination: cursor‑based:
?cursor=<token>&limit=<n>, response envelope includesnext_cursor. -
Sorting/filtering:
?sort=field:asc|desc,?filter[field]=op:valuewith documented operators. -
Idempotency: POST operations that create or mutate accept
Idempotency-Key. -
Errors: single envelope:
{ "error": { "code": "STRING_CODE", "message": "human friendly", "details": { "field": "value" }, "trace_id": "..." } }Standard codes include
AIRGAP_EGRESS_BLOCKED,POLICY_VIOLATION,NOT_FOUND,RATE_LIMITED. -
Auth: OAuth2 client credentials and PAT. Scopes are explicit (see 14: Authority‑Backed Scopes). Tenancy via claims; optional override header:
X-Stella-Scope: tenant/<id>if the token permits delegation. -
Content negotiation: JSON only for request/response unless endpoint is a stream or file download (
application/octet-stream). -
Long‑running operations: either webhooks (if enabled) or polling via
operation_idresource.
3.3 Versioning and deprecation
-
SemVer for the aggregate API:
v1,v2in base path. -
Backwards‑compatible changes allowed in minor versions (add fields, new optional params).
-
Breaking changes require new major version and coexistence for a deprecation window (min 12 months) with:
- Deprecation headers:
Deprecation: true,Sunset: <rfc1123 date>,Link: <doc rel="deprecation">. - Portal banners and Notifications Studio broadcast.
- Deprecation headers:
3.4 Governance and linting
- Enforce naming, pagination, error envelope, and example requirements via an OAS linter.
- CI gate: no PR merges if OAS validation fails or coverage < 100% for operation examples.
- Compatibility check: diff new OAS vs previous release, fail on breaking changes unless explicitly flagged.
3.5 SDK generation
-
Initial languages: TypeScript/Node, Python, Go, Java. C# and Rust are follow‑ups.
-
Generated via a stable, reproducible toolchain. Post‑generation patches are applied by templates, not hand edits.
-
Capabilities:
- Auth helpers: PAT and OAuth2.
- Retries with decorrelated jitter and
Retry‑Afterrespect. - Pluggable HTTP transport for proxies and air‑gapped environments.
- Binary download helpers and upload helpers for multipart endpoints.
- Paginators that yield items and handle
next_cursor. - Rich error types mapping
error.codeto language enums. - Telemetry hooks (before/after request callbacks).
-
Packaging:
- TS: npm package with ESM and CJS builds, types included.
- Python: PyPI package, Pydantic‑friendly models, type hints.
- Go: module with context‑aware methods and
io.Readerstreaming. - Java: Maven coordinates, builder pattern, OkHttp/HTTP client provider.
-
Versioning: SDK major matches API major. Minor/patch track generator changes only.
3.6 Dev portal and artifacts
- Reference docs auto‑built from the aggregate OAS with searchable nav, schema diagrams, and example blocks.
- Try‑it panel wired to the sandbox environment (disabled in air‑gap).
- Download center: links to SDKs, changelogs, and Postman/HTTP collection exports.
- .well‑known discovery:
GET /.well-known/openapireturns the canonical spec.
3.7 Conformance testing
- Mock server generated from OAS for contract tests.
- Replay tests: real services are validated against the OAS via request/response capture; deviations fail CI.
- Golden examples: every endpoint has recorded examples exercised in tests.
3.8 Air‑Gapped support
- Export Center can build a Docs & SDKs bundle:
stella export devportal --offline, including HTML docs, specs, and packages. - SDKs avoid network discovery and accept explicit base URLs; no auto‑updates.
3.9 Domain‑specific notes
- Conseiller/Excitator: models expose
source_id,source_type,source_digest. SDKs never hide source multiplicity. - Policy Engine: policy documents are versioned; SDK supports dry‑run/simulate endpoints with structured explanations.
- Findings Ledger: paginated listing includes stable, filterable fields for evidence export.
4) Architecture
4.1 New modules
src/StellaOps.Api.OpenApi/*per service and aggregate composersrc/StellaOps.Api.GovernanceOAS linter rules and compatibility checkersrc/StellaOps.Sdk.Generatorcodegen drivers, post‑processing templates, smoke testssrc/StellaOps.Sdk.Releasepackaging, signing, publishingsrc/StellaOps.DevPortal.Sitestatic generator and assetstest/contractmock server config, golden examplessrc/StellaOps.ExportCenter.DevPortalOfflinebundler
4.2 Build flow
- Validate per‑service specs → compose aggregate → lint → compatibility diff.
- Generate SDKs → build → run language‑level tests → publish to internal registry.
- Build dev portal and publish.
- Optionally build offline bundle.
4.3 Runtime contracts
GET /.well-known/openapiper service and at the gateway.- All services embed
x-stella-serviceandx-stella-versionextensions for traceability.
5) APIs and contracts (select)
- Discovery:
GET /.well-known/openapi→ JSON or YAML. - Errors: standard envelope (see 3.2).
- Rate limits: expose
X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset. - Operations: long‑running ops expose
operation_idandstatusviaGET /v1/operations/{id}.
6) Documentation changes
Create or update:
-
/docs/api/overview.md- API surface, auth, tenancy, pagination, idempotency, rate limits.
-
/docs/api/conventions.md- Path, naming, errors, filters, sorting, examples.
-
/docs/api/versioning.md- SemVer policy, deprecation windows, headers, migration playbooks.
-
/docs/api/reference/- Auto‑generated OAS site; link into service pages.
-
/docs/sdks/overview.md- Supported languages, install, hello‑world, retry/auth patterns.
-
/docs/sdks/typescript.md,/python.md,/go.md,/java.md- Language‑specific guides, snippets, paginator usage, streaming.
-
/docs/devportal/publishing.md- Build pipeline, offline bundle steps.
-
/docs/contributing/api-contracts.md- How to edit OAS, lint rules, compatibility checks, examples.
-
/docs/testing/contract-testing.md- Mock server, golden examples, replay tests.
-
/docs/security/auth-scopes.md- OAuth2, PAT, scope mapping, tenancy header.
-
/docs/airgap/devportal-offline.md- Air‑gapped docs and SDK bundle usage.
Add the banner at the top of each page:
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
7) Implementation plan
Phase 1 — Foundations
- Establish per‑service OAS skeletons and the aggregate composer.
- Introduce linting and compatibility checks in CI.
- Define the standard error envelope and migrate services.
Phase 2 — Reference & discovery
- Implement
/.well-known/openapifor gateway and services. - Build the dev portal with search, schema diagrams, and examples.
Phase 3 — SDKs (TS, Python, Go, Java)
- Implement generator drivers and templates.
- Publish alpha packages internally; integrate in CLI and Console integration tests.
- Add paginators, retries, auth helpers, and streaming.
Phase 4 — Conformance & examples
- Wire mock server into PR CI.
- Record golden example fixtures and replay tests against staging.
- Automate example extraction into docs.
Phase 5 — Release automation & deprecation
- Automate changelogs from OAS diffs.
- Notifications Studio integration for API deprecations.
- Offline dev portal bundle through Export Center.
Phase 6 — Follow‑ups
- C# and Rust SDKs, Postman/HTTP collections, sample apps repo.
8) Engineering tasks
OAS & governance
- Create
src/StellaOps.Api.OpenApi/<service>/openapi.yamlfor all services with minimal paths and shared components. - Implement aggregate composer and
$refresolver. - Add CI job: lint, validate, compatibility diff; block merges on failure.
- Migrate all endpoints to standard error envelope and provide examples.
Discovery & portal
- Implement
GET /.well-known/openapiat service and gateway. - Build dev portal: nav, search, schema viewer, try‑it (non‑prod), copy‑curl.
- Add version selector for v1/v2 specs.
SDKs
- Generator driver with pinned templates; forbid manual edits in generated folders.
- TS SDK: ESM/CJS build, tree‑shaking, paginator, middleware hooks.
- Python SDK: async and sync clients, type hints, file upload/download helpers.
- Go SDK: context‑first API, streaming, error type mapping.
- Java SDK: builder pattern, HTTP client provider abstraction.
- Common: retries,
Retry‑Afterhandling, idempotency key helper, auth helpers, telemetry hooks. - Language‑specific tests and smoke examples.
Conformance
- Mock server config with operation examples.
- Replay tests against staging; fail on schema drift.
- Golden example extraction pipeline.
Air‑Gapped
- Export Center job:
devportal --offlineproducing HTML docs, specs, and package artifacts. - SDKs accept explicit base URLs; disable online discovery.
Authority & Tenancy
- Document scopes per endpoint in OAS (
securitySchemes+securityblocks). - Implement optional
X-Stella-Scopeoverride with validation.
Release automation
- Version bump tooling for OAS and SDKs; SemVer aligned.
- Auto‑generate
CHANGELOG.mdfrom OAS diffs. - Publish to registries with signed artifacts and provenance.
Docs
- Author all pages listed in section 6; embed code snippets pulled from tested examples.
- Insert banner statement in each page.
Testing
- Contract tests in PR CI; 100% operation coverage with at least one example.
- Language SDK integration tests against mock server and staging.
- Backwards‑compat test suite comparing last N releases.
9) Feature changes required in other components
- Web Services: unify on error envelope, pagination, idempotency handling, and deprecation headers.
- CLI: consume the official TS or Go SDK instead of bespoke HTTP calls; this enforces parity.
- Console: use SDKs for backend calls where appropriate; helps dogfood the clients.
- Export Center: add
devportal --offlineand package signing. - Observability: include
x-stella-serviceand API version attributes in spans; trace IDs mirrored in error responses. - Notifications Studio: templates for API deprecations and SDK updates.
- Air‑Gapped Mode: ship offline dev portal and SDKs bundle; console disables try‑it.
Imposed rule reminder: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
10) Acceptance criteria
- Aggregate OpenAPI validates, lints cleanly, and covers 100% of public endpoints with examples.
/.well-known/openapiavailable at gateway and service level.- Dev portal builds with search, example blocks, and version selector.
- TS/Python/Go/Java SDKs publish successfully; each has paginators, retries, auth helpers, streaming, and typed errors.
- CLI integrations pass using SDKs.
- Contract tests run in PR CI; schema drift causes failures.
- Deprecation headers and Notifications Studio flow proven in a staged deprecation.
- Offline dev portal bundle exports and renders in a sealed environment.
11) Risks and mitigations
- Spec drift vs code reality: mock‑first development and replay tests keep services aligned with OAS.
- Generator churn: pin generator and templates; only update via planned minor releases.
- Breaking changes under pressure: enforce compatibility gate and documented exception process.
- SDK ergonomics mismatch: run language‑native design reviews with maintainers before GA.
- Air‑gapped constraints: prebuild full offline bundles; avoid dynamic CDN assets in docs.
12) Philosophy
- Contract first, code second. The spec is the product; servers and SDKs are implementations.
- Stability over cleverness. Boring, predictable APIs beat “magical” behavior.
- Truth preservation. Never hide or merge advisory/VEX sources; surface provenance everywhere.
- Automation everywhere. Humans shouldn’t manually edit generated code or publish packages.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.