582 lines
15 KiB
Markdown
582 lines
15 KiB
Markdown
# StellaOps Engineering Code of Conduct
|
|
Technical excellence + safe for change = best-in-class product
|
|
|
|
---
|
|
|
|
## 0. Mission and operating intent
|
|
|
|
StellaOps is a sovereign, self-hostable release control plane for non-Kubernetes container estates. We ship reproducible, auditable, security-gated releases where every decision is explainable and defensible.
|
|
|
|
### Engineering pledge (non-negotiable)
|
|
Every contribution must improve the product without regressions in:
|
|
|
|
1. Correctness (it does what it claims).
|
|
2. Safety for change (tests and structure allow refactoring without fear).
|
|
3. Determinism (same inputs produce the same outputs for evidence and replay).
|
|
4. Security (fail secure; least privilege; cryptographic correctness).
|
|
5. Operability (observable, diagnosable, offline/air-gap friendly).
|
|
|
|
This document is written for humans and autonomous implementers. It is intentionally prescriptive.
|
|
|
|
---
|
|
|
|
## 1. Scope, authority, and precedence
|
|
|
|
### 1.1 Applies to
|
|
- All code contributions (C#, TypeScript, Angular, SQL, Dockerfiles, Compose/Helm, CI pipelines, scripts).
|
|
- All documentation that defines contracts or operational reality (architecture dossiers, API/CLI references, runbooks, sprint plans).
|
|
- All tests and test infrastructure.
|
|
|
|
### 1.2 Authority
|
|
- This document supersedes informal guidance.
|
|
- Module-local `AGENTS.md` files may impose stricter requirements, but may not relax these standards.
|
|
- If a rule is unclear, prefer the interpretation that increases determinism, testability, and security.
|
|
|
|
---
|
|
|
|
## 2. Mandatory reading (before non-trivial work)
|
|
|
|
Before contributing to any module, read:
|
|
1. This file (CODE_OF_CONDUCT.md)
|
|
2. `docs/README.md`
|
|
3. `docs/07_HIGH_LEVEL_ARCHITECTURE.md`
|
|
4. `docs/modules/platform/architecture-overview.md`
|
|
5. [TESTING_PRACTICES.md](./TESTING_PRACTICES.md)
|
|
6. The relevant module dossier: `docs/modules/<module>/architecture.md` (or `architecture*.md`)
|
|
7. The module-local `AGENTS.md` if present (e.g., `src/Scanner/AGENTS.md`)
|
|
|
|
Enforcement: changes that contradict documented architecture/contracts will be rejected with a pointer to the violated doc.
|
|
|
|
---
|
|
|
|
## 3. Definition of Done (DoD): required evidence
|
|
|
|
A task/PR is DONE only when the required evidence exists and is linked from the sprint task (or documented explicitly as N/A).
|
|
|
|
Minimum DoD for any change:
|
|
- [ ] Tests exist for the change OR a written justification exists in sprint "Decisions & Risks".
|
|
- [ ] Determinism: any output participating in evidence/signatures/hashes is deterministic (ordering, timestamps, canonicalization).
|
|
- [ ] Docs: contract/workflow/config/schema changes are reflected in `docs/**` and linked from the sprint.
|
|
- [ ] Observability: new workflows emit structured logs; core success/failure counters exist.
|
|
- [ ] Security: external inputs validated; authorization enforced; secrets not persisted; least privilege respected.
|
|
- [ ] Sprint discipline: task status updated (`TODO -> DOING -> DONE` or `BLOCKED`) and completion criteria checked off.
|
|
|
|
If you cannot satisfy DoD due to constraints, the task is BLOCKED until the constraint is resolved.
|
|
|
|
---
|
|
|
|
## 4. Change-type checklists (select at least one per task/PR)
|
|
|
|
These checklists are mandatory. Pick the relevant checklist(s) and satisfy them.
|
|
|
|
### 4.1 Bug fix
|
|
- [ ] Repro captured as a failing test first (unless impossible; justify).
|
|
- [ ] Fix implemented.
|
|
- [ ] Regression test remains and is named for the bug.
|
|
|
|
### 4.2 New behavior / feature
|
|
- [ ] Unit tests cover core logic.
|
|
- [ ] Integration tests cover service/API contracts where applicable.
|
|
- [ ] Docs updated if user-visible or contract/schema changes occur.
|
|
- [ ] Backward-compatibility note recorded if behavior changes.
|
|
|
|
### 4.3 New API endpoint / CLI command
|
|
- [ ] Request/response validation is explicit.
|
|
- [ ] Authz policy is defined and tested.
|
|
- [ ] Negative tests: unauthorized, invalid input, boundary values.
|
|
- [ ] API/CLI docs updated with at least one example.
|
|
|
|
### 4.4 Schema / persistence change
|
|
- [ ] Migration present and tested.
|
|
- [ ] Indexing considered for new query patterns.
|
|
- [ ] Serialization round-trip tests updated/added.
|
|
- [ ] Roll-forward behavior documented; rollback constraints noted.
|
|
|
|
### 4.5 Security-sensitive change
|
|
- [ ] Threat note added in sprint "Decisions & Risks" (what can go wrong + mitigation).
|
|
- [ ] Audit events exist where relevant (who/what/when/why).
|
|
- [ ] Secrets handling verified (no plaintext persistence).
|
|
- [ ] Default posture is deny/fail-secure.
|
|
|
|
---
|
|
|
|
## 5. Review rejection criteria (hard NACK)
|
|
|
|
A PR must be rejected if it introduces any of the following:
|
|
- Silent stubs or TODOs without a sprint task reference.
|
|
- Nondeterminism in production paths (direct time/ID/random usage).
|
|
- Missing validation/authz for new external inputs/endpoints.
|
|
- Cross-module dependency creep without a clear interface boundary and documented rationale.
|
|
- Network access in tests (except local infrastructure like Testcontainers), or tests that cannot run offline.
|
|
- Public APIs exposing mutable collections or mutable internal state.
|
|
- Repo-wide refactors unrelated to sprint scope.
|
|
- New dependency without justification and supply-chain considerations.
|
|
|
|
---
|
|
|
|
## 6. Engineering standards (rules + examples)
|
|
|
|
### 6.1 Compiler and warning discipline
|
|
Rule:
|
|
- All projects must treat warnings as errors.
|
|
|
|
Example:
|
|
|
|
```xml
|
|
<PropertyGroup>
|
|
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
|
|
</PropertyGroup>
|
|
````
|
|
|
|
Rationale:
|
|
|
|
* Warnings become regressions later. Zero-warning builds are required for safe change.
|
|
|
|
---
|
|
|
|
### 6.2 Determinism: time, IDs, randomness
|
|
|
|
Rule:
|
|
|
|
* Never use `DateTime.UtcNow`, `DateTimeOffset.UtcNow`, `Guid.NewGuid()`, or `Random.Shared` directly in production code.
|
|
* Inject `TimeProvider` and an ID generator abstraction (e.g., `IGuidGenerator`).
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
// [BAD] nondeterministic, hard to test
|
|
public sealed class BadService
|
|
{
|
|
public Record CreateRecord() => new()
|
|
{
|
|
Id = Guid.NewGuid(),
|
|
CreatedAt = DateTimeOffset.UtcNow,
|
|
};
|
|
}
|
|
|
|
// [OK] injectable and deterministic under test
|
|
public sealed class GoodService(TimeProvider timeProvider, IGuidGenerator guidGenerator)
|
|
{
|
|
public Record CreateRecord() => new()
|
|
{
|
|
Id = guidGenerator.NewGuid(),
|
|
CreatedAt = timeProvider.GetUtcNow(),
|
|
};
|
|
}
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Deterministic behavior is required for reproducible evidence and reliable tests.
|
|
|
|
---
|
|
|
|
### 6.3 Culture-invariant parsing and formatting
|
|
|
|
Rule:
|
|
|
|
* Always use `CultureInfo.InvariantCulture` for parsing/formatting any value that is persisted, hashed, compared, or exported.
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
// [BAD] culture-sensitive
|
|
var value = double.Parse(input);
|
|
var formatted = percentage.ToString("P2");
|
|
|
|
// [OK] deterministic across locales
|
|
var value = double.Parse(input, CultureInfo.InvariantCulture);
|
|
var formatted = percentage.ToString("P2", CultureInfo.InvariantCulture);
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Locale-dependent parsing breaks determinism and evidence reproducibility.
|
|
|
|
---
|
|
|
|
### 6.4 ASCII-only output for logs, comments, and exported artifacts
|
|
|
|
Rule:
|
|
|
|
* Use ASCII-only characters in:
|
|
|
|
* log messages,
|
|
* comments,
|
|
* console output,
|
|
* exported artifacts (NDJSON, evidence packets, reports).
|
|
* Avoid non-ASCII glyphs in code and examples.
|
|
|
|
Exceptions:
|
|
|
|
* End-user UI text may be internationalized; if Unicode is required in stored/hashed/exported artifacts, document the rationale and encoding explicitly.
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
// [BAD] non-ASCII glyphs in logs
|
|
_logger.LogInformation("Success -> proceeding");
|
|
|
|
// [OK] ASCII-only
|
|
_logger.LogInformation("[OK] Success - proceeding");
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Non-ASCII breaks in constrained environments and is risky for canonicalization/signatures.
|
|
|
|
---
|
|
|
|
### 6.5 Immutable collection returns
|
|
|
|
Rule:
|
|
|
|
* Public APIs must return immutable or read-only collections (`IReadOnlyList<T>`, `ImmutableArray<T>`, or defensive copies).
|
|
* Never expose mutable backing stores.
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
// [BAD] caller can mutate internal state
|
|
public sealed class BadRegistry
|
|
{
|
|
private readonly List<string> _scopes = new();
|
|
public List<string> Scopes => _scopes;
|
|
}
|
|
|
|
// [OK] caller cannot mutate internal state
|
|
public sealed class GoodRegistry
|
|
{
|
|
private readonly List<string> _scopes = new();
|
|
public IReadOnlyList<string> Scopes => _scopes.AsReadOnly();
|
|
}
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Immutability is a safety contract that reduces hidden coupling and race conditions.
|
|
|
|
---
|
|
|
|
### 6.6 No silent stubs
|
|
|
|
Rule:
|
|
|
|
* Unimplemented code must throw `NotImplementedException` (or return an explicit error result).
|
|
* Never return success from unimplemented paths.
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
// [BAD] ships broken behavior
|
|
public Task<Result> ProcessAsync() =>
|
|
Task.FromResult(Result.Success());
|
|
|
|
// [OK] explicit failure with traceability
|
|
public Task<Result> ProcessAsync() =>
|
|
throw new NotImplementedException("Not implemented. See sprint task: <TASK-ID>.");
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Silent stubs create production incidents and false confidence.
|
|
|
|
---
|
|
|
|
### 6.7 CancellationToken propagation
|
|
|
|
Rule:
|
|
|
|
* Propagate `CancellationToken` through async call chains.
|
|
* Do not use `CancellationToken.None` in production paths.
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
// [BAD] ignores cancellation
|
|
await _repo.SaveAsync(entity, CancellationToken.None);
|
|
await Task.Delay(1000);
|
|
|
|
// [OK] respects cancellation
|
|
await _repo.SaveAsync(entity, ct);
|
|
await Task.Delay(1000, ct);
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Cancellation is required for graceful shutdown and avoiding resource leaks.
|
|
|
|
---
|
|
|
|
### 6.8 HttpClient via IHttpClientFactory
|
|
|
|
Rule:
|
|
|
|
* Never instantiate `HttpClient` directly.
|
|
* Use `IHttpClientFactory` with explicit timeouts and resilience policy.
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
// [BAD] risks socket exhaustion
|
|
using var client = new HttpClient();
|
|
|
|
// [OK] factory-managed
|
|
var client = httpClientFactory.CreateClient("MyApi");
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Factory-based clients are required for production reliability.
|
|
|
|
---
|
|
|
|
### 6.9 Bounded caches with eviction
|
|
|
|
Rule:
|
|
|
|
* No unbounded `Dictionary`/`ConcurrentDictionary` caches.
|
|
* Use bounded caches with eviction (size limit + TTL), or an external cache with explicit retention.
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
// [BAD] unbounded cache
|
|
private readonly ConcurrentDictionary<string, CacheEntry> _cache = new();
|
|
|
|
// [OK] bounded cache
|
|
private readonly MemoryCache _cache = new(new MemoryCacheOptions { SizeLimit = 10_000 });
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Unbounded caches eventually crash long-running services.
|
|
|
|
---
|
|
|
|
### 6.10 Options validation at startup
|
|
|
|
Rule:
|
|
|
|
* Validate configuration at startup using `ValidateOnStart()`.
|
|
* Use `IValidateOptions<T>` for complex validation.
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
services.AddOptions<MyOptions>()
|
|
.Bind(config.GetSection("My"))
|
|
.ValidateDataAnnotations()
|
|
.ValidateOnStart();
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Fail fast. Do not defer critical config failures to runtime.
|
|
|
|
---
|
|
|
|
### 6.11 Explicit paths and roots
|
|
|
|
Rule:
|
|
|
|
* Do not infer repository or data roots with fragile parent-directory walks.
|
|
* Use explicit CLI options and environment variables.
|
|
|
|
Rationale:
|
|
|
|
* Parent walks break in containers, CI, and offline kits.
|
|
|
|
---
|
|
|
|
## 7. Cryptographic and evidence standards
|
|
|
|
### 7.1 DSSE PAE consistency
|
|
|
|
Rule:
|
|
|
|
* Use one shared, spec-compliant DSSE PAE helper across the codebase.
|
|
* Never reimplement DSSE PAE.
|
|
|
|
Rationale:
|
|
|
|
* Crypto encoding drift breaks verification and can create security issues.
|
|
|
|
---
|
|
|
|
### 7.2 RFC 8785 JSON canonicalization
|
|
|
|
Rule:
|
|
|
|
* For digest/signature inputs, use the shared RFC 8785 canonicalizer.
|
|
* Do not use relaxed encoders or naming policies for canonical outputs.
|
|
|
|
Rationale:
|
|
|
|
* Canonical JSON is required for stable signatures and reproducible evidence.
|
|
|
|
---
|
|
|
|
### 7.3 Evidence integrity requirements
|
|
|
|
Rule:
|
|
|
|
* Evidence artifacts must be:
|
|
|
|
* content-addressed or hashed,
|
|
* reproducible from declared inputs,
|
|
* exportable,
|
|
* linked from decisions (policy, approvals, deployments) in a traceable chain.
|
|
|
|
Rationale:
|
|
|
|
* The product promise is audit-grade, verifiable release decisioning.
|
|
|
|
---
|
|
|
|
## 8. Data and time correctness
|
|
|
|
### 8.1 PostgreSQL timestamptz
|
|
|
|
Rule:
|
|
|
|
* Store and retrieve timestamps as UTC `DateTimeOffset`.
|
|
* Use `reader.GetFieldValue<DateTimeOffset>()` for timestamptz.
|
|
|
|
Example:
|
|
|
|
```csharp
|
|
// [BAD] loses offset
|
|
var createdAt = reader.GetDateTime(ordinal);
|
|
|
|
// [OK] preserves offset
|
|
var createdAt = reader.GetFieldValue<DateTimeOffset>(ordinal);
|
|
```
|
|
|
|
Rationale:
|
|
|
|
* Offset loss causes timeline confusion and breaks audit fidelity.
|
|
|
|
---
|
|
|
|
## 9. Testing requirements (summary)
|
|
|
|
All code contributions must include tests. The full policy is in: [TESTING_PRACTICES.md](./TESTING_PRACTICES.md)
|
|
|
|
Minimum expectations:
|
|
|
|
* Unit tests for new logic.
|
|
* Integration tests for contracts that cross process boundaries (DB, messaging, storage, HTTP).
|
|
* Determinism tests for any artifact that is exported, hashed, or used as evidence.
|
|
* Tests must run offline (no live network dependencies).
|
|
|
|
Test categorization:
|
|
|
|
* Use `[Trait("Category", "Unit")]` for unit tests.
|
|
* Use `[Trait("Category", "Integration")]` for integration tests.
|
|
|
|
---
|
|
|
|
## 10. Documentation and sprint discipline
|
|
|
|
### 10.1 Docs must match reality
|
|
|
|
Any change affecting:
|
|
|
|
* contracts,
|
|
* schemas,
|
|
* behavior,
|
|
* operations,
|
|
must update `docs/**` and be linked from the sprint.
|
|
|
|
### 10.2 Sprint discipline is mandatory
|
|
|
|
* Every task must live in a sprint file in `docs/implplan/`.
|
|
* Status must be updated (`TODO -> DOING -> DONE/BLOCKED`).
|
|
* Decisions and risks must be recorded in the sprint.
|
|
|
|
---
|
|
|
|
## 11. Technology stack compliance
|
|
|
|
Mandatory technologies:
|
|
|
|
* Runtime: .NET 10 (`net10.0`) with C# preview where the repo uses it
|
|
* Frontend: Angular v17
|
|
* Database: PostgreSQL 16+
|
|
* Testing: xUnit, Testcontainers, Moq
|
|
|
|
NuGet versioning rules:
|
|
|
|
* Do not specify package versions in `.csproj`.
|
|
* Use `src/Directory.Packages.props` for package versions.
|
|
* Prefer latest stable versions unless the repo pins otherwise.
|
|
|
|
---
|
|
|
|
## 12. Supply-chain and dependency security
|
|
|
|
Rule:
|
|
|
|
* New dependencies must include:
|
|
|
|
* justification (why it is needed),
|
|
* operational impact (offline/air-gap behavior),
|
|
* security posture (known CVEs and mitigation plan if any).
|
|
|
|
Note on CVEs:
|
|
|
|
* "No high/critical CVEs" is not a workable standard in real ecosystems.
|
|
* The enforceable standard is:
|
|
|
|
* no unaddressed high/critical findings without an explicit policy decision record and/or VEX justification,
|
|
* documented mitigation or acceptance rationale.
|
|
|
|
---
|
|
|
|
## 13. PR evidence block (required in PR description)
|
|
|
|
Every PR must include an evidence block. If not applicable, write "N/A".
|
|
|
|
* Sprint task(s): <SPRINT file + task IDs>
|
|
* Working directory: <path>
|
|
* Summary (1-3 bullets):
|
|
|
|
* ...
|
|
* Tests added/updated:
|
|
|
|
* Unit: <project/path> (filters or test names)
|
|
* Integration: <project/path> (filters or test names)
|
|
* E2E/Perf/Sec: <if applicable>
|
|
* Determinism considerations:
|
|
|
|
* <what was made deterministic and how it was validated>
|
|
* Docs updated:
|
|
|
|
* <doc paths>
|
|
* Observability:
|
|
|
|
* Logs: <events / correlation IDs>
|
|
* Metrics: <counter names>
|
|
* Security notes:
|
|
|
|
* Input validation: <where>
|
|
* Authz: <where>
|
|
* Secrets: <where/how ensured not persisted>
|
|
|
|
---
|
|
|
|
## 14. Enforcement and continuous improvement
|
|
|
|
### 14.1 Enforcement
|
|
|
|
PRs will be rejected if they violate any rule in this document or fail the DoD requirements.
|
|
|
|
### 14.2 Continuous improvement
|
|
|
|
This document is living. Improve it by:
|
|
|
|
* proposing new rules when recurring defects appear,
|
|
* documenting new patterns in module dossiers and module-local `AGENTS.md`,
|
|
* adding tests that prevent regressions.
|