stella-ops.org/git.stella-ops.org

Fork 0

Files

master 644887997c test fixes and new product advisories work

2026-01-28 02:30:48 +02:00

16 KiB

Raw Permalink Blame History

StellaOps Engineering Code of Conduct

Technical excellence + safe for change = best-in-class product

0. Mission and operating intent

StellaOps is a sovereign, self-hostable release control plane for non-Kubernetes container estates. We ship reproducible, auditable, security-gated releases where every decision is explainable and defensible.

Engineering pledge (non-negotiable)

Every contribution must improve the product without regressions in:

Correctness (it does what it claims).
Safety for change (tests and structure allow refactoring without fear).
Determinism (same inputs produce the same outputs for evidence and replay).
Security (fail secure; least privilege; cryptographic correctness).
Operability (observable, diagnosable, offline/air-gap friendly).

This document is written for humans and autonomous implementers. It is intentionally prescriptive.

1. Scope, authority, and precedence

1.1 Applies to

All code contributions (C#, TypeScript, Angular, SQL, Dockerfiles, Compose/Helm, CI pipelines, scripts).
All documentation that defines contracts or operational reality (architecture dossiers, API/CLI references, runbooks, sprint plans).
All tests and test infrastructure.

1.2 Authority

This document supersedes informal guidance.
Module-local AGENTS.md files may impose stricter requirements, but may not relax these standards.
If a rule is unclear, prefer the interpretation that increases determinism, testability, and security.

2. Mandatory reading (before non-trivial work)

Before contributing to any module, read:

This file (CODE_OF_CONDUCT.md)
docs/README.md
docs/07_HIGH_LEVEL_ARCHITECTURE.md
docs/modules/platform/architecture-overview.md
TESTING_PRACTICES.md
The relevant module dossier: docs/modules/<module>/architecture.md (or architecture*.md)
The module-local AGENTS.md if present (e.g., src/Scanner/AGENTS.md)

Enforcement: changes that contradict documented architecture/contracts will be rejected with a pointer to the violated doc.

3. Definition of Done (DoD): required evidence

A task/PR is DONE only when the required evidence exists and is linked from the sprint task (or documented explicitly as N/A).

Minimum DoD for any change:

Tests exist for the change OR a written justification exists in sprint "Decisions & Risks".
Determinism: any output participating in evidence/signatures/hashes is deterministic (ordering, timestamps, canonicalization).
Docs: contract/workflow/config/schema changes are reflected in docs/** and linked from the sprint.
Observability: new workflows emit structured logs; core success/failure counters exist.
Security: external inputs validated; authorization enforced; secrets not persisted; least privilege respected.
Sprint discipline: task status updated (TODO -> DOING -> DONE or BLOCKED) and completion criteria checked off.

If you cannot satisfy DoD due to constraints, the task is BLOCKED until the constraint is resolved.

4. Change-type checklists (select at least one per task/PR)

These checklists are mandatory. Pick the relevant checklist(s) and satisfy them.

4.1 Bug fix

Repro captured as a failing test first (unless impossible; justify).
Fix implemented.
Regression test remains and is named for the bug.

4.2 New behavior / feature

Unit tests cover core logic.
Integration tests cover service/API contracts where applicable.
Docs updated if user-visible or contract/schema changes occur.
Backward-compatibility note recorded if behavior changes.

4.3 New API endpoint / CLI command

Request/response validation is explicit.
Authz policy is defined and tested.
Negative tests: unauthorized, invalid input, boundary values.
API/CLI docs updated with at least one example.

4.4 Schema / persistence change

Migration present and tested.
Indexing considered for new query patterns.
Serialization round-trip tests updated/added.
Roll-forward behavior documented; rollback constraints noted.

4.5 Security-sensitive change

Threat note added in sprint "Decisions & Risks" (what can go wrong + mitigation).
Audit events exist where relevant (who/what/when/why).
Secrets handling verified (no plaintext persistence).
Default posture is deny/fail-secure.

5. Review rejection criteria (hard NACK)

A PR must be rejected if it introduces any of the following:

Silent stubs or TODOs without a sprint task reference.
Nondeterminism in production paths (direct time/ID/random usage).
Missing validation/authz for new external inputs/endpoints.
Cross-module dependency creep without a clear interface boundary and documented rationale.
Network access in tests (except local infrastructure like Testcontainers), or tests that cannot run offline.
Public APIs exposing mutable collections or mutable internal state.
Repo-wide refactors unrelated to sprint scope.
New dependency without justification and supply-chain considerations.

6. Engineering standards (rules + examples)

6.1 Compiler and warning discipline

Rule:

All projects must treat warnings as errors.

Example:

<PropertyGroup>
  <TreatWarningsAsErrors>true</TreatWarningsAsErrors>
</PropertyGroup>

Rationale:

Warnings become regressions later. Zero-warning builds are required for safe change.

6.2 Determinism: time, IDs, randomness

Rule:

Never use DateTime.UtcNow, DateTimeOffset.UtcNow, Guid.NewGuid(), or Random.Shared directly in production code.
Inject TimeProvider and an ID generator abstraction (e.g., IGuidGenerator).

Example:

// [BAD] nondeterministic, hard to test
public sealed class BadService
{
    public Record CreateRecord() => new()
    {
        Id = Guid.NewGuid(),
        CreatedAt = DateTimeOffset.UtcNow,
    };
}

// [OK] injectable and deterministic under test
public sealed class GoodService(TimeProvider timeProvider, IGuidGenerator guidGenerator)
{
    public Record CreateRecord() => new()
    {
        Id = guidGenerator.NewGuid(),
        CreatedAt = timeProvider.GetUtcNow(),
    };
}

Rationale:

Deterministic behavior is required for reproducible evidence and reliable tests.

6.3 Culture-invariant parsing and formatting

Rule:

Always use CultureInfo.InvariantCulture for parsing/formatting any value that is persisted, hashed, compared, or exported.

Example:

// [BAD] culture-sensitive
var value = double.Parse(input);
var formatted = percentage.ToString("P2");

// [OK] deterministic across locales
var value = double.Parse(input, CultureInfo.InvariantCulture);
var formatted = percentage.ToString("P2", CultureInfo.InvariantCulture);

Rationale:

Locale-dependent parsing breaks determinism and evidence reproducibility.

6.4 ASCII-only output for logs, comments, and exported artifacts

Rule:

Use ASCII-only characters in:
- log messages,
- comments,
- console output,
- exported artifacts (NDJSON, evidence packets, reports).
Avoid non-ASCII glyphs in code and examples.

Exceptions:

End-user UI text may be internationalized; if Unicode is required in stored/hashed/exported artifacts, document the rationale and encoding explicitly.

Example:

// [BAD] non-ASCII glyphs in logs
_logger.LogInformation("Success -> proceeding");

// [OK] ASCII-only
_logger.LogInformation("[OK] Success - proceeding");

Rationale:

Non-ASCII breaks in constrained environments and is risky for canonicalization/signatures.

6.5 Immutable collection returns

Rule:

Public APIs must return immutable or read-only collections (IReadOnlyList<T>, ImmutableArray<T>, or defensive copies).
Never expose mutable backing stores.

Example:

// [BAD] caller can mutate internal state
public sealed class BadRegistry
{
    private readonly List<string> _scopes = new();
    public List<string> Scopes => _scopes;
}

// [OK] caller cannot mutate internal state
public sealed class GoodRegistry
{
    private readonly List<string> _scopes = new();
    public IReadOnlyList<string> Scopes => _scopes.AsReadOnly();
}

Rationale:

Immutability is a safety contract that reduces hidden coupling and race conditions.

6.6 No silent stubs

Rule:

Unimplemented code must throw NotImplementedException (or return an explicit error result).
Never return success from unimplemented paths.

Example:

// [BAD] ships broken behavior
public Task<Result> ProcessAsync() =>
    Task.FromResult(Result.Success());

// [OK] explicit failure with traceability
public Task<Result> ProcessAsync() =>
    throw new NotImplementedException("Not implemented. See sprint task: <TASK-ID>.");

Rationale:

Silent stubs create production incidents and false confidence.

6.7 CancellationToken propagation

Rule:

Propagate CancellationToken through async call chains.
Do not use CancellationToken.None in production paths.

Example:

// [BAD] ignores cancellation
await _repo.SaveAsync(entity, CancellationToken.None);
await Task.Delay(1000);

// [OK] respects cancellation
await _repo.SaveAsync(entity, ct);
await Task.Delay(1000, ct);

Rationale:

Cancellation is required for graceful shutdown and avoiding resource leaks.

6.8 HttpClient via IHttpClientFactory

Rule:

Never instantiate HttpClient directly.
Use IHttpClientFactory with explicit timeouts and resilience policy.

Example:

// [BAD] risks socket exhaustion
using var client = new HttpClient();

// [OK] factory-managed
var client = httpClientFactory.CreateClient("MyApi");

Rationale:

Factory-based clients are required for production reliability.

6.9 Bounded caches with eviction

Rule:

No unbounded Dictionary/ConcurrentDictionary caches.
Use bounded caches with eviction (size limit + TTL), or an external cache with explicit retention.

Example:

// [BAD] unbounded cache
private readonly ConcurrentDictionary<string, CacheEntry> _cache = new();

// [OK] bounded cache
private readonly MemoryCache _cache = new(new MemoryCacheOptions { SizeLimit = 10_000 });

Rationale:

Unbounded caches eventually crash long-running services.

6.10 Options validation at startup

Rule:

Validate configuration at startup using ValidateOnStart().
Use IValidateOptions<T> for complex validation.

Example:

services.AddOptions<MyOptions>()
    .Bind(config.GetSection("My"))
    .ValidateDataAnnotations()
    .ValidateOnStart();

Rationale:

Fail fast. Do not defer critical config failures to runtime.

6.11 Explicit paths and roots

Rule:

Do not infer repository or data roots with fragile parent-directory walks.
Use explicit CLI options and environment variables.

Rationale:

Parent walks break in containers, CI, and offline kits.

7. Cryptographic and evidence standards

7.1 DSSE PAE consistency

Rule:

Use one shared, spec-compliant DSSE PAE helper across the codebase.
Never reimplement DSSE PAE.

Rationale:

Crypto encoding drift breaks verification and can create security issues.

7.2 RFC 8785 JSON canonicalization

Rule:

For digest/signature inputs, use the shared RFC 8785 canonicalizer.
Do not use relaxed encoders or naming policies for canonical outputs.

Rationale:

Canonical JSON is required for stable signatures and reproducible evidence.

7.3 Evidence integrity requirements

Rule:

Evidence artifacts must be:
- content-addressed or hashed,
- reproducible from declared inputs,
- exportable,
- linked from decisions (policy, approvals, deployments) in a traceable chain.

Rationale:

The product promise is audit-grade, verifiable release decisioning.

8. Data and time correctness

8.1 PostgreSQL timestamptz

Rule:

Store and retrieve timestamps as UTC DateTimeOffset.
Use reader.GetFieldValue<DateTimeOffset>() for timestamptz.

Example:

// [BAD] loses offset
var createdAt = reader.GetDateTime(ordinal);

// [OK] preserves offset
var createdAt = reader.GetFieldValue<DateTimeOffset>(ordinal);

Rationale:

Offset loss causes timeline confusion and breaks audit fidelity.

9. Testing requirements (summary)

All code contributions must include tests. The full policy is in: TESTING_PRACTICES.md

Minimum expectations:

Unit tests for new logic.
Integration tests for contracts that cross process boundaries (DB, messaging, storage, HTTP).
Determinism tests for any artifact that is exported, hashed, or used as evidence.
Tests must run offline (no live network dependencies).

Test categorization:

Use [Trait("Category", "Unit")] for unit tests.
Use [Trait("Category", "Integration")] for integration tests.

9.1 Turn #6 testing enhancements

The following practices from TESTING_PRACTICES.md are required for compliance-critical and safety-critical modules:

Intent tagging: Use [Trait("Intent", "<category>")] to classify test purpose (Regulatory, Safety, Performance, Competitive, Operational).
Observability contracts: Validate OTel traces, structured logs, and metrics as APIs with schema enforcement.
Evidence traceability: Link requirements to tests to artifacts for audit chains using [Requirement("...", SprintTaskId = "...")].
Cross-version testing: Validate N-1 and N+1 compatibility for release gating.
Time-extended testing: Run longevity tests for memory leaks, counter drift, and resource exhaustion.
Post-incident replay: Every P1/P2 incident produces a permanent regression test tagged with [Trait("Category", "PostIncident")].

See TESTING_PRACTICES.md for full details, examples, and enforcement guidance.

10. Documentation and sprint discipline

10.1 Docs must match reality

Any change affecting:

contracts,
schemas,
behavior,
operations, must update docs/** and be linked from the sprint.

10.2 Sprint discipline is mandatory

Every task must live in a sprint file in docs/implplan/.
Status must be updated (TODO -> DOING -> DONE/BLOCKED).
Decisions and risks must be recorded in the sprint.

11. Technology stack compliance

Mandatory technologies:

Runtime: .NET 10 (net10.0) with C# preview where the repo uses it
Frontend: Angular v17
Database: PostgreSQL 16+
Testing: xUnit, Testcontainers, Moq

NuGet versioning rules:

Do not specify package versions in .csproj.
Use src/Directory.Packages.props for package versions.
Prefer latest stable versions unless the repo pins otherwise.

12. Supply-chain and dependency security

Rule:

New dependencies must include:
- justification (why it is needed),
- operational impact (offline/air-gap behavior),
- security posture (known CVEs and mitigation plan if any).

Note on CVEs:

"No high/critical CVEs" is not a workable standard in real ecosystems.
The enforceable standard is:
- no unaddressed high/critical findings without an explicit policy decision record and/or VEX justification,
- documented mitigation or acceptance rationale.

13. PR evidence block (required in PR description)

Every PR must include an evidence block. If not applicable, write "N/A".

Sprint task(s): <SPRINT file + task IDs>
Working directory:
Summary (1-3 bullets):
- ...
Tests added/updated:
- Unit: <project/path> (filters or test names)
- Integration: <project/path> (filters or test names)
- E2E/Perf/Sec:
Determinism considerations:
Docs updated:
Observability:
- Logs: <events / correlation IDs>
- Metrics:
Security notes:
- Input validation:
- Authz:
- Secrets: <where/how ensured not persisted>

14. Enforcement and continuous improvement

14.1 Enforcement

PRs will be rejected if they violate any rule in this document or fail the DoD requirements.

14.2 Continuous improvement

This document is living. Improve it by:

proposing new rules when recurring defects appear,
documenting new patterns in module dossiers and module-local AGENTS.md,
adding tests that prevent regressions.

16 KiB Raw Permalink Blame History

StellaOps Engineering Code of Conduct

0. Mission and operating intent

Engineering pledge (non-negotiable)

1. Scope, authority, and precedence

1.1 Applies to

1.2 Authority

2. Mandatory reading (before non-trivial work)

3. Definition of Done (DoD): required evidence

4. Change-type checklists (select at least one per task/PR)

4.1 Bug fix

4.2 New behavior / feature

4.3 New API endpoint / CLI command

4.4 Schema / persistence change

4.5 Security-sensitive change

5. Review rejection criteria (hard NACK)

6. Engineering standards (rules + examples)

6.1 Compiler and warning discipline

6.2 Determinism: time, IDs, randomness

6.3 Culture-invariant parsing and formatting

6.4 ASCII-only output for logs, comments, and exported artifacts

6.5 Immutable collection returns

6.6 No silent stubs

6.7 CancellationToken propagation

6.8 HttpClient via IHttpClientFactory

6.9 Bounded caches with eviction

6.10 Options validation at startup

6.11 Explicit paths and roots

7. Cryptographic and evidence standards

7.1 DSSE PAE consistency

7.2 RFC 8785 JSON canonicalization

7.3 Evidence integrity requirements

8. Data and time correctness

8.1 PostgreSQL timestamptz

9. Testing requirements (summary)

9.1 Turn #6 testing enhancements

10. Documentation and sprint discipline

10.1 Docs must match reality

10.2 Sprint discipline is mandatory

11. Technology stack compliance

12. Supply-chain and dependency security

13. PR evidence block (required in PR description)

14. Enforcement and continuous improvement

14.1 Enforcement

14.2 Continuous improvement

16 KiB

Raw Permalink Blame History