Files
git.stella-ops.org/docs/product-advisories/30-Nov-2025 - Standup Sprint Kickstarters.md
StellaOps Bot 25254e3831 news advisories
2025-11-30 21:00:38 +02:00

26 KiB
Raw Blame History

Heres a crisp, no-drama standup plan with three small wins that unblock bigger work. Background first, then exact tasks you can ship in a day.


1) Scanner post-mortem → 2 reproducible regressions

Why: Post-mortems sprawl. Two bullet-proof repros turn theories into fixable tickets.

Task today

  • Pick the two highest-impact failure modes (e.g., wrong reachability verdict; missed OSV/CVE due to parser).
  • For each, produce a 1-command repro script (Docker image + SBOM fixture) and an expected vs actual JSON artifact.
  • Store under tests/scanner/regressions/<case-id>/ with README.md and make test target.

Definition of done

  • CI job Scanner-Regression runs both and fails deterministically if behavior regresses.

2) Mongo→Postgres slice prototype (pick one)

Why: A focused end-to-end slice beats a giant migration plan.

Candidates

  • Authority.Tokens (licensing/entitlements)
  • Scheduler.Jobs (enqueued/executed metadata)
  • VEX.Verdicts (append-only history)

Task today

  • Draft minimal DDL in db/pg/migrations/0001_<slice>.sql and EF Core model.
  • Add writer + reader paths behind I<Slice>Store; toggle backend using STORE_BACKEND=Mongo|Postgres.
  • Compose a data-parity test: write N records → read from both stores → byte-compare normalized JSON.

Definition of done

  • Service boots with STORE_BACKEND=Postgres, parity test passes locally, migration cleanly creates/drops the slice.

3) DSSE + Rekor attestation batch sweep

Plain English: sign DSSE envelopes, submit to Rekor, and size batches until failures appear.

Task today

  • Choose an artifact type (e.g., scanner-report.json).
  • Implement a tiny CLI/script that signs K reports into DSSE, submits to Rekor/mirror, and logs latency + bytes + success.
  • Sweep K in {1,5,10,25,50}, output CSV under experiments/dsse_rekor/.

Definition of done

  • Deliver plot/table with recommended batch size and failure threshold (e.g., “use 10; >25 implies 8% retries”).

Suggested assignments (1-day scope)

  • Alex → #1 repros
  • Mira → #2 PG slice (Authority.Tokens)
  • Petar → #3 DSSE/Rekor sweep

Ready-to-copy tickets

  • SCN-201: “Create 2 deterministic regression tests for Scanner failures” (include folder layout + CI target).
  • DB-301: “Prototype Postgres backend for Authority.Tokens with parity test and feature flag.”
  • SEC-411: “Run DSSE+Rekor batch sweep for scanner reports; publish CSV and default threshold.”

If you want, I can generate the folders, Makefiles, EF Core stubs, and DSSE/Rekor sweep script next. Good, this helps narrow the scope. Ill go point by point and then expand #3.


1) How these fit into Stella Ops, concretely

Think in terms of your core promises / moats:

  • Deterministic, replayable scans
  • Verifiable SBOM → VEX chain with proofs
  • Future-proof data layout (Mongo + “recognized” SQL)
  • Proof-of-integrity / Rekor-style transparency

The three tasks line up like this:

Task 1: Scanner post-mortem → 2 reproducible test cases

Where it lives: scanner.webservice + your test suite.

Why it matters for Stella Ops:

  • Your brand is: “we dont silently miss things; if we ever do, it becomes a test forever”.

  • Each “failure mode” becomes a fixture that:

    • feeds into reachability heuristics,
    • is later used by Vexer/Excititor to show “weve seen this pattern in the wild and we handle it”.

Its not random QA; its the foundation for your “golden reachability dataset” idea from the other branch. Every time you study a competitor miss or your own miss, it turns into:

  • “Golden fixture #NN known tricky case, guarded forever by CI”.

So this is directly supporting:

  • Deterministic scanner behavior
  • Trust Algebra Studio later being able to say: “policy X passes all N golden fixtures”.

Very practical outcome: it gives your devs a concrete target instead of a vague “scanner is sometimes wrong”.


Task 2: Postgres without migration: why and how

Youre right: there is no migration today, only shape-finding.

You said earlier: “conversion, not migration” and “we use PostgreSQL mainly because of recognition”. That can be turned into something useful now, without over-engineering:

Goal in this phase

  • Define one slice of data model in Postgres that:

    • is understandable to auditors / integrators,
    • is stable enough that, if you later decide to “convert” Mongo → PG, you already know how it should look,
    • forces you to create a clean abstraction (IWhateverStore) rather than hard-wiring Mongo into every service.

So instead of “migration plan”, think:

“We are prototyping a Postgres-friendly façade for one core concept, behind an interface.”

Example: Authority.Tokens or Scheduler.Jobs.

  • You keep Mongo as the actual source of truth for now.
  • You add a minimal Postgres model in EF Core.
  • You add a parity test (write/read in both backends, compare).
  • You wire a feature flag like STORE_BACKEND=Mongo|Postgres so you can switch environments on/off.

This gives you:

  • Early signal about “does this data model work in SQL?”
  • A future-proof seam where “conversion” can happen when the product stabilizes.
  • Something that looks familiar to enterprise customers (“yes, we have Postgres, here is the schema”).

No migration script, no DMS, just learning and shaping.

If you prefer, you can drop the CI parity test and only keep:

  • Interface
  • Two implementations
  • A simple console test or integration test

to keep ceremony minimal while still forcing a clean boundary.


3) DSSE + Rekor attestation experiment: deeper elaboration

Ill treat this as: “Explain what exactly my team should build and why it matters to Stella Ops.”

3.1. Why you care at all

This task supports at least three of your moats:

  • Deterministic replayable scans The DSSE envelope + Rekor entry is a cryptographic “receipt” for a given scan + SBOM + VEX result.
  • Proof-of-Integrity Graph / Proof-Market Ledger If you later build your own Rekor mirror or “Proof-Market Ledger”, you need to know real batch sizes and behavior now.
  • Crypto-sovereign readiness Eventually you want GOST / SM / PQC signatures; this small experiment tells you how your stack behaves with any signature scheme you plug in later.

So were doing one focused measurement:

For one type of attestation, find the smallest batch size that:

  • keeps latency acceptable,
  • doesnt cause excessive timeouts or retries,
  • doesnt make envelopes so large they become awkward.

This becomes your default configuration for Scanner → Attestation → Rekor in all future designs.


3.2. What exactly to build

Propose a tiny .NET 10 console tool, e.g.:

src/Experiments/StellaOps.Attest.Bench/StellaOps.Attest.Bench.csproj

Binary: stella-attest-bench

Inputs

  • A directory with scanner reports, e.g.: artifacts/scanner-reports/*.json
  • Rekor endpoint and credentials (or test/mirror instance)
  • Batch sizes to sweep: e.g. 1,5,10,25,50

CLI sketch

stella-attest-bench \
  --reports-dir ./artifacts/scanner-reports \
  --rekor-url https://rekor.stella.local \
  --batch-sizes 1,5,10,25,50 \
  --out ./experiments/dsse_rekor/results.csv

What each run does

For each batch size K:

  1. Take K reports from the directory.

  2. For each report:

    • Wrap into a DSSE envelope:

      {
        "payloadType": "application/vnd.stellaops.scanner-report+json",
        "payload": "<base64(report.json)>",
        "signatures": [
          {
            "keyid": "authority-key-1",
            "sig": "<base64(signature-by-key-1)>"
          }
        ]
      }
      
    • Measure size of the envelope in bytes.

  3. Submit the K envelopes to Rekor:

    • Either one by one, or if your client API supports it, in a single batch call.

    • Record:

      • start timestamp
      • end timestamp
      • status (success / failure / retry count)
  4. Append a row to results.csv:

    timestamp,batch_size,envelopes_count,total_bytes,avg_bytes,latency_ms,successes,failures,retries
    2025-11-30T14:02:00Z,10,10,123456,12345.6,820,10,0,0
    

You can enrich it later with HTTP codes, Rekor log index, etc., but this is enough to choose a default.


3.3. Minimal internal structure

Rough C# layout (no full code, just architecture so devs dont wander):

// Program.cs
// - Parse args
// - Build IServiceProvider
// - Resolve and run BenchRunner

public sealed class BenchConfig
{
    public string ReportsDirectory { get; init; } = default!;
    public Uri RekorUrl { get; init; } = default!;
    public int[] BatchSizes { get; init; } = Array.Empty<int>();
    public string OutputCsvPath { get; init; } = default!;
}

public sealed class BenchRunner
{
    private readonly IDsseSigner _signer;
    private readonly IRekorClient _rekor;
    private readonly IResultsSink _sink;

    public async Task RunAsync(BenchConfig config, CancellationToken ct);
}

// IDsseSigner: wraps "payload" (report json bytes) -> DSSE envelope bytes
// IRekorClient: envelope bytes -> result (success/fail, latency, retries)
// IResultsSink: append rows to CSV

Later, IDsseSigner and IRekorClient can be reused inside a proper StellaOps.Attest library that the Scanner or Scheduler use.


3.4. What “done” looks like in a way that matters to Stella Ops

Not just “we have a toy”. You want a decision out of this:

  • DEFAULT_DSSE_BATCH_SIZE = 10

  • “For this Rekor setup, we observe:

    • ≤10: 100 % success, latency ~X ms
    • 25: occasional retries
    • 50: frequent 5xx / timeouts; not recommended”

You then:

  • Write this into a short markdown note, e.g. docs/attestation/dsse-rekor-batch-sizing.md

  • Use it when designing:

    • Scanner → “Attestation job” messages (avoid pushing >10 envelopes at once),
    • The future “Proof-Market Ledger” so it doesnt assume unbounded batches,
    • Air-gap mirror behavior (“we queue up batches of 10 before syncing”).

So the outcome is a concrete configuration default and a constraint that all future modules respect.


If you want to simplify further

If this still feels too heavy for current prototyping speed, the absolute minimum to keep the value:

  • Hardcode 2 batch sizes: 1 and 10.

  • Only measure:

    • latency
    • success/fail
  • Print results to stdout instead of CSV.

The key is: one small executable experiment that turns “DSSE + Rekor is cool” into “our default batch size is X for environment Y”.


If you tell me which module names you want to standardize on for this (e.g. StellaOps.Attest, StellaOps.Experiments, etc.), I can sketch exact folder names, namespaces, and even a first .csproj + Program.cs your dev can paste in. Ill rewrite all three as tight internal guidelines you can drop into docs/dev/. Think of each as something you can hand to a new engineer and say: “Follow this exactly.”


G1 · Scanner regression fixtures

File: docs/dev/scanner-regressions.md Applies to: scanner.webservice and any libraries it uses.

1. Purpose

Any serious scanner failure must become a deterministic regression fixture. Over time this becomes Stella Ops “golden dataset” for reachability and correctness.

Outcomes:

  • Bugs dont repeat silently.
  • Heuristics and future policies are validated against the same corpus.
  • Postmortems always end with a guardrail in CI.

2. When to create a regression fixture

Create a fixture when all three hold:

  1. The bug affects at least one of:

    • Severity ≥ Medium
    • Highvolume ecosystems (OS packages, Java, Python, Node, container base images)
    • Core behaviors (reachability, deduplication, suppression, parser correctness)
  2. The behavior is reproducible from static inputs (image, SBOM, or manifests).

  3. The expected correct behavior is agreed by at least one more engineer on the scanner team.

If in doubt: add the fixture. It is cheap, and it strengthens the golden corpus.

3. Directory structure & naming

Test project:

tests/
  StellaOps.Scanner.RegressionTests/
    Regression/
      SCN-0001-missed-cve-in-layer/
      SCN-0002-wrong-reachability/
      ...

Fixture layout (example):

SCN-0001-missed-cve-in-layer/
  input/
    image.sbom.json        # or image.tar, etc.
    config.json            # scanner flags, if needed
  expected/
    findings.json          # canonical expected findings
  case.metadata.json       # machine-readable description
  case.md                  # short human narrative

case.metadata.json schema:

{
  "id": "SCN-0001",
  "title": "Missed CVE-2025-12345 in lower layer",
  "kind": "vulnerability-missed",
  "source": "internal-postmortem", 
  "severity": "high",
  "tags": [
    "reachability",
    "language:java",
    "package:log4j"
  ]
}

case.md should answer:

  • What failed?
  • Why this case is representative / important?
  • What is the correct expected behavior?

4. Test harness rules

Global rules for all regression tests:

  • No network access (fixtures must be fully selfcontained).
  • No timedependent logic (use fixed timestamps if necessary).
  • No nondeterministic behavior (seed any randomness).

Comparison rules:

  • Normalize scanner output before comparison:

    • Sort arrays (e.g. findings).
    • Remove volatile fields (generated IDs, timestamps, internal debug metadata).
  • Compare canonical JSON (e.g. string equality on normalized JSON).

Implementation sketch (xUnit):

public class GoldenRegressionTests
{
    [Theory]
    [MemberData(nameof(RegressionSuite.LoadCases), MemberType = typeof(RegressionSuite))]
    public async Task Scanner_matches_expected_findings(RegressionCase @case)
    {
        var actual = await ScannerTestHost.RunAsync(@case.InputDirectory);

        var expectedJson = File.ReadAllText(@case.ExpectedFindingsPath);

        var normalizedActual = FindingsNormalizer.Normalize(actual);
        var normalizedExpected = FindingsNormalizer.Normalize(expectedJson);

        Assert.Equal(normalizedExpected, normalizedActual);
    }
}

5. How to add a new regression case

Checklist for developers:

  1. Reproduce the bug using local dev tools.

  2. Minimize the input:

    • Prefer a trimmed SBOM or minimal image over full customer artifacts.
    • Remove sensitive data; use synthetic equivalents if needed.
  3. Create folder under Regression/SCN-XXXX-short-slug/.

  4. Populate:

    • input/ with all needed inputs (SBOMs, manifests, config).
    • expected/findings.json with the correct canonical output (not the buggy one).
    • case.metadata.json and case.md.
  5. Run tests locally: dotnet test tests/StellaOps.Scanner.RegressionTests

  6. Fix the scanner behavior (if not already fixed).

  7. Ensure tests fail without the fix and pass with it.

  8. Open PR with:

    • Fixture directory.
    • Scanner fix.
    • Any harness adjustments.

6. CI integration & “done” definition

  • CI job Scanner-Regression runs in PR validation and main.

  • A regression case is “live” when:

    • Its present under Regression/.
    • It is picked up by the harness.
    • CI fails if the behavior regresses.

G2 · Postgres slice prototype (shapefinding, no migration)

File: docs/dev/authority-store-backends.md Applies to: Authority (or similar) services that currently use Mongo.

1. Purpose

This is not data migration. It is about:

  • Designing a clean storage interface.
  • Prototyping a Postgresfriendly schema for one bounded slice (e.g. Authority.Tokens).
  • Being able to run the service with either Mongo or Postgres behind the same interface.

This supports future “conversion” and enterprise expectations (“we speak Postgres”) without blocking current prototyping speed.

2. Scope & constraints

  • Scope: one slice only (e.g. tokens, jobs, or VEX verdicts).
  • Mongo remains the operational source of truth for now.
  • Postgres path is optin via configuration.
  • No backward migration or synchronization logic.

3. Repository layout

Example for Authority.Tokens:

src/
  StellaOps.Authority/
    Domain/
      Token.cs
      TokenId.cs
      ...
    Stores/
      ITokenStore.cs
      MongoTokenStore.cs
      PostgresTokenStore.cs
    Persistence/
      AuthorityPgDbContext.cs
      Migrations/
        0001_AuthTokens_Init.sql
docs/
  dev/
    authority-store-backends.md

4. Store interface design

Guidelines:

  • Keep the interface narrow and domaincentric.
  • Do not leak Mongo or SQL constructs.

Example:

public interface ITokenStore
{
    Task<Token?> GetByIdAsync(TokenId id, CancellationToken ct = default);
    Task<IReadOnlyList<Token>> GetByOwnerAsync(PrincipalId ownerId, CancellationToken ct = default);
    Task SaveAsync(Token token, CancellationToken ct = default);
    Task RevokeAsync(TokenId id, RevocationReason reason, CancellationToken ct = default);
}

Both MongoTokenStore and PostgresTokenStore must implement this contract.

5. Postgres schema guidelines

Principles:

  • Use schemas per bounded context, e.g. authority.
  • Choose stable primary keys (uuid or bigint), not composite keys unless necessary.
  • Index only for known access patterns.

Example DDL:

CREATE SCHEMA IF NOT EXISTS authority;

CREATE TABLE IF NOT EXISTS authority.tokens (
    id              uuid            PRIMARY KEY,
    owner_id        uuid            NOT NULL,
    kind            text            NOT NULL,
    issued_at_utc   timestamptz     NOT NULL,
    expires_at_utc  timestamptz     NULL,
    is_revoked      boolean         NOT NULL DEFAULT false,
    revoked_at_utc  timestamptz     NULL,
    revoked_reason  text            NULL,
    payload_json    jsonb           NOT NULL,
    created_at_utc  timestamptz     NOT NULL DEFAULT now(),
    updated_at_utc  timestamptz     NOT NULL DEFAULT now()
);

CREATE INDEX IF NOT EXISTS ix_tokens_owner_id
    ON authority.tokens (owner_id);

CREATE INDEX IF NOT EXISTS ix_tokens_kind
    ON authority.tokens (kind);

EF Core:

  • Map with ToTable("tokens", "authority").
  • Use owned types or value converters for payload_json.

6. Configuration & feature flag

Configuration:

// appsettings.json
{
  "Authority": {
    "StoreBackend": "Mongo", // or "Postgres"
    "Postgres": {
      "ConnectionString": "Host=...;Database=stella;..."
    },
    "Mongo": {
      "ConnectionString": "...",
      "Database": "stella_authority"
    }
  }
}

Environment override:

  • AUTHORITY_STORE_BACKEND=Postgres

DI wiring:

services.AddSingleton<ITokenStore>(sp =>
{
    var cfg = sp.GetRequiredService<IOptions<AuthorityOptions>>().Value;
    return cfg.StoreBackend switch
    {
        "Postgres" => new PostgresTokenStore(
            sp.GetRequiredService<AuthorityPgDbContext>()),
        "Mongo" => new MongoTokenStore(
            sp.GetRequiredService<IMongoDatabase>()),
        _ => throw new InvalidOperationException("Unknown store backend")
    };
});

7. Minimal parity / sanity checks

Given you are still prototyping, keep this light.

Recommended:

  • A single test or console harness that:

    • Creates a small set of Token objects.
    • Writes via Mongo and via Postgres.
    • Reads back from each and compares a JSON representation of the domain object (ignoring DBspecific metadata).

This is not full data migration testing; it is a smoke check that both backends honor the same domain contract.

8. “Done” definition for the first slice

For the chosen slice (e.g. Authority.Tokens):

  • ITokenStore exists, with Mongo and Postgres implementations.

  • A Postgres DbContext and first migration are present and runnable.

  • The service starts cleanly with StoreBackend=Postgres against an empty DB.

  • There is at least one automated or scripted sanity check that both backends behave equivalently for typical operations.

  • This is documented in docs/dev/authority-store-backends.md:

    • How to switch backends.
    • Current state: “Postgres is experimental; Mongo is default.”

G3 · DSSE + Rekor batchsize experiment

File: docs/attestation/dsse-rekor-batch-sizing.md Applies to: Attestation / integrity pipeline for scanner reports.

1. Purpose

Determine a concrete default batch size for DSSE envelopes submitted to Rekor, balancing:

  • Reliability (few or no failures / retries).
  • Latency (per batch).
  • Envelope size (practical for transport and logging).

Outcome: a hard configuration value, e.g. DefaultDsseRekorBatchSize = 10, backed by measurement.

2. Scope

  • Artifact type: scanner report (e.g. scanner-report.json).
  • Environment: your current Rekor endpoint (or mirror), not production critical path.
  • One small experiment tool, later reusable by attestation services.

3. Project structure

src/
  Experiments/
    StellaOps.Attest.Bench/
      Program.cs
      BenchConfig.cs
      BenchRunner.cs
      DsseSigner.cs
      RekorClient.cs
      ResultsCsvSink.cs
experiments/
  dsse_rekor/
    results.csv
docs/
  attestation/
    dsse-rekor-batch-sizing.md

4. CLI contract

Binary: stella-attest-bench

Example usage:

stella-attest-bench \
  --reports-dir ./artifacts/scanner-reports \
  --rekor-url https://rekor.lab.stella \
  --batch-sizes 1,5,10,25,50 \
  --out ./experiments/dsse_rekor/results.csv \
  --max-retries 3 \
  --timeout-ms 10000

Required flags:

  • --reports-dir
  • --rekor-url
  • --batch-sizes
  • --out

Optional:

  • --max-retries (default 3)
  • --timeout-ms (default 10000 ms)

5. Implementation guidelines

Core config:

public sealed class BenchConfig
{
    public string ReportsDirectory { get; init; } = default!;
    public Uri RekorUrl { get; init; } = default!;
    public int[] BatchSizes { get; init; } = Array.Empty<int>();
    public string OutputCsvPath { get; init; } = default!;
    public int MaxRetries { get; init; } = 3;
    public int TimeoutMs { get; init; } = 10_000;
}

Interfaces:

public interface IDsseSigner
{
    byte[] WrapAndSign(byte[] payload, string payloadType);
}

public interface IRekorClient
{
    Task<RekorSubmitResult> SubmitAsync(
        IReadOnlyList<byte[]> envelopes,
        CancellationToken ct);
}

public interface IResultsSink
{
    Task AppendAsync(BatchMeasurement measurement, CancellationToken ct);
}

Measurement model:

public sealed class BatchMeasurement
{
    public DateTime TimestampUtc { get; init; }
    public int BatchSize { get; init; }
    public int EnvelopeCount { get; init; }
    public long TotalBytes { get; init; }
    public double AverageBytes { get; init; }
    public long LatencyMs { get; init; }
    public int SuccessCount { get; init; }
    public int FailureCount { get; init; }
    public int RetryCount { get; init; }
}

Runner outline:

  1. Enumerate report files in ReportsDirectory (e.g. *.json).

  2. For each batchSize:

    • Select up to batchSize reports.

    • For each report:

      • Read bytes.
      • Call IDsseSigner.WrapAndSign with payload type application/vnd.stellaops.scanner-report+json.
      • Track envelope sizes.
    • Start stopwatch.

    • Submit via IRekorClient.SubmitAsync with retry logic honoring MaxRetries and TimeoutMs.

    • Record latency, successes, failures, retries.

    • Write one BatchMeasurement row to results.csv.

CSV headers:

timestamp_utc,batch_size,envelopes_count,total_bytes,avg_bytes,latency_ms,successes,failures,retries

Signer:

  • For the experiment, use a local dev key (e.g. Ed25519).
  • Make signing deterministic (no random salts that affect envelope size unexpectedly).

Rekor client:

  • For this experiment, you only need “accepted or not” plus HTTP status codes; inclusion proof verification is out of scope.

6. How to run and interpret

Execution steps:

  1. Prepare a folder with representative scanner reports:

    • At least 2030 reports from different images, to approximate typical size variance.
  2. Ensure the Rekor environment is reachable and not ratelimited.

  3. Run the tool with a sweep like 1,5,10,25,50.

  4. Inspect results.csv:

    • For each batch size, look at:

      • Average latency.
      • Any nonzero failures or elevated retry counts.
      • Total and average bytes per envelope.

Decision rules:

  • Pick the largest batch size that:

    • Shows 0 failures and acceptable retry counts across multiple runs.
    • Keeps latency in a reasonable budget for your pipeline (e.g. under a few seconds per batch).
  • Define:

    • DefaultBatchSize = chosen safe value.
    • HardMaxBatchSize = first size where failures or unacceptable latency appear.

Document in docs/attestation/dsse-rekor-batch-sizing.md:

  • Environment details (Rekor version, hardware, network).

  • Brief description of report corpus.

  • A summarized table from results.csv.

  • The chosen DefaultBatchSize and HardMaxBatchSize with a clear statement:

    • “All production attestation jobs must respect this default and must not exceed the hard max without a new measurement.”

7. “Done” definition

The experiment is “done” when:

  • StellaOps.Attest.Bench builds and runs with the specified CLI.

  • It produces a results.csv with one row per (run, batch_size) combination.

  • There is a written default:

    • DefaultDsseRekorBatchSize = X
    • plus rationale in dsse-rekor-batch-sizing.md.
  • Future attestation designs (e.g. scanner → ledger pipeline) are required to use that default unless they explicitly update the experiment and the document.


If you want, next step can be: convert each of these into actual files (including a minimal csproj and Program.cs for the bench tool) so your team can copypaste and start implementing.