feat: Add initial implementation of Vulnerability Resolver Jobs
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
- Created project for StellaOps.Scanner.Analyzers.Native.Tests with necessary dependencies. - Documented roles and guidelines in AGENTS.md for Scheduler module. - Implemented IResolverJobService interface and InMemoryResolverJobService for handling resolver jobs. - Added ResolverBacklogNotifier and ResolverBacklogService for monitoring job metrics. - Developed API endpoints for managing resolver jobs and retrieving metrics. - Defined models for resolver job requests and responses. - Integrated dependency injection for resolver job services. - Implemented ImpactIndexSnapshot for persisting impact index data. - Introduced SignalsScoringOptions for configurable scoring weights in reachability scoring. - Added unit tests for ReachabilityScoringService and RuntimeFactsIngestionService. - Created dotnet-filter.sh script to handle command-line arguments for dotnet. - Established nuget-prime project for managing package downloads.
This commit is contained in:
@@ -0,0 +1,133 @@
|
||||
Here’s a compact, practical way to think about **embedding in‑toto provenance attestations directly inside your event payloads** (instead of sidecar files), so your vuln/build graph stays temporally consistent.
|
||||
|
||||
---
|
||||
|
||||
### Why embed?
|
||||
|
||||
* **Atomicity:** build → publish → scan → VEX decisions share one event ID and clock; no dangling sidecars.
|
||||
* **Replayability:** the event stream alone reproduces state (great for offline kits/audits).
|
||||
* **Causal joins:** vulnerability findings can cite the exact provenance that led to an image/digest.
|
||||
|
||||
---
|
||||
|
||||
### Event shape (single, self‑contained envelope)
|
||||
|
||||
```json
|
||||
{
|
||||
"eventId": "01JDN2Q0YB8M…",
|
||||
"eventType": "build.provenance.v1",
|
||||
"occurredAt": "2025-11-13T10:22:31Z",
|
||||
"subject": {
|
||||
"artifactPurl": "pkg:docker/acme/api@sha256:…",
|
||||
"digest": {"sha256": "…"}
|
||||
},
|
||||
"provenance": {
|
||||
"kind": "in-toto-provenance",
|
||||
"dsse": {
|
||||
"payloadType": "application/vnd.in-toto+json",
|
||||
"payload": "<base64(in-toto Statement)>",
|
||||
"signatures": [{"keyid":"…","sig":"…"}]
|
||||
},
|
||||
"transparency": {
|
||||
"rekor": {"logIndex": 123456, "logID": "…", "entryUUID": "…"}
|
||||
}
|
||||
},
|
||||
"sig": {
|
||||
"envelope": "dsse",
|
||||
"alg": "Ed25519",
|
||||
"bundle": { "certChain": ["…"], "timestamp": "…" }
|
||||
},
|
||||
"meta": {
|
||||
"builderId": "https://builder.stella-ops.local/gha",
|
||||
"buildInvocationId": "gha-run-457812",
|
||||
"slsa": {"level": 3}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Notes**
|
||||
|
||||
* `provenance.dsse.payload` holds the raw in‑toto Statement (Statement + Subject + Predicate).
|
||||
* Keep both **artifact digest** (subject) and **statement subject** (inside payload) and verify they match on ingest.
|
||||
|
||||
---
|
||||
|
||||
### DB model (Mongo-esque)
|
||||
|
||||
* `events` collection: one doc per event (above schema).
|
||||
* **Compound index:** `{ "subject.digest.sha256": 1, "occurredAt": 1 }`
|
||||
* **Causal index:** `{ "meta.buildInvocationId": 1 }`
|
||||
* **Uniq guard:** `{ "eventId": 1 } unique`
|
||||
|
||||
---
|
||||
|
||||
### Ingest pipeline (deterministic)
|
||||
|
||||
1. **Verify DSSE:** check signature, cert roots (or offline trust bundle).
|
||||
2. **Validate Statement:** subject digests, builder ID, predicateType.
|
||||
3. **Upsert artifact node:** keyed by digest; attach `lastProvenanceEventId`.
|
||||
4. **Append event:** write once; never mutate (event‑sourced).
|
||||
5. **Emit derived edges:** `(builderId) --built--> (artifact@digest)` with `occurredAt`.
|
||||
|
||||
---
|
||||
|
||||
### Joining scans to provenance (temporal consistency)
|
||||
|
||||
* When a scan event arrives, resolve the **latest provenance event with `occurredAt ≤ scan.occurredAt`** for the same digest.
|
||||
* Store an edge `(artifact@digest) --scannedWith--> (scanner@version)` with a **pointer to the provenance eventId** used for policy.
|
||||
|
||||
---
|
||||
|
||||
### Minimal .NET 10 contracts
|
||||
|
||||
```csharp
|
||||
public sealed record DsseEnvelope(string PayloadType, string Payload, IReadOnlyList<DsseSig> Signatures);
|
||||
public sealed record Provenance(string Kind, DsseEnvelope Dsse, Transparency? Transparency);
|
||||
public sealed record EventSubject(string ArtifactPurl, Digest Digest);
|
||||
public sealed record EventEnvelope(
|
||||
string EventId, string EventType, DateTime OccurredAt,
|
||||
EventSubject Subject, Provenance Provenance, SigMeta Sig, Meta Meta);
|
||||
|
||||
public interface IEventVerifier {
|
||||
ValueTask VerifyAsync(EventEnvelope ev, CancellationToken ct);
|
||||
}
|
||||
public interface IEventIngestor {
|
||||
ValueTask IngestAsync(EventEnvelope ev, CancellationToken ct); // verify->validate->append->derive
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Policy hooks (VEX/Trust Algebra)
|
||||
|
||||
* **Rule:** “Only trust findings if the scan’s referenced provenance has `builderId ∈ AllowedBuilders` and `SLSA ≥ 3` and `time(scan) − time(prov) ≤ 24h`.”
|
||||
* **Effect:** drops stale/forged results and aligns all scoring to one timeline.
|
||||
|
||||
---
|
||||
|
||||
### Migration from sidecars
|
||||
|
||||
1. **Dual‑write** for one sprint: keep emitting sidecars, but also embed DSSE in events.
|
||||
2. Add **backfill job**: wraps historical sidecars into `build.provenance.v1` events (preserve original timestamps).
|
||||
3. Flip **consumers** (scoring/VEX) to **require `provenance` in the event**; keep sidecar reader only for legacy imports.
|
||||
|
||||
---
|
||||
|
||||
### Failure & edge cases
|
||||
|
||||
* **Oversized payloads:** gzip the DSSE payload; cap event body (e.g., 512 KB) and store overflow in `provenance.ref` (content‑addressed blob) while **hash‑linking** it in the event.
|
||||
* **Multiple subjects:** keep the Statement intact; still key the event by the **primary digest** you care about, but validate all subjects.
|
||||
|
||||
---
|
||||
|
||||
### Quick checklist to ship
|
||||
|
||||
* [ ] Event schema & JSON schema with strict types (no additionalProperties).
|
||||
* [ ] DSSE + in‑toto validators (offline trust bundles supported).
|
||||
* [ ] Mongo indexes + append‑only writer.
|
||||
* [ ] Temporal join in scanner consumer (≤ O(log n) via index).
|
||||
* [ ] VEX rules referencing `event.meta` & `provenance.dsse`.
|
||||
* [ ] Backfill task for legacy sidecars.
|
||||
* [ ] Replay test: rebuild graph from events only → identical results.
|
||||
|
||||
If you want, I can turn this into ready‑to‑drop **.proto + C# models**, plus a Mongo migration script and a tiny verifier service.
|
||||
@@ -0,0 +1,103 @@
|
||||
|
||||
|
||||
Here’s a tight idea I think you’ll like: **make every VEX “non‑affected” verdict explain itself with provable, symbol‑level evidence**—not just “package X isn’t reachable,” but “function `Foo::bar()` (the vulnerable sink) is never called in any admissible execution of image Y,” backed by cryptographic provenance.
|
||||
|
||||
---
|
||||
|
||||
# Why this matters (quickly)
|
||||
|
||||
* **Trust**: Auditors and customers can verify why you suppressed a CVE.
|
||||
* **Quiet scanner**: Fewer false alarms because decisions cite concrete call‑paths (or their absence).
|
||||
* **Moat**: Competitors stop at file/package reachability; you show **function‑level** proof tied to in‑toto attestations.
|
||||
|
||||
---
|
||||
|
||||
# Core concept (plain)
|
||||
|
||||
Blend two things:
|
||||
|
||||
1. **Deterministic symbol reachability** (per language): build minimal call graphs and mark whether the vulnerable symbol is callable from your app’s entrypoints.
|
||||
2. **in‑toto‑anchored provenance**: sign the *inputs and reasoning* (rules, SBOM slice, call‑graph hash, evidence artifacts), so the verdict can be independently re‑verified.
|
||||
|
||||
Result: each VEX decision is a **verifiable mini‑proof**.
|
||||
|
||||
---
|
||||
|
||||
# What the evidence looks like (per CVE/component)
|
||||
|
||||
* **Symbol set**: canonical IDs of vulnerable functions (e.g., `pkg@ver#Type::Method(sig)`).
|
||||
* **Call‑graph digest**: hash of pruned call graph from app entrypoints to those symbols.
|
||||
* **Evidence**:
|
||||
|
||||
* Static: “No path from any entrypoint → {vuln symbols} (k=0).”
|
||||
* Optional runtime: sampled traces (EventPipe/JFR/eBPF) show **0 hits** to symbols/guards.
|
||||
* **Context**: build inputs (SBOM, lockfiles, compile units), framework models used, versions.
|
||||
* **Attestation**: in‑toto/DSSE signed bundle with reproducible scan manifest.
|
||||
|
||||
---
|
||||
|
||||
# Minimal prototype this week (Scanner reachability scorer)
|
||||
|
||||
1. **Symbol mappers (MVP)**
|
||||
|
||||
* .NET: read PDB + IL to enumerate `MethodDef` symbols; map NuGet pkg → assembly → methods.
|
||||
* JVM: JAR index + method table (from ASM); map Maven coords → classes → methods.
|
||||
2. **Entrypoint discovery**
|
||||
|
||||
* Docker CMD/ENTRYPOINT → process launch → managed main(s) (ASP.NET Program.Main, Spring Boot main).
|
||||
3. **Shallow call‑graph** (no fancy points‑to yet):
|
||||
|
||||
* Direct calls + common framework handoffs (ASP.NET routing → controller; Spring @RequestMapping → handler).
|
||||
4. **Vuln ↔ symbol alignment**
|
||||
|
||||
* Heuristics: match GHSA/OSV “affected functions” or patch diff to infer symbol names; fallback to package‑scope verdict with a flag “symbol‑inferred: false”.
|
||||
5. **Decision object**
|
||||
|
||||
* `ReachabilityDecision.json` with: entrypoints, symbol set, path_count, notes, hashes.
|
||||
6. **Attest**
|
||||
|
||||
* Emit `reachability.intoto.jsonl` (subject = image digest + SBOM component + symbol digest). Cosign with your test key.
|
||||
7. **VEX output**
|
||||
|
||||
* OpenVEX statement reason: `component_not_present` or `vulnerable_code_not_in_execute_path` with `justification_url` → small HTML report (signed).
|
||||
|
||||
---
|
||||
|
||||
# Data & schemas to add
|
||||
|
||||
* `Scanner.Reachability/`
|
||||
|
||||
* `SymbolIndex` (pkg → assemblies/classes/methods)
|
||||
* `EntryPoints` (per image, normalized)
|
||||
* `CallGraphPruned` (edges + hash)
|
||||
* `Decision` (path_count, evidence, versions)
|
||||
* `Authority`
|
||||
|
||||
* Key management for DSSE; policy to **require** reachability evidence for “non‑affected”.
|
||||
|
||||
---
|
||||
|
||||
# Language roadmap (fast win → harder)
|
||||
|
||||
* **Week 1–2:** .NET + JVM shallow graphs + ASP.NET/Spring models.
|
||||
* **Week 3–4:** Node/TS (TS compiler API), Go (SSA), Python (import graph + common web frameworks).
|
||||
* **Stretch:** Rust/Swift (MIR/SIL summaries), native (symbols + coarse edges), Shell (exec chain).
|
||||
|
||||
---
|
||||
|
||||
# Where to surface it (UX)
|
||||
|
||||
* VEX details panel: “Non‑affected (0 call paths)” → expand → show entrypoints, collapsed edge list, trace hit‑counts, and “Verify attestation” button.
|
||||
* CLI: `stella scan --explain CVE-XYZ --verify-attestation`.
|
||||
|
||||
---
|
||||
|
||||
# Guardrails
|
||||
|
||||
* If symbol mapping is uncertain, **do not** claim “non‑affected”; downgrade to “under review” with rationale.
|
||||
* Cache symbol indexes per package version; keep the call‑graph pruned to entrypoint cones for speed.
|
||||
* Everything reproducible: the **scan manifest** (feeds, rule versions, hashes) must recreate the same decision bit‑for‑bit.
|
||||
|
||||
---
|
||||
|
||||
If you want, I can draft the exact C# namespaces, interfaces, and the OpenVEX + in‑toto payloads you can drop into `Scanner.Reachability` and `Authority.Attest`.
|
||||
@@ -0,0 +1,488 @@
|
||||
Here is a complete, implementation-ready sketch you can drop into your solution and tune.
|
||||
|
||||
I assume:
|
||||
|
||||
* ASP.NET Core Web API (.NET 10).
|
||||
* EF `DbContext` with `DbSet<PolCensusList>`.
|
||||
* Excel via **ClosedXML** (clean API, MIT license, built on OpenXML).
|
||||
|
||||
---
|
||||
|
||||
## 1. NuGet packages
|
||||
|
||||
Add to the Web/API project:
|
||||
|
||||
```bash
|
||||
dotnet add package ClosedXML
|
||||
dotnet add package DocumentFormat.OpenXml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. File repository abstraction
|
||||
|
||||
This matches your requirement: upload/download by `bucketId` + `fileId`, plus stream variants.
|
||||
|
||||
```csharp
|
||||
public interface IFileRepository
|
||||
{
|
||||
// Uploads a file identified by bucketId + fileId from a Stream
|
||||
Task UploadAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
Stream content,
|
||||
string contentType,
|
||||
CancellationToken cancellationToken = default);
|
||||
|
||||
// Uploads a file from an in-memory buffer
|
||||
Task UploadAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
byte[] content,
|
||||
string contentType,
|
||||
CancellationToken cancellationToken = default);
|
||||
|
||||
// Downloads a file as a Stream (caller is responsible for disposing)
|
||||
Task<Stream> DownloadAsStreamAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
CancellationToken cancellationToken = default);
|
||||
|
||||
// Downloads a file as a byte[] buffer
|
||||
Task<byte[]> DownloadAsBytesAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
CancellationToken cancellationToken = default);
|
||||
}
|
||||
```
|
||||
|
||||
Example of a simple implementation over some `IFileStoreClient` (adjust to your FileStore API):
|
||||
|
||||
```csharp
|
||||
public sealed class FileStoreRepository : IFileRepository
|
||||
{
|
||||
private readonly IFileStoreClient _client;
|
||||
|
||||
public FileStoreRepository(IFileStoreClient client)
|
||||
{
|
||||
_client = client;
|
||||
}
|
||||
|
||||
public async Task UploadAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
Stream content,
|
||||
string contentType,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
// Example – adapt to your real client
|
||||
await _client.PutObjectAsync(
|
||||
bucketId: bucketId,
|
||||
objectId: fileId,
|
||||
content: content,
|
||||
contentType: contentType,
|
||||
cancellationToken: cancellationToken);
|
||||
}
|
||||
|
||||
public async Task UploadAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
byte[] content,
|
||||
string contentType,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
await using var ms = new MemoryStream(content, writable: false);
|
||||
await UploadAsync(bucketId, fileId, ms, contentType, cancellationToken);
|
||||
}
|
||||
|
||||
public async Task<Stream> DownloadAsStreamAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
// Must return a readable Stream ready for ClosedXML
|
||||
return await _client.GetObjectStreamAsync(
|
||||
bucketId: bucketId,
|
||||
objectId: fileId,
|
||||
cancellationToken: cancellationToken);
|
||||
}
|
||||
|
||||
public async Task<byte[]> DownloadAsBytesAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
await using var stream = await DownloadAsStreamAsync(bucketId, fileId, cancellationToken);
|
||||
using var ms = new MemoryStream();
|
||||
await stream.CopyToAsync(ms, cancellationToken);
|
||||
return ms.ToArray();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Register in DI:
|
||||
|
||||
```csharp
|
||||
builder.Services.AddScoped<IFileRepository, FileStoreRepository>();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Import service for `PolCensusList` from Excel
|
||||
|
||||
To keep the controller thin, put Excel parsing + EF into a service.
|
||||
|
||||
Assumptions (adjust as needed):
|
||||
|
||||
* The file is an `.xlsx` with a header row.
|
||||
* Data starts at row 2.
|
||||
* Columns are:
|
||||
|
||||
| Column | Excel | Property |
|
||||
| ------ | ----- | -------------- |
|
||||
| A | 1 | CustPid |
|
||||
| B | 2 | Gname |
|
||||
| C | 3 | Sname |
|
||||
| D | 4 | Fname |
|
||||
| E | 5 | BirthDate |
|
||||
| F | 6 | Gender |
|
||||
| G | 7 | Bmi |
|
||||
| H | 8 | Dependant |
|
||||
| I | 9 | DependantOn |
|
||||
| J | 10 | MemberAction |
|
||||
| K | 11 | GrpCode |
|
||||
| L | 12 | BeginDate |
|
||||
| M | 13 | SrCustId |
|
||||
| N | 14 | MemberPolicyId |
|
||||
| O | 15 | MemberAnnexId |
|
||||
| P | 16 | ErrMsg |
|
||||
|
||||
Other fields (`SrPolicyId`, `SrAnnexId`, `FileId`, `Tstamp`) are taken from parameters/system.
|
||||
|
||||
```csharp
|
||||
using System.Globalization;
|
||||
using ClosedXML.Excel;
|
||||
using Microsoft.EntityFrameworkCore;
|
||||
|
||||
public interface IPolCensusImportService
|
||||
{
|
||||
Task<int> ImportFromExcelAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
decimal srPolicyId,
|
||||
decimal srAnnexId,
|
||||
CancellationToken cancellationToken = default);
|
||||
}
|
||||
|
||||
public sealed class PolCensusImportService : IPolCensusImportService
|
||||
{
|
||||
private readonly SerdicaHealthContext _dbContext;
|
||||
private readonly IFileRepository _fileRepository;
|
||||
|
||||
public PolCensusImportService(
|
||||
SerdicaHealthContext dbContext,
|
||||
IFileRepository fileRepository)
|
||||
{
|
||||
_dbContext = dbContext;
|
||||
_fileRepository = fileRepository;
|
||||
}
|
||||
|
||||
public async Task<int> ImportFromExcelAsync(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
decimal srPolicyId,
|
||||
decimal srAnnexId,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
await using var stream = await _fileRepository.DownloadAsStreamAsync(bucketId, fileId, cancellationToken);
|
||||
using var workbook = new XLWorkbook(stream);
|
||||
var worksheet = workbook.Worksheets.First();
|
||||
|
||||
var now = DateTime.UtcNow;
|
||||
var entities = new List<PolCensusList>();
|
||||
|
||||
const int headerRow = 1;
|
||||
var firstDataRow = headerRow + 1;
|
||||
|
||||
for (var row = firstDataRow; ; row++)
|
||||
{
|
||||
var rowRange = worksheet.Row(row);
|
||||
if (rowRange.IsEmpty()) break; // Stop on first fully empty row
|
||||
|
||||
// Minimal “empty row” check – no CustPid and no Name => stop
|
||||
var custPidCell = rowRange.Cell(1);
|
||||
var gnameCell = rowRange.Cell(2);
|
||||
var snameCell = rowRange.Cell(3);
|
||||
|
||||
if (custPidCell.IsEmpty() && gnameCell.IsEmpty() && snameCell.IsEmpty())
|
||||
{
|
||||
break;
|
||||
}
|
||||
|
||||
var entity = new PolCensusList
|
||||
{
|
||||
// Non-null FK fields from parameters
|
||||
SrPolicyId = srPolicyId,
|
||||
SrAnnexId = srAnnexId,
|
||||
|
||||
CustPid = custPidCell.GetString().Trim(),
|
||||
Gname = gnameCell.GetString().Trim(),
|
||||
Sname = snameCell.GetString().Trim(),
|
||||
Fname = rowRange.Cell(4).GetString().Trim(),
|
||||
BirthDate = GetDate(rowRange.Cell(5)),
|
||||
Gender = rowRange.Cell(6).GetString().Trim(),
|
||||
Bmi = GetDecimal(rowRange.Cell(7)),
|
||||
Dependant = rowRange.Cell(8).GetString().Trim(),
|
||||
DependantOn = rowRange.Cell(9).GetString().Trim(),
|
||||
MemberAction = rowRange.Cell(10).GetString().Trim(),
|
||||
GrpCode = rowRange.Cell(11).GetString().Trim(),
|
||||
BeginDate = GetNullableDate(rowRange.Cell(12)),
|
||||
SrCustId = GetNullableDecimal(rowRange.Cell(13)),
|
||||
MemberPolicyId= GetNullableDecimal(rowRange.Cell(14)),
|
||||
MemberAnnexId = GetNullableDecimal(rowRange.Cell(15)),
|
||||
ErrMsg = rowRange.Cell(16).GetString().Trim(),
|
||||
|
||||
// Audit / technical fields
|
||||
Tstamp = now,
|
||||
FileId = fileId,
|
||||
|
||||
// Attr* left null for now – can be mapped later if needed
|
||||
};
|
||||
|
||||
entities.Add(entity);
|
||||
}
|
||||
|
||||
await using var transaction = await _dbContext.Database.BeginTransactionAsync(cancellationToken);
|
||||
try
|
||||
{
|
||||
await _dbContext.PolCensusLists.AddRangeAsync(entities, cancellationToken);
|
||||
var affected = await _dbContext.SaveChangesAsync(cancellationToken);
|
||||
await transaction.CommitAsync(cancellationToken);
|
||||
return affected;
|
||||
}
|
||||
catch
|
||||
{
|
||||
await transaction.RollbackAsync(cancellationToken);
|
||||
throw;
|
||||
}
|
||||
}
|
||||
|
||||
private static DateTime GetDate(IXLCell cell)
|
||||
{
|
||||
if (cell.DataType == XLDataType.DateTime &&
|
||||
cell.GetDateTime() != default)
|
||||
{
|
||||
return cell.GetDateTime().Date;
|
||||
}
|
||||
|
||||
var raw = cell.GetString().Trim();
|
||||
if (string.IsNullOrEmpty(raw))
|
||||
throw new InvalidOperationException("BirthDate is required but empty.");
|
||||
|
||||
// Try a few reasonable formats – extend if needed
|
||||
var formats = new[]
|
||||
{
|
||||
"dd.MM.yyyy",
|
||||
"dd/MM/yyyy",
|
||||
"yyyy-MM-dd",
|
||||
"M/d/yyyy",
|
||||
};
|
||||
|
||||
if (DateTime.TryParseExact(raw, formats,
|
||||
CultureInfo.InvariantCulture,
|
||||
DateTimeStyles.AssumeLocal,
|
||||
out var dt))
|
||||
{
|
||||
return dt.Date;
|
||||
}
|
||||
|
||||
if (DateTime.TryParse(raw, CultureInfo.CurrentCulture,
|
||||
DateTimeStyles.AssumeLocal, out var dt2))
|
||||
{
|
||||
return dt2.Date;
|
||||
}
|
||||
|
||||
throw new FormatException($"Cannot parse date value '{raw}'.");
|
||||
}
|
||||
|
||||
private static DateTime? GetNullableDate(IXLCell cell)
|
||||
{
|
||||
if (cell.IsEmpty()) return null;
|
||||
|
||||
if (cell.DataType == XLDataType.DateTime &&
|
||||
cell.GetDateTime() != default)
|
||||
{
|
||||
return cell.GetDateTime().Date;
|
||||
}
|
||||
|
||||
var raw = cell.GetString().Trim();
|
||||
if (string.IsNullOrEmpty(raw)) return null;
|
||||
|
||||
var formats = new[]
|
||||
{
|
||||
"dd.MM.yyyy",
|
||||
"dd/MM/yyyy",
|
||||
"yyyy-MM-dd",
|
||||
"M/d/yyyy",
|
||||
};
|
||||
|
||||
if (DateTime.TryParseExact(raw, formats,
|
||||
CultureInfo.InvariantCulture,
|
||||
DateTimeStyles.AssumeLocal,
|
||||
out var dt))
|
||||
{
|
||||
return dt.Date;
|
||||
}
|
||||
|
||||
if (DateTime.TryParse(raw, CultureInfo.CurrentCulture,
|
||||
DateTimeStyles.AssumeLocal, out var dt2))
|
||||
{
|
||||
return dt2.Date;
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
private static decimal GetDecimal(IXLCell cell)
|
||||
{
|
||||
if (cell.IsEmpty()) return 0m;
|
||||
|
||||
if (cell.DataType == XLDataType.Number)
|
||||
{
|
||||
return (decimal)cell.GetDouble();
|
||||
}
|
||||
|
||||
var raw = cell.GetString().Trim();
|
||||
if (string.IsNullOrEmpty(raw)) return 0m;
|
||||
|
||||
// Try invariant and current culture
|
||||
if (decimal.TryParse(raw, NumberStyles.Any, CultureInfo.InvariantCulture, out var result))
|
||||
return result;
|
||||
|
||||
if (decimal.TryParse(raw, NumberStyles.Any, CultureInfo.CurrentCulture, out result))
|
||||
return result;
|
||||
|
||||
throw new FormatException($"Cannot parse decimal value '{raw}'.");
|
||||
}
|
||||
|
||||
private static decimal? GetNullableDecimal(IXLCell cell)
|
||||
{
|
||||
if (cell.IsEmpty()) return null;
|
||||
|
||||
if (cell.DataType == XLDataType.Number)
|
||||
{
|
||||
return (decimal)cell.GetDouble();
|
||||
}
|
||||
|
||||
var raw = cell.GetString().Trim();
|
||||
if (string.IsNullOrEmpty(raw)) return null;
|
||||
|
||||
if (decimal.TryParse(raw, NumberStyles.Any, CultureInfo.InvariantCulture, out var result))
|
||||
return result;
|
||||
|
||||
if (decimal.TryParse(raw, NumberStyles.Any, CultureInfo.CurrentCulture, out result))
|
||||
return result;
|
||||
|
||||
return null;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Register the service:
|
||||
|
||||
```csharp
|
||||
builder.Services.AddScoped<IPolCensusImportService, PolCensusImportService>();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. API endpoint: import census data by `fileId`
|
||||
|
||||
Controller for IPAL / Serdica health module:
|
||||
|
||||
```csharp
|
||||
using Microsoft.AspNetCore.Mvc;
|
||||
|
||||
[ApiController]
|
||||
[Route("api/health/census")]
|
||||
public sealed class PolCensusController : ControllerBase
|
||||
{
|
||||
private readonly IPolCensusImportService _importService;
|
||||
|
||||
public PolCensusController(IPolCensusImportService importService)
|
||||
{
|
||||
_importService = importService;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Imports census members from an Excel file stored in FileStore.
|
||||
/// </summary>
|
||||
/// <param name="bucketId">Logical bucket / container in FileStore (e.g. "ipal-serdica-census").</param>
|
||||
/// <param name="fileId">Unique file identifier in FileStore.</param>
|
||||
/// <param name="srPolicyId">Serdica policy id (FK for PolCensusList).</param>
|
||||
/// <param name="srAnnexId">Serdica annex id (FK for PolCensusList).</param>
|
||||
[HttpPost("import/{bucketId}/{fileId}")]
|
||||
public async Task<IActionResult> ImportFromFile(
|
||||
string bucketId,
|
||||
string fileId,
|
||||
[FromQuery] decimal srPolicyId,
|
||||
[FromQuery] decimal srAnnexId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
var importedCount = await _importService.ImportFromExcelAsync(
|
||||
bucketId,
|
||||
fileId,
|
||||
srPolicyId,
|
||||
srAnnexId,
|
||||
cancellationToken);
|
||||
|
||||
return Ok(new
|
||||
{
|
||||
imported = importedCount,
|
||||
fileId,
|
||||
bucketId,
|
||||
srPolicyId,
|
||||
srAnnexId
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Optional: upload endpoint using `IFileRepository`
|
||||
|
||||
If you want a simple upload entry point compatible with the repository:
|
||||
|
||||
```csharp
|
||||
[HttpPost("upload")]
|
||||
public async Task<IActionResult> UploadCensusFile(
|
||||
[FromForm] IFormFile file,
|
||||
[FromQuery] string bucketId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
if (file == null || file.Length == 0)
|
||||
return BadRequest("File is required.");
|
||||
|
||||
var fileId = Guid.NewGuid().ToString("N");
|
||||
|
||||
await using var stream = file.OpenReadStream();
|
||||
await _fileRepository.UploadAsync(
|
||||
bucketId,
|
||||
fileId,
|
||||
stream,
|
||||
file.ContentType ?? "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
|
||||
cancellationToken);
|
||||
|
||||
return Ok(new { fileId, bucketId });
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
If you tell me:
|
||||
|
||||
* the exact Excel column layout you will get from IPAL / Serdica
|
||||
* whether `CensusId` is sequence-generated in Oracle or must be populated in code,
|
||||
|
||||
I can tighten the mapping + EF configuration so that it matches your schema 1:1 and is ready for production.
|
||||
@@ -0,0 +1,131 @@
|
||||
|
||||
|
||||
Here’s a compact, implementation‑ready blueprint to make your scanner’s results quiet, explainable, and auditable end‑to‑end.
|
||||
|
||||
# Phase the “proof spine”
|
||||
|
||||
1. **SBOM‑only → VEX‑ready → Attested**
|
||||
|
||||
* **SBOM (now):** Generate SPDX 3.0.1 + CycloneDX 1.6 for every image/module. Include purls, CPE (if available), license IDs, source URIs, and build metadata.
|
||||
* **VEX‑ready (next):** Normalize vuln inputs (OSV, GHSA, vendor feeds) to a single internal model; keep fields needed for VEX (status, justification, impact, action, timestamp, issuer).
|
||||
* **Attest (then):** Emit **in‑toto/DSSE** attestations that bind: (a) SBOM digest, (b) ruleset version, (c) data sources & hashes, (d) VEX decisions. Log statement references in **Rekor** (or your mirror) for transparency.
|
||||
|
||||
# Explainability path (per alert)
|
||||
|
||||
For every surfaced finding, materialize:
|
||||
|
||||
* **Origin SBOM node** → component@version (with purl/CPE)
|
||||
* **Match rule** → which matcher hit (name+version, range, CPE heuristics, source trust)
|
||||
* **VEX gate** → decision with justification (e.g., affected/not_affected, component_not_present, configuration_needed)
|
||||
* **Reachability trace** → static (call graph path) and/or runtime (probe hits) to the vulnerable symbol(s)
|
||||
* **Deterministic score** → numeric risk built from stable inputs (below)
|
||||
Expose this as a single JSON object and a short, human‑readable proof block in the UI/CLI.
|
||||
|
||||
# Smart‑Diff (incremental analysis)
|
||||
|
||||
* **Change detector:** hash symbols/packages and dependency graphs; on new scans, diff against prior state.
|
||||
* **Selective re‑analysis:** only re‑parse/re‑solve changed modules, lockfiles, or call‑graph regions.
|
||||
* **Memoized match & reachability:** cache vuln matches and reachability slices per (component, version, framework‑model) key.
|
||||
|
||||
# Scoring (quiet by design)
|
||||
|
||||
Use stable, auditable inputs:
|
||||
|
||||
* **Base:** CVSS v4.0 metrics (as provided by source), fall back to v3.1 if v4 missing.
|
||||
* **Exploit maturity:** explicit flags when present (known exploited, PoC available, none).
|
||||
* **Reachability boost/penalty:** function‑level confirmation > package‑level guess; runtime evidence > static‑only.
|
||||
* **Compensating controls:** WAF/feature flags/sandboxing recorded as gates that reduce surfaced priority (but never erase provenance).
|
||||
|
||||
# Minimal data contracts (copy‑paste into your code)
|
||||
|
||||
**SBOM node (core):**
|
||||
|
||||
```json
|
||||
{
|
||||
"purl": "pkg:npm/lodash@4.17.21",
|
||||
"hashes": [{"alg":"sha256","value":"..."}],
|
||||
"licenses": ["MIT"],
|
||||
"build": {"sourceUri":"git+https://...","commit":"..."},
|
||||
"attestations": [{"type":"intoto","subjectDigest":"sha256:..."}]
|
||||
}
|
||||
```
|
||||
|
||||
**Finding proof (per alert):**
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "FND-abc123",
|
||||
"component": {"purl":"pkg:maven/org.example/foo@1.2.3"},
|
||||
"vuln": {"id":"CVE-2024-XXXX","source":"OSV"},
|
||||
"matchRule": {"name":"purl-eq","details":{"range":"[1.2.0,1.2.5)"}},
|
||||
"vexGate": {"status":"affected","justification":"reachable_code_path"},
|
||||
"reachability": {
|
||||
"staticPath": ["Controller.handle","Service.parse","lib/vulnFunc"],
|
||||
"runtimeHits": [{"symbol":"lib/vulnFunc","count":37}]
|
||||
},
|
||||
"score": {"base":7.1,"exploit":"poc","reach":"function","final":8.4},
|
||||
"provenance": {
|
||||
"sbomDigest":"sha256:...",
|
||||
"ruleset":"signals-1.4.2",
|
||||
"feeds":[{"name":"OSV","etag":"..."}],
|
||||
"attRef":"rekor:sha256:..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
# Services & where they live in Stella Ops
|
||||
|
||||
* **Sbomer**: Syft‑backed generators (SPDX/CycloneDX) + DSSE signing.
|
||||
* **Feedser/Concelier**: fetch & normalize vuln feeds (OSV/GHSA/vendor), maintain trust scores; “preserve‑prune source” rule stays.
|
||||
* **Scanner.WebService**: orchestrates analyzers; run lattice algorithms here (per your standing rule).
|
||||
* **Vexer/Excititor**: VEX issuance + policy evaluation (lattice gates).
|
||||
* **Authority**: key management, DSSE signing, Rekor client (and mirror) endpoints.
|
||||
* **Signals**: event‑sourced store for proofs, reachability artifacts, and scoring outputs.
|
||||
|
||||
# Policies (tiny DSL sketch)
|
||||
|
||||
```yaml
|
||||
version: 1
|
||||
sources:
|
||||
- id: osv
|
||||
trust: 0.9
|
||||
gates:
|
||||
- id: not-present
|
||||
when: component.present == false
|
||||
action: vex(status: not_affected, reason: component_not_present)
|
||||
- id: unreachable
|
||||
when: reachability.static == false and reachability.runtime == false
|
||||
action: vex(status: not_affected, reason: vulnerable_code_not_in_execute_path)
|
||||
scoring:
|
||||
base: cvss.v4 or cvss.v3
|
||||
adjust:
|
||||
- if: exploit.maturity in ["known_exploited","poc"]
|
||||
add: 0.8
|
||||
- if: reachability.function_confirmed
|
||||
add: 1.1
|
||||
- if: gate == "not-present"
|
||||
subtract: 3.0
|
||||
```
|
||||
|
||||
# Attestations & transparency (pragmatic path)
|
||||
|
||||
* **Produce** DSSE‑wrapped in‑toto statements for SBOM, ScanResult, and VEXBundle.
|
||||
* **Record** statement digests in Rekor (or your **Proof‑Market** mirror) with pointers back to your artifact store.
|
||||
* **Bundle** offline kits with SBOM+VEX+attestations and a mini‑Rekor log segment for air‑gapped audits.
|
||||
|
||||
# UX: one‑screen truth
|
||||
|
||||
* Table of findings with **Final Score**, a **“Why?”** button expanding the 5‑part proof chain, and **Fix** suggestions.
|
||||
* Global toggles: *Show only reachable*, *Mute not‑affected*, *Show deltas* (Smart‑Diff), *Export VEX*.
|
||||
|
||||
# “Done next” checklist
|
||||
|
||||
* Wire Syft→SPDX/CycloneDX→DSSE emit → Rekor client.
|
||||
* Normalize feeds to a single vuln model with trust weights.
|
||||
* Implement **FindingProof** schema and persist it in Signals.
|
||||
* Add **Symbolizer + per‑lang reachability** stubs (even minimal) to populate `reachability` fields.
|
||||
* Ship VEX export (OpenVEX/CSAF) based on current gates.
|
||||
* Add Smart‑Diff over SBOM + symbol graph hashes.
|
||||
* Surface the full proof chain in UI/CLI.
|
||||
|
||||
If you want, I can drop in concrete .NET 10 interfaces/classes for each component and a first pass of the Rekor/DSSE helpers next.
|
||||
@@ -0,0 +1,102 @@
|
||||
|
||||
|
||||
Here’s a compact, plain‑English plan to make your scanner **faster, quieter, and auditor‑friendly** by (1) diff‑aware rescans and (2) unified binary+source reachability—both drop‑in for Stella Ops.
|
||||
|
||||
# Deterministic, diff‑aware rescans (clean SBOM/VEX diffs)
|
||||
**Goal:** Only recompute what changed; emit stable, minimal diffs reviewers can trust.
|
||||
|
||||
**Core ideas**
|
||||
- **Per‑layer SBOM artifacts (cacheable):** For each image layer `L#`, persist:
|
||||
- `sbom-L#.cdx.json` (CycloneDX), `hash(L#)`, `toolchain-hash`, `feeds-hash`.
|
||||
- **Symbol‑fingerprints** for each discovered file: `algo|path|size|mtime|xxh3|funcIDs[]`.
|
||||
- **Slice recomputation:** On new image `I'`, match layers via hashes; for changed layers or files, recompute *only* their call‑graph slices and vuln joins.
|
||||
- **Deterministic manifests:** Every scan writes a `scan.lock.json` (inputs, feed versions, rules, lattice policy hash, tool versions, clocks) so results are **replayable**.
|
||||
|
||||
**Minimal data model (Mongo)**
|
||||
- `scan_runs(_id, imageDigest, inputsHash, policyHash, feedsHash, startedAt, finishedAt, parentRunId?)`
|
||||
- `layer_sboms(scanRunId, layerDigest, sbomCid, symbolIndexCid, layerHash)`
|
||||
- `file_symbols(scanRunId, path, fileHash, funcIDs[], lang, size, mtime)`
|
||||
- `diffs(fromRunId, toRunId, kind: 'sbom'|'vex'|'reachability', stats, patch)` (store JSON Patch)
|
||||
|
||||
**Algorithm sketch**
|
||||
1. Resolve base image ancestry → map `old layer digest ↔ new layer digest`.
|
||||
2. For unchanged layers: reuse `layer_sboms` + `file_symbols`.
|
||||
3. For changed/added files: re‑symbolize + re‑analyze; restrict call‑graph build to **impacted SCCs**.
|
||||
4. Re‑join OSV/GHSA/vendor vulns → compute reachability deltas → emit **stable JSON Patch**.
|
||||
|
||||
**CLI impact**
|
||||
- `stella scan --deterministic --cache-dir ~/.stella/cache --emit-diff previousRunId`
|
||||
- `stella diff --from <runA> --to <runB> --format jsonpatch|md`
|
||||
|
||||
---
|
||||
|
||||
# Unified binary + source reachability (function‑level)
|
||||
**Goal:** Decide “is the vulnerable function reachable/used here?” across native and managed code.
|
||||
|
||||
**Extraction**
|
||||
- **Binary symbolizers:**
|
||||
- ELF: parse `.symtab`/`.dynsym`, DWARF (if present).
|
||||
- Mach‑O/PE: export tables + DWARF/PDB (if present).
|
||||
- Build **Canonical Symbol ID (CSID)**: `lang:pkg@ver!binary#file:function(signature)`; normalize C++/Rust mangling.
|
||||
- **Source symbolizers:**
|
||||
- .NET (Roslyn+IL), JVM (bytecode), Go (SSA), Node/TS (TS AST), Python (AST), Rust (HIR/MIR if available).
|
||||
- **Bindings join:** Map FFI edges (P/Invoke, cgo, JNI/JNA, N-API) → **cross‑ecosystem call edges**:
|
||||
- `.NET P/Invoke` → DLL export CSID.
|
||||
- Java JNI → `Java_com_pkg_Class_Method` ↔ native export.
|
||||
- Node N-API → addon exports ↔ JS require() site.
|
||||
|
||||
**Reachability pipeline**
|
||||
1. Build per‑language call graphs (CG) with framework models (ASP.NET, Spring, Express, etc.).
|
||||
2. Add FFI edges; merge into a **polyglot call graph**.
|
||||
3. Mark **entrypoints** (container `CMD/ENTRYPOINT`, web handlers, cron, CLI verbs).
|
||||
4. For each CVE → {pkg, version, affected symbols[]} map → **is any affected CSID on a path from an entrypoint?**
|
||||
5. Output evidence:
|
||||
- `reachable: true|false|unknown`
|
||||
- shortest path (symbols list)
|
||||
- probes (optional): runtime samples (EventPipe/JFR/uprobes) hitting CSIDs
|
||||
|
||||
**Artifacts emitted**
|
||||
- `symbols.csi.jsonl` (all CSIDs)
|
||||
- `polyglot.cg.slices.json` (only impacted SCCs for diffs)
|
||||
- `reach.vex.json` (OpenVEX/CSAF with function‑level notes + confidence)
|
||||
|
||||
---
|
||||
|
||||
# What to build next (low‑risk, high‑impact)
|
||||
- **[Week 1–2]** Per‑layer caches + `scan.lock.json`; file symbol‑fingerprints (xxh3 + top‑K funcIDs).
|
||||
- **[Week 3–4]** ELF/PE/Mach‑O symbolizer lib with CSIDs; .NET IL + P/Invoke mapper.
|
||||
- **[Week 5–6]** Polyglot CG merge + entrypoint discovery from Docker metadata; JSON Patch diffs.
|
||||
- **[Week 7+]** Runtime probes (opt‑in) to boost confidence and suppress false positives.
|
||||
|
||||
---
|
||||
|
||||
# Tiny code seeds (C# hints)
|
||||
|
||||
**Symbol fingerprint (per file)**
|
||||
```csharp
|
||||
record SymbolFingerprint(
|
||||
string Algo, string Path, long Size, long MTimeUnix,
|
||||
string ContentHash, string[] FuncIds);
|
||||
```
|
||||
|
||||
**Deterministic scan lock**
|
||||
```csharp
|
||||
record ScanLock(
|
||||
string FeedsHash, string RulesHash, string PolicyHash, string Toolchain,
|
||||
string ImageDigest, string[] LayerDigests, DateTimeOffset Clock,
|
||||
IDictionary<string,string> EnvPins);
|
||||
```
|
||||
|
||||
**JSON Patch diff emit**
|
||||
```csharp
|
||||
var patch = JsonDiffPatch.Diff(oldVexJson, newVexJson); // stable sort keys beforehand
|
||||
File.WriteAllText("vex.diff.json", patch);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
If you want, I can turn this into:
|
||||
- a **.proto** for the cache/index objects,
|
||||
- a **Mongo schema + indexes** (including compound keys for fast layer reuse),
|
||||
- and a **.NET 10** service skeleton (`StellaOps.Scanner.WebService`) with endpoints:
|
||||
`/scan`, `/diff/{from}/{to}`, `/reach/{runId}`.
|
||||
@@ -0,0 +1,146 @@
|
||||
Here’s a fast, practical idea to speed up container scans: add a **hash‑based SBOM layer cache** keyed by **(Docker layer digest + dependency‑manifest checksum)** so identical inputs skip recomputation and only verify attestations.
|
||||
|
||||
---
|
||||
|
||||
### What this is (in plain words)
|
||||
|
||||
* **Layers are immutable.** Each image layer already has a content digest (e.g., `sha256:...`).
|
||||
* **Dependency state is declarative.** Lockfiles/manifest files (NuGet `packages.lock.json`, `package-lock.json`, `poetry.lock`, `go.sum`, etc.) summarize deps.
|
||||
* If both the **layer bytes** and the **manifest content** are identical to something we’ve scanned before, recomputing the SBOM/VEX is wasted work. We can **reuse** the previous result (plus a quick signature/attestation check).
|
||||
|
||||
---
|
||||
|
||||
### Cache key
|
||||
|
||||
```
|
||||
CacheKey = SHA256(
|
||||
concat(
|
||||
LayerDigestCanonical, // e.g., "sha256:abcd..."
|
||||
'\n',
|
||||
ManifestAlgo, // e.g., "sha256"
|
||||
':',
|
||||
ManifestChecksum // hash of lockfile(s) inside the layer FS view
|
||||
)
|
||||
)
|
||||
```
|
||||
|
||||
* Optionally include toolchain IDs to prevent cross‑version skew:
|
||||
|
||||
* `SbomerVersion`, `ScannerRulesetVersion`, `FeedsSnapshotId` (OSV/NVD feed epoch), `PolicyBundleHash`.
|
||||
|
||||
---
|
||||
|
||||
### When it hits
|
||||
|
||||
* **Exact same layer + same manifests** → return cached **SBOM component graph + vuln findings + VEX** and **re‑verify** the **DSSE/in‑toto attestation** and timestamps (freshness SLA).
|
||||
* **Same layer, manifests absent** → fall back to byte‑level heuristics (package index cache); lower confidence.
|
||||
|
||||
---
|
||||
|
||||
### Minimal .NET 10 sketch (Stella Ops)
|
||||
|
||||
```csharp
|
||||
public sealed record LayerInput(
|
||||
string LayerDigest, // "sha256:..."
|
||||
string? ManifestAlgo, // "sha256"
|
||||
string? ManifestChecksum, // hex
|
||||
string SbomerVersion,
|
||||
string RulesetVersion,
|
||||
string FeedsSnapshotId,
|
||||
string PolicyBundleHash);
|
||||
|
||||
public static string ComputeCacheKey(LayerInput x)
|
||||
{
|
||||
var s = string.Join("\n", new[]{
|
||||
x.LayerDigest,
|
||||
x.ManifestAlgo ?? "",
|
||||
x.ManifestChecksum ?? "",
|
||||
x.SbomerVersion,
|
||||
x.RulesetVersion,
|
||||
x.FeedsSnapshotId,
|
||||
x.PolicyBundleHash
|
||||
});
|
||||
using var sha = System.Security.Cryptography.SHA256.Create();
|
||||
return Convert.ToHexString(sha.ComputeHash(System.Text.Encoding.UTF8.GetBytes(s)));
|
||||
}
|
||||
|
||||
public sealed class SbomCacheEntry
|
||||
{
|
||||
public required string CacheKey { get; init; }
|
||||
public required byte[] CycloneDxJson { get; init; } // gz if large
|
||||
public required byte[] VexJson { get; init; }
|
||||
public required byte[] AttestationDsse { get; init; } // for re-verify
|
||||
public required DateTimeOffset ProducedAt { get; init; }
|
||||
public required string FeedsSnapshotId { get; init; } // provenance
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Cache flow (Scanner)
|
||||
|
||||
1. **Before scan**
|
||||
|
||||
* Extract manifest files from the union FS of the current layer.
|
||||
* Hash them (stable newline normalization).
|
||||
* Build `LayerInput`; compute `CacheKey`.
|
||||
* **Lookup** in `ISbomCache.Get(CacheKey)`.
|
||||
2. **Hit**
|
||||
|
||||
* **Verify attestation** (keys/policy), **check feed epoch** still within tolerance, **re‑sign freshness** if policy allows.
|
||||
* Emit cached SBOM/VEX downstream; mark provenance as “replayed”.
|
||||
3. **Miss**
|
||||
|
||||
* Run normal analyzers → SBOM → vuln match → VEX lattice.
|
||||
* Create **in‑toto/DSSE attestation**.
|
||||
* Store `SbomCacheEntry` and **index by**:
|
||||
|
||||
* `CacheKey` (primary),
|
||||
* `LayerDigest` (secondary),
|
||||
* `(ecosystem, manifestChecksum)` for diagnostics.
|
||||
4. **Invalidation**
|
||||
|
||||
* Roll cache on **FeedsSnapshotId** bumps or **RulesetVersion** change.
|
||||
* TTL optional for emergency revocations; keep **attestation+provenance** for audit.
|
||||
|
||||
---
|
||||
|
||||
### Storage options
|
||||
|
||||
* **Local**: content‑addressed dir (`/var/lib/stellaops/sbom-cache/aa/bb/<cacheKey>.cjson.gz`).
|
||||
* **Remote**: Redis or Mongo (GridFS) keyed by `cacheKey`; attach indexes on `LayerDigest`, `FeedsSnapshotId`.
|
||||
* **OCI artifact**: push SBOM/VEX as OCI refs tied to layer digest (helps multi‑node CI).
|
||||
|
||||
---
|
||||
|
||||
### Attestation verification (quick)
|
||||
|
||||
* On hit: `Verify(AttestationDsse, Policy)`; ensure `subject.digest == LayerDigest` and metadata (`FeedsSnapshotId`, tool versions) matches required policy.
|
||||
* Optional **freshness stamp**: a tiny, fast “verification attestation” you produce at replay time.
|
||||
|
||||
---
|
||||
|
||||
### Edge cases
|
||||
|
||||
* **Multi‑manifest layers** (polyglot): combine checksums in a stable order (e.g., `SHA256(man1 + '\n' + man2 + ...)`).
|
||||
* **Runtime‑only diffs** (no manifest change): include **package index snapshot hash** if you maintain one.
|
||||
* **Reproducibility drift**: include analyzer version & configuration knobs in the key so the cache never masks rule changes.
|
||||
|
||||
---
|
||||
|
||||
### Why this helps
|
||||
|
||||
* Cold scans compute once; subsequent builds (same base image + same lockfiles) **skip minutes of work**.
|
||||
* Reproducibility becomes **measurable**: cache hit ratio per repo, per base image, per feed epoch.
|
||||
|
||||
---
|
||||
|
||||
### Quick tasks to add to Stella Ops
|
||||
|
||||
* [ ] Implement `LayerInput` + keying in `Scanner.WebService`.
|
||||
* [ ] Add **Manifest Harvester** step per ecosystem (NuGet, npm, pip/poetry, go, Cargo).
|
||||
* [ ] Add `ISbomCache` (local + Mongo/OCI backends) with metrics.
|
||||
* [ ] Wire **attestation re‑verify** path on hits.
|
||||
* [ ] Ship a **cache report**: hit/miss, time saved, reasons for miss (ruleset/feeds changed, manifest changed, new analyzer).
|
||||
|
||||
If you want, I can draft the actual C# interfaces (cache backend + verifier) and a tiny integration for your existing `Sbomer`/`Vexer` services next.
|
||||
@@ -0,0 +1,224 @@
|
||||
Here’s a compact, implementation‑ready plan to validate function‑level reachability with a public, minimal CVE corpus—one runnable example per runtime (Go, .NET, Python, Rust). It gives you known vulnerable symbols, a tiny app that (optionally) calls them, and captured runtime traces to prove reachability.
|
||||
|
||||
---
|
||||
|
||||
# Corpus layout
|
||||
|
||||
```
|
||||
stellaops-reach-corpus/
|
||||
README.md
|
||||
tooling/
|
||||
capture-dotnet-eventpipe.ps1
|
||||
capture-go-trace.sh
|
||||
capture-python-coverage.sh
|
||||
capture-rust-probe.sh
|
||||
go/
|
||||
CVE-YYYY-XXXX-min/
|
||||
go.mod
|
||||
vulner/pkg/vuln.go // vulnerable symbol(s): func DoVuln()
|
||||
app/main.go // calls or avoids DoVuln() via flag
|
||||
traces/ // .out/.json from runtime
|
||||
EXPECT.yaml // ground truth: reachable? call path?
|
||||
dotnet/
|
||||
CVE-YYYY-XXXX-min/
|
||||
src/VulnLib/VulnLib.cs // [MethodImpl] public static void DoVuln()
|
||||
src/App/App.csproj
|
||||
src/App/Program.cs // --reach / --no-reach
|
||||
traces/ // .nettrace, EventPipe JSON, stack dumps
|
||||
EXPECT.yaml
|
||||
python/
|
||||
CVE-YYYY-XXXX-min/
|
||||
vuln/__init__.py // def do_vuln()
|
||||
app.py // toggle call via env
|
||||
requirements.txt
|
||||
traces/coverage/ // coverage.xml + callgraph.json
|
||||
EXPECT.yaml
|
||||
rust/
|
||||
CVE-YYYY-XXXX-min/
|
||||
Cargo.toml
|
||||
src/lib.rs // pub fn do_vuln()
|
||||
src/main.rs // feature flags: reach/no_reach
|
||||
traces/ // eBPF/usdt or log-markers
|
||||
EXPECT.yaml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# EXPECT.yaml (shared contract)
|
||||
|
||||
```yaml
|
||||
id: CVE-YYYY-XXXX
|
||||
ecosystem: (go|dotnet|python|rust)
|
||||
packages:
|
||||
- name: example.org/vulner
|
||||
version: 1.0.0
|
||||
symbols:
|
||||
- fqname: example.org/vulner.DoVuln # or Namespace.Class.Method, module.func
|
||||
kind: function
|
||||
scenarios:
|
||||
- name: reach
|
||||
args: ["--reach"]
|
||||
expected:
|
||||
reachable: true
|
||||
call_paths:
|
||||
- ["app.main", "vulner.DoVuln"]
|
||||
runtime_hits: >=1
|
||||
- name: no_reach
|
||||
args: ["--no-reach"]
|
||||
expected:
|
||||
reachable: false
|
||||
call_paths: []
|
||||
runtime_hits: 0
|
||||
artifacts:
|
||||
- sbom: sbom.cdx.json
|
||||
- trace: traces/reach.trace
|
||||
notes: Minimal repro; avoid network/filesystem side effects.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Minimal vulnerable symbol patterns
|
||||
|
||||
**Go**
|
||||
|
||||
`vulner/pkg/vuln.go`
|
||||
|
||||
```go
|
||||
package vulner
|
||||
func DoVuln(input string) string { return "vuln:" + input } // marker
|
||||
```
|
||||
|
||||
`app/main.go`
|
||||
|
||||
```go
|
||||
package main
|
||||
import (
|
||||
"flag"
|
||||
"example.org/vulner"
|
||||
"fmt"
|
||||
)
|
||||
func main() {
|
||||
reach := flag.Bool("reach", false, "call vuln")
|
||||
flag.Parse()
|
||||
if *reach { fmt.Println(vulner.DoVuln("hit")) } else { fmt.Println("skip") }
|
||||
}
|
||||
```
|
||||
|
||||
**.NET (C# / .NET 10)**
|
||||
|
||||
`VulnLib/VulnLib.cs`
|
||||
|
||||
```csharp
|
||||
namespace VulnLib;
|
||||
public static class V {
|
||||
public static string DoVuln(string s) => "vuln:" + s; // marker
|
||||
}
|
||||
```
|
||||
|
||||
`App/Program.cs`
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using VulnLib;
|
||||
var reach = args.Contains("--reach");
|
||||
Console.WriteLine(reach ? V.DoVuln("hit") : "skip");
|
||||
```
|
||||
|
||||
**Python**
|
||||
|
||||
`vuln/__init__.py`
|
||||
|
||||
```python
|
||||
def do_vuln(s: str) -> str:
|
||||
return "vuln:" + s # marker
|
||||
```
|
||||
|
||||
`app.py`
|
||||
|
||||
```python
|
||||
import os
|
||||
from vuln import do_vuln
|
||||
print(do_vuln("hit") if os.getenv("REACH")=="1" else "skip")
|
||||
```
|
||||
|
||||
**Rust**
|
||||
|
||||
`src/lib.rs`
|
||||
|
||||
```rust
|
||||
pub fn do_vuln(s: &str) -> String { format!("vuln:{s}") } // marker
|
||||
```
|
||||
|
||||
`src/main.rs`
|
||||
|
||||
```rust
|
||||
use std::env; use vuln::do_vuln;
|
||||
fn main() {
|
||||
let reach = env::args().any(|a| a=="--reach");
|
||||
println!("{}", if reach { do_vuln("hit") } else { "skip".into() });
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Runtime trace capture (tiny, deterministic)
|
||||
|
||||
* **Go**: `-toolexec` or `GODEBUG=efence=1` not required; use `go test -run TestReach -vet=off` (optional) + `pprof` or `runtime/trace`.
|
||||
|
||||
* `tooling/capture-go-trace.sh`: `go test ./... -run TestNoop && go test -run TestReach -trace=traces/reach.out`
|
||||
|
||||
* **.NET**: EventPipe
|
||||
|
||||
* `dotnet-trace collect -p $PID --providers Microsoft-DotNETCore-SampleProfiler:0:5`
|
||||
* Or `dotnet-monitor collect --duration 5s --process-id ... --artifact-type traces`
|
||||
|
||||
* **Python**: `coverage run -m app` + `coverage xml -o traces/coverage/coverage.xml`
|
||||
|
||||
* **Rust**: simplest is log markers + `RUST_LOG` capture; optional: `perf record -g` or USDT via `tokio-tracing` if you want call sites.
|
||||
|
||||
Each trace folder includes a short `trace.json` (normalized stack hits for the vulnerable symbol) produced by a tiny normalizer script you ship in `tooling/`.
|
||||
|
||||
---
|
||||
|
||||
# SBOM & ground‑truth
|
||||
|
||||
For each example:
|
||||
|
||||
* Generate CycloneDX SBOM (use the language’s simplest generator or a tiny script) and include component + symbol annotations (e.g., `properties` with `symbol:fqname`).
|
||||
* Keep versions pinned to avoid drift.
|
||||
|
||||
---
|
||||
|
||||
# Validation runner (one command)
|
||||
|
||||
`tooling/validate-all.sh`:
|
||||
|
||||
1. Build each example twice (reach / no_reach).
|
||||
2. Capture SBOM + runtime traces.
|
||||
3. Emit a unified `results.json` with:
|
||||
|
||||
* detected symbols from your Symbolizer
|
||||
* static call‑graph reachability
|
||||
* runtime hit count per symbol
|
||||
* pass/fail vs `EXPECT.yaml`.
|
||||
|
||||
Exit non‑zero on any mismatch → perfect for CI gates.
|
||||
|
||||
---
|
||||
|
||||
# Why this works as a public differentiator
|
||||
|
||||
* **Minimal & real**: one tiny, idiomatic app per runtime; clear vulnerable symbol; two scenarios.
|
||||
* **Auditable**: EXPECT.yaml + traces make results falsifiable.
|
||||
* **Portable**: no network, no DB; runs in Docker or GitHub Actions.
|
||||
* **Extensible**: add more CVEs by copying the template and swapping the “vulnerable symbol” (e.g., path‑traversal helper, unsafe deserializer stub, weak RNG wrapper).
|
||||
|
||||
---
|
||||
|
||||
# Next steps I can deliver immediately
|
||||
|
||||
* Bootstrap repo with the above structure.
|
||||
* Add the four first examples + scripts.
|
||||
* Wire a single `validate-all` CLI to produce a JUnit‑style report for your CI.
|
||||
|
||||
If you want, I’ll generate the skeleton with ready‑to‑run code, EXPECTs, and the capture scripts tailored to your .NET 10 + Docker workflow.
|
||||
@@ -0,0 +1,34 @@
|
||||
Here’s a quick, concrete proposal to **lock in a stable SBOM model for Stella Ops**: use **SPDX 3.0.1** as your canonical persistence schema and **CycloneDX 1.6** as the interchange “view,” bridged by a deterministic transform.
|
||||
|
||||
**Why this pairing**
|
||||
|
||||
* **SPDX 3.0.1** gives you a rigorous, profile‑based data model (Core/Security/AI/Build, etc.) with explicit **Relationship** semantics—ideal for long‑lived storage and graph queries. ([SPDX][1])
|
||||
* **CycloneDX 1.6** excels at exchange: widely adopted, supports **services/SaaSBOM**, **attestations (CDXA)**, **CBOM (crypto inventory)**, MLBOM, and more—perfect for producing portable BOMs for customers and regulators. ([CycloneDX][2])
|
||||
|
||||
**Target architecture (minimal)**
|
||||
|
||||
* **Persistence:** Store SBOMs as SPDX 3.0.1 (JSON‑LD/RDF), normalized into your Mongo event‑sourced graph; keep Relationship edges first‑class. ([SPDX][1])
|
||||
* **Interchange:** On export, render CycloneDX 1.6 (JSON/XML) including `components`, `services`, `dependencies`, `vulnerabilities`, and optional CBOM/CDXA blocks. ([SBOM Observer][3])
|
||||
* **Deterministic transform:** Define a static mapping table (SPDX→CycloneDX) with sorted collections, stable UUID seeds, and normalized strings to guarantee byte‑for‑byte reproducibility across offline sites.
|
||||
|
||||
**Quick win mapping examples**
|
||||
|
||||
* SPDX `Element` + `RelationshipType` → CycloneDX `dependencies` graph. ([SPDX][4])
|
||||
* SPDX Security profile findings → CycloneDX `vulnerabilities` entries. ([SPDX][1])
|
||||
* SPDX AI/Build profiles → CycloneDX MLBOM + CDXA attestations (build/provenance). ([SPDX][5])
|
||||
* Crypto materials (keys/algos/policies) held in SPDX extensions or attributes → CycloneDX **CBOM** on export for policy checks (CNSA/NIST). ([CycloneDX][2])
|
||||
|
||||
**Governance & standards signal**
|
||||
|
||||
* SPDX 3.0.x is actively aligned with **OMG/ISO** submissions (good long‑term bet for storage). ([SPDX Lists][6])
|
||||
* CycloneDX 1.6 is the current, actively enhanced interchange standard used across vendors and tooling. ([GitHub][7])
|
||||
|
||||
If you want, I’ll draft the exact field‑by‑field mapping table (SPDX profile → CycloneDX section), plus a small .NET 10 library skeleton for the deterministic exporter.
|
||||
|
||||
[1]: https://spdx.github.io/spdx-spec/v3.0.1/?utm_source=chatgpt.com "SPDX Specification 3.0.1"
|
||||
[2]: https://cyclonedx.org/news/cyclonedx-v1.6-released/?utm_source=chatgpt.com "CycloneDX v1.6 Released, Advances Software Supply ..."
|
||||
[3]: https://sbom.observer/academy/learn/topics/cyclonedx?utm_source=chatgpt.com "What is CycloneDX?"
|
||||
[4]: https://spdx.github.io/spdx-spec/v3.0.1/model/Core/Vocabularies/RelationshipType/?utm_source=chatgpt.com "RelationshipType - SPDX Specification 3.0.1"
|
||||
[5]: https://spdx.dev/wp-content/uploads/sites/31/2024/12/SPDX-3.0.1-1.pdf?utm_source=chatgpt.com "SPDX© Specification v3.0.1"
|
||||
[6]: https://lists.spdx.org/g/Spdx-tech/topic/release_3_0_1_of_the_spdx/110308825?utm_source=chatgpt.com "Release 3.0.1 of the SPDX Specification"
|
||||
[7]: https://github.com/CycloneDX/specification?utm_source=chatgpt.com "CycloneDX/specification"
|
||||
@@ -0,0 +1,132 @@
|
||||
Here’s a practical, plain‑English game plan to validate three big Stella Ops claims—quiet scans, provenance, and diff‑native CI—so you (and auditors/customers) can reproduce the results end‑to‑end.
|
||||
|
||||
---
|
||||
|
||||
# 1) “Explainably quiet by design”
|
||||
|
||||
**Goal:** Fewer false‑alarms, with every suppression justified (reachability/VEX), and every alert deduplicated and actionable.
|
||||
|
||||
**What to measure**
|
||||
|
||||
* **Noise rate:** total findings vs. actionable (has fix/KB/CWE + reachable or policy‑relevant).
|
||||
* **Dedup:** identical CVE across layers/repos counted once.
|
||||
* **Explainability:** % of findings with a clear path (package → symbol/function → evidence).
|
||||
* **Suppression justifications:** % of suppressed items with VEX reason (not affected, configuration, environment, reachability).
|
||||
|
||||
**A/B test setup**
|
||||
|
||||
* **Repos (representative mix):** .NET (aspnet app & library), JVM (Spring), Node/TS (Nest), Python (FastAPI), Go (CLI), container base images (Alpine, Debian, Ubuntu), and a known‑noisy mono‑repo.
|
||||
* **Modes:** `baseline=no VEX/reach`, `quiet=reach+VEX+dedup`.
|
||||
* **Metrics capture:** emit JSONL per repo with counts and examples.
|
||||
|
||||
**Minimal harness (pseudo)**
|
||||
|
||||
```bash
|
||||
# baseline
|
||||
stella scan repo --out baseline.jsonl --no-reach --no-vex --no-dedup
|
||||
# quiet
|
||||
stella scan repo --out quiet.jsonl --reach --vex openvex.json --dedup
|
||||
stella explain --in quiet.jsonl --evidence callgraph,eventpipe --why > explain.md
|
||||
stella metrics compare baseline.jsonl quiet.jsonl > ab_summary.md
|
||||
```
|
||||
|
||||
**Pass criteria (suggested)**
|
||||
|
||||
* ≥50% reduction in non‑actionable alerts.
|
||||
* 100% of suppressions carry VEX+reason.
|
||||
* ≥90% of actionable findings link to evidence (reachable symbol or policy gate).
|
||||
|
||||
---
|
||||
|
||||
# 2) “Provenance‑first DevSecOps”
|
||||
|
||||
**Goal:** Ship a verifiable bundle anyone can check offline: SBOM + attestations + transparency‑log proof.
|
||||
|
||||
**What to export**
|
||||
|
||||
* **SBOM:** CycloneDX 1.6 or SPDX 3.0.1.
|
||||
* **Provenance attestation:** in‑toto/DSSE (builder, materials, recipe, digest).
|
||||
* **Signatures:** Sigstore (cosign) or regional crypto (pluggable).
|
||||
* **Transparency log receipt:** Rekor (or mirror) inclusion proof.
|
||||
* **Policy snapshot:** the exact policy/lattice and feed hashes used.
|
||||
* **Repro manifest:** declarative inputs so scans are replayable.
|
||||
|
||||
**One‑shot exporter**
|
||||
|
||||
```bash
|
||||
stella bundle export \
|
||||
--sbom cyclonedx.json \
|
||||
--attest provenance.intoto.jsonl \
|
||||
--sig cosign.sig \
|
||||
--rekor-inclusion rekor.json \
|
||||
--policy policy.yml \
|
||||
--replay manifest.lock.json \
|
||||
--out stella-proof-bundle.tgz
|
||||
```
|
||||
|
||||
**Independent verification (clean machine)**
|
||||
|
||||
```bash
|
||||
stella bundle verify stella-proof-bundle.tgz \
|
||||
--check-sig --check-rekor --check-sbom --check-policy --replay
|
||||
# Output should show digest matches, valid DSSE, Rekor inclusion, and replay parity.
|
||||
```
|
||||
|
||||
**Pass criteria**
|
||||
|
||||
* All cryptographic checks pass offline.
|
||||
* Replay produces byte‑identical findings set (or a diff limited to time‑varying feeds pinned by hash).
|
||||
|
||||
---
|
||||
|
||||
# 3) “Diff‑native CI for containers”
|
||||
|
||||
**Goal:** Rescan only what changed (layers/deps/policies) with equal detection parity and lower wall‑time.
|
||||
|
||||
**Test matrix**
|
||||
|
||||
* **Images:** multistage app (runtime+deps), language runtimes (dotnet, jre, node, python), and a “fat” base (ubuntu:XX).
|
||||
* **Changes:** Dockerfile ENV only, add/remove package, patch app DLL/JAR/JS, policy toggle.
|
||||
|
||||
**Runs**
|
||||
|
||||
```bash
|
||||
# Full scan
|
||||
time stella image scan myimg:old > full_old.json
|
||||
time stella image scan myimg:new > full_new.json
|
||||
|
||||
# Diff-aware
|
||||
time stella image scan myimg:new --diff-from myimg:old --cache .stella-cache > diff_new.json
|
||||
|
||||
stella parity check full_new.json diff_new.json > parity.md
|
||||
```
|
||||
|
||||
**Metrics**
|
||||
|
||||
* **Parity:** same actionable findings IDs (allowing dedup).
|
||||
* **Speedup:** (full time) / (diff time).
|
||||
* **Cache hit ratio:** reused layers/components.
|
||||
|
||||
**Pass criteria**
|
||||
|
||||
* 100% actionable parity on modified images.
|
||||
* ≥3× faster on typical “small change” commits; no worse than full scan when cache misses.
|
||||
|
||||
---
|
||||
|
||||
## What you’ll publish (deliverables)
|
||||
|
||||
* `VALIDATION_PLAN.md` — steps above with fixed seeds (image digests, repo SHAs).
|
||||
* `harness/` — scripts to run A/B and diff tests, export bundles, and verify.
|
||||
* `results/YYYY‑MM/` — raw JSONL, parity reports, timing tables, and a 1‑page summary.
|
||||
* `policy/` — locked policy + feed hashes used in the runs.
|
||||
|
||||
---
|
||||
|
||||
## Nice‑to‑have extras
|
||||
|
||||
* **Reachability/VEX gallery:** a few “before/after” call graphs and suppression cards.
|
||||
* **Auditor mode:** `stella audit open stella-proof-bundle.tgz` → read‑only UI that renders SBOM, VEX, signatures, Rekor proof, and replay log.
|
||||
* **CI examples:** GitLab/GitHub YAML snippets for full vs. diff jobs with caching.
|
||||
|
||||
If you want, I can spit out the repo‑ready scaffold (folders, stub scripts, sample policies) tailored to your .NET 10 + Docker setup so you can run this tonight.
|
||||
Reference in New Issue
Block a user