feat: Add Scanner CI runner and related artifacts
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Airgap Sealed CI Smoke / sealed-smoke (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled

- Implemented `run-scanner-ci.sh` to build and run tests for the Scanner solution with a warmed NuGet cache.
- Created `excititor-vex-traces.json` dashboard for monitoring Excititor VEX observations.
- Added Docker Compose configuration for the OTLP span sink in `docker-compose.spansink.yml`.
- Configured OpenTelemetry collector in `otel-spansink.yaml` to receive and process traces.
- Developed `run-spansink.sh` script to run the OTLP span sink for Excititor traces.
- Introduced `FileSystemRiskBundleObjectStore` for storing risk bundle artifacts in the filesystem.
- Built `RiskBundleBuilder` for creating risk bundles with associated metadata and providers.
- Established `RiskBundleJob` to execute the risk bundle creation and storage process.
- Defined models for risk bundle inputs, entries, and manifests in `RiskBundleModels.cs`.
- Implemented signing functionality for risk bundle manifests with `HmacRiskBundleManifestSigner`.
- Created unit tests for `RiskBundleBuilder`, `RiskBundleJob`, and signing functionality to ensure correctness.
- Added filesystem artifact reader tests to validate manifest parsing and artifact listing.
- Included test manifests for egress scenarios in the task runner tests.
- Developed timeline query service tests to verify tenant and event ID handling.
This commit is contained in:
StellaOps Bot
2025-11-30 19:12:35 +02:00
parent 17d45a6d30
commit 71e9a56cfd
92 changed files with 2596 additions and 387 deletions

View File

@@ -13,6 +13,7 @@ Deterministic, reproducible benchmark for reachability analysis tools.
- `benchmark/truth/` — ground-truth labels (hidden/internal split optional).
- `benchmark/submissions/` — sample submissions and format reference.
- `tools/scorer/``rb-score` CLI and tests.
- `tools/build/``build_all.py` (run all cases) and `validate_builds.py` (run twice and compare hashes).
- `baselines/` — reference runners (Semgrep, CodeQL, Stella) with normalized outputs.
- `ci/` — deterministic CI workflows and scripts.
- `website/` — static site (leaderboard/docs/downloads).

View File

@@ -28,6 +28,8 @@ python -m pip install -r requirements.txt
python -m unittest tests/test_scoring.py
```
Explainability tiers (task 513-009) are covered by `test_explainability_tiers` in `tests/test_scoring.py`.
## Notes
- Predictions for sinks not present in truth count as false positives (strict posture).
- Truth sinks with label `unknown` are ignored for FN/FP counting.

View File

@@ -65,6 +65,37 @@ class TestScoring(unittest.TestCase):
self.assertEqual(report.f1, 0.0)
self.assertEqual(report.determinism_rate, 1.0)
def test_explainability_tiers(self):
# Build synthetic predictions to exercise explainability tiers 0-3
preds = [
{"sink_id": "a", "prediction": "reachable", "explain": {}}, # tier 0
{"sink_id": "b", "prediction": "reachable", "explain": {"path": ["f1", "f2"]}}, # tier 1
{"sink_id": "c", "prediction": "reachable", "explain": {"entry": "E", "path": ["f1", "f2", "f3"]}}, # tier 2
{"sink_id": "d", "prediction": "reachable", "explain": {"guards": ["x"], "path": ["f1", "f2"]}}, # tier 3
]
# Minimal truth to allow scoring
truth_doc = {
"version": "1.0.0",
"cases": [
{
"case_id": "case-1",
"sinks": [
{"sink_id": s, "label": "reachable"} for s in ["a", "b", "c", "d"]
],
}
],
}
submission = {
"version": "1.0.0",
"tool": {"name": "t", "version": "1"},
"run": {"platform": "x"},
"cases": [{"case_id": "case-1", "sinks": preds}],
}
report = rb_score.score(truth_doc, submission)
# explainability average should be (0+1+2+3)/4 = 1.5
self.assertAlmostEqual(report.explain_avg, 1.5, places=4)
if __name__ == "__main__":
unittest.main()