feat: Add VEX Lens CI and Load Testing Plan
- Introduced a comprehensive CI job structure for VEX Lens, including build, test, linting, and load testing. - Defined load test parameters and SLOs for VEX Lens API and Issuer Directory. - Created Grafana dashboards and alerting mechanisms for monitoring API performance and error rates. - Established offline posture guidelines for CI jobs and load testing. feat: Implement deterministic projection verification script - Added `verify_projection.sh` script for verifying the integrity of projection exports against expected hashes. - Ensured robust error handling for missing files and hash mismatches. feat: Develop Vuln Explorer CI and Ops Plan - Created CI jobs for Vuln Explorer, including build, test, and replay verification. - Implemented backup and disaster recovery strategies for MongoDB and Redis. - Established Merkle anchoring verification and automation for ledger projector. feat: Introduce EventEnvelopeHasher for hashing event envelopes - Implemented `EventEnvelopeHasher` to compute SHA256 hashes for event envelopes. feat: Add Risk Store and Dashboard components - Developed `RiskStore` for managing risk data and state. - Created `RiskDashboardComponent` for displaying risk profiles with filtering capabilities. - Implemented unit tests for `RiskStore` and `RiskDashboardComponent`. feat: Enhance Vulnerability Detail Component - Developed `VulnerabilityDetailComponent` for displaying detailed information about vulnerabilities. - Implemented error handling for missing vulnerability IDs and loading failures.
This commit is contained in:
53
ops/devops/docker/Dockerfile.hardened.template
Normal file
53
ops/devops/docker/Dockerfile.hardened.template
Normal file
@@ -0,0 +1,53 @@
|
||||
# syntax=docker/dockerfile:1.7
|
||||
# Hardened multi-stage template for StellaOps services
|
||||
# Parameters are build-time ARGs so this file can be re-used across services.
|
||||
|
||||
ARG SDK_IMAGE=mcr.microsoft.com/dotnet/sdk:10.0-bookworm-slim
|
||||
ARG RUNTIME_IMAGE=mcr.microsoft.com/dotnet/aspnet:10.0-bookworm-slim
|
||||
ARG APP_PROJECT=src/Service/Service.csproj
|
||||
ARG CONFIGURATION=Release
|
||||
ARG PUBLISH_DIR=/app/publish
|
||||
ARG APP_USER=stella
|
||||
ARG APP_UID=10001
|
||||
ARG APP_GID=10001
|
||||
ARG APP_PORT=8080
|
||||
|
||||
FROM ${SDK_IMAGE} AS build
|
||||
ENV DOTNET_CLI_TELEMETRY_OPTOUT=1 \
|
||||
DOTNET_NOLOGO=1 \
|
||||
SOURCE_DATE_EPOCH=1704067200
|
||||
WORKDIR /src
|
||||
# Expect restore sources to be available offline via local-nugets/
|
||||
COPY . .
|
||||
RUN dotnet restore ${APP_PROJECT} --packages /src/local-nugets && \
|
||||
dotnet publish ${APP_PROJECT} -c ${CONFIGURATION} -o ${PUBLISH_DIR} \
|
||||
/p:UseAppHost=true /p:PublishTrimmed=false
|
||||
|
||||
FROM ${RUNTIME_IMAGE} AS runtime
|
||||
# Create non-root user/group with stable ids for auditability
|
||||
RUN groupadd -r -g ${APP_GID} ${APP_USER} && \
|
||||
useradd -r -u ${APP_UID} -g ${APP_GID} -d /var/lib/${APP_USER} ${APP_USER} && \
|
||||
mkdir -p /app /var/lib/${APP_USER} /var/run/${APP_USER} /tmp && \
|
||||
chown -R ${APP_UID}:${APP_GID} /app /var/lib/${APP_USER} /var/run/${APP_USER} /tmp
|
||||
|
||||
WORKDIR /app
|
||||
COPY --from=build --chown=${APP_UID}:${APP_GID} ${PUBLISH_DIR}/ ./
|
||||
# Ship healthcheck helper; callers may override with their own script
|
||||
COPY --chown=${APP_UID}:${APP_GID} ops/devops/docker/healthcheck.sh /usr/local/bin/healthcheck.sh
|
||||
|
||||
ENV ASPNETCORE_URLS=http://+:${APP_PORT} \
|
||||
DOTNET_EnableDiagnostics=0 \
|
||||
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1 \
|
||||
COMPlus_EnableDiagnostics=0
|
||||
|
||||
USER ${APP_UID}:${APP_GID}
|
||||
EXPOSE ${APP_PORT}
|
||||
HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 \
|
||||
CMD /usr/local/bin/healthcheck.sh
|
||||
|
||||
# Harden filesystem; deploys should also set readOnlyRootFilesystem true
|
||||
RUN chmod 500 /app && \
|
||||
find /app -maxdepth 1 -type f -exec chmod 400 {} \; && \
|
||||
find /app -maxdepth 1 -type d -exec chmod 500 {} \;
|
||||
|
||||
ENTRYPOINT ["./StellaOps.Service"]
|
||||
68
ops/devops/docker/base-image-guidelines.md
Normal file
68
ops/devops/docker/base-image-guidelines.md
Normal file
@@ -0,0 +1,68 @@
|
||||
# Docker hardening blueprint (DOCKER-44-001)
|
||||
|
||||
Use this template for core services (API, Console, Orchestrator, Task Runner, Concelier, Excititor, Policy, Notify, Export, AdvisoryAI).
|
||||
|
||||
The reusable multi-stage scaffold lives at `ops/devops/docker/Dockerfile.hardened.template` and expects:
|
||||
- .NET 10 SDK/runtime images provided via offline mirror (`SDK_IMAGE` / `RUNTIME_IMAGE`).
|
||||
- `APP_PROJECT` path to the service csproj.
|
||||
- `healthcheck.sh` copied from `ops/devops/docker/` (already referenced by the template).
|
||||
|
||||
Copy the template next to the service and set build args in CI (per-service matrix) to avoid maintaining divergent Dockerfiles.
|
||||
|
||||
```Dockerfile
|
||||
# syntax=docker/dockerfile:1.7
|
||||
ARG SDK_IMAGE=mcr.microsoft.com/dotnet/sdk:10.0-bookworm-slim
|
||||
ARG RUNTIME_IMAGE=mcr.microsoft.com/dotnet/aspnet:10.0-bookworm-slim
|
||||
ARG APP_PROJECT=src/Service/Service.csproj
|
||||
ARG CONFIGURATION=Release
|
||||
ARG APP_USER=stella
|
||||
ARG APP_UID=10001
|
||||
ARG APP_GID=10001
|
||||
ARG APP_PORT=8080
|
||||
|
||||
FROM ${SDK_IMAGE} AS build
|
||||
ENV DOTNET_CLI_TELEMETRY_OPTOUT=1 DOTNET_NOLOGO=1 SOURCE_DATE_EPOCH=1704067200
|
||||
WORKDIR /src
|
||||
COPY . .
|
||||
RUN dotnet restore ${APP_PROJECT} --packages /src/local-nugets && \
|
||||
dotnet publish ${APP_PROJECT} -c ${CONFIGURATION} -o /app/publish /p:UseAppHost=true /p:PublishTrimmed=false
|
||||
|
||||
FROM ${RUNTIME_IMAGE} AS runtime
|
||||
RUN groupadd -r -g ${APP_GID} ${APP_USER} && \
|
||||
useradd -r -u ${APP_UID} -g ${APP_GID} -d /var/lib/${APP_USER} ${APP_USER}
|
||||
WORKDIR /app
|
||||
COPY --from=build --chown=${APP_UID}:${APP_GID} /app/publish/ ./
|
||||
COPY --chown=${APP_UID}:${APP_GID} ops/devops/docker/healthcheck.sh /usr/local/bin/healthcheck.sh
|
||||
ENV ASPNETCORE_URLS=http://+:${APP_PORT} \
|
||||
DOTNET_EnableDiagnostics=0 \
|
||||
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1 \
|
||||
COMPlus_EnableDiagnostics=0
|
||||
USER ${APP_UID}:${APP_GID}
|
||||
EXPOSE ${APP_PORT}
|
||||
HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 CMD /usr/local/bin/healthcheck.sh
|
||||
RUN chmod 500 /app && find /app -maxdepth 1 -type f -exec chmod 400 {} \; && find /app -maxdepth 1 -type d -exec chmod 500 {} \;
|
||||
ENTRYPOINT ["./StellaOps.Service"]
|
||||
```
|
||||
|
||||
Build stage (per service) should:
|
||||
- Use `mcr.microsoft.com/dotnet/sdk:10.0-bookworm-slim` (or mirror) with `DOTNET_CLI_TELEMETRY_OPTOUT=1`.
|
||||
- Restore from `local-nugets/` (offline) and run `dotnet publish -c Release -o /app/out`.
|
||||
- Set `SOURCE_DATE_EPOCH` to freeze timestamps.
|
||||
|
||||
Required checks:
|
||||
- No `root` user in final image.
|
||||
- `CAP_NET_RAW` dropped (default with non-root).
|
||||
- Read-only rootfs enforced at deploy time (`securityContext.readOnlyRootFilesystem: true` in Helm/Compose).
|
||||
- Health endpoints exposed: `/health/liveness`, `/health/readiness`, `/version`, `/metrics`.
|
||||
- Image SBOM generated (syft) in pipeline; attach cosign attestations (see DOCKER-44-002).
|
||||
|
||||
SBOM & attestation helper (DOCKER-44-002):
|
||||
- Script: `ops/devops/docker/sbom_attest.sh <image> [out-dir] [cosign-key]`
|
||||
- Emits SPDX (`*.spdx.json`) and CycloneDX (`*.cdx.json`) with `SOURCE_DATE_EPOCH` pinned for reproducibility.
|
||||
- Attaches both as cosign attestations (`--type spdx` / `--type cyclonedx`); supports keyless when `COSIGN_EXPERIMENTAL=1` or explicit PEM key.
|
||||
- Integrate in CI after image build/push; keep registry creds offline-friendly (use local registry mirror during air-gapped builds).
|
||||
|
||||
Health endpoint verification (DOCKER-44-003):
|
||||
- Script: `ops/devops/docker/verify_health_endpoints.sh <image> [port]` spins container, checks `/health/liveness`, `/health/readiness`, `/version`, `/metrics`, and warns if `/capabilities.merge` is not `false` (for Concelier/Excititor).
|
||||
- Run in CI after publishing the image; requires `docker` and `curl` (or `wget`).
|
||||
- Endpoint contract and ASP.NET wiring examples live in `ops/devops/docker/health-endpoints.md`; service owners should copy the snippet and ensure readiness checks cover DB/cache/bus.
|
||||
44
ops/devops/docker/health-endpoints.md
Normal file
44
ops/devops/docker/health-endpoints.md
Normal file
@@ -0,0 +1,44 @@
|
||||
# Health & capability endpoint contract (DOCKER-44-003)
|
||||
|
||||
Target services: API, Console, Orchestrator, Task Runner, Concelier, Excititor, Policy, Notify, Export, AdvisoryAI.
|
||||
|
||||
## HTTP paths
|
||||
- `GET /health/liveness` — fast, dependency-free check; returns `200` and minimal body.
|
||||
- `GET /health/readiness` — may hit critical deps (DB, bus, cache); returns `503` when not ready.
|
||||
- `GET /version` — static payload with `service`, `version`, `commit`, `buildTimestamp` (ISO-8601 UTC), `source` (channel).
|
||||
- `GET /metrics` — Prometheus text exposition; reuse existing instrumentation.
|
||||
- `GET /capabilities` — if present for Concelier/Excititor, must include `"merge": false`.
|
||||
|
||||
## Minimal ASP.NET 10 wiring (per service)
|
||||
```csharp
|
||||
var builder = WebApplication.CreateBuilder(args);
|
||||
// health checks; add real checks as needed
|
||||
builder.Services.AddHealthChecks();
|
||||
var app = builder.Build();
|
||||
|
||||
app.MapHealthChecks("/health/liveness", new() { Predicate = _ => false });
|
||||
app.MapHealthChecks("/health/readiness");
|
||||
|
||||
app.MapGet("/version", () => Results.Json(new {
|
||||
service = "StellaOps.Policy", // override per service
|
||||
version = ThisAssembly.AssemblyInformationalVersion,
|
||||
commit = ThisAssembly.Git.Commit,
|
||||
buildTimestamp = ThisAssembly.Git.CommitDate.UtcDateTime,
|
||||
source = Environment.GetEnvironmentVariable("STELLA_CHANNEL") ?? "edge"
|
||||
}));
|
||||
|
||||
app.UseHttpMetrics();
|
||||
app.MapMetrics();
|
||||
|
||||
app.Run();
|
||||
```
|
||||
- Ensure `ThisAssembly.*` source generators are enabled or substitute build vars.
|
||||
- Keep `/health/liveness` lightweight; `/health/readiness` should test critical dependencies (Mongo, Redis, message bus) with timeouts.
|
||||
- When adding `/capabilities`, explicitly emit `merge = false` for Concelier/Excititor.
|
||||
|
||||
## CI verification
|
||||
- After publishing an image, run `ops/devops/docker/verify_health_endpoints.sh <image> [port]`.
|
||||
- CI should fail if any required endpoint is missing or non-200.
|
||||
|
||||
## Deployment
|
||||
- Helm/Compose should set `readOnlyRootFilesystem: true` and wire readiness/liveness probes to these paths/port.
|
||||
24
ops/devops/docker/healthcheck.sh
Normal file
24
ops/devops/docker/healthcheck.sh
Normal file
@@ -0,0 +1,24 @@
|
||||
#!/bin/sh
|
||||
set -eu
|
||||
HOST="${HEALTH_HOST:-127.0.0.1}"
|
||||
PORT="${HEALTH_PORT:-8080}"
|
||||
LIVENESS_PATH="${LIVENESS_PATH:-/health/liveness}"
|
||||
READINESS_PATH="${READINESS_PATH:-/health/readiness}"
|
||||
USER_AGENT="stellaops-healthcheck"
|
||||
|
||||
fetch() {
|
||||
target_path="$1"
|
||||
# BusyBox wget is available in Alpine; curl not assumed.
|
||||
wget -qO- "http://${HOST}:${PORT}${target_path}" \
|
||||
--header="User-Agent: ${USER_AGENT}" \
|
||||
--timeout="${HEALTH_TIMEOUT:-4}" >/dev/null
|
||||
}
|
||||
|
||||
fail=0
|
||||
if ! fetch "$LIVENESS_PATH"; then
|
||||
fail=1
|
||||
fi
|
||||
if ! fetch "$READINESS_PATH"; then
|
||||
fail=1
|
||||
fi
|
||||
exit "$fail"
|
||||
48
ops/devops/docker/sbom_attest.sh
Normal file
48
ops/devops/docker/sbom_attest.sh
Normal file
@@ -0,0 +1,48 @@
|
||||
#!/usr/bin/env bash
|
||||
# Deterministic SBOM + attestation helper for DOCKER-44-002
|
||||
# Usage: ./sbom_attest.sh <image-ref> [output-dir] [cosign-key]
|
||||
# - image-ref: fully qualified image (e.g., ghcr.io/stellaops/policy:1.2.3)
|
||||
# - output-dir: defaults to ./sbom
|
||||
# - cosign-key: path to cosign key (PEM). If omitted, uses keyless if allowed (COSIGN_EXPERIMENTAL=1)
|
||||
|
||||
set -euo pipefail
|
||||
IMAGE_REF=${1:?"image ref required"}
|
||||
OUT_DIR=${2:-sbom}
|
||||
COSIGN_KEY=${3:-}
|
||||
|
||||
mkdir -p "${OUT_DIR}"
|
||||
|
||||
# Normalize filename (replace / and : with _)
|
||||
name_safe() {
|
||||
echo "$1" | tr '/:' '__'
|
||||
}
|
||||
|
||||
BASENAME=$(name_safe "${IMAGE_REF}")
|
||||
SPDX_JSON="${OUT_DIR}/${BASENAME}.spdx.json"
|
||||
CDX_JSON="${OUT_DIR}/${BASENAME}.cdx.json"
|
||||
ATTESTATION="${OUT_DIR}/${BASENAME}.sbom.att"
|
||||
|
||||
# Freeze timestamps for reproducibility
|
||||
export SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH:-1704067200}
|
||||
|
||||
# Generate SPDX 3.0-ish JSON (syft formats are stable and offline-friendly)
|
||||
syft "${IMAGE_REF}" -o spdx-json > "${SPDX_JSON}"
|
||||
# Generate CycloneDX 1.6 JSON
|
||||
syft "${IMAGE_REF}" -o cyclonedx-json > "${CDX_JSON}"
|
||||
|
||||
# Attach SBOMs as cosign attestations (one per format)
|
||||
export COSIGN_EXPERIMENTAL=${COSIGN_EXPERIMENTAL:-1}
|
||||
COSIGN_ARGS=("attest" "--predicate" "${SPDX_JSON}" "--type" "spdx" "${IMAGE_REF}")
|
||||
if [[ -n "${COSIGN_KEY}" ]]; then
|
||||
COSIGN_ARGS+=("--key" "${COSIGN_KEY}")
|
||||
fi
|
||||
cosign "${COSIGN_ARGS[@]}"
|
||||
|
||||
COSIGN_ARGS=("attest" "--predicate" "${CDX_JSON}" "--type" "cyclonedx" "${IMAGE_REF}")
|
||||
if [[ -n "${COSIGN_KEY}" ]]; then
|
||||
COSIGN_ARGS+=("--key" "${COSIGN_KEY}")
|
||||
fi
|
||||
cosign "${COSIGN_ARGS[@]}"
|
||||
|
||||
echo "SBOMs written to ${SPDX_JSON} and ${CDX_JSON}" >&2
|
||||
echo "Attestations pushed for ${IMAGE_REF}" >&2
|
||||
70
ops/devops/docker/verify_health_endpoints.sh
Normal file
70
ops/devops/docker/verify_health_endpoints.sh
Normal file
@@ -0,0 +1,70 @@
|
||||
#!/usr/bin/env bash
|
||||
# Smoke-check /health and capability endpoints for a built image (DOCKER-44-003)
|
||||
# Usage: ./verify_health_endpoints.sh <image-ref> [port]
|
||||
# Requires: docker, curl or wget
|
||||
set -euo pipefail
|
||||
IMAGE=${1:?"image ref required"}
|
||||
PORT=${2:-8080}
|
||||
CONTAINER_NAME="healthcheck-$$"
|
||||
TIMEOUT=30
|
||||
SLEEP=1
|
||||
|
||||
have_curl=1
|
||||
if ! command -v curl >/dev/null 2>&1; then
|
||||
have_curl=0
|
||||
fi
|
||||
|
||||
req() {
|
||||
local path=$1
|
||||
local url="http://127.0.0.1:${PORT}${path}"
|
||||
if [[ $have_curl -eq 1 ]]; then
|
||||
curl -fsS --max-time 3 "$url" >/dev/null
|
||||
else
|
||||
wget -qO- --timeout=3 "$url" >/dev/null
|
||||
fi
|
||||
}
|
||||
|
||||
cleanup() {
|
||||
docker rm -f "$CONTAINER_NAME" >/dev/null 2>&1 || true
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
echo "[info] starting container ${IMAGE} on port ${PORT}" >&2
|
||||
cleanup
|
||||
if ! docker run -d --rm --name "$CONTAINER_NAME" -p "${PORT}:${PORT}" "$IMAGE" >/dev/null; then
|
||||
echo "[error] failed to start image ${IMAGE}" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# wait for readiness
|
||||
start=$(date +%s)
|
||||
while true; do
|
||||
if req /health/liveness 2>/dev/null; then break; fi
|
||||
now=$(date +%s)
|
||||
if (( now - start > TIMEOUT )); then
|
||||
echo "[error] liveness endpoint did not come up in ${TIMEOUT}s" >&2
|
||||
exit 1
|
||||
fi
|
||||
sleep $SLEEP
|
||||
done
|
||||
|
||||
# verify endpoints
|
||||
fail=0
|
||||
for path in /health/liveness /health/readiness /version /metrics; do
|
||||
if ! req "$path"; then
|
||||
echo "[error] missing or failing ${path}" >&2
|
||||
fail=1
|
||||
fi
|
||||
done
|
||||
|
||||
# capability endpoint optional; if present ensure merge=false for Concelier/Excititor
|
||||
if req /capabilities 2>/dev/null; then
|
||||
body="$(curl -fsS "http://127.0.0.1:${PORT}/capabilities" 2>/dev/null || true)"
|
||||
if echo "$body" | grep -q '"merge"[[:space:]]*:[[:space:]]*false'; then
|
||||
:
|
||||
else
|
||||
echo "[warn] /capabilities present but merge flag not false" >&2
|
||||
fi
|
||||
fi
|
||||
|
||||
exit $fail
|
||||
74
ops/devops/secrets/surface-secrets-provisioning.md
Normal file
74
ops/devops/secrets/surface-secrets-provisioning.md
Normal file
@@ -0,0 +1,74 @@
|
||||
# Surface.Secrets provisioning playbook (OPS-SECRETS-01)
|
||||
|
||||
Audience: DevOps/Ops teams shipping Scanner/Zastava/Orchestrator bundles.
|
||||
Scope: how to provision secrets for the `StellaOps.Scanner.Surface.Secrets` providers across Kubernetes, Docker Compose, and Offline Kit.
|
||||
|
||||
## Secret types (handles only)
|
||||
- Registry pull creds (CAS / OCI / private feeds)
|
||||
- CAS/attestation tokens
|
||||
- TLS client certs for Surface.FS / RustFS (optional)
|
||||
- Feature flag/token bundles used by Surface.Validation (non-sensitive payloads still go through handles)
|
||||
|
||||
All values are referenced via `secret://` handles inside service configs; plaintext never enters configs or SBOMs.
|
||||
|
||||
## Provider matrix
|
||||
| Environment | Provider | Location | Notes |
|
||||
| --- | --- | --- | --- |
|
||||
| Kubernetes | `kubernetes` | Namespace-scoped `Secret` objects | Mount-free: providers read via API using service account; RBAC must allow `get/list` on the secret names. |
|
||||
| Compose (connected) | `file` | Host-mounted path (e.g., `/etc/stellaops/secrets`) | Keep per-tenant subfolders; chmod 700 root; avoid embedding in images. |
|
||||
| Airgap/Offline Kit | `file` | Unpacked bundle `surface-secrets/<tenant>/...` | Bundled as encrypted payloads; decrypt/unpack to the expected directory before first boot. |
|
||||
| Tests | `inline` | Environment variables or minimal inline JSON | Only for unit/system tests; disable in prod (`SCANNER_SURFACE_SECRETS_ALLOW_INLINE=false`). |
|
||||
|
||||
## Kubernetes workflow
|
||||
1) Namespace: choose one per environment (e.g., `stellaops-prod`).
|
||||
2) Secret layout: one K8s Secret per tenant+component to keep RBAC narrow.
|
||||
```
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: scanner-secrets-default
|
||||
namespace: stellaops-prod
|
||||
stringData:
|
||||
registry.json: |
|
||||
{ "type": "registry", "name": "default", "username": "svc", "password": "********", "scopes": ["stella/*"] }
|
||||
cas.json: |
|
||||
{ "type": "cas-token", "name": "default", "token": "********" }
|
||||
```
|
||||
3) RBAC: service accounts for Scanner Worker/WebService and Zastava Observer/Webhook need `get/list` on these secrets.
|
||||
4) Values: set in Helm via `surface.secrets.provider=kubernetes` and `surface.secrets.namespace=<ns>` (already templated in `values*.yaml`).
|
||||
|
||||
## Compose workflow
|
||||
1) Create secrets directory (default `/etc/stellaops/secrets`).
|
||||
2) Layout per schema (see `docs/modules/scanner/design/surface-secrets-schema.md`):
|
||||
```
|
||||
/etc/stellaops/secrets/
|
||||
tenants/default/registry/default.json
|
||||
tenants/default/cas/default.json
|
||||
```
|
||||
3) Set env in `.env` files:
|
||||
```
|
||||
SCANNER_SURFACE_SECRETS_PROVIDER=file
|
||||
SCANNER_SURFACE_SECRETS_ROOT=/etc/stellaops/secrets
|
||||
SCANNER_SURFACE_SECRETS_NAMESPACE=
|
||||
SCANNER_SURFACE_SECRETS_ALLOW_INLINE=false
|
||||
ZASTAVA_SURFACE_SECRETS_PROVIDER=${SCANNER_SURFACE_SECRETS_PROVIDER}
|
||||
ZASTAVA_SURFACE_SECRETS_ROOT=${SCANNER_SURFACE_SECRETS_ROOT}
|
||||
```
|
||||
4) Ensure docker-compose mounts the secrets path read-only to the services that need it.
|
||||
|
||||
## Offline Kit workflow
|
||||
- The offline kit already ships encrypted `surface-secrets` bundles (see `docs/24_OFFLINE_KIT.md`).
|
||||
- Operators must: (a) decrypt using the provided key, (b) place contents under `/etc/stellaops/secrets` (or override `*_SURFACE_SECRETS_ROOT`), (c) keep permissions 700/600.
|
||||
- Set `*_SURFACE_SECRETS_PROVIDER=file` and root path envs as in Compose; Kubernetes provider is not available offline.
|
||||
|
||||
## Validation & observability
|
||||
- Surface.Validation will fail readiness if required secrets are missing or malformed.
|
||||
- Metrics/Logs: look for `surface.secrets.*` issue codes; readiness should fail on `Error` severities.
|
||||
- For CI smoke: run service with `SURFACE_SECRETS_ALLOW_INLINE=true` and inject test secrets via env for deterministic integration tests.
|
||||
|
||||
## Quick checklist
|
||||
- [ ] Provider selected per environment (`kubernetes`/`file`/`inline`)
|
||||
- [ ] Secrets directory or namespace populated per schema
|
||||
- [ ] RBAC (K8s) or file permissions (Compose/offline) locked down
|
||||
- [ ] Env variables set for both Scanner (`SCANNER_*`) and Zastava (`ZASTAVA_*` prefixes)
|
||||
- [ ] Readiness wired to Surface.Validation so missing secrets block rollout
|
||||
33
ops/devops/telemetry/README.md
Normal file
33
ops/devops/telemetry/README.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Telemetry bundle verifier
|
||||
|
||||
Files:
|
||||
- `verify-telemetry-bundle.sh`: offline verifier (checksums + optional JSON schema)
|
||||
- `tests/sample-bundle/telemetry-bundle.json`: sample manifest
|
||||
- `tests/sample-bundle/telemetry-bundle.sha256`: checksum list for sample bundle
|
||||
- `tests/telemetry-bundle.tar`: deterministic sample bundle (ustar, mtime=0, owner/group 0)
|
||||
- `tests/run-schema-tests.sh`: validates sample config against config schema
|
||||
- `tests/ci-run.sh`: runs schema test + bundle verifier (use in CI)
|
||||
|
||||
Dependencies for full validation:
|
||||
- `python` with `jsonschema` installed (`pip install jsonschema`)
|
||||
- `tar`, `sha256sum`
|
||||
|
||||
Deterministic TAR flags used for sample bundle:
|
||||
`tar --mtime=@0 --owner=0 --group=0 --numeric-owner --format=ustar`
|
||||
|
||||
Exit codes:
|
||||
- 0 success
|
||||
- 21 missing manifest/checksums
|
||||
- 22 checksum mismatch
|
||||
- 23 schema validation failed
|
||||
- 64 usage error
|
||||
|
||||
Quick check:
|
||||
```bash
|
||||
./verify-telemetry-bundle.sh tests/telemetry-bundle.tar
|
||||
```
|
||||
|
||||
CI suggestion:
|
||||
```bash
|
||||
ops/devops/telemetry/tests/ci-run.sh
|
||||
```
|
||||
7
ops/devops/telemetry/tests/ci-run.sh
Normal file
7
ops/devops/telemetry/tests/ci-run.sh
Normal file
@@ -0,0 +1,7 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
ROOT="$(cd "$(dirname "$0")/../../" && pwd)"
|
||||
SCHEMA="$ROOT/docs/modules/telemetry/schemas/telemetry-bundle.schema.json"
|
||||
|
||||
"$ROOT/ops/devops/telemetry/tests/run-schema-tests.sh"
|
||||
TELEMETRY_BUNDLE_SCHEMA="$SCHEMA" "$ROOT/ops/devops/telemetry/verify-telemetry-bundle.sh" "$ROOT/ops/devops/telemetry/tests/telemetry-bundle.tar"
|
||||
35
ops/devops/telemetry/tests/config-valid.json
Normal file
35
ops/devops/telemetry/tests/config-valid.json
Normal file
@@ -0,0 +1,35 @@
|
||||
{
|
||||
"schemaVersion": "1.0.0",
|
||||
"hashAlgorithm": "sha256",
|
||||
"profiles": [
|
||||
{
|
||||
"name": "default",
|
||||
"description": "default profile",
|
||||
"collectorVersion": "otelcol/1.0.0",
|
||||
"cryptoProfile": "fips",
|
||||
"sealedMode": false,
|
||||
"allowlistedEndpoints": ["http://localhost:4318"],
|
||||
"exporters": [
|
||||
{
|
||||
"type": "otlp",
|
||||
"endpoint": "http://localhost:4318",
|
||||
"protocol": "http",
|
||||
"compression": "none",
|
||||
"enabled": true
|
||||
}
|
||||
],
|
||||
"redactionPolicyUri": "https://example.com/redaction-policy.json",
|
||||
"sampling": {
|
||||
"strategy": "traceidratio",
|
||||
"seed": "0000000000000001",
|
||||
"rules": [
|
||||
{"match": "service.name == 'api'", "priority": 10, "sampleRate": 0.2}
|
||||
]
|
||||
},
|
||||
"tenantRouting": {
|
||||
"attribute": "tenant.id",
|
||||
"quotasPerTenant": {"tenant-a": 1000}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
9
ops/devops/telemetry/tests/make-sample.sh
Normal file
9
ops/devops/telemetry/tests/make-sample.sh
Normal file
@@ -0,0 +1,9 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
ROOT="$(cd "$(dirname "$0")/../" && pwd)"
|
||||
BUNDLE_DIR="$ROOT/tests/sample-bundle"
|
||||
mkdir -p "$BUNDLE_DIR"
|
||||
cp "$ROOT/tests/manifest-valid.json" "$BUNDLE_DIR/telemetry-bundle.json"
|
||||
(cd "$BUNDLE_DIR" && sha256sum telemetry-bundle.json > telemetry-bundle.sha256)
|
||||
tar --mtime=@0 --owner=0 --group=0 --numeric-owner --format=ustar -C "$BUNDLE_DIR" -cf "$ROOT/tests/telemetry-bundle.tar" telemetry-bundle.json telemetry-bundle.sha256
|
||||
echo "Wrote sample bundle to $ROOT/tests/telemetry-bundle.tar"
|
||||
19
ops/devops/telemetry/tests/run-schema-tests.sh
Normal file
19
ops/devops/telemetry/tests/run-schema-tests.sh
Normal file
@@ -0,0 +1,19 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
ROOT="$(cd "$(dirname "$0")/../../" && pwd)"
|
||||
if ! command -v python >/dev/null 2>&1; then
|
||||
echo "python not found" >&2; exit 127; fi
|
||||
if ! python - <<'PY' >/dev/null 2>&1; then
|
||||
import jsonschema
|
||||
PY
|
||||
then
|
||||
echo "python jsonschema module not installed" >&2; exit 127; fi
|
||||
python - <<'PY'
|
||||
import json, pathlib
|
||||
from jsonschema import validate
|
||||
root = pathlib.Path('ops/devops/telemetry/tests')
|
||||
config = json.loads((root / 'config-valid.json').read_text())
|
||||
schema = json.loads(pathlib.Path('docs/modules/telemetry/schemas/telemetry-config.schema.json').read_text())
|
||||
validate(config, schema)
|
||||
print('telemetry-config schema ok')
|
||||
PY
|
||||
@@ -0,0 +1,26 @@
|
||||
{
|
||||
"schemaVersion": "1.0.0",
|
||||
"bundleId": "00000000-0000-0000-0000-000000000001",
|
||||
"createdAt": "2025-12-01T00:00:00Z",
|
||||
"profileHash": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
|
||||
"collectorVersion": "otelcol/1.0.0",
|
||||
"sealedMode": true,
|
||||
"redactionManifest": "redaction-manifest.json",
|
||||
"manifestHashAlgorithm": "sha256",
|
||||
"timeAnchor": {
|
||||
"type": "rfc3161",
|
||||
"value": "dummy-token"
|
||||
},
|
||||
"artifacts": [
|
||||
{
|
||||
"path": "logs.ndjson",
|
||||
"sha256": "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",
|
||||
"mediaType": "application/x-ndjson",
|
||||
"size": 123
|
||||
}
|
||||
],
|
||||
"dsseEnvelope": {
|
||||
"hash": "cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc",
|
||||
"location": "bundle.dsse.json"
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1 @@
|
||||
6e3fedbf183aece5dfa14a90ebce955e2887d36747c424e628dc2cc03bcb0ed3 telemetry-bundle.json
|
||||
BIN
ops/devops/telemetry/tests/telemetry-bundle.tar
Normal file
BIN
ops/devops/telemetry/tests/telemetry-bundle.tar
Normal file
Binary file not shown.
@@ -9,8 +9,11 @@ set -euo pipefail
|
||||
# 23 schema validation failed
|
||||
|
||||
BUNDLE=${1:-}
|
||||
SCHEMA_PATH=${TELEMETRY_BUNDLE_SCHEMA:-}
|
||||
|
||||
if [[ -z "$BUNDLE" ]]; then
|
||||
echo "Usage: $0 path/to/telemetry-bundle.tar" >&2
|
||||
echo "Optional: set TELEMETRY_BUNDLE_SCHEMA=/abs/path/to/telemetry-bundle.schema.json" >&2
|
||||
exit 64
|
||||
fi
|
||||
|
||||
@@ -38,9 +41,13 @@ popd >/dev/null
|
||||
|
||||
# JSON schema validation (optional if jsonschema not present).
|
||||
if command -v python >/dev/null 2>&1; then
|
||||
SCHEMA_DIR="$(cd "$(dirname "$0")/../../docs/modules/telemetry/schemas" && pwd)"
|
||||
SCHEMA_FILE="$SCHEMA_DIR/telemetry-bundle.schema.json"
|
||||
if [[ -f "$SCHEMA_FILE" ]]; then
|
||||
SCHEMA_FILE="$SCHEMA_PATH"
|
||||
if [[ -z "$SCHEMA_FILE" ]]; then
|
||||
SCHEMA_DIR="$(cd "$(dirname "$0")/../../docs/modules/telemetry/schemas" 2>/dev/null || echo "")"
|
||||
SCHEMA_FILE="$SCHEMA_DIR/telemetry-bundle.schema.json"
|
||||
fi
|
||||
|
||||
if [[ -n "$SCHEMA_FILE" && -f "$SCHEMA_FILE" ]]; then
|
||||
python - "$MANIFEST" "$SCHEMA_FILE" <<'PY'
|
||||
import json, sys
|
||||
from jsonschema import validate, Draft202012Validator
|
||||
|
||||
54
ops/devops/vex/vex-ci-loadtest-plan.md
Normal file
54
ops/devops/vex/vex-ci-loadtest-plan.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# VEX Lens CI + Load/Obs Plan (DEVOPS-VEX-30-001)
|
||||
|
||||
Scope: CI jobs, load/perf tests, dashboards, and alerts for VEX Lens API and Issuer Directory.
|
||||
Assumptions: offline-friendly mirrors available; VEX Lens uses Mongo + Redis; Issuer Directory uses Mongo + OIDC.
|
||||
|
||||
## CI Jobs (Gitea workflow template)
|
||||
- `build-vex`: dotnet restore/build for `src/VexLens/StellaOps.VexLens`, cache `local-nugets/`, set `DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1`.
|
||||
- `test-vex`: `dotnet test` VexLens and Issuer Directory tests with `DOTNET_DISABLE_BUILTIN_GRAPH=1` to avoid graph fan-out; publish TRX + coverage.
|
||||
- `lint-spec`: validate VEX OpenAPI/JSON schema snapshots (run `dotnet tool run spec-validation`).
|
||||
- `sbom+attest`: reuse `ops/devops/docker/sbom_attest.sh` after image build; push attestations.
|
||||
- `loadtest`: run k6 (or oha) scenario against ephemeral stack via compose profile:
|
||||
- startup with Mongo/Redis fixtures from `samples/vex/fixtures/*.json`.
|
||||
- endpoints: `/vex/entries?tenant=…`, `/issuer-directory/issuers`, `/issuer-directory/statistics`.
|
||||
- SLOs: p95 < 250ms for reads, error rate < 0.5%.
|
||||
- artifacts: `results.json` + Prometheus remote-write if enabled.
|
||||
|
||||
## Load Test Shape (k6 sketch)
|
||||
- 5 min ramp to 200 VUs, 10 min steady, 2 min ramp-down.
|
||||
- Mix: 70% list queries (pagination), 20% filtered queries (product, severity), 10% issuer stats.
|
||||
- Headers: tenant header (`X-StellaOps-Tenant`), auth token from seeded issuer.
|
||||
- Fixtures: seed 100k VEX statements, 5k issuers, mixed disputed/verified statuses.
|
||||
|
||||
## Dashboards (Grafana)
|
||||
Panels to add under folder `StellaOps / VEX`:
|
||||
- API latency: p50/p95/p99 for `/vex/entries`, `/issuer-directory/*`.
|
||||
- Error rates by status code and tenant.
|
||||
- Query volume and cache hit rate (Redis, if used).
|
||||
- Mongo metrics: `mongodb_driver_commands_seconds` (p95), connection pool usage.
|
||||
- Background jobs: ingestion/GC queue latency and failures.
|
||||
|
||||
## Alerts
|
||||
- `vex_api_latency_p95_gt_250ms` for 5m.
|
||||
- `vex_api_error_rate_gt_0.5pct` for 5m.
|
||||
- `issuer_directory_cache_miss_rate_gt_20pct` for 15m (if cache enabled).
|
||||
- `mongo_pool_exhausted` when pool usage > 90% for 5m.
|
||||
|
||||
## Offline / air-gap posture
|
||||
- Use mirrored images and `local-nugets/` only; no outbound fetch in CI jobs.
|
||||
- k6 binary vendored under `tools/k6/` (add to cache) or use `oha` from `tools/oha/`.
|
||||
- Load test fixtures stored in repo under `samples/vex/fixtures/` to avoid network pulls.
|
||||
|
||||
## How to run locally
|
||||
```
|
||||
# build and test
|
||||
DOTNET_DISABLE_BUILTIN_GRAPH=1 dotnet test src/VexLens/StellaOps.VexLens.Tests/StellaOps.VexLens.Tests.csproj
|
||||
# run loadtest (requires docker + k6)
|
||||
make -f ops/devops/Makefile vex-loadtest
|
||||
```
|
||||
|
||||
## Evidence to attach
|
||||
- TRX + coverage
|
||||
- k6 `results.json`/`summary.txt`
|
||||
- Grafana dashboard JSON export (`dashboards/vex/*.json`)
|
||||
- Alert rules file (`ops/devops/vex/alerts.yaml` when created)
|
||||
25
ops/devops/vuln/verify_projection.sh
Normal file
25
ops/devops/vuln/verify_projection.sh
Normal file
@@ -0,0 +1,25 @@
|
||||
#!/usr/bin/env bash
|
||||
# Deterministic projection verification for DEVOPS-VULN-29-001/002
|
||||
# Usage: ./verify_projection.sh [projection-export.json] [expected-hash-file]
|
||||
set -euo pipefail
|
||||
PROJECTION=${1:-samples/vuln/events/projection.json}
|
||||
EXPECTED_HASH_FILE=${2:-ops/devops/vuln/expected_projection.sha256}
|
||||
|
||||
if [[ ! -f "$PROJECTION" ]]; then
|
||||
echo "projection file not found: $PROJECTION" >&2
|
||||
exit 1
|
||||
fi
|
||||
if [[ ! -f "$EXPECTED_HASH_FILE" ]]; then
|
||||
echo "expected hash file not found: $EXPECTED_HASH_FILE" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
calc_hash=$(sha256sum "$PROJECTION" | awk '{print $1}')
|
||||
expected_hash=$(cut -d' ' -f1 "$EXPECTED_HASH_FILE")
|
||||
|
||||
if [[ "$calc_hash" != "$expected_hash" ]]; then
|
||||
echo "mismatch: projection hash $calc_hash expected $expected_hash" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
echo "projection hash matches ($calc_hash)" >&2
|
||||
43
ops/devops/vuln/vuln-explorer-ci-plan.md
Normal file
43
ops/devops/vuln/vuln-explorer-ci-plan.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Vuln Explorer CI + Ops Plan (DEVOPS-VULN-29-001)
|
||||
|
||||
Scope: CI jobs, backup/DR, Merkle anchoring monitoring, and verification automation for the Vuln Explorer ledger projector and API.
|
||||
Assumptions: Vuln Explorer API uses MongoDB + Redis; ledger projector performs replay into materialized views; Merkle tree anchoring to transparency log.
|
||||
|
||||
## CI Jobs
|
||||
- `build-vuln`: dotnet restore/build for `src/VulnExplorer/StellaOps.VulnExplorer.Api` and projector; use `DOTNET_DISABLE_BUILTIN_GRAPH=1` and `local-nugets/`.
|
||||
- `test-vuln`: focused tests with `dotnet test src/VulnExplorer/__Tests/...` and `--filter Category!=GraphHeavy`; publish TRX + coverage.
|
||||
- `replay-smoke`: run projector against fixture event log (`samples/vuln/events/replay.ndjson`) and assert deterministic materialized view hash; fail on divergence.
|
||||
- `sbom+attest`: reuse `ops/devops/docker/sbom_attest.sh` post-build.
|
||||
|
||||
## Backup & DR
|
||||
- Mongo: enable point-in-time snapshots (if available) or nightly `mongodump` of `vuln_explorer` db; store in object storage with retention 30d.
|
||||
- Redis (if used for cache): not authoritative; no backup required.
|
||||
- Replay-first recovery: keep latest event log snapshot in `release artifacts`; replay task rehydrates materialized views.
|
||||
|
||||
## Merkle Anchoring Verification
|
||||
- Monitor projector metrics: `ledger_projection_lag_seconds`, `ledger_projection_errors_total`.
|
||||
- Add periodic job `verify-merkle`: fetch latest Merkle root from projector state, cross-check against transparency log (`rekor` or configured log) using `cosign verify-tree` or custom verifier.
|
||||
- Alert when last anchored root age > 15m or mismatch detected.
|
||||
|
||||
## Verification Automation
|
||||
- Script `ops/devops/vuln/verify_projection.sh` (to be added) should:
|
||||
- Run projector against fixture events and compute hash of materialized view snapshot (`sha256sum` over canonical JSON export).
|
||||
- Compare with expected hash stored in `ops/devops/vuln/expected_projection.sha256`.
|
||||
- Exit non-zero on mismatch.
|
||||
|
||||
## Fixtures
|
||||
- Store deterministic replay fixture under `samples/vuln/events/replay.ndjson` (generated offline, includes mixed tenants, disputed findings, remediation states).
|
||||
- Export canonical projection snapshot to `samples/vuln/events/projection.json` and hash to `ops/devops/vuln/expected_projection.sha256`.
|
||||
|
||||
## Dashboards / Alerts (DEVOPS-VULN-29-002/003)
|
||||
- Dashboard panels: projection lag, replay throughput, API latency (`/findings`, `/findings/{id}`), query budget enforcement hits, and Merkle anchoring status.
|
||||
- Alerts: `vuln_projection_lag_gt_60s`, `vuln_projection_error_rate_gt_1pct`, `vuln_api_latency_p95_gt_300ms`, `merkle_anchor_stale_gt_15m`.
|
||||
|
||||
## Offline posture
|
||||
- CI and verification use in-repo fixtures; no external downloads.
|
||||
- Use mirrored images and `local-nugets/` for all builds/tests.
|
||||
|
||||
## Local run
|
||||
```
|
||||
DOTNET_DISABLE_BUILTIN_GRAPH=1 dotnet test src/VulnExplorer/__Tests/StellaOps.VulnExplorer.Api.Tests/StellaOps.VulnExplorer.Api.Tests.csproj --filter Category!=GraphHeavy
|
||||
```
|
||||
Reference in New Issue
Block a user