This commit is contained in:
StellaOps Bot
2025-11-23 23:40:10 +02:00
parent c13355923f
commit 029002ad05
93 changed files with 2160 additions and 285 deletions

View File

@@ -0,0 +1,35 @@
# Export Center Helm Overlays (DEPLOY-EXPORT-35-001)
## Values files (download-only)
- `deploy/helm/stellaops/values-export.yaml` (add) with:
- `exportcenter:`
- `image.repository`: `registry.stella-ops.org/export-center`
- `image.tag`: set via pipeline
- `objectStorage.endpoint`: `http://minio:9000`
- `objectStorage.bucket`: `export-prod`
- `objectStorage.accessKeySecret`: `exportcenter-minio`
- `objectStorage.secretKeySecret`: `exportcenter-minio`
- `signing.kmsKey`: `exportcenter-kms`
- `signing.kmsRegion`: `us-east-1`
- `dsse.enabled`: true
## Secrets
- KMS signing: create secret `exportcenter-kms` with JSON key material (KMS provider specific). Example: `ops/deployment/export/secrets-example.yaml`.
- MinIO creds: `exportcenter-minio` with `accesskey`, `secretkey` keys (see example manifest).
## Rollout
- `helm upgrade --install export-center deploy/helm/stellaops -f deploy/helm/stellaops/values-export.yaml --set image.tag=$TAG`
- Pre-flight: `helm template ...` and `helm lint`.
- Post: verify readiness `kubectl rollout status deploy/export-center` and run `curl /healthz`.
## Rollback
- `helm rollback export-center <rev>`; ensure previous tag exists.
## Required artefacts
- Signed images + provenance (from release pipeline).
- SBOM attached via registry (cosign attestations acceptable).
## Acceptance
- Overlay renders without missing values.
- Secrets documented and referenced in template.
- Rollout/rollback steps documented.

View File

@@ -0,0 +1,15 @@
apiVersion: v1
kind: Secret
metadata:
name: exportcenter-minio
stringData:
accesskey: REPLACE_ME
secretkey: REPLACE_ME
---
apiVersion: v1
kind: Secret
metadata:
name: exportcenter-kms
stringData:
key.json: |
{"kmsProvider":"awskms","keyId":"arn:aws:kms:...","region":"us-east-1"}

View File

@@ -0,0 +1,28 @@
# Notifier Helm Overlays (DEPLOY-NOTIFY-38-001)
## Values file
- `deploy/helm/stellaops/values-notify.yaml` (added) with:
- `notify:`
- `image.repository`: `registry.stella-ops.org/notify`
- `image.tag`: set by pipeline
- `smtp.host`, `smtp.port`, `smtp.usernameSecret`, `smtp.passwordSecret`
- `webhook.allowedHosts`: list
- `chat.webhookSecret`: secret name for chat tokens
- `tls.secretName`: optional ingress cert
## Secrets
- SMTP creds secret `notify-smtp` with keys `username`, `password` (see `ops/deployment/notify/secrets-example.yaml`).
- Chat/webhook secret `notify-chat` with key `token` (see example manifest).
## Rollout
- `helm upgrade --install notify deploy/helm/stellaops -f deploy/helm/stellaops/values-notify.yaml --set image.tag=$TAG`
- Pre-flight: `helm lint`, `helm template`.
- Post: `kubectl rollout status deploy/notify` and `curl /healthz`.
## Rollback
- `helm rollback notify <rev>`; confirm previous image tag exists.
## Acceptance
- Overlay renders without missing values.
- Secrets documented and referenced.
- Rollout/rollback steps documented.

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Secret
metadata:
name: notify-smtp
stringData:
username: REPLACE_ME
password: REPLACE_ME
---
apiVersion: v1
kind: Secret
metadata:
name: notify-chat
stringData:
token: REPLACE_ME

25
ops/devops/aoc/aoc-ci.md Normal file
View File

@@ -0,0 +1,25 @@
# AOC Analyzer CI Contract (DEVOPS-AOC-19-001)
## Scope
Integrate AOC Roslyn analyzer and guard tests into CI to block banned writes in ingestion projects.
## Steps
1) Restore & build analyzers
- `dotnet restore src/Aoc/__Analyzers/StellaOps.Aoc.Analyzers/StellaOps.Aoc.Analyzers.csproj`
- `dotnet build src/Aoc/__Analyzers/StellaOps.Aoc.Analyzers/StellaOps.Aoc.Analyzers.csproj -c Release`
2) Run analyzer on ingestion projects (Authority/Concelier/Excititor ingest paths)
- `dotnet build src/Concelier/StellaOps.Concelier.Ingestion/StellaOps.Concelier.Ingestion.csproj -c Release /p:RunAnalyzers=true /p:TreatWarningsAsErrors=true`
- `dotnet build src/Authority/StellaOps.Authority.Ingestion/StellaOps.Authority.Ingestion.csproj -c Release /p:RunAnalyzers=true /p:TreatWarningsAsErrors=true`
- `dotnet build src/Excititor/StellaOps.Excititor.Ingestion/StellaOps.Excititor.Ingestion.csproj -c Release /p:RunAnalyzers=true /p:TreatWarningsAsErrors=true`
3) Guard tests
- `dotnet test src/Aoc/__Tests/StellaOps.Aoc.Analyzers.Tests/StellaOps.Aoc.Analyzers.Tests.csproj -c Release`
4) Artefacts
- Upload `.artifacts/aoc-analyzer.log` and test TRX.
## Determinism/Offline
- Use local feeds (`local-nugets/`); no external fetches post-restore.
- Build with `/p:ContinuousIntegrationBuild=true`.
## Acceptance
- CI fails on any analyzer warning in ingestion projects.
- Tests pass; artefacts uploaded.

View File

@@ -0,0 +1,22 @@
# AOC Verify Stage (DEVOPS-AOC-19-002)
## Purpose
Add CI stage to run `stella aoc verify --since <commit>` against seeded Mongo snapshots for Concelier + Excititor, publishing violation reports.
## Inputs
- `STAGING_MONGO_URI` (read-only snapshot).
- Optional `AOC_VERIFY_SINCE` (defaults to `HEAD~1`).
## Steps
1) Seed snapshot (if needed)
- Restore snapshot into local Mongo or point to read-only staging snapshot.
2) Run verify
- `dotnet run --project src/Aoc/StellaOps.Aoc.Cli -- verify --since ${AOC_VERIFY_SINCE:-HEAD~1} --mongo $STAGING_MONGO_URI --output .artifacts/aoc-verify.json`
3) Fail on violations
- Parse `.artifacts/aoc-verify.json`; if `violations > 0`, fail with summary.
4) Publish artifacts
- Upload `.artifacts/aoc-verify.json` and `.artifacts/aoc-verify.ndjson` (per-violation).
## Acceptance
- Stage fails when violations exist; passes clean otherwise.
- Artifacts attached for auditing.

View File

@@ -0,0 +1,51 @@
#!/usr/bin/env bash
set -euo pipefail
# Smoke tests for Trivy compatibility and OCI distribution for Export Center.
ROOT=${ROOT:-$(cd "$(dirname "$0")/../.." && pwd)}
ARTifacts=${ARTifacts:-$ROOT/out/export-smoke}
mkdir -p "$ARTifacts"
# 1) Trivy DB import compatibility
TRIVY_VERSION="0.52.2"
TRIVY_BIN="$ARTifacts/trivy"
if [[ ! -x "$TRIVY_BIN" ]]; then
curl -fsSL "https://github.com/aquasecurity/trivy/releases/download/v${TRIVY_VERSION}/trivy_${TRIVY_VERSION}_Linux-64bit.tar.gz" -o "$ARTifacts/trivy.tgz"
tar -xzf "$ARTifacts/trivy.tgz" -C "$ARTifacts" trivy
fi
"$TRIVY_BIN" module db import --help > "$ARTifacts/trivy-import-help.txt"
# 2) OCI distribution check (local registry)
REGISTRY_PORT=${REGISTRY_PORT:-5005}
REGISTRY_DIR="$ARTifacts/registry"
mkdir -p "$REGISTRY_DIR"
podman run --rm -d -p "${REGISTRY_PORT}:5000" --name export-registry -v "$REGISTRY_DIR":/var/lib/registry registry:2
trap 'podman rm -f export-registry >/dev/null 2>&1 || true' EXIT
sleep 2
echo '{"schemaVersion":2,"manifests":[]}' > "$ARTifacts/empty-index.json"
DIGEST=$(sha256sum "$ARTifacts/empty-index.json" | awk '{print $1}')
mkdir -p "$ARTifacts/blobs/sha256"
cp "$ARTifacts/empty-index.json" "$ARTifacts/blobs/sha256/$DIGEST"
# Push blob and manifest via curl
cat > "$ARTifacts/manifest.json" <<JSON
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.image.config.v1+json",
"size": 2,
"digest": "sha256:d4735e3a265e16eee03f59718b9b5d03d68c8ffa19c2f8f71b66e08d6c6f2c1a"
},
"layers": []
}
JSON
MAN_DIGEST=$(sha256sum "$ARTifacts/manifest.json" | awk '{print $1}')
curl -sSf -X PUT "http://localhost:${REGISTRY_PORT}/v2/export-smoke/blobs/uploads/" -H 'Content-Length: 0' -o "$ARTifacts/upload-location.txt"
UPLOAD_URL=$(cat "$ARTifacts/upload-location.txt" | tr -d '\r\n')
curl -sSf -X PUT "${UPLOAD_URL}?digest=sha256:${MAN_DIGEST}" --data-binary "@$ARTifacts/manifest.json"
curl -sSf "http://localhost:${REGISTRY_PORT}/v2/export-smoke/manifests/sha256:${MAN_DIGEST}" -o "$ARTifacts/manifest.pull.json"
echo "trivy smoke + oci registry ok" > "$ARTifacts/result.txt"

View File

@@ -0,0 +1,32 @@
# LNM Backfill Plan (DEVOPS-LNM-22-001)
## Goal
Run staging backfill for advisory observations/linksets, validate counts/conflicts, and document rollout steps for production.
## Prereqs
- Concelier API CCLN0102 available (advisory/linkset endpoints stable).
- Staging Mongo snapshot taken (pre-backfill) and stored at `s3://staging-backups/concelier-pre-lnmbf.gz`.
- NATS/Redis staging brokers reachable.
## Steps
1) Seed snapshot
- Restore staging Mongo from pre-backfill snapshot.
2) Run backfill job
- `dotnet run --project src/Concelier/StellaOps.Concelier.Backfill -- --mode=observations --batch-size=500 --max-conflicts=0`
- `dotnet run --project src/Concelier/StellaOps.Concelier.Backfill -- --mode=linksets --batch-size=500 --max-conflicts=0`
3) Validate counts
- Compare `advisory_observations_total` and `linksets_total` vs expected inventory; export to `.artifacts/lnm-counts.json`.
- Check conflict log `.artifacts/lnm-conflicts.ndjson` (must be empty).
4) Events/NATS smoke
- Ensure `concelier.lnm.backfill.completed` emitted; verify Redis/NATS queues drained.
5) Roll-forward checklist
- Promote batch size to 2000 for prod, keep `--max-conflicts=0`.
- Schedule maintenance window, ensure snapshot available for rollback.
## Outputs
- `.artifacts/lnm-counts.json`
- `.artifacts/lnm-conflicts.ndjson` (empty)
- Log of job runtime + throughput.
## Acceptance
- Zero conflicts; counts match expected; events emitted; rollback plan documented.

View File

@@ -0,0 +1,24 @@
#!/usr/bin/env bash
set -euo pipefail
ROOT=${ROOT:-$(cd "$(dirname "$0")/../.." && pwd)}
ARTifacts=${ARTifacts:-$ROOT/.artifacts}
COUNTS=$ARTifacts/lnm-counts.json
CONFLICTS=$ARTifacts/lnm-conflicts.ndjson
mkdir -p "$ARTifacts"
mongoexport --uri "${STAGING_MONGO_URI:?set STAGING_MONGO_URI}" --collection advisoryObservations --db concelier --type=json --query '{}' --out "$ARTifacts/obs.json" >/dev/null
mongoexport --uri "${STAGING_MONGO_URI:?set STAGING_MONGO_URI}" --collection linksets --db concelier --type=json --query '{}' --out "$ARTifacts/linksets.json" >/dev/null
OBS=$(jq length "$ARTifacts/obs.json")
LNK=$(jq length "$ARTifacts/linksets.json")
cat > "$COUNTS" <<JSON
{
"observations": $OBS,
"linksets": $LNK,
"timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
}
JSON
touch "$CONFLICTS"
echo "Counts written to $COUNTS; conflicts at $CONFLICTS"

View File

@@ -0,0 +1,11 @@
#!/usr/bin/env bash
set -euo pipefail
DASHBOARD=${1:-ops/devops/lnm/metrics-dashboard.json}
jq . "$DASHBOARD" >/dev/null
REQUIRED=("advisory_observations_total" "linksets_total" "ingest_api_latency_seconds_bucket" "lnm_backfill_processed_total")
for metric in "${REQUIRED[@]}"; do
if ! grep -q "$metric" "$DASHBOARD"; then
echo "::error::metric $metric missing from dashboard"; exit 1
fi
done
echo "dashboard metrics present"

View File

@@ -0,0 +1,9 @@
{
"title": "LNM Backfill Metrics",
"panels": [
{"type": "stat", "title": "Observations", "targets": [{"expr": "advisory_observations_total"}]},
{"type": "stat", "title": "Linksets", "targets": [{"expr": "linksets_total"}]},
{"type": "graph", "title": "Ingest→API latency p95", "targets": [{"expr": "histogram_quantile(0.95, rate(ingest_api_latency_seconds_bucket[5m]))"}]},
{"type": "graph", "title": "Backfill throughput", "targets": [{"expr": "rate(lnm_backfill_processed_total[5m])"}]}
]
}

View File

@@ -0,0 +1,20 @@
# VEX Backfill Plan (DEVOPS-LNM-22-002)
## Goal
Run VEX observation/linkset backfill with monitoring, ensure events flow via NATS/Redis, and capture run artifacts.
## Steps
1) Pre-checks
- Confirm DEVOPS-LNM-22-001 counts baseline (`.artifacts/lnm-counts.json`).
- Ensure `STAGING_MONGO_URI`, `NATS_URL`, `REDIS_URL` available (read-only or test brokers).
2) Run VEX backfill
- `dotnet run --project src/Concelier/StellaOps.Concelier.Backfill -- --mode=vex --batch-size=500 --max-conflicts=0 --mongo $STAGING_MONGO_URI --nats $NATS_URL --redis $REDIS_URL`
3) Metrics capture
- Export per-run metrics to `.artifacts/vex-backfill-metrics.json` (duration, processed, conflicts, events emitted).
4) Event verification
- Subscribe to `concelier.vex.backfill.completed` and `concelier.linksets.vex.upserted`; ensure queues drained.
5) Roll-forward checklist
- Increase batch size to 2000 for prod; keep conflicts = 0; schedule maintenance window.
## Acceptance
- Zero conflicts; events observed; metrics file present; rollback plan documented.