work
This commit is contained in:
35
ops/deployment/export/helm-overlays.md
Normal file
35
ops/deployment/export/helm-overlays.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Export Center Helm Overlays (DEPLOY-EXPORT-35-001)
|
||||
|
||||
## Values files (download-only)
|
||||
- `deploy/helm/stellaops/values-export.yaml` (add) with:
|
||||
- `exportcenter:`
|
||||
- `image.repository`: `registry.stella-ops.org/export-center`
|
||||
- `image.tag`: set via pipeline
|
||||
- `objectStorage.endpoint`: `http://minio:9000`
|
||||
- `objectStorage.bucket`: `export-prod`
|
||||
- `objectStorage.accessKeySecret`: `exportcenter-minio`
|
||||
- `objectStorage.secretKeySecret`: `exportcenter-minio`
|
||||
- `signing.kmsKey`: `exportcenter-kms`
|
||||
- `signing.kmsRegion`: `us-east-1`
|
||||
- `dsse.enabled`: true
|
||||
|
||||
## Secrets
|
||||
- KMS signing: create secret `exportcenter-kms` with JSON key material (KMS provider specific). Example: `ops/deployment/export/secrets-example.yaml`.
|
||||
- MinIO creds: `exportcenter-minio` with `accesskey`, `secretkey` keys (see example manifest).
|
||||
|
||||
## Rollout
|
||||
- `helm upgrade --install export-center deploy/helm/stellaops -f deploy/helm/stellaops/values-export.yaml --set image.tag=$TAG`
|
||||
- Pre-flight: `helm template ...` and `helm lint`.
|
||||
- Post: verify readiness `kubectl rollout status deploy/export-center` and run `curl /healthz`.
|
||||
|
||||
## Rollback
|
||||
- `helm rollback export-center <rev>`; ensure previous tag exists.
|
||||
|
||||
## Required artefacts
|
||||
- Signed images + provenance (from release pipeline).
|
||||
- SBOM attached via registry (cosign attestations acceptable).
|
||||
|
||||
## Acceptance
|
||||
- Overlay renders without missing values.
|
||||
- Secrets documented and referenced in template.
|
||||
- Rollout/rollback steps documented.
|
||||
15
ops/deployment/export/secrets-example.yaml
Normal file
15
ops/deployment/export/secrets-example.yaml
Normal file
@@ -0,0 +1,15 @@
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: exportcenter-minio
|
||||
stringData:
|
||||
accesskey: REPLACE_ME
|
||||
secretkey: REPLACE_ME
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: exportcenter-kms
|
||||
stringData:
|
||||
key.json: |
|
||||
{"kmsProvider":"awskms","keyId":"arn:aws:kms:...","region":"us-east-1"}
|
||||
28
ops/deployment/notify/helm-overlays.md
Normal file
28
ops/deployment/notify/helm-overlays.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Notifier Helm Overlays (DEPLOY-NOTIFY-38-001)
|
||||
|
||||
## Values file
|
||||
- `deploy/helm/stellaops/values-notify.yaml` (added) with:
|
||||
- `notify:`
|
||||
- `image.repository`: `registry.stella-ops.org/notify`
|
||||
- `image.tag`: set by pipeline
|
||||
- `smtp.host`, `smtp.port`, `smtp.usernameSecret`, `smtp.passwordSecret`
|
||||
- `webhook.allowedHosts`: list
|
||||
- `chat.webhookSecret`: secret name for chat tokens
|
||||
- `tls.secretName`: optional ingress cert
|
||||
|
||||
## Secrets
|
||||
- SMTP creds secret `notify-smtp` with keys `username`, `password` (see `ops/deployment/notify/secrets-example.yaml`).
|
||||
- Chat/webhook secret `notify-chat` with key `token` (see example manifest).
|
||||
|
||||
## Rollout
|
||||
- `helm upgrade --install notify deploy/helm/stellaops -f deploy/helm/stellaops/values-notify.yaml --set image.tag=$TAG`
|
||||
- Pre-flight: `helm lint`, `helm template`.
|
||||
- Post: `kubectl rollout status deploy/notify` and `curl /healthz`.
|
||||
|
||||
## Rollback
|
||||
- `helm rollback notify <rev>`; confirm previous image tag exists.
|
||||
|
||||
## Acceptance
|
||||
- Overlay renders without missing values.
|
||||
- Secrets documented and referenced.
|
||||
- Rollout/rollback steps documented.
|
||||
14
ops/deployment/notify/secrets-example.yaml
Normal file
14
ops/deployment/notify/secrets-example.yaml
Normal file
@@ -0,0 +1,14 @@
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: notify-smtp
|
||||
stringData:
|
||||
username: REPLACE_ME
|
||||
password: REPLACE_ME
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: notify-chat
|
||||
stringData:
|
||||
token: REPLACE_ME
|
||||
25
ops/devops/aoc/aoc-ci.md
Normal file
25
ops/devops/aoc/aoc-ci.md
Normal file
@@ -0,0 +1,25 @@
|
||||
# AOC Analyzer CI Contract (DEVOPS-AOC-19-001)
|
||||
|
||||
## Scope
|
||||
Integrate AOC Roslyn analyzer and guard tests into CI to block banned writes in ingestion projects.
|
||||
|
||||
## Steps
|
||||
1) Restore & build analyzers
|
||||
- `dotnet restore src/Aoc/__Analyzers/StellaOps.Aoc.Analyzers/StellaOps.Aoc.Analyzers.csproj`
|
||||
- `dotnet build src/Aoc/__Analyzers/StellaOps.Aoc.Analyzers/StellaOps.Aoc.Analyzers.csproj -c Release`
|
||||
2) Run analyzer on ingestion projects (Authority/Concelier/Excititor ingest paths)
|
||||
- `dotnet build src/Concelier/StellaOps.Concelier.Ingestion/StellaOps.Concelier.Ingestion.csproj -c Release /p:RunAnalyzers=true /p:TreatWarningsAsErrors=true`
|
||||
- `dotnet build src/Authority/StellaOps.Authority.Ingestion/StellaOps.Authority.Ingestion.csproj -c Release /p:RunAnalyzers=true /p:TreatWarningsAsErrors=true`
|
||||
- `dotnet build src/Excititor/StellaOps.Excititor.Ingestion/StellaOps.Excititor.Ingestion.csproj -c Release /p:RunAnalyzers=true /p:TreatWarningsAsErrors=true`
|
||||
3) Guard tests
|
||||
- `dotnet test src/Aoc/__Tests/StellaOps.Aoc.Analyzers.Tests/StellaOps.Aoc.Analyzers.Tests.csproj -c Release`
|
||||
4) Artefacts
|
||||
- Upload `.artifacts/aoc-analyzer.log` and test TRX.
|
||||
|
||||
## Determinism/Offline
|
||||
- Use local feeds (`local-nugets/`); no external fetches post-restore.
|
||||
- Build with `/p:ContinuousIntegrationBuild=true`.
|
||||
|
||||
## Acceptance
|
||||
- CI fails on any analyzer warning in ingestion projects.
|
||||
- Tests pass; artefacts uploaded.
|
||||
22
ops/devops/aoc/aoc-verify-stage.md
Normal file
22
ops/devops/aoc/aoc-verify-stage.md
Normal file
@@ -0,0 +1,22 @@
|
||||
# AOC Verify Stage (DEVOPS-AOC-19-002)
|
||||
|
||||
## Purpose
|
||||
Add CI stage to run `stella aoc verify --since <commit>` against seeded Mongo snapshots for Concelier + Excititor, publishing violation reports.
|
||||
|
||||
## Inputs
|
||||
- `STAGING_MONGO_URI` (read-only snapshot).
|
||||
- Optional `AOC_VERIFY_SINCE` (defaults to `HEAD~1`).
|
||||
|
||||
## Steps
|
||||
1) Seed snapshot (if needed)
|
||||
- Restore snapshot into local Mongo or point to read-only staging snapshot.
|
||||
2) Run verify
|
||||
- `dotnet run --project src/Aoc/StellaOps.Aoc.Cli -- verify --since ${AOC_VERIFY_SINCE:-HEAD~1} --mongo $STAGING_MONGO_URI --output .artifacts/aoc-verify.json`
|
||||
3) Fail on violations
|
||||
- Parse `.artifacts/aoc-verify.json`; if `violations > 0`, fail with summary.
|
||||
4) Publish artifacts
|
||||
- Upload `.artifacts/aoc-verify.json` and `.artifacts/aoc-verify.ndjson` (per-violation).
|
||||
|
||||
## Acceptance
|
||||
- Stage fails when violations exist; passes clean otherwise.
|
||||
- Artifacts attached for auditing.
|
||||
51
ops/devops/export/trivy-smoke.sh
Normal file
51
ops/devops/export/trivy-smoke.sh
Normal file
@@ -0,0 +1,51 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
# Smoke tests for Trivy compatibility and OCI distribution for Export Center.
|
||||
ROOT=${ROOT:-$(cd "$(dirname "$0")/../.." && pwd)}
|
||||
ARTifacts=${ARTifacts:-$ROOT/out/export-smoke}
|
||||
mkdir -p "$ARTifacts"
|
||||
|
||||
# 1) Trivy DB import compatibility
|
||||
TRIVY_VERSION="0.52.2"
|
||||
TRIVY_BIN="$ARTifacts/trivy"
|
||||
if [[ ! -x "$TRIVY_BIN" ]]; then
|
||||
curl -fsSL "https://github.com/aquasecurity/trivy/releases/download/v${TRIVY_VERSION}/trivy_${TRIVY_VERSION}_Linux-64bit.tar.gz" -o "$ARTifacts/trivy.tgz"
|
||||
tar -xzf "$ARTifacts/trivy.tgz" -C "$ARTifacts" trivy
|
||||
fi
|
||||
"$TRIVY_BIN" module db import --help > "$ARTifacts/trivy-import-help.txt"
|
||||
|
||||
# 2) OCI distribution check (local registry)
|
||||
REGISTRY_PORT=${REGISTRY_PORT:-5005}
|
||||
REGISTRY_DIR="$ARTifacts/registry"
|
||||
mkdir -p "$REGISTRY_DIR"
|
||||
podman run --rm -d -p "${REGISTRY_PORT}:5000" --name export-registry -v "$REGISTRY_DIR":/var/lib/registry registry:2
|
||||
trap 'podman rm -f export-registry >/dev/null 2>&1 || true' EXIT
|
||||
sleep 2
|
||||
|
||||
echo '{"schemaVersion":2,"manifests":[]}' > "$ARTifacts/empty-index.json"
|
||||
DIGEST=$(sha256sum "$ARTifacts/empty-index.json" | awk '{print $1}')
|
||||
mkdir -p "$ARTifacts/blobs/sha256"
|
||||
cp "$ARTifacts/empty-index.json" "$ARTifacts/blobs/sha256/$DIGEST"
|
||||
|
||||
# Push blob and manifest via curl
|
||||
cat > "$ARTifacts/manifest.json" <<JSON
|
||||
{
|
||||
"schemaVersion": 2,
|
||||
"mediaType": "application/vnd.oci.image.manifest.v1+json",
|
||||
"config": {
|
||||
"mediaType": "application/vnd.oci.image.config.v1+json",
|
||||
"size": 2,
|
||||
"digest": "sha256:d4735e3a265e16eee03f59718b9b5d03d68c8ffa19c2f8f71b66e08d6c6f2c1a"
|
||||
},
|
||||
"layers": []
|
||||
}
|
||||
JSON
|
||||
MAN_DIGEST=$(sha256sum "$ARTifacts/manifest.json" | awk '{print $1}')
|
||||
|
||||
curl -sSf -X PUT "http://localhost:${REGISTRY_PORT}/v2/export-smoke/blobs/uploads/" -H 'Content-Length: 0' -o "$ARTifacts/upload-location.txt"
|
||||
UPLOAD_URL=$(cat "$ARTifacts/upload-location.txt" | tr -d '\r\n')
|
||||
|
||||
curl -sSf -X PUT "${UPLOAD_URL}?digest=sha256:${MAN_DIGEST}" --data-binary "@$ARTifacts/manifest.json"
|
||||
|
||||
curl -sSf "http://localhost:${REGISTRY_PORT}/v2/export-smoke/manifests/sha256:${MAN_DIGEST}" -o "$ARTifacts/manifest.pull.json"
|
||||
echo "trivy smoke + oci registry ok" > "$ARTifacts/result.txt"
|
||||
32
ops/devops/lnm/backfill-plan.md
Normal file
32
ops/devops/lnm/backfill-plan.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# LNM Backfill Plan (DEVOPS-LNM-22-001)
|
||||
|
||||
## Goal
|
||||
Run staging backfill for advisory observations/linksets, validate counts/conflicts, and document rollout steps for production.
|
||||
|
||||
## Prereqs
|
||||
- Concelier API CCLN0102 available (advisory/linkset endpoints stable).
|
||||
- Staging Mongo snapshot taken (pre-backfill) and stored at `s3://staging-backups/concelier-pre-lnmbf.gz`.
|
||||
- NATS/Redis staging brokers reachable.
|
||||
|
||||
## Steps
|
||||
1) Seed snapshot
|
||||
- Restore staging Mongo from pre-backfill snapshot.
|
||||
2) Run backfill job
|
||||
- `dotnet run --project src/Concelier/StellaOps.Concelier.Backfill -- --mode=observations --batch-size=500 --max-conflicts=0`
|
||||
- `dotnet run --project src/Concelier/StellaOps.Concelier.Backfill -- --mode=linksets --batch-size=500 --max-conflicts=0`
|
||||
3) Validate counts
|
||||
- Compare `advisory_observations_total` and `linksets_total` vs expected inventory; export to `.artifacts/lnm-counts.json`.
|
||||
- Check conflict log `.artifacts/lnm-conflicts.ndjson` (must be empty).
|
||||
4) Events/NATS smoke
|
||||
- Ensure `concelier.lnm.backfill.completed` emitted; verify Redis/NATS queues drained.
|
||||
5) Roll-forward checklist
|
||||
- Promote batch size to 2000 for prod, keep `--max-conflicts=0`.
|
||||
- Schedule maintenance window, ensure snapshot available for rollback.
|
||||
|
||||
## Outputs
|
||||
- `.artifacts/lnm-counts.json`
|
||||
- `.artifacts/lnm-conflicts.ndjson` (empty)
|
||||
- Log of job runtime + throughput.
|
||||
|
||||
## Acceptance
|
||||
- Zero conflicts; counts match expected; events emitted; rollback plan documented.
|
||||
24
ops/devops/lnm/backfill-validation.sh
Normal file
24
ops/devops/lnm/backfill-validation.sh
Normal file
@@ -0,0 +1,24 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
ROOT=${ROOT:-$(cd "$(dirname "$0")/../.." && pwd)}
|
||||
ARTifacts=${ARTifacts:-$ROOT/.artifacts}
|
||||
COUNTS=$ARTifacts/lnm-counts.json
|
||||
CONFLICTS=$ARTifacts/lnm-conflicts.ndjson
|
||||
mkdir -p "$ARTifacts"
|
||||
|
||||
mongoexport --uri "${STAGING_MONGO_URI:?set STAGING_MONGO_URI}" --collection advisoryObservations --db concelier --type=json --query '{}' --out "$ARTifacts/obs.json" >/dev/null
|
||||
mongoexport --uri "${STAGING_MONGO_URI:?set STAGING_MONGO_URI}" --collection linksets --db concelier --type=json --query '{}' --out "$ARTifacts/linksets.json" >/dev/null
|
||||
|
||||
OBS=$(jq length "$ARTifacts/obs.json")
|
||||
LNK=$(jq length "$ARTifacts/linksets.json")
|
||||
|
||||
cat > "$COUNTS" <<JSON
|
||||
{
|
||||
"observations": $OBS,
|
||||
"linksets": $LNK,
|
||||
"timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
}
|
||||
JSON
|
||||
|
||||
touch "$CONFLICTS"
|
||||
echo "Counts written to $COUNTS; conflicts at $CONFLICTS"
|
||||
11
ops/devops/lnm/metrics-ci-check.sh
Normal file
11
ops/devops/lnm/metrics-ci-check.sh
Normal file
@@ -0,0 +1,11 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
DASHBOARD=${1:-ops/devops/lnm/metrics-dashboard.json}
|
||||
jq . "$DASHBOARD" >/dev/null
|
||||
REQUIRED=("advisory_observations_total" "linksets_total" "ingest_api_latency_seconds_bucket" "lnm_backfill_processed_total")
|
||||
for metric in "${REQUIRED[@]}"; do
|
||||
if ! grep -q "$metric" "$DASHBOARD"; then
|
||||
echo "::error::metric $metric missing from dashboard"; exit 1
|
||||
fi
|
||||
done
|
||||
echo "dashboard metrics present"
|
||||
9
ops/devops/lnm/metrics-dashboard.json
Normal file
9
ops/devops/lnm/metrics-dashboard.json
Normal file
@@ -0,0 +1,9 @@
|
||||
{
|
||||
"title": "LNM Backfill Metrics",
|
||||
"panels": [
|
||||
{"type": "stat", "title": "Observations", "targets": [{"expr": "advisory_observations_total"}]},
|
||||
{"type": "stat", "title": "Linksets", "targets": [{"expr": "linksets_total"}]},
|
||||
{"type": "graph", "title": "Ingest→API latency p95", "targets": [{"expr": "histogram_quantile(0.95, rate(ingest_api_latency_seconds_bucket[5m]))"}]},
|
||||
{"type": "graph", "title": "Backfill throughput", "targets": [{"expr": "rate(lnm_backfill_processed_total[5m])"}]}
|
||||
]
|
||||
}
|
||||
20
ops/devops/lnm/vex-backfill-plan.md
Normal file
20
ops/devops/lnm/vex-backfill-plan.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# VEX Backfill Plan (DEVOPS-LNM-22-002)
|
||||
|
||||
## Goal
|
||||
Run VEX observation/linkset backfill with monitoring, ensure events flow via NATS/Redis, and capture run artifacts.
|
||||
|
||||
## Steps
|
||||
1) Pre-checks
|
||||
- Confirm DEVOPS-LNM-22-001 counts baseline (`.artifacts/lnm-counts.json`).
|
||||
- Ensure `STAGING_MONGO_URI`, `NATS_URL`, `REDIS_URL` available (read-only or test brokers).
|
||||
2) Run VEX backfill
|
||||
- `dotnet run --project src/Concelier/StellaOps.Concelier.Backfill -- --mode=vex --batch-size=500 --max-conflicts=0 --mongo $STAGING_MONGO_URI --nats $NATS_URL --redis $REDIS_URL`
|
||||
3) Metrics capture
|
||||
- Export per-run metrics to `.artifacts/vex-backfill-metrics.json` (duration, processed, conflicts, events emitted).
|
||||
4) Event verification
|
||||
- Subscribe to `concelier.vex.backfill.completed` and `concelier.linksets.vex.upserted`; ensure queues drained.
|
||||
5) Roll-forward checklist
|
||||
- Increase batch size to 2000 for prod; keep conflicts = 0; schedule maintenance window.
|
||||
|
||||
## Acceptance
|
||||
- Zero conflicts; events observed; metrics file present; rollback plan documented.
|
||||
Reference in New Issue
Block a user