Files
git.stella-ops.org/docs/17_SECURITY_HARDENING_GUIDE.md

8.8 KiB
Raw Blame History

17 · Security Hardening Guide — StellaOps

(v2.0 — 12Jul2025)

Audience — Sitereliability and platform teams deploying the opensource Core in production or restricted networks.


0TableofContents

  1. Threat model (summary)
  2. HostOS baseline
  3. Container & runtime hardening
  4. Networkplane guidance
  5. Secrets & key management
  6. Image, SBOM & plugin supplychain controls
  7. Logging, monitoring & audit
  8. Update & patch strategy
  9. Incidentresponse workflow
  10. Pentesting & continuous assurance
  11. Contacts & vulnerability disclosure
  12. Change log

1Threat model (summary)

Asset Threats Mitigations
SBOMs & scan results Disclosure, tamper TLSintransit, readonly Redis volume, RBAC, Cosignverified plugins
Backend container RCE, codeinjection Distroless image, nonroot UID, readonly FS, seccomp + CAP_DROP:ALL
Update artefacts Supplychain attack Cosignsigned images & SBOMs, enforced by admission controller
Admin credentials Phishing, brute force OAuth 2.0 with 12h token TTL, optional mTLS

2HostOS baseline checklist

Item Recommended setting
OS Ubuntu22.04LTS (kernel5.15) or Alma9
Patches unattendedupgrades or vendorequivalent enabled
Filesystem noexec,nosuid on /tmp, /var/tmp
Docker Engine v24.*, API socket rootowned (0660)
Auditd Watch /etc/docker, /usr/bin/docker* and Compose files
Time sync chrony or systemdtimesyncd

3Container & runtime hardening

3.1Docker Compose reference (compose-core.yml)

services:
  backend:
    image: ghcr.io/stellaops/backend:1.5.0
    user: "101:101"              # nonroot
    read_only: true
    security_opt:
      - "no-new-privileges:true"
      - "seccomp:./seccomp-backend.json"
    cap_drop: [ALL]
    tmpfs:
      - /tmp:size=64m,exec,nosymlink
    environment:
      - ASPNETCORE_URLS=https://+:8080
      - TLSPROVIDER=OpenSslGost
    depends_on: [redis]
    networks: [core-net]
    healthcheck:
      test: ["CMD", "wget", "-qO-", "https://localhost:8080/health"]
      interval: 30s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7.2-alpine
    command: ["redis-server", "--requirepass", "${REDIS_PASS}", "--rename-command", "FLUSHALL", ""]
    user: "redis"
    read_only: true
    cap_drop: [ALL]
    tmpfs:
      - /data
    networks: [core-net]

networks:
  core-net:
    driver: bridge

No dedicated “Redis” or “Mongo” subnets are declared; the single bridge network suffices for the default stack.

3.2Kubernetes deployment highlights

Use a separate NetworkPolicy that only allows egress from backend to Redis :6379. securityContext: runAsNonRoot, readOnlyRootFilesystem, allowPrivilegeEscalation: false, drop all capabilities. PodDisruptionBudget of minAvailable: 1. Optionally add CosignVerified=true label enforced by an admission controller (e.g. Kyverno or Connaisseur).

4Networkplane guidance

Plane Recommendation
Northsouth Terminate TLS 1.2+ (OpenSSLGOST default). Use LetsEncrypt or internal CA.
Eastwest Compose bridge or K8s ClusterIP only; no public Redis/Mongo ports.
Ingress controller Limit methods to GET, POST, PATCH (no TRACE).
Ratelimits 40 rps default; tune ScannerPool.Workers and ingress limitreq to match.

5Secrets & key management

Secret Storage Rotation
ClientJWT (offline) /var/lib/stella/tokens/client.jwt (root:600) 30days provided by each OUK
REDIS_PASS Docker/K8s secret 90days
OAuth signing key /keys/jwt.pem (readonly mount) 180days
Cosign public key /keys/cosign.pub baked into image; change on every major release
Trivy DB mirror token (if remote) Secret + readonly 30days

Never bake secrets into images; always inject at runtime.

Operational tip: schedule a cron reminding ops 5days before client.jwt expiry. The backend also emits a Prometheus metric stella_quota_token_days_remaining.

6Image, SBOM & plugin supplychain controls

  • Images — Pull by digest not latest; verify:
cosign verify ghcr.io/stellaops/backend@sha256:<DIGEST> \
  --key https://stella-ops.org/keys/cosign.pub
  • SBOM — Each release ships an SPDX file; store alongside images for audit.
  • Thirdparty plugins — Place in /plugins/; backend will:
  • Validate Cosign signature.
  • Check [StellaPluginVersion("major.minor")].
  • Refuse to start if Security.DisablePluginUnsigned=false (default).

7Logging, monitoring & audit

Control Implementation
Log format Serilog JSON; ship via FluentBit to ELK or Loki
Metrics Prometheus /metrics endpoint; default Grafana dashboard in infra/
Audit events Redis stream audit; export daily to SIEM
Alert rules Feed age 48h, P95 walltime>5s, Redis used memory>75%

8Update & patch strategy

Layer Cadence Method
Backend & CLI images Monthly or CVEdriven docker pull + docker compose up -d
Trivy DB 24h cron via FeedMerger configurable (FeedMerger.Cron)
Docker Engine vendor LTS distro package manager
Host OS security repos enabled unattendedupgrades

9Incidentresponse workflow

  • Detect — PagerDuty alert from Prometheus or SIEM.
  • Contain — Stop affected Backend container; isolate Redis RDB snapshot.
  • Eradicate — Pull verified images, redeploy, rotate secrets.
  • Recover — Restore RDB, replay SBOMs if history lost.
  • Review — Postmortem within 72h; create followup issues.
  • Escalate P1 incidents to <security@stellaops.org> (24×7).

10Pentesting & continuous assurance

| Control | Frequency | Tool | |-------------------|-------------------| | OWASP | ZAP baseline | Each merge to main GitHub Action zap-baseline-scan | | Dependency scanning | Pull request | Trivy FS + GitHub Dependabot | | External redteam | Annual or before GA | 3rdparty CRESTaccredited vendor |

11Vulnerability disclosure & contact

  • Preferred channel: security@stellaops.org (GPG key on website).
  • Coordinated disclosure reward: public credit and swag (no monetary bounty at this time).

12Change log

Version Date Notes
v2.0 20250712 Full overhaul: hostOS baseline, supplychain signing, removal of unnecessary subnets, rolebased contact email, K8s guidance.
v1.1 20250709 Minor fence fixes.
v1.0 20250709 Original draft.