Files
git.stella-ops.org/docs/17_SECURITY_HARDENING_GUIDE.md

187 lines
8.8 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 17 · Security Hardening Guide — **StellaOps**
*(v2.0  12Jul2025)*
> **Audience** — Sitereliability and platform teams deploying **the opensource Core** in production or restricted networks.
---
## 0TableofContents
1. Threat model (summary)
2. HostOS baseline
3. Container & runtime hardening
4. Networkplane guidance
5. Secrets & key management
6. Image, SBOM & plugin supplychain controls
7. Logging, monitoring & audit
8. Update & patch strategy
9. Incidentresponse workflow
10. Pentesting & continuous assurance
11. Contacts & vulnerability disclosure
12. Change log
---
## 1Threat model (summary)
| Asset | Threats | Mitigations |
| -------------------- | --------------------- | ---------------------------------------------------------------------- |
| SBOMs & scan results | Disclosure, tamper | TLSintransit, readonly Redis volume, RBAC, Cosignverified plugins |
| Backend container | RCE, codeinjection | Distroless image, nonroot UID, readonly FS, seccomp + `CAP_DROP:ALL` |
| Update artefacts | Supplychain attack | Cosignsigned images & SBOMs, enforced by admission controller |
| Admin credentials | Phishing, brute force | OAuth 2.0 with 12h token TTL, optional mTLS |
---
## 2HostOS baseline checklist
| Item | Recommended setting |
| ------------- | --------------------------------------------------------- |
| OS | Ubuntu22.04LTS (kernel5.15) or Alma9 |
| Patches | `unattendedupgrades` or vendorequivalent enabled |
| Filesystem | `noexec,nosuid` on `/tmp`, `/var/tmp` |
| Docker Engine | v24.*, API socket rootowned (`0660`) |
| Auditd | Watch `/etc/docker`, `/usr/bin/docker*` and Compose files |
| Time sync | `chrony` or `systemdtimesyncd` |
---
## 3Container & runtime hardening
### 3.1Docker Compose reference (`compose-core.yml`)
```yaml
services:
backend:
image: ghcr.io/stellaops/backend:1.5.0
user: "101:101" # nonroot
read_only: true
security_opt:
- "no-new-privileges:true"
- "seccomp:./seccomp-backend.json"
cap_drop: [ALL]
tmpfs:
- /tmp:size=64m,exec,nosymlink
environment:
- ASPNETCORE_URLS=https://+:8080
- TLSPROVIDER=OpenSslGost
depends_on: [redis]
networks: [core-net]
healthcheck:
test: ["CMD", "wget", "-qO-", "https://localhost:8080/health"]
interval: 30s
timeout: 5s
retries: 5
redis:
image: redis:7.2-alpine
command: ["redis-server", "--requirepass", "${REDIS_PASS}", "--rename-command", "FLUSHALL", ""]
user: "redis"
read_only: true
cap_drop: [ALL]
tmpfs:
- /data
networks: [core-net]
networks:
core-net:
driver: bridge
```
No dedicated “Redis” or “Mongo” subnets are declared; the single bridge network suffices for the default stack.
### 3.2Kubernetes deployment highlights
Use a separate NetworkPolicy that only allows egress from backend to Redis :6379.
securityContext: runAsNonRoot, readOnlyRootFilesystem, allowPrivilegeEscalation: false, drop all capabilities.
PodDisruptionBudget of minAvailable: 1.
Optionally add CosignVerified=true label enforced by an admission controller (e.g. Kyverno or Connaisseur).
## 4Networkplane guidance
| Plane | Recommendation |
| ------------------ | -------------------------------------------------------------------------- |
| Northsouth | Terminate TLS 1.2+ (OpenSSLGOST default). Use LetsEncrypt or internal CA. |
| Eastwest | Compose bridge or K8s ClusterIP only; no public Redis/Mongo ports. |
| Ingress controller | Limit methods to GET, POST, PATCH (no TRACE). |
| Ratelimits | 40 rps default; tune ScannerPool.Workers and ingress limitreq to match. |
## 5Secrets & key management
| Secret | Storage | Rotation |
| --------------------------------- | ---------------------------------- | ----------------------------- |
| **ClientJWT (offline)** | `/var/lib/stella/tokens/client.jwt` (root:600) | **30days** provided by each OUK |
| REDIS_PASS | Docker/K8s secret | 90days |
| OAuth signing key | /keys/jwt.pem (readonly mount) | 180days |
| Cosign public key | /keys/cosign.pub baked into image; | change on every major release |
| Trivy DB mirror token (if remote) | Secret + readonly | 30days |
Never bake secrets into images; always inject at runtime.
> **Operational tip:** schedule a cron reminding ops 5days before
> `client.jwt` expiry. The backend also emits a Prometheus metric
> `stella_quota_token_days_remaining`.
## 6Image, SBOM & plugin supplychain controls
* Images — Pull by digest not latest; verify:
```bash
cosign verify ghcr.io/stellaops/backend@sha256:<DIGEST> \
--key https://stella-ops.org/keys/cosign.pub
```
* SBOM — Each release ships an SPDX file; store alongside images for audit.
* Thirdparty plugins — Place in /plugins/; backend will:
* Validate Cosign signature.
* Check [StellaPluginVersion("major.minor")].
* Refuse to start if Security.DisablePluginUnsigned=false (default).
## 7Logging, monitoring & audit
| Control | Implementation |
| ------------ | ----------------------------------------------------------------- |
| Log format | Serilog JSON; ship via FluentBit to ELK or Loki |
| Metrics | Prometheus /metrics endpoint; default Grafana dashboard in infra/ |
| Audit events | Redis stream audit; export daily to SIEM |
| Alert rules | Feed age 48h, P95 walltime>5s, Redis used memory>75% |
## 8Update & patch strategy
| Layer | Cadence | Method |
| -------------------- | -------------------------------------------------------- | ------------------------------ |
| Backend & CLI images | Monthly or CVEdriven docker pull + docker compose up -d |
| Trivy DB | 24h cron via FeedMerger | configurable (FeedMerger.Cron) |
| Docker Engine | vendor LTS | distro package manager |
| Host OS | security repos enabled | unattendedupgrades |
## 9Incidentresponse workflow
* Detect — PagerDuty alert from Prometheus or SIEM.
* Contain — Stop affected Backend container; isolate Redis RDB snapshot.
* Eradicate — Pull verified images, redeploy, rotate secrets.
* Recover — Restore RDB, replay SBOMs if history lost.
* Review — Postmortem within 72h; create followup issues.
* Escalate P1 incidents to <security@stellaops.org> (24×7).
## 10Pentesting & continuous assurance
| Control | Frequency | Tool |
|-------------------|-------------------|
| OWASP | ZAP baseline | Each merge to main GitHub Action zap-baseline-scan |
| Dependency scanning | Pull request | Trivy FS + GitHub Dependabot |
| External redteam | Annual or before GA | 3rdparty CRESTaccredited vendor |
## 11Vulnerability disclosure & contact
* Preferred channel: security@stellaops.org (GPG key on website).
* Coordinated disclosure reward: public credit and swag (no monetary bounty at this time).
## 12Change log
| Version | Date | Notes |
| ------- | ---------- | -------------------------------------------------------------------------------------------------------------------------------- |
| v2.0 | 20250712 | Full overhaul: hostOS baseline, supplychain signing, removal of unnecessary subnets, rolebased contact email, K8s guidance. |
| v1.1 | 20250709 | Minor fence fixes. |
| v1.0 | 20250709 | Original draft. |