Files
git.stella-ops.org/docs/modules/release-orchestrator/security/agent-security.md

13 KiB

Agent Security Model

Overview

Agents are trusted components that execute deployment tasks on targets. Their security model ensures:

  • Strong identity through mTLS certificates
  • Minimal privilege through scoped task credentials
  • Audit trail through signed task receipts
  • Isolation through process sandboxing

Agent Registration Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                    AGENT REGISTRATION FLOW                                  │
│                                                                             │
│  1. Admin generates registration token (one-time use)                       │
│     POST /api/v1/admin/agent-tokens                                        │
│     Response: { token: "reg_xxx", expiresAt: "..." }                       │
│                                                                             │
│  2. Agent starts with registration token                                    │
│     ./stella-agent --register --token=reg_xxx                              │
│                                                                             │
│  3. Agent requests mTLS certificate                                         │
│     POST /api/v1/agents/register                                           │
│     Headers: X-Registration-Token: reg_xxx                                 │
│     Body: { name, version, capabilities, csr }                             │
│     Response: { agentId, certificate, caCertificate }                      │
│                                                                             │
│  4. Agent establishes mTLS connection                                       │
│     Uses issued certificate for all subsequent requests                    │
│                                                                             │
│  5. Agent requests short-lived JWT for task execution                       │
│     POST /api/v1/agents/token (over mTLS)                                  │
│     Response: { token, expiresIn: 3600 }                                   │
│                                                                             │
│  6. Agent refreshes token before expiration                                 │
│     Token refresh only over mTLS connection                                │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

mTLS Communication

All agent-to-core communication uses mutual TLS:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    AGENT COMMUNICATION SECURITY                             │
│                                                                             │
│  ┌──────────────┐                          ┌──────────────┐                │
│  │    AGENT     │                          │  STELLA CORE │                │
│  └──────┬───────┘                          └──────┬───────┘                │
│         │                                         │                         │
│         │  mTLS (mutual TLS)                      │                         │
│         │  - Agent cert signed by Stella CA       │                         │
│         │  - Server cert verified by Agent        │                         │
│         │  - TLS 1.3 only                         │                         │
│         │  - Perfect forward secrecy              │                         │
│         │◄────────────────────────────────────────►│                         │
│         │                                         │                         │
│         │  Encrypted payload                      │                         │
│         │  - Task payloads encrypted with         │                         │
│         │    agent-specific key                   │                         │
│         │  - Logs encrypted in transit            │                         │
│         │◄────────────────────────────────────────►│                         │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

TLS Requirements

Requirement Value
Protocol TLS 1.3 only
Cipher Suites TLS_AES_256_GCM_SHA384, TLS_CHACHA20_POLY1305_SHA256
Key Exchange ECDHE with P-384 or X25519
Certificate Key RSA 4096-bit or ECDSA P-384
Certificate Validity 90 days (auto-renewed)

Certificate Management

Certificate Structure

interface AgentCertificate {
  subject: {
    CN: string;        // Agent name
    O: string;         // "Stella Ops"
    OU: string;        // Tenant ID
  };
  serialNumber: string;
  issuer: string;      // Stella CA
  validFrom: DateTime;
  validTo: DateTime;
  extensions: {
    keyUsage: ["digitalSignature", "keyEncipherment"];
    extendedKeyUsage: ["clientAuth"];
    subjectAltName: string[];  // Agent ID as URI
  };
}

Certificate Renewal

Agents automatically renew certificates before expiration:

  1. Agent detects certificate expiring within 30 days
  2. Agent generates new CSR with same identity
  3. Agent submits renewal request over existing mTLS connection
  4. Authority issues new certificate
  5. Agent transitions to new certificate seamlessly

Secrets Management

Secrets are NEVER stored in the Stella database. Only vault references are stored.

┌─────────────────────────────────────────────────────────────────────────────┐
│                    SECRETS FLOW (NEVER STORED IN DB)                        │
│                                                                             │
│  ┌──────────────┐        ┌──────────────┐        ┌──────────────┐          │
│  │    VAULT     │        │  STELLA CORE │        │    AGENT     │          │
│  │  (Source)    │        │  (Broker)    │        │  (Consumer)  │          │
│  └──────┬───────┘        └──────┬───────┘        └──────┬───────┘          │
│         │                       │                       │                   │
│         │                       │ Task requires secret  │                   │
│         │                       │                       │                   │
│         │ Fetch with service   │                       │                   │
│         │ account token        │                       │                   │
│         │◄───────────────────────                       │                   │
│         │                       │                       │                   │
│         │ Return secret        │                       │                   │
│         │ (wrapped, short TTL) │                       │                   │
│         │────────────────────────►                       │                   │
│         │                       │                       │                   │
│         │                       │ Embed in task payload │                   │
│         │                       │ (encrypted)          │                   │
│         │                       │────────────────────────►                   │
│         │                       │                       │                   │
│         │                       │                       │ Decrypt           │
│         │                       │                       │ Use for task      │
│         │                       │                       │ Discard           │
│                                                                             │
│  Rules:                                                                     │
│  - Secrets NEVER stored in Stella database                                 │
│  - Only Vault references stored                                            │
│  - Secrets fetched at execution time only                                  │
│  - Secrets not logged (masked in logs)                                     │
│  - Secrets not persisted in agent memory beyond task scope                 │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Task Security

Task Assignment

interface AgentTask {
  id: UUID;
  type: TaskType;
  targetId: UUID;
  payload: TaskPayload;
  credentials: EncryptedCredentials;  // Encrypted with agent's public key
  timeout: number;
  priority: TaskPriority;
  idempotencyKey: string;
  assignedAt: DateTime;
  expiresAt: DateTime;
}

Credential Scoping

Task credentials are:

  • Scoped to specific target only
  • Valid only for task duration
  • Encrypted with agent's public key
  • Logged when accessed (without values)

Task Execution Isolation

Agents execute tasks with isolation:

interface TaskExecutionContext {
  // Process isolation
  workingDirectory: string;      // Unique per task
  processUser: string;           // Non-root user
  networkNamespace: string;      // If network isolation enabled

  // Resource limits
  memoryLimit: number;           // Bytes
  cpuLimit: number;              // Millicores
  diskLimit: number;             // Bytes
  networkEgress: string[];       // Allowed destinations

  // Cleanup
  cleanupOnComplete: boolean;
  cleanupTimeout: number;
}

Agent Capabilities

Agents declare capabilities that determine what tasks they can execute:

interface AgentCapabilities {
  docker?: DockerCapability;
  compose?: ComposeCapability;
  ssh?: SshCapability;
  winrm?: WinrmCapability;
  ecs?: EcsCapability;
  nomad?: NomadCapability;
}

interface DockerCapability {
  version: string;
  apiVersion: string;
  runtimes: string[];
  registryAuth: boolean;
}

interface ComposeCapability {
  version: string;
  fileFormats: string[];
}

Heartbeat Protocol

interface AgentHeartbeat {
  agentId: UUID;
  timestamp: DateTime;
  status: "healthy" | "degraded";
  resourceUsage: {
    cpuPercent: number;
    memoryPercent: number;
    diskPercent: number;
    networkRxBytes: number;
    networkTxBytes: number;
  };
  activeTaskCount: number;
  completedTasks: number;
  failedTasks: number;
  errors: string[];
  signature: string;  // HMAC of heartbeat data
}

Heartbeat Validation

  1. Verify signature matches expected HMAC
  2. Check timestamp is within acceptable skew (30s)
  3. Update agent status based on heartbeat content
  4. Trigger alerts if heartbeat missing for >90s

Agent Revocation

When an agent is compromised or decommissioned:

  1. Certificate added to CRL (Certificate Revocation List)
  2. All pending tasks for agent cancelled
  3. Agent removed from target assignments
  4. Audit event logged
  5. New agent can be registered with same name (new identity)

Security Checklist

Control Implementation
Identity mTLS certificates signed by internal CA
Authentication Certificate-based + short-lived JWT
Authorization Task-scoped credentials
Encryption TLS 1.3 for transport, envelope encryption for secrets
Isolation Process sandboxing, resource limits
Audit All task assignments and completions logged
Revocation CRL for compromised agents
Secret handling Vault integration, no persistence

References