13 KiB
13 KiB
Agent Security Model
Overview
Agents are trusted components that execute deployment tasks on targets. Their security model ensures:
- Strong identity through mTLS certificates
- Minimal privilege through scoped task credentials
- Audit trail through signed task receipts
- Isolation through process sandboxing
Agent Registration Flow
┌─────────────────────────────────────────────────────────────────────────────┐
│ AGENT REGISTRATION FLOW │
│ │
│ 1. Admin generates registration token (one-time use) │
│ POST /api/v1/admin/agent-tokens │
│ Response: { token: "reg_xxx", expiresAt: "..." } │
│ │
│ 2. Agent starts with registration token │
│ ./stella-agent --register --token=reg_xxx │
│ │
│ 3. Agent requests mTLS certificate │
│ POST /api/v1/agents/register │
│ Headers: X-Registration-Token: reg_xxx │
│ Body: { name, version, capabilities, csr } │
│ Response: { agentId, certificate, caCertificate } │
│ │
│ 4. Agent establishes mTLS connection │
│ Uses issued certificate for all subsequent requests │
│ │
│ 5. Agent requests short-lived JWT for task execution │
│ POST /api/v1/agents/token (over mTLS) │
│ Response: { token, expiresIn: 3600 } │
│ │
│ 6. Agent refreshes token before expiration │
│ Token refresh only over mTLS connection │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
mTLS Communication
All agent-to-core communication uses mutual TLS:
┌─────────────────────────────────────────────────────────────────────────────┐
│ AGENT COMMUNICATION SECURITY │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ AGENT │ │ STELLA CORE │ │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ │ mTLS (mutual TLS) │ │
│ │ - Agent cert signed by Stella CA │ │
│ │ - Server cert verified by Agent │ │
│ │ - TLS 1.3 only │ │
│ │ - Perfect forward secrecy │ │
│ │◄────────────────────────────────────────►│ │
│ │ │ │
│ │ Encrypted payload │ │
│ │ - Task payloads encrypted with │ │
│ │ agent-specific key │ │
│ │ - Logs encrypted in transit │ │
│ │◄────────────────────────────────────────►│ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
TLS Requirements
| Requirement | Value |
|---|---|
| Protocol | TLS 1.3 only |
| Cipher Suites | TLS_AES_256_GCM_SHA384, TLS_CHACHA20_POLY1305_SHA256 |
| Key Exchange | ECDHE with P-384 or X25519 |
| Certificate Key | RSA 4096-bit or ECDSA P-384 |
| Certificate Validity | 90 days (auto-renewed) |
Certificate Management
Certificate Structure
interface AgentCertificate {
subject: {
CN: string; // Agent name
O: string; // "Stella Ops"
OU: string; // Tenant ID
};
serialNumber: string;
issuer: string; // Stella CA
validFrom: DateTime;
validTo: DateTime;
extensions: {
keyUsage: ["digitalSignature", "keyEncipherment"];
extendedKeyUsage: ["clientAuth"];
subjectAltName: string[]; // Agent ID as URI
};
}
Certificate Renewal
Agents automatically renew certificates before expiration:
- Agent detects certificate expiring within 30 days
- Agent generates new CSR with same identity
- Agent submits renewal request over existing mTLS connection
- Authority issues new certificate
- Agent transitions to new certificate seamlessly
Secrets Management
Secrets are NEVER stored in the Stella database. Only vault references are stored.
┌─────────────────────────────────────────────────────────────────────────────┐
│ SECRETS FLOW (NEVER STORED IN DB) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ VAULT │ │ STELLA CORE │ │ AGENT │ │
│ │ (Source) │ │ (Broker) │ │ (Consumer) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ │ │ Task requires secret │ │
│ │ │ │ │
│ │ Fetch with service │ │ │
│ │ account token │ │ │
│ │◄─────────────────────── │ │
│ │ │ │ │
│ │ Return secret │ │ │
│ │ (wrapped, short TTL) │ │ │
│ │────────────────────────► │ │
│ │ │ │ │
│ │ │ Embed in task payload │ │
│ │ │ (encrypted) │ │
│ │ │────────────────────────► │
│ │ │ │ │
│ │ │ │ Decrypt │
│ │ │ │ Use for task │
│ │ │ │ Discard │
│ │
│ Rules: │
│ - Secrets NEVER stored in Stella database │
│ - Only Vault references stored │
│ - Secrets fetched at execution time only │
│ - Secrets not logged (masked in logs) │
│ - Secrets not persisted in agent memory beyond task scope │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Task Security
Task Assignment
interface AgentTask {
id: UUID;
type: TaskType;
targetId: UUID;
payload: TaskPayload;
credentials: EncryptedCredentials; // Encrypted with agent's public key
timeout: number;
priority: TaskPriority;
idempotencyKey: string;
assignedAt: DateTime;
expiresAt: DateTime;
}
Credential Scoping
Task credentials are:
- Scoped to specific target only
- Valid only for task duration
- Encrypted with agent's public key
- Logged when accessed (without values)
Task Execution Isolation
Agents execute tasks with isolation:
interface TaskExecutionContext {
// Process isolation
workingDirectory: string; // Unique per task
processUser: string; // Non-root user
networkNamespace: string; // If network isolation enabled
// Resource limits
memoryLimit: number; // Bytes
cpuLimit: number; // Millicores
diskLimit: number; // Bytes
networkEgress: string[]; // Allowed destinations
// Cleanup
cleanupOnComplete: boolean;
cleanupTimeout: number;
}
Agent Capabilities
Agents declare capabilities that determine what tasks they can execute:
interface AgentCapabilities {
docker?: DockerCapability;
compose?: ComposeCapability;
ssh?: SshCapability;
winrm?: WinrmCapability;
ecs?: EcsCapability;
nomad?: NomadCapability;
}
interface DockerCapability {
version: string;
apiVersion: string;
runtimes: string[];
registryAuth: boolean;
}
interface ComposeCapability {
version: string;
fileFormats: string[];
}
Heartbeat Protocol
interface AgentHeartbeat {
agentId: UUID;
timestamp: DateTime;
status: "healthy" | "degraded";
resourceUsage: {
cpuPercent: number;
memoryPercent: number;
diskPercent: number;
networkRxBytes: number;
networkTxBytes: number;
};
activeTaskCount: number;
completedTasks: number;
failedTasks: number;
errors: string[];
signature: string; // HMAC of heartbeat data
}
Heartbeat Validation
- Verify signature matches expected HMAC
- Check timestamp is within acceptable skew (30s)
- Update agent status based on heartbeat content
- Trigger alerts if heartbeat missing for >90s
Agent Revocation
When an agent is compromised or decommissioned:
- Certificate added to CRL (Certificate Revocation List)
- All pending tasks for agent cancelled
- Agent removed from target assignments
- Audit event logged
- New agent can be registered with same name (new identity)
Security Checklist
| Control | Implementation |
|---|---|
| Identity | mTLS certificates signed by internal CA |
| Authentication | Certificate-based + short-lived JWT |
| Authorization | Task-scoped credentials |
| Encryption | TLS 1.3 for transport, envelope encryption for secrets |
| Isolation | Process sandboxing, resource limits |
| Audit | All task assignments and completions logged |
| Revocation | CRL for compromised agents |
| Secret handling | Vault integration, no persistence |