release orchestrator pivot, architecture and planning
This commit is contained in:
286
docs/modules/release-orchestrator/security/agent-security.md
Normal file
286
docs/modules/release-orchestrator/security/agent-security.md
Normal file
@@ -0,0 +1,286 @@
|
||||
# Agent Security Model
|
||||
|
||||
## Overview
|
||||
|
||||
Agents are trusted components that execute deployment tasks on targets. Their security model ensures:
|
||||
- Strong identity through mTLS certificates
|
||||
- Minimal privilege through scoped task credentials
|
||||
- Audit trail through signed task receipts
|
||||
- Isolation through process sandboxing
|
||||
|
||||
## Agent Registration Flow
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ AGENT REGISTRATION FLOW │
|
||||
│ │
|
||||
│ 1. Admin generates registration token (one-time use) │
|
||||
│ POST /api/v1/admin/agent-tokens │
|
||||
│ Response: { token: "reg_xxx", expiresAt: "..." } │
|
||||
│ │
|
||||
│ 2. Agent starts with registration token │
|
||||
│ ./stella-agent --register --token=reg_xxx │
|
||||
│ │
|
||||
│ 3. Agent requests mTLS certificate │
|
||||
│ POST /api/v1/agents/register │
|
||||
│ Headers: X-Registration-Token: reg_xxx │
|
||||
│ Body: { name, version, capabilities, csr } │
|
||||
│ Response: { agentId, certificate, caCertificate } │
|
||||
│ │
|
||||
│ 4. Agent establishes mTLS connection │
|
||||
│ Uses issued certificate for all subsequent requests │
|
||||
│ │
|
||||
│ 5. Agent requests short-lived JWT for task execution │
|
||||
│ POST /api/v1/agents/token (over mTLS) │
|
||||
│ Response: { token, expiresIn: 3600 } │
|
||||
│ │
|
||||
│ 6. Agent refreshes token before expiration │
|
||||
│ Token refresh only over mTLS connection │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## mTLS Communication
|
||||
|
||||
All agent-to-core communication uses mutual TLS:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ AGENT COMMUNICATION SECURITY │
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ AGENT │ │ STELLA CORE │ │
|
||||
│ └──────┬───────┘ └──────┬───────┘ │
|
||||
│ │ │ │
|
||||
│ │ mTLS (mutual TLS) │ │
|
||||
│ │ - Agent cert signed by Stella CA │ │
|
||||
│ │ - Server cert verified by Agent │ │
|
||||
│ │ - TLS 1.3 only │ │
|
||||
│ │ - Perfect forward secrecy │ │
|
||||
│ │◄────────────────────────────────────────►│ │
|
||||
│ │ │ │
|
||||
│ │ Encrypted payload │ │
|
||||
│ │ - Task payloads encrypted with │ │
|
||||
│ │ agent-specific key │ │
|
||||
│ │ - Logs encrypted in transit │ │
|
||||
│ │◄────────────────────────────────────────►│ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### TLS Requirements
|
||||
|
||||
| Requirement | Value |
|
||||
|-------------|-------|
|
||||
| Protocol | TLS 1.3 only |
|
||||
| Cipher Suites | TLS_AES_256_GCM_SHA384, TLS_CHACHA20_POLY1305_SHA256 |
|
||||
| Key Exchange | ECDHE with P-384 or X25519 |
|
||||
| Certificate Key | RSA 4096-bit or ECDSA P-384 |
|
||||
| Certificate Validity | 90 days (auto-renewed) |
|
||||
|
||||
## Certificate Management
|
||||
|
||||
### Certificate Structure
|
||||
|
||||
```typescript
|
||||
interface AgentCertificate {
|
||||
subject: {
|
||||
CN: string; // Agent name
|
||||
O: string; // "Stella Ops"
|
||||
OU: string; // Tenant ID
|
||||
};
|
||||
serialNumber: string;
|
||||
issuer: string; // Stella CA
|
||||
validFrom: DateTime;
|
||||
validTo: DateTime;
|
||||
extensions: {
|
||||
keyUsage: ["digitalSignature", "keyEncipherment"];
|
||||
extendedKeyUsage: ["clientAuth"];
|
||||
subjectAltName: string[]; // Agent ID as URI
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Certificate Renewal
|
||||
|
||||
Agents automatically renew certificates before expiration:
|
||||
1. Agent detects certificate expiring within 30 days
|
||||
2. Agent generates new CSR with same identity
|
||||
3. Agent submits renewal request over existing mTLS connection
|
||||
4. Authority issues new certificate
|
||||
5. Agent transitions to new certificate seamlessly
|
||||
|
||||
## Secrets Management
|
||||
|
||||
Secrets are NEVER stored in the Stella database. Only vault references are stored.
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ SECRETS FLOW (NEVER STORED IN DB) │
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ VAULT │ │ STELLA CORE │ │ AGENT │ │
|
||||
│ │ (Source) │ │ (Broker) │ │ (Consumer) │ │
|
||||
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||
│ │ │ │ │
|
||||
│ │ │ Task requires secret │ │
|
||||
│ │ │ │ │
|
||||
│ │ Fetch with service │ │ │
|
||||
│ │ account token │ │ │
|
||||
│ │◄─────────────────────── │ │
|
||||
│ │ │ │ │
|
||||
│ │ Return secret │ │ │
|
||||
│ │ (wrapped, short TTL) │ │ │
|
||||
│ │────────────────────────► │ │
|
||||
│ │ │ │ │
|
||||
│ │ │ Embed in task payload │ │
|
||||
│ │ │ (encrypted) │ │
|
||||
│ │ │────────────────────────► │
|
||||
│ │ │ │ │
|
||||
│ │ │ │ Decrypt │
|
||||
│ │ │ │ Use for task │
|
||||
│ │ │ │ Discard │
|
||||
│ │
|
||||
│ Rules: │
|
||||
│ - Secrets NEVER stored in Stella database │
|
||||
│ - Only Vault references stored │
|
||||
│ - Secrets fetched at execution time only │
|
||||
│ - Secrets not logged (masked in logs) │
|
||||
│ - Secrets not persisted in agent memory beyond task scope │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Task Security
|
||||
|
||||
### Task Assignment
|
||||
|
||||
```typescript
|
||||
interface AgentTask {
|
||||
id: UUID;
|
||||
type: TaskType;
|
||||
targetId: UUID;
|
||||
payload: TaskPayload;
|
||||
credentials: EncryptedCredentials; // Encrypted with agent's public key
|
||||
timeout: number;
|
||||
priority: TaskPriority;
|
||||
idempotencyKey: string;
|
||||
assignedAt: DateTime;
|
||||
expiresAt: DateTime;
|
||||
}
|
||||
```
|
||||
|
||||
### Credential Scoping
|
||||
|
||||
Task credentials are:
|
||||
- Scoped to specific target only
|
||||
- Valid only for task duration
|
||||
- Encrypted with agent's public key
|
||||
- Logged when accessed (without values)
|
||||
|
||||
### Task Execution Isolation
|
||||
|
||||
Agents execute tasks with isolation:
|
||||
```typescript
|
||||
interface TaskExecutionContext {
|
||||
// Process isolation
|
||||
workingDirectory: string; // Unique per task
|
||||
processUser: string; // Non-root user
|
||||
networkNamespace: string; // If network isolation enabled
|
||||
|
||||
// Resource limits
|
||||
memoryLimit: number; // Bytes
|
||||
cpuLimit: number; // Millicores
|
||||
diskLimit: number; // Bytes
|
||||
networkEgress: string[]; // Allowed destinations
|
||||
|
||||
// Cleanup
|
||||
cleanupOnComplete: boolean;
|
||||
cleanupTimeout: number;
|
||||
}
|
||||
```
|
||||
|
||||
## Agent Capabilities
|
||||
|
||||
Agents declare capabilities that determine what tasks they can execute:
|
||||
|
||||
```typescript
|
||||
interface AgentCapabilities {
|
||||
docker?: DockerCapability;
|
||||
compose?: ComposeCapability;
|
||||
ssh?: SshCapability;
|
||||
winrm?: WinrmCapability;
|
||||
ecs?: EcsCapability;
|
||||
nomad?: NomadCapability;
|
||||
}
|
||||
|
||||
interface DockerCapability {
|
||||
version: string;
|
||||
apiVersion: string;
|
||||
runtimes: string[];
|
||||
registryAuth: boolean;
|
||||
}
|
||||
|
||||
interface ComposeCapability {
|
||||
version: string;
|
||||
fileFormats: string[];
|
||||
}
|
||||
```
|
||||
|
||||
## Heartbeat Protocol
|
||||
|
||||
```typescript
|
||||
interface AgentHeartbeat {
|
||||
agentId: UUID;
|
||||
timestamp: DateTime;
|
||||
status: "healthy" | "degraded";
|
||||
resourceUsage: {
|
||||
cpuPercent: number;
|
||||
memoryPercent: number;
|
||||
diskPercent: number;
|
||||
networkRxBytes: number;
|
||||
networkTxBytes: number;
|
||||
};
|
||||
activeTaskCount: number;
|
||||
completedTasks: number;
|
||||
failedTasks: number;
|
||||
errors: string[];
|
||||
signature: string; // HMAC of heartbeat data
|
||||
}
|
||||
```
|
||||
|
||||
### Heartbeat Validation
|
||||
|
||||
1. Verify signature matches expected HMAC
|
||||
2. Check timestamp is within acceptable skew (30s)
|
||||
3. Update agent status based on heartbeat content
|
||||
4. Trigger alerts if heartbeat missing for >90s
|
||||
|
||||
## Agent Revocation
|
||||
|
||||
When an agent is compromised or decommissioned:
|
||||
|
||||
1. Certificate added to CRL (Certificate Revocation List)
|
||||
2. All pending tasks for agent cancelled
|
||||
3. Agent removed from target assignments
|
||||
4. Audit event logged
|
||||
5. New agent can be registered with same name (new identity)
|
||||
|
||||
## Security Checklist
|
||||
|
||||
| Control | Implementation |
|
||||
|---------|----------------|
|
||||
| Identity | mTLS certificates signed by internal CA |
|
||||
| Authentication | Certificate-based + short-lived JWT |
|
||||
| Authorization | Task-scoped credentials |
|
||||
| Encryption | TLS 1.3 for transport, envelope encryption for secrets |
|
||||
| Isolation | Process sandboxing, resource limits |
|
||||
| Audit | All task assignments and completions logged |
|
||||
| Revocation | CRL for compromised agents |
|
||||
| Secret handling | Vault integration, no persistence |
|
||||
|
||||
## References
|
||||
|
||||
- [Security Overview](overview.md)
|
||||
- [Authentication & Authorization](auth.md)
|
||||
- [Threat Model](threat-model.md)
|
||||
305
docs/modules/release-orchestrator/security/auth.md
Normal file
305
docs/modules/release-orchestrator/security/auth.md
Normal file
@@ -0,0 +1,305 @@
|
||||
# Authentication & Authorization
|
||||
|
||||
## Authentication Methods
|
||||
|
||||
### OAuth 2.0 for Human Users
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────────────────────┐
|
||||
│ OAUTH 2.0 AUTHORIZATION CODE FLOW │
|
||||
│ │
|
||||
│ ┌──────────┐ ┌──────────────┐ │
|
||||
│ │ Browser │ │ Authority │ │
|
||||
│ └────┬─────┘ └──────┬───────┘ │
|
||||
│ │ │ │
|
||||
│ │ 1. Login request │ │
|
||||
│ │ ────────────────────────────────────► │ │
|
||||
│ │ │ │
|
||||
│ │ 2. Redirect to IdP │ │
|
||||
│ │ ◄──────────────────────────────────── │ │
|
||||
│ │ │ │
|
||||
│ │ 3. User authenticates at IdP │ │
|
||||
│ │ ─────────────────────────────────► │ │
|
||||
│ │ │ │
|
||||
│ │ 4. IdP callback with code │ │
|
||||
│ │ ◄──────────────────────────────────── │ │
|
||||
│ │ │ │
|
||||
│ │ 5. Exchange code for tokens │ │
|
||||
│ │ ────────────────────────────────────► │ │
|
||||
│ │ │ │
|
||||
│ │ 6. Access token + refresh token │ │
|
||||
│ │ ◄──────────────────────────────────── │ │
|
||||
│ │ │ │
|
||||
└──────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### mTLS for Agents
|
||||
|
||||
Agents authenticate using mutual TLS with certificates issued by Stella's internal CA.
|
||||
|
||||
**Registration Flow:**
|
||||
1. Admin generates one-time registration token
|
||||
2. Agent starts with registration token
|
||||
3. Agent submits CSR (Certificate Signing Request)
|
||||
4. Authority issues certificate signed by Stella CA
|
||||
5. Agent uses certificate for all subsequent requests
|
||||
|
||||
### API Keys for Service-to-Service
|
||||
|
||||
External services can use API keys for programmatic access:
|
||||
- Keys are tenant-scoped
|
||||
- Keys can have restricted permissions
|
||||
- Keys can have expiration dates
|
||||
- Key usage is audited
|
||||
|
||||
## JWT Token Structure
|
||||
|
||||
### Access Token Claims
|
||||
|
||||
```typescript
|
||||
interface AccessTokenClaims {
|
||||
// Standard claims
|
||||
iss: string; // "https://authority.stella.local"
|
||||
sub: string; // User ID
|
||||
aud: string[]; // ["stella-api"]
|
||||
exp: number; // Expiration timestamp
|
||||
iat: number; // Issued at timestamp
|
||||
jti: string; // Unique token ID
|
||||
|
||||
// Custom claims
|
||||
tenant_id: string;
|
||||
roles: string[];
|
||||
permissions: Permission[];
|
||||
email?: string;
|
||||
name?: string;
|
||||
}
|
||||
```
|
||||
|
||||
### Token Lifetimes
|
||||
|
||||
| Token Type | Lifetime | Refresh |
|
||||
|------------|----------|---------|
|
||||
| Access Token | 15 minutes | Via refresh token |
|
||||
| Refresh Token | 7 days | Rotated on use |
|
||||
| Agent Token | 1 hour | Via mTLS connection |
|
||||
| API Key | Configurable | Not refreshed |
|
||||
|
||||
## Authorization Model
|
||||
|
||||
### Resource Types
|
||||
|
||||
```typescript
|
||||
type ResourceType =
|
||||
| "environment"
|
||||
| "release"
|
||||
| "promotion"
|
||||
| "target"
|
||||
| "agent"
|
||||
| "workflow"
|
||||
| "plugin"
|
||||
| "integration"
|
||||
| "evidence";
|
||||
```
|
||||
|
||||
### Action Types
|
||||
|
||||
```typescript
|
||||
type ActionType =
|
||||
| "create"
|
||||
| "read"
|
||||
| "update"
|
||||
| "delete"
|
||||
| "execute"
|
||||
| "approve"
|
||||
| "deploy"
|
||||
| "rollback";
|
||||
```
|
||||
|
||||
### Permission Structure
|
||||
|
||||
```typescript
|
||||
interface Permission {
|
||||
resource: ResourceType;
|
||||
action: ActionType;
|
||||
scope?: PermissionScope;
|
||||
conditions?: Condition[];
|
||||
}
|
||||
|
||||
type PermissionScope =
|
||||
| "*" // All resources
|
||||
| { environmentId: UUID } // Specific environment
|
||||
| { labels: Record<string, string> }; // Label-based
|
||||
```
|
||||
|
||||
### Built-in Roles
|
||||
|
||||
| Role | Description | Key Permissions |
|
||||
|------|-------------|-----------------|
|
||||
| `admin` | Full access | All permissions |
|
||||
| `release_manager` | Manage releases and promotions | Create releases, request promotions |
|
||||
| `deployer` | Execute deployments | Approve promotions (where allowed), view releases |
|
||||
| `approver` | Approve promotions | Approve promotions (SoD respected) |
|
||||
| `viewer` | Read-only access | Read all resources |
|
||||
| `agent` | Agent service account | Execute deployment tasks |
|
||||
|
||||
### Role Definitions
|
||||
|
||||
```typescript
|
||||
const roles = {
|
||||
admin: {
|
||||
permissions: [
|
||||
{ resource: "*", action: "*" }
|
||||
]
|
||||
},
|
||||
release_manager: {
|
||||
permissions: [
|
||||
{ resource: "release", action: "create" },
|
||||
{ resource: "release", action: "read" },
|
||||
{ resource: "release", action: "update" },
|
||||
{ resource: "promotion", action: "create" },
|
||||
{ resource: "promotion", action: "read" },
|
||||
{ resource: "environment", action: "read" },
|
||||
{ resource: "workflow", action: "read" },
|
||||
{ resource: "workflow", action: "execute" }
|
||||
]
|
||||
},
|
||||
deployer: {
|
||||
permissions: [
|
||||
{ resource: "release", action: "read" },
|
||||
{ resource: "promotion", action: "read" },
|
||||
{ resource: "promotion", action: "approve" },
|
||||
{ resource: "environment", action: "read" },
|
||||
{ resource: "target", action: "read" },
|
||||
{ resource: "agent", action: "read" }
|
||||
]
|
||||
},
|
||||
approver: {
|
||||
permissions: [
|
||||
{ resource: "promotion", action: "read" },
|
||||
{ resource: "promotion", action: "approve" },
|
||||
{ resource: "release", action: "read" },
|
||||
{ resource: "environment", action: "read" }
|
||||
]
|
||||
},
|
||||
viewer: {
|
||||
permissions: [
|
||||
{ resource: "*", action: "read" }
|
||||
]
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
## Environment-Scoped Permissions
|
||||
|
||||
Permissions can be scoped to specific environments:
|
||||
|
||||
```typescript
|
||||
// User can approve promotions only to staging
|
||||
{
|
||||
resource: "promotion",
|
||||
action: "approve",
|
||||
scope: { environmentId: "staging-env-id" }
|
||||
}
|
||||
|
||||
// User can deploy only to targets with specific labels
|
||||
{
|
||||
resource: "target",
|
||||
action: "deploy",
|
||||
scope: { labels: { "tier": "frontend" } }
|
||||
}
|
||||
```
|
||||
|
||||
## Separation of Duties (SoD)
|
||||
|
||||
When SoD is enabled for an environment:
|
||||
- The user who requested a promotion cannot approve it
|
||||
- The user who created a release cannot be the sole approver
|
||||
- Approval records include SoD verification status
|
||||
|
||||
```typescript
|
||||
interface ApprovalValidation {
|
||||
promotionId: UUID;
|
||||
approverId: UUID;
|
||||
requesterId: UUID;
|
||||
sodRequired: boolean;
|
||||
sodSatisfied: boolean;
|
||||
validationResult: "valid" | "self_approval_denied" | "sod_violation";
|
||||
}
|
||||
```
|
||||
|
||||
## Permission Checking Algorithm
|
||||
|
||||
```typescript
|
||||
async function checkPermission(
|
||||
userId: UUID,
|
||||
resource: ResourceType,
|
||||
action: ActionType,
|
||||
resourceId?: UUID
|
||||
): Promise<boolean> {
|
||||
// 1. Get user's roles and direct permissions
|
||||
const userRoles = await getUserRoles(userId);
|
||||
const userPermissions = await getUserPermissions(userId);
|
||||
|
||||
// 2. Expand role permissions
|
||||
const rolePermissions = userRoles.flatMap(r => roles[r].permissions);
|
||||
const allPermissions = [...rolePermissions, ...userPermissions];
|
||||
|
||||
// 3. Check for matching permission
|
||||
for (const perm of allPermissions) {
|
||||
if (matchesResource(perm.resource, resource) &&
|
||||
matchesAction(perm.action, action) &&
|
||||
matchesScope(perm.scope, resourceId) &&
|
||||
evaluateConditions(perm.conditions)) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
function matchesResource(pattern: string, resource: string): boolean {
|
||||
return pattern === "*" || pattern === resource;
|
||||
}
|
||||
|
||||
function matchesAction(pattern: string, action: string): boolean {
|
||||
return pattern === "*" || pattern === action;
|
||||
}
|
||||
```
|
||||
|
||||
## API Authorization Headers
|
||||
|
||||
All API requests require:
|
||||
```http
|
||||
Authorization: Bearer <access_token>
|
||||
```
|
||||
|
||||
For agent requests (over mTLS):
|
||||
```http
|
||||
X-Agent-Id: <agent_id>
|
||||
Authorization: Bearer <agent_token>
|
||||
```
|
||||
|
||||
## Permission Denied Response
|
||||
|
||||
```json
|
||||
{
|
||||
"success": false,
|
||||
"error": {
|
||||
"code": "PERMISSION_DENIED",
|
||||
"message": "User does not have permission to approve promotions to production",
|
||||
"details": {
|
||||
"resource": "promotion",
|
||||
"action": "approve",
|
||||
"scope": { "environmentId": "prod-env-id" },
|
||||
"requiredRoles": ["admin", "approver"],
|
||||
"userRoles": ["viewer"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Security Overview](overview.md)
|
||||
- [Agent Security](agent-security.md)
|
||||
- [Authority Module](../../../authority/architecture.md)
|
||||
281
docs/modules/release-orchestrator/security/overview.md
Normal file
281
docs/modules/release-orchestrator/security/overview.md
Normal file
@@ -0,0 +1,281 @@
|
||||
# Security Architecture Overview
|
||||
|
||||
## Security Principles
|
||||
|
||||
| Principle | Implementation |
|
||||
|-----------|----------------|
|
||||
| **Defense in depth** | Multiple layers: network, auth, authz, audit |
|
||||
| **Least privilege** | Role-based access; minimal permissions |
|
||||
| **Zero trust** | All requests authenticated; mTLS for agents |
|
||||
| **Secrets hygiene** | Secrets in vault; never in DB; ephemeral injection |
|
||||
| **Audit everything** | All mutations logged; evidence trail |
|
||||
| **Immutable evidence** | Evidence packets append-only; cryptographically signed |
|
||||
|
||||
## Authentication Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ AUTHENTICATION ARCHITECTURE │
|
||||
│ │
|
||||
│ Human Users Service/Agent │
|
||||
│ ┌──────────┐ ┌──────────┐ │
|
||||
│ │ Browser │ │ Agent │ │
|
||||
│ └────┬─────┘ └────┬─────┘ │
|
||||
│ │ │ │
|
||||
│ │ OAuth 2.0 │ mTLS + JWT │
|
||||
│ │ Authorization Code │ │
|
||||
│ ▼ ▼ │
|
||||
│ ┌──────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ AUTHORITY MODULE │ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
|
||||
│ │ │ OAuth 2.0 │ │ mTLS │ │ API Key │ │ │
|
||||
│ │ │ Provider │ │ Validator │ │ Validator │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
|
||||
│ │ │ TOKEN ISSUER │ │ │
|
||||
│ │ │ - Short-lived JWT (15 min) │ │ │
|
||||
│ │ │ - Contains: user_id, tenant_id, roles, permissions │ │ │
|
||||
│ │ │ - Signed with RS256 │ │ │
|
||||
│ │ └─────────────────────────────────────────────────────────────┘ │ │
|
||||
│ └──────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌──────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ API GATEWAY │ │
|
||||
│ │ │ │
|
||||
│ │ - Validate JWT signature │ │
|
||||
│ │ - Check token expiration │ │
|
||||
│ │ - Extract tenant context │ │
|
||||
│ │ - Enforce rate limits │ │
|
||||
│ └──────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Authorization Model
|
||||
|
||||
### Permission Structure
|
||||
|
||||
```typescript
|
||||
interface Permission {
|
||||
resource: ResourceType;
|
||||
action: ActionType;
|
||||
scope?: ScopeType;
|
||||
conditions?: Condition[];
|
||||
}
|
||||
|
||||
type ResourceType =
|
||||
| "environment"
|
||||
| "release"
|
||||
| "promotion"
|
||||
| "target"
|
||||
| "agent"
|
||||
| "workflow"
|
||||
| "plugin"
|
||||
| "integration"
|
||||
| "evidence";
|
||||
|
||||
type ActionType =
|
||||
| "create"
|
||||
| "read"
|
||||
| "update"
|
||||
| "delete"
|
||||
| "execute"
|
||||
| "approve"
|
||||
| "deploy"
|
||||
| "rollback";
|
||||
|
||||
type ScopeType =
|
||||
| "*" // All resources
|
||||
| { environmentId: UUID } // Specific environment
|
||||
| { labels: Record<string, string> }; // Label-based
|
||||
```
|
||||
|
||||
### Role Definitions
|
||||
|
||||
| Role | Permissions |
|
||||
|------|-------------|
|
||||
| `admin` | All permissions on all resources |
|
||||
| `release-manager` | Full access to releases, promotions; read environments/targets |
|
||||
| `deployer` | Read releases; create/read promotions; read targets |
|
||||
| `approver` | Read/approve promotions |
|
||||
| `viewer` | Read-only access to all resources |
|
||||
|
||||
### Environment-Scoped Roles
|
||||
|
||||
Roles can be scoped to specific environments:
|
||||
|
||||
```typescript
|
||||
// Example: Production deployer can only deploy to production
|
||||
const prodDeployer = {
|
||||
role: "deployer",
|
||||
scope: { environmentId: "prod-environment-uuid" }
|
||||
};
|
||||
```
|
||||
|
||||
## Policy Enforcement Points
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ POLICY ENFORCEMENT POINTS │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ API LAYER (PEP 1) │ │
|
||||
│ │ - Authenticate request │ │
|
||||
│ │ - Check resource-level permissions │ │
|
||||
│ │ - Enforce tenant isolation │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ SERVICE LAYER (PEP 2) │ │
|
||||
│ │ - Check business-level permissions │ │
|
||||
│ │ - Validate separation of duties │ │
|
||||
│ │ - Enforce approval policies │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ DECISION ENGINE (PEP 3) │ │
|
||||
│ │ - Evaluate security gates │ │
|
||||
│ │ - Evaluate custom OPA policies │ │
|
||||
│ │ - Produce signed decision records │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ DATA LAYER (PEP 4) │ │
|
||||
│ │ - Row-level security (tenant_id) │ │
|
||||
│ │ - Append-only enforcement (evidence) │ │
|
||||
│ │ - Encryption at rest │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Agent Security Model
|
||||
|
||||
See [Agent Security](agent-security.md) for detailed agent security architecture.
|
||||
|
||||
Key features:
|
||||
- mTLS authentication with CA-signed certificates
|
||||
- One-time registration tokens
|
||||
- Short-lived JWT for task execution
|
||||
- Encrypted task payloads
|
||||
- Scoped credentials per task
|
||||
|
||||
## Secrets Management
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ SECRETS FLOW (NEVER STORED IN DB) │
|
||||
│ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ VAULT │ │ STELLA CORE │ │ AGENT │ │
|
||||
│ │ (Source) │ │ (Broker) │ │ (Consumer) │ │
|
||||
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||
│ │ │ │ │
|
||||
│ │ │ Task requires secret │ │
|
||||
│ │ │ │ │
|
||||
│ │ Fetch with service │ │ │
|
||||
│ │ account token │ │ │
|
||||
│ │◄─────────────────────── │ │
|
||||
│ │ │ │ │
|
||||
│ │ Return secret │ │ │
|
||||
│ │ (wrapped, short TTL) │ │ │
|
||||
│ │───────────────────────► │ │
|
||||
│ │ │ │ │
|
||||
│ │ │ Embed in task payload │ │
|
||||
│ │ │ (encrypted) │ │
|
||||
│ │ │───────────────────────► │
|
||||
│ │ │ │ │
|
||||
│ │ │ │ Decrypt │
|
||||
│ │ │ │ Use for task │
|
||||
│ │ │ │ Discard │
|
||||
│ │
|
||||
│ Rules: │
|
||||
│ - Secrets NEVER stored in Stella database │
|
||||
│ - Only Vault references stored │
|
||||
│ - Secrets fetched at execution time only │
|
||||
│ - Secrets not logged (masked in logs) │
|
||||
│ - Secrets not persisted in agent memory beyond task scope │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Threat Model
|
||||
|
||||
| Threat | Attack Vector | Mitigation |
|
||||
|--------|---------------|------------|
|
||||
| **Credential theft** | Database breach | Secrets never in DB; only vault refs |
|
||||
| **Token replay** | Stolen JWT | Short-lived tokens (15 min); refresh tokens rotated |
|
||||
| **Agent impersonation** | Fake agent | mTLS with CA-signed certs; registration token one-time |
|
||||
| **Digest tampering** | Modified image | Digest verification at pull time; mismatch = failure |
|
||||
| **Evidence tampering** | Modified audit records | Append-only table; cryptographic signing |
|
||||
| **Privilege escalation** | Compromised account | Role-based access; SoD enforcement; audit logs |
|
||||
| **Supply chain attack** | Malicious plugin | Plugin sandbox; capability declarations; review process |
|
||||
| **Lateral movement** | Compromised target | Short-lived task credentials; scoped permissions |
|
||||
| **Data exfiltration** | Log/artifact theft | Encryption at rest; network segmentation |
|
||||
| **Denial of service** | Resource exhaustion | Rate limiting; resource quotas; circuit breakers |
|
||||
|
||||
## Audit Trail
|
||||
|
||||
### Audit Event Structure
|
||||
|
||||
```typescript
|
||||
interface AuditEvent {
|
||||
id: UUID;
|
||||
timestamp: DateTime;
|
||||
tenantId: UUID;
|
||||
|
||||
// Actor
|
||||
actorType: "user" | "agent" | "system" | "plugin";
|
||||
actorId: UUID;
|
||||
actorName: string;
|
||||
actorIp?: string;
|
||||
|
||||
// Action
|
||||
action: string; // "promotion.approved", "deployment.started"
|
||||
resource: string; // "promotion"
|
||||
resourceId: UUID;
|
||||
|
||||
// Context
|
||||
environmentId?: UUID;
|
||||
releaseId?: UUID;
|
||||
promotionId?: UUID;
|
||||
|
||||
// Details
|
||||
before?: object; // State before (for updates)
|
||||
after?: object; // State after
|
||||
metadata?: object; // Additional context
|
||||
|
||||
// Integrity
|
||||
previousEventHash: string; // Hash chain for tamper detection
|
||||
eventHash: string;
|
||||
}
|
||||
```
|
||||
|
||||
### Audited Operations
|
||||
|
||||
| Category | Operations |
|
||||
|----------|------------|
|
||||
| **Authentication** | Login, logout, token refresh, failed attempts |
|
||||
| **Authorization** | Permission denied events |
|
||||
| **Environments** | Create, update, delete, freeze window changes |
|
||||
| **Releases** | Create, deprecate, archive |
|
||||
| **Promotions** | Request, approve, reject, cancel |
|
||||
| **Deployments** | Start, complete, fail, rollback |
|
||||
| **Targets** | Register, update, delete, health changes |
|
||||
| **Agents** | Register, heartbeat gaps, capability changes |
|
||||
| **Integrations** | Create, update, delete, test |
|
||||
| **Plugins** | Enable, disable, config changes |
|
||||
| **Evidence** | Create (never update/delete) |
|
||||
|
||||
## References
|
||||
|
||||
- [Authentication & Authorization](auth.md)
|
||||
- [Agent Security](agent-security.md)
|
||||
- [Threat Model](threat-model.md)
|
||||
- [Audit Trail](audit-trail.md)
|
||||
207
docs/modules/release-orchestrator/security/threat-model.md
Normal file
207
docs/modules/release-orchestrator/security/threat-model.md
Normal file
@@ -0,0 +1,207 @@
|
||||
# Threat Model
|
||||
|
||||
## Overview
|
||||
|
||||
This document identifies threats to the Release Orchestrator and their mitigations.
|
||||
|
||||
## Threat Categories
|
||||
|
||||
### T1: Credential Theft
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | Attacker gains access to credentials through database breach |
|
||||
| **Attack Vector** | SQL injection, database backup theft, insider threat |
|
||||
| **Assets at Risk** | Registry credentials, vault tokens, SSH keys |
|
||||
| **Mitigation** | Secrets NEVER stored in database; only vault references stored |
|
||||
| **Detection** | Anomalous vault access patterns, failed authentication attempts |
|
||||
|
||||
### T2: Token Replay
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | Attacker captures and reuses valid JWT tokens |
|
||||
| **Attack Vector** | Man-in-the-middle, log file exposure, memory dump |
|
||||
| **Assets at Risk** | User sessions, API access |
|
||||
| **Mitigation** | Short-lived tokens (15 min), refresh token rotation, TLS everywhere |
|
||||
| **Detection** | Token used from unusual IP, concurrent sessions |
|
||||
|
||||
### T3: Agent Impersonation
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | Attacker registers fake agent to receive deployment tasks |
|
||||
| **Attack Vector** | Stolen registration token, certificate forgery |
|
||||
| **Assets at Risk** | Deployment credentials, target access |
|
||||
| **Mitigation** | One-time registration tokens, mTLS with CA-signed certs |
|
||||
| **Detection** | Registration from unexpected network, capability mismatch |
|
||||
|
||||
### T4: Digest Tampering
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | Attacker modifies container image after release creation |
|
||||
| **Attack Vector** | Registry compromise, man-in-the-middle at pull time |
|
||||
| **Assets at Risk** | Application integrity, supply chain |
|
||||
| **Mitigation** | Digest verification at pull time; mismatch = deployment failure |
|
||||
| **Detection** | Pull failures due to digest mismatch |
|
||||
|
||||
### T5: Evidence Tampering
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | Attacker modifies audit records to hide malicious activity |
|
||||
| **Attack Vector** | Database admin access, SQL injection |
|
||||
| **Assets at Risk** | Audit integrity, compliance |
|
||||
| **Mitigation** | Append-only table, cryptographic signing, no UPDATE/DELETE |
|
||||
| **Detection** | Signature verification failure, hash chain break |
|
||||
|
||||
### T6: Privilege Escalation
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | User gains permissions beyond their role |
|
||||
| **Attack Vector** | Role assignment exploit, permission bypass |
|
||||
| **Assets at Risk** | Environment access, approval authority |
|
||||
| **Mitigation** | Role-based access, SoD enforcement, audit logs |
|
||||
| **Detection** | Unusual permission patterns, SoD violation attempts |
|
||||
|
||||
### T7: Supply Chain Attack
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | Malicious plugin injected into workflow |
|
||||
| **Attack Vector** | Plugin repository compromise, typosquatting |
|
||||
| **Assets at Risk** | All environments, all credentials |
|
||||
| **Mitigation** | Plugin sandbox, capability declarations, signed manifests |
|
||||
| **Detection** | Unexpected network egress, resource anomalies |
|
||||
|
||||
### T8: Lateral Movement
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | Attacker uses compromised target to access others |
|
||||
| **Attack Vector** | Target compromise, credential reuse |
|
||||
| **Assets at Risk** | Other targets, environments |
|
||||
| **Mitigation** | Short-lived task credentials, scoped permissions |
|
||||
| **Detection** | Cross-target credential use, unexpected connections |
|
||||
|
||||
### T9: Data Exfiltration
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | Attacker extracts logs, artifacts, or configuration |
|
||||
| **Attack Vector** | API abuse, log aggregator compromise |
|
||||
| **Assets at Risk** | Application data, deployment configurations |
|
||||
| **Mitigation** | Encryption at rest, network segmentation, audit logging |
|
||||
| **Detection** | Large data transfers, unusual API patterns |
|
||||
|
||||
### T10: Denial of Service
|
||||
|
||||
| Aspect | Description |
|
||||
|--------|-------------|
|
||||
| **Threat** | Attacker exhausts resources to prevent deployments |
|
||||
| **Attack Vector** | API flooding, workflow loop, agent task spam |
|
||||
| **Assets at Risk** | Service availability |
|
||||
| **Mitigation** | Rate limiting, resource quotas, circuit breakers |
|
||||
| **Detection** | Resource exhaustion alerts, traffic spikes |
|
||||
|
||||
## STRIDE Analysis
|
||||
|
||||
| Category | Threats | Primary Mitigations |
|
||||
|----------|---------|---------------------|
|
||||
| **Spoofing** | T3 Agent Impersonation | mTLS, registration tokens |
|
||||
| **Tampering** | T4 Digest, T5 Evidence | Digest verification, append-only tables |
|
||||
| **Repudiation** | Evidence manipulation | Signed evidence packets |
|
||||
| **Information Disclosure** | T1 Credentials, T9 Exfiltration | Vault integration, encryption |
|
||||
| **Denial of Service** | T10 Resource exhaustion | Rate limits, quotas |
|
||||
| **Elevation of Privilege** | T6 Escalation | RBAC, SoD enforcement |
|
||||
|
||||
## Trust Boundaries
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ TRUST BOUNDARIES │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ PUBLIC NETWORK (Untrusted) │ │
|
||||
│ │ │ │
|
||||
│ │ Internet, External Users, External Services │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ │ TLS + Authentication │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ DMZ (Semi-trusted) │ │
|
||||
│ │ │ │
|
||||
│ │ API Gateway, Webhook Gateway │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ │ Internal mTLS │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ INTERNAL NETWORK (Trusted) │ │
|
||||
│ │ │ │
|
||||
│ │ Stella Core Services, Database, Internal Vault │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ │ Agent mTLS │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ DEPLOYMENT NETWORK (Controlled) │ │
|
||||
│ │ │ │
|
||||
│ │ Agents, Targets │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Data Classification
|
||||
|
||||
| Classification | Examples | Protection Requirements |
|
||||
|---------------|----------|------------------------|
|
||||
| **Critical** | Vault credentials, signing keys | Hardware security, minimal access |
|
||||
| **Sensitive** | User tokens, agent certificates | Encryption, access logging |
|
||||
| **Internal** | Release configs, workflow definitions | Encryption at rest |
|
||||
| **Public** | API documentation, release names | Integrity protection |
|
||||
|
||||
## Security Controls Summary
|
||||
|
||||
| Control | Implementation | Threats Addressed |
|
||||
|---------|----------------|-------------------|
|
||||
| mTLS | Agent communication | T3 |
|
||||
| Short-lived tokens | 15-min access tokens | T2 |
|
||||
| Vault integration | No secrets in DB | T1 |
|
||||
| Digest verification | Pull-time validation | T4 |
|
||||
| Append-only tables | Evidence immutability | T5 |
|
||||
| RBAC + SoD | Permission enforcement | T6 |
|
||||
| Plugin sandbox | Resource limits, capability control | T7 |
|
||||
| Scoped credentials | Task-specific access | T8 |
|
||||
| Encryption | At rest and in transit | T9 |
|
||||
| Rate limiting | API and resource quotas | T10 |
|
||||
|
||||
## Incident Response
|
||||
|
||||
### Detection Signals
|
||||
|
||||
| Signal | Indicates | Response |
|
||||
|--------|-----------|----------|
|
||||
| Digest mismatch at pull | T4 Tampering | Halt deployment, investigate registry |
|
||||
| Evidence signature failure | T5 Tampering | Preserve logs, forensic analysis |
|
||||
| Unusual agent registration | T3 Impersonation | Revoke agent, review access |
|
||||
| SoD violation attempt | T6 Escalation | Block action, alert admin |
|
||||
| Plugin network egress | T7 Supply chain | Isolate plugin, review manifest |
|
||||
|
||||
### Response Procedures
|
||||
|
||||
1. **Contain** - Isolate affected component (revoke token, disable agent)
|
||||
2. **Investigate** - Collect logs, evidence packets, audit trail
|
||||
3. **Remediate** - Patch vulnerability, rotate credentials
|
||||
4. **Recover** - Restore service, verify integrity
|
||||
5. **Report** - Document incident, update threat model
|
||||
|
||||
## References
|
||||
|
||||
- [Security Overview](overview.md)
|
||||
- [Agent Security](agent-security.md)
|
||||
- [Audit Trail](audit-trail.md)
|
||||
Reference in New Issue
Block a user