add release orchestrator docs and sprints gaps fills

This commit is contained in:
2026-01-11 01:05:17 +02:00
parent d58c093887
commit a62974a8c2
37 changed files with 6061 additions and 0 deletions

View File

@@ -0,0 +1,403 @@
# Agent-Based Deployment
> Agent-based deployment using Docker and Compose agents for executing tasks on targets.
**Status:** Planned (not yet implemented)
**Source:** [Architecture Advisory Section 10.3](../../../product/advisories/09-Jan-2026%20-%20Stella%20Ops%20Orchestrator%20Architecture.md)
**Related Modules:** [Agents Module](../modules/agents.md), [Deploy Orchestrator](../modules/deploy-orchestrator.md)
**Sprints:** [108_002 Docker Agent](../../../../implplan/SPRINT_20260110_108_002_AGENTS_docker.md), [108_003 Compose Agent](../../../../implplan/SPRINT_20260110_108_003_AGENTS_compose.md)
## Overview
Agent-based deployment uses lightweight agents installed on target hosts to execute deployment tasks. Agents communicate with the orchestrator over mTLS and receive tasks through heartbeat polling or WebSocket streams.
---
## Agent Task Protocol
### Task Payload Structure
```typescript
// Task assignment (Core -> Agent)
interface AgentTask {
id: UUID;
type: TaskType;
targetId: UUID;
payload: TaskPayload;
credentials: EncryptedCredentials;
timeout: number;
priority: TaskPriority;
idempotencyKey: string;
assignedAt: DateTime;
expiresAt: DateTime;
}
type TaskType =
| "deploy"
| "rollback"
| "health-check"
| "inspect"
| "execute-command"
| "upload-files"
| "write-sticker"
| "read-sticker";
interface DeployTaskPayload {
image: string;
digest: string;
config: DeployConfig;
artifacts: ArtifactReference[];
previousDigest?: string;
hooks: {
preDeploy?: HookConfig;
postDeploy?: HookConfig;
};
}
```
### Task Result Structure
```typescript
// Task result (Agent -> Core)
interface TaskResult {
taskId: UUID;
success: boolean;
startedAt: DateTime;
completedAt: DateTime;
// Success details
outputs?: Record<string, any>;
artifacts?: ArtifactReference[];
// Failure details
error?: string;
errorType?: string;
retriable?: boolean;
// Logs
logs: string;
// Metrics
metrics: {
pullDurationMs?: number;
deployDurationMs?: number;
healthCheckDurationMs?: number;
};
}
```
---
## Docker Agent Implementation
The Docker agent deploys single containers to Docker hosts with digest verification.
### Docker Agent Capabilities
- Pull images with digest verification
- Create and start containers
- Stop and remove containers
- Health check monitoring
- Version sticker management
- Rollback to previous container
### Deploy Task Flow
```typescript
class DockerAgent implements TargetExecutor {
private docker: Docker;
async deploy(task: DeployTaskPayload): Promise<DeployResult> {
const { image, digest, config, previousDigest } = task;
const containerName = config.containerName;
// 1. Pull image and verify digest
this.log(`Pulling image ${image}@${digest}`);
await this.docker.pull(image, { digest });
const pulledDigest = await this.getImageDigest(image);
if (pulledDigest !== digest) {
throw new DigestMismatchError(
`Expected digest ${digest}, got ${pulledDigest}. Possible tampering detected.`
);
}
// 2. Run pre-deploy hook
if (task.hooks?.preDeploy) {
await this.runHook(task.hooks.preDeploy, "pre-deploy");
}
// 3. Stop and rename existing container
const existingContainer = await this.findContainer(containerName);
if (existingContainer) {
this.log(`Stopping existing container ${containerName}`);
await existingContainer.stop({ t: 10 });
await existingContainer.rename(`${containerName}-previous-${Date.now()}`);
}
// 4. Create new container
this.log(`Creating container ${containerName} from ${image}@${digest}`);
const container = await this.docker.createContainer({
name: containerName,
Image: `${image}@${digest}`, // Always use digest, not tag
Env: this.buildEnvVars(config.environment),
HostConfig: {
PortBindings: this.buildPortBindings(config.ports),
Binds: this.buildBindMounts(config.volumes),
RestartPolicy: { Name: config.restartPolicy || "unless-stopped" },
Memory: config.memoryLimit,
CpuQuota: config.cpuLimit,
},
Labels: {
"stella.release.id": config.releaseId,
"stella.release.name": config.releaseName,
"stella.digest": digest,
"stella.deployed.at": new Date().toISOString(),
},
});
// 5. Start container
this.log(`Starting container ${containerName}`);
await container.start();
// 6. Wait for container to be healthy (if health check configured)
if (config.healthCheck) {
this.log(`Waiting for container health check`);
const healthy = await this.waitForHealthy(container, config.healthCheck.timeout);
if (!healthy) {
// Rollback to previous container
await this.rollbackContainer(containerName, existingContainer);
throw new HealthCheckFailedError(`Container ${containerName} failed health check`);
}
}
// 7. Run post-deploy hook
if (task.hooks?.postDeploy) {
await this.runHook(task.hooks.postDeploy, "post-deploy");
}
// 8. Cleanup previous container
if (existingContainer && config.cleanupPrevious !== false) {
this.log(`Removing previous container`);
await existingContainer.remove({ force: true });
}
return {
success: true,
containerId: container.id,
previousDigest: previousDigest,
logs: this.getLogs(),
durationMs: this.getDuration(),
};
}
}
```
### Rollback Implementation
```typescript
async rollback(task: RollbackTaskPayload): Promise<DeployResult> {
const { containerName, targetDigest } = task;
// Find previous container or use specified digest
if (targetDigest) {
// Deploy specific digest
return this.deploy({
...task,
digest: targetDigest,
});
}
// Find and restore previous container
const previousContainer = await this.findContainer(`${containerName}-previous-*`);
if (!previousContainer) {
throw new RollbackError(`No previous container found for ${containerName}`);
}
// Stop current, rename, start previous
const currentContainer = await this.findContainer(containerName);
if (currentContainer) {
await currentContainer.stop({ t: 10 });
await currentContainer.rename(`${containerName}-failed-${Date.now()}`);
}
await previousContainer.rename(containerName);
await previousContainer.start();
return {
success: true,
containerId: previousContainer.id,
logs: this.getLogs(),
durationMs: this.getDuration(),
};
}
```
### Version Sticker Management
```typescript
async writeSticker(sticker: VersionSticker): Promise<void> {
const stickerPath = this.config.stickerPath || "/var/stella/version.json";
const stickerContent = JSON.stringify(sticker, null, 2);
// Write to host filesystem or container volume
if (this.config.stickerLocation === "volume") {
// Write to shared volume
await this.docker.run("alpine", [
"sh", "-c",
`echo '${stickerContent}' > ${stickerPath}`
], {
HostConfig: {
Binds: [`${this.config.stickerVolume}:/var/stella`]
}
});
} else {
// Write directly to host
fs.writeFileSync(stickerPath, stickerContent);
}
}
```
---
## Compose Agent Implementation
The Compose agent deploys multi-container applications defined in Docker Compose files.
### Compose Agent Capabilities
- Pull images for all services
- Verify digests for all services
- Deploy using compose lock files
- Health check all services
- Rollback to previous deployment
- Version sticker management
### Deploy Task Flow
```typescript
class ComposeAgent implements TargetExecutor {
async deploy(task: DeployTaskPayload): Promise<DeployResult> {
const { artifacts, config } = task;
const deployDir = config.deploymentDirectory;
// 1. Write compose lock file
const composeLock = artifacts.find(a => a.type === "compose_lock");
const composeContent = await this.fetchArtifact(composeLock);
const composePath = path.join(deployDir, "compose.stella.lock.yml");
await fs.writeFile(composePath, composeContent);
// 2. Write any additional config files
for (const artifact of artifacts.filter(a => a.type === "config")) {
const content = await this.fetchArtifact(artifact);
await fs.writeFile(path.join(deployDir, artifact.name), content);
}
// 3. Run pre-deploy hook
if (task.hooks?.preDeploy) {
await this.runHook(task.hooks.preDeploy, deployDir);
}
// 4. Pull images
this.log("Pulling images...");
const pullResult = await this.runCompose(deployDir, ["pull"]);
if (!pullResult.success) {
throw new Error(`Failed to pull images: ${pullResult.stderr}`);
}
// 5. Verify digests
await this.verifyDigests(composePath, config.expectedDigests);
// 6. Deploy
this.log("Deploying services...");
const upResult = await this.runCompose(deployDir, [
"up", "-d",
"--remove-orphans",
"--force-recreate"
]);
if (!upResult.success) {
throw new Error(`Failed to deploy: ${upResult.stderr}`);
}
// 7. Wait for services to be healthy
if (config.healthCheck) {
this.log("Waiting for services to be healthy...");
const healthy = await this.waitForServicesHealthy(
deployDir,
config.healthCheck.timeout
);
if (!healthy) {
// Rollback
await this.rollbackToBackup(deployDir);
throw new HealthCheckFailedError("Services failed health check");
}
}
// 8. Run post-deploy hook
if (task.hooks?.postDeploy) {
await this.runHook(task.hooks.postDeploy, deployDir);
}
// 9. Write version sticker
await this.writeSticker(config.sticker, deployDir);
return {
success: true,
logs: this.getLogs(),
durationMs: this.getDuration(),
};
}
}
```
### Digest Verification
```typescript
private async verifyDigests(
composePath: string,
expectedDigests: Record<string, string>
): Promise<void> {
const composeContent = yaml.parse(await fs.readFile(composePath, "utf-8"));
for (const [service, expectedDigest] of Object.entries(expectedDigests)) {
const serviceConfig = composeContent.services[service];
if (!serviceConfig) {
throw new Error(`Service ${service} not found in compose file`);
}
const image = serviceConfig.image;
if (!image.includes("@sha256:")) {
throw new Error(`Service ${service} image not pinned to digest: ${image}`);
}
const actualDigest = image.split("@")[1];
if (actualDigest !== expectedDigest) {
throw new DigestMismatchError(
`Service ${service}: expected ${expectedDigest}, got ${actualDigest}`
);
}
}
}
```
---
## Security Considerations
1. **Digest Verification:** All deployments verify image digests before execution
2. **Credential Encryption:** Credentials are encrypted in transit and at rest
3. **mTLS Communication:** All agent-server communication uses mutual TLS
4. **Hook Sandboxing:** Pre/post-deploy hooks run in isolated environments
5. **Audit Logging:** All deployment actions are logged with actor context
---
## See Also
- [Agents Module](../modules/agents.md)
- [Agent Security](../security/agent-security.md)
- [Deployment Orchestrator](../modules/deploy-orchestrator.md)
- [Agentless Deployment](agentless.md)

View File

@@ -0,0 +1,427 @@
# Agentless Deployment (SSH/WinRM)
> Agentless deployment using SSH and WinRM for remote execution without installing agents.
**Status:** Planned (not yet implemented)
**Source:** [Architecture Advisory Section 10.4](../../../product/advisories/09-Jan-2026%20-%20Stella%20Ops%20Orchestrator%20Architecture.md)
**Related Modules:** [Agents Module](../modules/agents.md), [Deploy Orchestrator](../modules/deploy-orchestrator.md)
**Sprints:** [108_004 SSH Agent](../../../../implplan/SPRINT_20260110_108_004_AGENTS_ssh.md), [108_005 WinRM Agent](../../../../implplan/SPRINT_20260110_108_005_AGENTS_winrm.md)
## Overview
Agentless deployment enables deployment to targets without requiring a pre-installed agent. The orchestrator connects directly to targets using SSH (Linux/Unix) or WinRM (Windows) to execute deployment commands.
---
## SSH Remote Executor
### Capabilities
- SSH key-based authentication
- File transfer via SFTP
- Remote command execution
- Docker operations over SSH
- Script execution
- Backup and rollback
### Connection Management
```typescript
class SSHRemoteExecutor implements TargetExecutor {
private ssh: SSHClient;
async connect(config: SSHConnectionConfig): Promise<void> {
const privateKey = await this.secrets.getSecret(config.privateKeyRef);
this.ssh = new SSHClient();
await this.ssh.connect({
host: config.host,
port: config.port || 22,
username: config.username,
privateKey: privateKey.value,
readyTimeout: config.connectionTimeout || 30000,
keepaliveInterval: 10000,
});
}
}
```
### Deploy Task Flow
```typescript
async deploy(task: DeployTaskPayload): Promise<DeployResult> {
const { artifacts, config } = task;
const deployDir = config.deploymentDirectory;
try {
// 1. Ensure deployment directory exists
await this.exec(`mkdir -p ${deployDir}`);
await this.exec(`mkdir -p ${deployDir}/.stella-backup`);
// 2. Backup current deployment
await this.exec(`cp -r ${deployDir}/* ${deployDir}/.stella-backup/ 2>/dev/null || true`);
// 3. Upload artifacts
for (const artifact of artifacts) {
const content = await this.fetchArtifact(artifact);
const remotePath = path.join(deployDir, artifact.name);
await this.uploadFile(content, remotePath);
}
// 4. Run pre-deploy hook
if (task.hooks?.preDeploy) {
await this.runRemoteHook(task.hooks.preDeploy, deployDir);
}
// 5. Execute deployment script
const deployScript = artifacts.find(a => a.type === "deploy_script");
if (deployScript) {
const scriptPath = path.join(deployDir, deployScript.name);
await this.exec(`chmod +x ${scriptPath}`);
const result = await this.exec(scriptPath, {
cwd: deployDir,
timeout: config.deploymentTimeout,
env: config.environment,
});
if (result.exitCode !== 0) {
throw new DeploymentError(`Deploy script failed: ${result.stderr}`);
}
}
// 6. Run post-deploy hook
if (task.hooks?.postDeploy) {
await this.runRemoteHook(task.hooks.postDeploy, deployDir);
}
// 7. Health check
if (config.healthCheck) {
const healthy = await this.runHealthCheck(config.healthCheck);
if (!healthy) {
await this.rollback(task);
throw new HealthCheckFailedError("Health check failed");
}
}
// 8. Write version sticker
await this.writeSticker(config.sticker, deployDir);
// 9. Cleanup backup
await this.exec(`rm -rf ${deployDir}/.stella-backup`);
return {
success: true,
logs: this.getLogs(),
durationMs: this.getDuration(),
};
} finally {
this.ssh.end();
}
}
```
### Command Execution
```typescript
private async exec(
command: string,
options?: ExecOptions
): Promise<CommandResult> {
return new Promise((resolve, reject) => {
const timeout = options?.timeout || 60000;
let stdout = "";
let stderr = "";
this.ssh.exec(command, { cwd: options?.cwd }, (err, stream) => {
if (err) {
reject(err);
return;
}
const timer = setTimeout(() => {
stream.close();
reject(new TimeoutError(`Command timed out after ${timeout}ms`));
}, timeout);
stream.on("data", (data: Buffer) => {
stdout += data.toString();
this.log(data.toString());
});
stream.stderr.on("data", (data: Buffer) => {
stderr += data.toString();
this.log(`[stderr] ${data.toString()}`);
});
stream.on("close", (code: number) => {
clearTimeout(timer);
resolve({ exitCode: code, stdout, stderr });
});
});
});
}
```
### File Upload via SFTP
```typescript
private async uploadFile(content: Buffer | string, remotePath: string): Promise<void> {
return new Promise((resolve, reject) => {
this.ssh.sftp((err, sftp) => {
if (err) {
reject(err);
return;
}
const writeStream = sftp.createWriteStream(remotePath);
writeStream.on("close", () => resolve());
writeStream.on("error", reject);
writeStream.end(content);
});
});
}
```
### Rollback
```typescript
async rollback(task: RollbackTaskPayload): Promise<DeployResult> {
const deployDir = task.config.deploymentDirectory;
// Restore from backup
await this.exec(`rm -rf ${deployDir}/*`);
await this.exec(`cp -r ${deployDir}/.stella-backup/* ${deployDir}/`);
// Re-run deployment from backup
const deployScript = path.join(deployDir, "deploy.sh");
await this.exec(deployScript, { cwd: deployDir });
return {
success: true,
logs: this.getLogs(),
durationMs: this.getDuration(),
};
}
```
---
## WinRM Remote Executor
### Capabilities
- NTLM/Kerberos authentication
- PowerShell script execution
- File transfer via base64 encoding
- Windows container operations
- Windows service management
### Connection Management
```typescript
class WinRMRemoteExecutor implements TargetExecutor {
private winrm: WinRMClient;
async connect(config: WinRMConnectionConfig): Promise<void> {
const credential = await this.secrets.getSecret(config.credentialRef);
this.winrm = new WinRMClient({
host: config.host,
port: config.port || 5986,
username: credential.username,
password: credential.password,
protocol: config.useHttps ? "https" : "http",
authentication: config.authType || "ntlm", // ntlm, kerberos, basic
});
await this.winrm.openShell();
}
}
```
### Deploy Task Flow
```typescript
async deploy(task: DeployTaskPayload): Promise<DeployResult> {
const { artifacts, config } = task;
const deployDir = config.deploymentDirectory;
try {
// 1. Ensure deployment directory exists
await this.execPowerShell(`
if (-not (Test-Path "${deployDir}")) {
New-Item -ItemType Directory -Path "${deployDir}" -Force
}
if (-not (Test-Path "${deployDir}\\.stella-backup")) {
New-Item -ItemType Directory -Path "${deployDir}\\.stella-backup" -Force
}
`);
// 2. Backup current deployment
await this.execPowerShell(`
Get-ChildItem "${deployDir}" -Exclude ".stella-backup" |
Copy-Item -Destination "${deployDir}\\.stella-backup" -Recurse -Force
`);
// 3. Upload artifacts
for (const artifact of artifacts) {
const content = await this.fetchArtifact(artifact);
const remotePath = `${deployDir}\\${artifact.name}`;
await this.uploadFile(content, remotePath);
}
// 4. Run pre-deploy hook
if (task.hooks?.preDeploy) {
await this.runRemoteHook(task.hooks.preDeploy, deployDir);
}
// 5. Execute deployment script
const deployScript = artifacts.find(a => a.type === "deploy_script");
if (deployScript) {
const scriptPath = `${deployDir}\\${deployScript.name}`;
const result = await this.execPowerShell(`
Set-Location "${deployDir}"
& "${scriptPath}"
exit $LASTEXITCODE
`, { timeout: config.deploymentTimeout });
if (result.exitCode !== 0) {
throw new DeploymentError(`Deploy script failed: ${result.stderr}`);
}
}
// 6. Run post-deploy hook
if (task.hooks?.postDeploy) {
await this.runRemoteHook(task.hooks.postDeploy, deployDir);
}
// 7. Health check
if (config.healthCheck) {
const healthy = await this.runHealthCheck(config.healthCheck);
if (!healthy) {
await this.rollback(task);
throw new HealthCheckFailedError("Health check failed");
}
}
// 8. Write version sticker
await this.writeSticker(config.sticker, deployDir);
// 9. Cleanup backup
await this.execPowerShell(`
Remove-Item -Path "${deployDir}\\.stella-backup" -Recurse -Force
`);
return {
success: true,
logs: this.getLogs(),
durationMs: this.getDuration(),
};
} finally {
this.winrm.closeShell();
}
}
```
### PowerShell Execution
```typescript
private async execPowerShell(
script: string,
options?: ExecOptions
): Promise<CommandResult> {
const encoded = Buffer.from(script, "utf16le").toString("base64");
return this.winrm.runCommand(
`powershell -EncodedCommand ${encoded}`,
{ timeout: options?.timeout || 60000 }
);
}
```
### File Upload
```typescript
private async uploadFile(content: Buffer | string, remotePath: string): Promise<void> {
// Use PowerShell to write file content
const base64Content = Buffer.from(content).toString("base64");
await this.execPowerShell(`
$bytes = [Convert]::FromBase64String("${base64Content}")
[IO.File]::WriteAllBytes("${remotePath}", $bytes)
`);
}
```
---
## Security Considerations
### SSH Security
1. **Key-Based Authentication:** Always use SSH keys, never passwords
2. **Key Rotation:** Regularly rotate SSH keys
3. **Bastion Hosts:** Use jump hosts for network isolation
4. **Connection Timeouts:** Enforce strict connection timeouts
5. **Known Hosts:** Verify host fingerprints
### WinRM Security
1. **HTTPS Required:** Always use WinRM over HTTPS in production
2. **Certificate Validation:** Validate server certificates
3. **Kerberos Preferred:** Use Kerberos when available, NTLM as fallback
4. **Credential Protection:** Store credentials in vault
5. **Session Cleanup:** Always close sessions after use
---
## Configuration Examples
### SSH Target Configuration
```yaml
target:
name: web-server-01
type: ssh
connection:
host: 192.168.1.100
port: 22
username: deploy
privateKeyRef: vault://ssh-keys/deploy-key
deployment:
directory: /opt/myapp
healthCheck:
command: curl -f http://localhost:8080/health
timeout: 30
```
### WinRM Target Configuration
```yaml
target:
name: windows-server-01
type: winrm
connection:
host: 192.168.1.200
port: 5986
useHttps: true
authType: kerberos
credentialRef: vault://windows-creds/deploy-user
deployment:
directory: C:\Apps\MyApp
healthCheck:
command: Invoke-WebRequest -Uri http://localhost:8080/health -UseBasicParsing
timeout: 30
```
---
## See Also
- [Agent-Based Deployment](agent-based.md)
- [Agents Module](../modules/agents.md)
- [Deployment Orchestrator](../modules/deploy-orchestrator.md)
- [Security Overview](../security/overview.md)