Files

master a62974a8c2 add release orchestrator docs and sprints gaps fills

2026-01-11 01:05:17 +02:00

6.8 KiB

Raw Blame History

A/B Release Models

Two models for A/B releases: target-group based and router-based traffic splitting.

Status: Planned (not yet implemented) Source: Architecture Advisory Section 11.2 Related Modules: Progressive Delivery Module, Traffic Router Sprint: 110_001 A/B Release Manager

Overview

Stella Ops supports two distinct models for A/B releases:

Target-Group A/B: Scale different target groups to shift workload
Router-Based A/B: Use traffic routers to split requests between variations

Each model has different use cases, trade-offs, and implementation requirements.

Model 1: Target-Group A/B

Target-group A/B splits traffic by scaling different groups of targets. Suitable for worker services, background processors, and scenarios where sticky sessions are not required.

Configuration

interface TargetGroupABConfig {
  type: "target-group";

  // Group definitions
  groupA: {
    targetGroupId: UUID;
    labels?: Record<string, string>;
  };
  groupB: {
    targetGroupId: UUID;
    labels?: Record<string, string>;
  };

  // Rollout by scaling groups
  rolloutStrategy: {
    type: "scale-groups";
    stages: ScaleStage[];
  };
}

interface ScaleStage {
  name: string;
  groupAPercentage: number;   // Percentage of group A targets active
  groupBPercentage: number;   // Percentage of group B targets active
  duration?: number;          // Auto-advance after duration (seconds)
  healthThreshold?: number;   // Required health % to advance
  requireApproval?: boolean;
}

Example: Worker Service Canary

const workerCanaryConfig: TargetGroupABConfig = {
  type: "target-group",
  groupA: { labels: { "worker-group": "A" } },
  groupB: { labels: { "worker-group": "B" } },
  rolloutStrategy: {
    type: "scale-groups",
    stages: [
      // Stage 1: 100% A, 10% B (canary)
      { name: "canary", groupAPercentage: 100, groupBPercentage: 10,
        duration: 300, healthThreshold: 95 },
      // Stage 2: 100% A, 50% B
      { name: "expand", groupAPercentage: 100, groupBPercentage: 50,
        duration: 600, healthThreshold: 95 },
      // Stage 3: 50% A, 100% B
      { name: "shift", groupAPercentage: 50, groupBPercentage: 100,
        duration: 600, healthThreshold: 95 },
      // Stage 4: 0% A, 100% B (complete)
      { name: "complete", groupAPercentage: 0, groupBPercentage: 100,
        requireApproval: true },
    ],
  },
};

Use Cases

Background job processors
Worker services without external traffic
Infrastructure-level splitting
Static traffic distribution
Hardware-based variants

Model 2: Router-Based A/B

Router-based A/B uses traffic routers (Nginx, HAProxy, ALB) to split incoming requests between variations. Suitable for APIs, web services, and scenarios requiring sticky sessions.

Configuration

interface RouterBasedABConfig {
  type: "router-based";

  // Router integration
  routerIntegrationId: UUID;

  // Upstream configuration
  upstreamName: string;
  variationA: {
    targets: string[];
    serviceName?: string;
  };
  variationB: {
    targets: string[];
    serviceName?: string;
  };

  // Traffic split configuration
  trafficSplit: TrafficSplitConfig;

  // Rollout strategy
  rolloutStrategy: RouterRolloutStrategy;
}

interface TrafficSplitConfig {
  type: "weight" | "header" | "cookie" | "tenant" | "composite";

  // Weight-based (percentage)
  weights?: { A: number; B: number };

  // Header-based
  headerName?: string;
  headerValueA?: string;
  headerValueB?: string;

  // Cookie-based
  cookieName?: string;
  cookieValueA?: string;
  cookieValueB?: string;

  // Tenant-based (by host/path)
  tenantRules?: TenantRule[];
}

Rollout Strategy

interface RouterRolloutStrategy {
  type: "manual" | "time-based" | "health-based" | "composite";
  stages: RouterRolloutStage[];
}

interface RouterRolloutStage {
  name: string;
  trafficPercentageB: number;     // % of traffic to variation B

  // Advancement criteria
  duration?: number;              // Auto-advance after duration
  healthThreshold?: number;       // Required health %
  errorRateThreshold?: number;    // Max error rate %
  latencyThreshold?: number;      // Max p99 latency ms
  requireApproval?: boolean;

  // Optional: specific routing rules for this stage
  routingOverrides?: TrafficSplitConfig;
}

Example: API Canary with Health-Based Advancement

const apiCanaryConfig: RouterBasedABConfig = {
  type: "router-based",
  routerIntegrationId: "nginx-prod",
  upstreamName: "api-backend",
  variationA: { serviceName: "api-v1" },
  variationB: { serviceName: "api-v2" },
  trafficSplit: { type: "weight", weights: { A: 100, B: 0 } },
  rolloutStrategy: {
    type: "health-based",
    stages: [
      { name: "canary-10", trafficPercentageB: 10,
        duration: 300, healthThreshold: 99, errorRateThreshold: 1 },
      { name: "canary-25", trafficPercentageB: 25,
        duration: 600, healthThreshold: 99, errorRateThreshold: 1 },
      { name: "canary-50", trafficPercentageB: 50,
        duration: 900, healthThreshold: 99, errorRateThreshold: 1 },
      { name: "promote", trafficPercentageB: 100,
        requireApproval: true },
    ],
  },
};

Use Cases

API services with external traffic
Web applications with user sessions
Dynamic traffic distribution
User-based variants (A/B testing)
Feature flags and gradual rollouts

Routing Strategies

Weight-Based Routing

Splits traffic by percentage across variations.

trafficSplit:
  type: weight
  weights:
    A: 90
    B: 10

Header-Based Routing

Routes based on request header values.

trafficSplit:
  type: header
  headerName: X-Feature-Flag
  headerValueA: "control"
  headerValueB: "experiment"

Routes based on cookie values for sticky sessions.

trafficSplit:
  type: cookie
  cookieName: ab_variation
  cookieValueA: "A"
  cookieValueB: "B"

Comparison Matrix

Aspect	Target-Group A/B	Router-Based A/B
Traffic Control	By scaling targets	By routing rules
Sticky Sessions	Not supported	Supported
Granularity	Target-level	Request-level
External Traffic	Not required	Required
Infrastructure	Target groups	Traffic router
Use Case	Workers, batch jobs	APIs, web apps
Rollback Speed	Slower (scaling)	Immediate (routing)

6.8 KiB Raw Blame History

A/B Release Models

Overview

Model 1: Target-Group A/B

Configuration

Example: Worker Service Canary

Use Cases

Model 2: Router-Based A/B

Configuration

Rollout Strategy

Example: API Canary with Health-Based Advancement

Use Cases

Routing Strategies

Weight-Based Routing

Header-Based Routing

Cookie-Based Routing

Comparison Matrix

See Also

6.8 KiB

Raw Blame History