# Break-Glass Account Runbook This runbook documents emergency access procedures using the break-glass account system when standard authentication is unavailable. > **Sprint:** SPRINT_20260112_018_AUTH_local_rbac_fallback ## Overview Break-glass accounts provide emergency administrative access when: - PostgreSQL database is unavailable - OIDC/OAuth2 identity provider is unreachable - Authority service is degraded - Network isolation prevents standard authentication Break-glass access is fully audited and time-limited by design. ## When to Use Break-Glass Access | Scenario | Standard Auth | Break-Glass | |----------|---------------|-------------| | Database maintenance | N/A | Use | | IdP outage | Unavailable | Use | | Network partition | Unavailable | Use | | Routine operations | Available | Do NOT use | | Security incident response | May be unavailable | Use with incident code | **CRITICAL:** Break-glass access should only be used when standard authentication is genuinely unavailable. All usage is logged and auditable. ## Prerequisites ### Configuration Requirements Break-glass must be explicitly enabled in local policy: ```yaml # /etc/stellaops/authority/local-policy.yaml breakGlass: enabled: true sessionTimeoutMinutes: 15 maxExtensions: 2 allowedReasonCodes: - database_maintenance - idp_outage - network_partition - security_incident - disaster_recovery accounts: - id: "break-glass-admin" passwordHash: "$argon2id$v=19$m=65536,t=3,p=4$..." roles: ["admin"] ``` ### Password Hash Generation Generate password hashes using Argon2id: ```bash # Using argon2 CLI tool echo -n "your-secure-password" | argon2 $(openssl rand -base64 16) -id -t 3 -m 16 -p 4 -l 32 -e # Or using stella CLI stella auth hash-password --algorithm argon2id ``` ## Break-Glass Login Procedure ### Step 1: Verify Standard Auth is Unavailable Before using break-glass, confirm standard authentication is genuinely unavailable: ```bash # Check Authority health curl -s https://authority.example.com/health | jq . # Check OIDC endpoint curl -s https://idp.example.com/.well-known/openid-configuration # Check database connectivity stella doctor check --component postgres ``` ### Step 2: Access Break-Glass Login Navigate to the break-glass endpoint: ``` https://authority.example.com/break-glass/login ``` Or use the CLI: ```bash stella auth break-glass login \ --account break-glass-admin \ --reason database_maintenance ``` ### Step 3: Provide Credentials and Reason | Field | Description | Required | |-------|-------------|----------| | Account ID | Break-glass account identifier | Yes | | Password | Account password | Yes | | Reason Code | Pre-approved reason code | Yes | | Reason Details | Free-text explanation | Recommended | **Approved Reason Codes:** | Code | Description | |------|-------------| | `database_maintenance` | Scheduled or emergency database work | | `idp_outage` | Identity provider unavailable | | `network_partition` | Network connectivity issues | | `security_incident` | Active security incident response | | `disaster_recovery` | DR/BCP activation | ### Step 4: Session Created On successful authentication: - Session token issued with limited TTL (default: 15 minutes) - Audit event logged: `breakglass.session.created` - All subsequent actions are tagged with break-glass context ## Session Management ### Session Timeout Break-glass sessions have strict time limits: | Setting | Default | Description | |---------|---------|-------------| | `sessionTimeoutMinutes` | 15 | Session lifetime | | `maxExtensions` | 2 | Maximum session extensions | | Extension period | 15 min | Time added per extension | ### Extending a Session If additional time is needed: ```bash # CLI stella auth break-glass extend \ --session-id \ --reason "database migration still running" # UI # Click "Extend Session" button in break-glass banner ``` Extension requires: 1. Re-entering password 2. Providing extension reason 3. Not exceeding `maxExtensions` limit ### Session Termination Sessions end when: - User explicitly logs out - Session timeout expires - Max extensions reached - Administrator force-terminates ```bash # Explicit logout stella auth break-glass logout --session-id # Force terminate (admin) stella auth break-glass terminate --session-id --reason "normal auth restored" ``` ## Audit Trail ### Audit Events All break-glass activity is logged: | Event | Description | |-------|-------------| | `breakglass.session.created` | Session started | | `breakglass.session.extended` | Session extended | | `breakglass.session.terminated` | User logout | | `breakglass.session.expired` | Timeout reached | | `breakglass.auth.failed` | Authentication failed | | `breakglass.reason.invalid` | Invalid reason code | | `breakglass.extensions.exceeded` | Max extensions reached | ### Audit Event Structure ```json { "eventType": "breakglass.session.created", "timestamp": "2026-01-16T10:30:00Z", "accountId": "break-glass-admin", "sessionId": "bg-sess-abc123", "reasonCode": "database_maintenance", "reasonDetails": "PostgreSQL major version upgrade", "sourceIp": "10.0.1.50", "userAgent": "stella-cli/2027.Q1" } ``` ### Querying Audit Logs ```bash # List all break-glass events stella audit query --event-type "breakglass.*" --since "24h" # Export for compliance stella audit export \ --event-type "breakglass.*" \ --start 2026-01-01 \ --end 2026-01-31 \ --format json \ --output break-glass-audit-jan2026.json ``` ## Fallback Policy Store ### Automatic Failover When PostgreSQL becomes unavailable: 1. Authority detects health check failures 2. After `failureThreshold` (default: 3) consecutive failures 3. Authority switches to local policy store 4. Mode changes to `Fallback` 5. Event logged: `authority.mode.changed` ### Policy Store Modes | Mode | Description | Available Features | |------|-------------|-------------------| | `Primary` | PostgreSQL available | Full RBAC, user management | | `Fallback` | Using local policy | Break-glass only | | `Degraded` | Both degraded | Emergency access only | ### Recovery When PostgreSQL recovers: 1. Health checks pass 2. After `minFallbackDurationMs` (default: 30s) cooldown 3. Authority switches back to Primary 4. Fallback sessions can continue until expiry ## Security Considerations ### Password Policy Break-glass account passwords should: - Be at least 20 characters - Include upper, lower, numbers, symbols - Be stored securely (HSM, Vault, split custody) - Be rotated on a schedule (quarterly recommended) ### Access Control - Limit break-glass accounts to essential personnel - Use separate accounts per operator when possible - Review access list quarterly - Disable unused accounts immediately ### Monitoring Set up alerts for break-glass activity: ```yaml # Alert rule example - alert: BreakGlassSessionCreated expr: stellaops_breakglass_sessions_created_total > 0 for: 0m labels: severity: warning annotations: summary: Break-glass session created description: A break-glass session was created. Verify this is expected. ``` ## Troubleshooting ### Login Failures | Error | Cause | Resolution | |-------|-------|------------| | `invalid_credentials` | Wrong password | Verify password | | `invalid_reason_code` | Reason not in allowed list | Use approved reason code | | `account_disabled` | Account explicitly disabled | Contact administrator | | `break_glass_disabled` | Feature disabled in config | Enable in local-policy.yaml | ### Session Issues | Issue | Cause | Resolution | |-------|-------|------------| | Session expired immediately | Clock skew | Sync server time | | Cannot extend | Max extensions reached | Log out and re-authenticate | | Actions failing | Insufficient roles | Verify account has required roles | ### Policy Store Issues ```bash # Check policy store status stella doctor check --component authority # Verify local policy file stella auth policy validate --file /etc/stellaops/authority/local-policy.yaml # Force reload policy stella auth policy reload ``` ## Compliance Notes Break-glass usage must be: - Documented in incident reports - Reviewed during security audits - Reported in compliance dashboards - Justified for each session Retain audit logs for: - SOC 2: 1 year minimum - HIPAA: 6 years - PCI-DSS: 1 year - Internal policy: As defined ## Related Documentation - [Local RBAC Policy Schema](../modules/authority/local-policy-schema.md) - [Authority Architecture](../modules/authority/architecture.md) - [Offline Operations](../operations/airgap-operations-runbook.md) - [Audit System](../modules/audit/architecture.md)