8.6 KiB
Break-Glass Account Runbook
This runbook documents emergency access procedures using the break-glass account system when standard authentication is unavailable.
Sprint: SPRINT_20260112_018_AUTH_local_rbac_fallback
Overview
Break-glass accounts provide emergency administrative access when:
- PostgreSQL database is unavailable
- OIDC/OAuth2 identity provider is unreachable
- Authority service is degraded
- Network isolation prevents standard authentication
Break-glass access is fully audited and time-limited by design.
When to Use Break-Glass Access
| Scenario | Standard Auth | Break-Glass |
|---|---|---|
| Database maintenance | N/A | Use |
| IdP outage | Unavailable | Use |
| Network partition | Unavailable | Use |
| Routine operations | Available | Do NOT use |
| Security incident response | May be unavailable | Use with incident code |
CRITICAL: Break-glass access should only be used when standard authentication is genuinely unavailable. All usage is logged and auditable.
Prerequisites
Configuration Requirements
Break-glass must be explicitly enabled in local policy:
# /etc/stellaops/authority/local-policy.yaml
breakGlass:
enabled: true
sessionTimeoutMinutes: 15
maxExtensions: 2
allowedReasonCodes:
- database_maintenance
- idp_outage
- network_partition
- security_incident
- disaster_recovery
accounts:
- id: "break-glass-admin"
passwordHash: "$argon2id$v=19$m=65536,t=3,p=4$..."
roles: ["admin"]
Password Hash Generation
Generate password hashes using Argon2id:
# Using argon2 CLI tool
echo -n "your-secure-password" | argon2 $(openssl rand -base64 16) -id -t 3 -m 16 -p 4 -l 32 -e
# Or using stella CLI
stella auth hash-password --algorithm argon2id
Break-Glass Login Procedure
Step 1: Verify Standard Auth is Unavailable
Before using break-glass, confirm standard authentication is genuinely unavailable:
# Check Authority health
curl -s https://authority.example.com/health | jq .
# Check OIDC endpoint
curl -s https://idp.example.com/.well-known/openid-configuration
# Check database connectivity
stella doctor check --component postgres
Step 2: Access Break-Glass Login
Navigate to the break-glass endpoint:
https://authority.example.com/break-glass/login
Or use the CLI:
stella auth break-glass login \
--account break-glass-admin \
--reason database_maintenance
Step 3: Provide Credentials and Reason
| Field | Description | Required |
|---|---|---|
| Account ID | Break-glass account identifier | Yes |
| Password | Account password | Yes |
| Reason Code | Pre-approved reason code | Yes |
| Reason Details | Free-text explanation | Recommended |
Approved Reason Codes:
| Code | Description |
|---|---|
database_maintenance |
Scheduled or emergency database work |
idp_outage |
Identity provider unavailable |
network_partition |
Network connectivity issues |
security_incident |
Active security incident response |
disaster_recovery |
DR/BCP activation |
Step 4: Session Created
On successful authentication:
- Session token issued with limited TTL (default: 15 minutes)
- Audit event logged:
breakglass.session.created - All subsequent actions are tagged with break-glass context
Session Management
Session Timeout
Break-glass sessions have strict time limits:
| Setting | Default | Description |
|---|---|---|
sessionTimeoutMinutes |
15 | Session lifetime |
maxExtensions |
2 | Maximum session extensions |
| Extension period | 15 min | Time added per extension |
Extending a Session
If additional time is needed:
# CLI
stella auth break-glass extend \
--session-id <session-id> \
--reason "database migration still running"
# UI
# Click "Extend Session" button in break-glass banner
Extension requires:
- Re-entering password
- Providing extension reason
- Not exceeding
maxExtensionslimit
Session Termination
Sessions end when:
- User explicitly logs out
- Session timeout expires
- Max extensions reached
- Administrator force-terminates
# Explicit logout
stella auth break-glass logout --session-id <session-id>
# Force terminate (admin)
stella auth break-glass terminate --session-id <session-id> --reason "normal auth restored"
Audit Trail
Audit Events
All break-glass activity is logged:
| Event | Description |
|---|---|
breakglass.session.created |
Session started |
breakglass.session.extended |
Session extended |
breakglass.session.terminated |
User logout |
breakglass.session.expired |
Timeout reached |
breakglass.auth.failed |
Authentication failed |
breakglass.reason.invalid |
Invalid reason code |
breakglass.extensions.exceeded |
Max extensions reached |
Audit Event Structure
{
"eventType": "breakglass.session.created",
"timestamp": "2026-01-16T10:30:00Z",
"accountId": "break-glass-admin",
"sessionId": "bg-sess-abc123",
"reasonCode": "database_maintenance",
"reasonDetails": "PostgreSQL major version upgrade",
"sourceIp": "10.0.1.50",
"userAgent": "stella-cli/2027.Q1"
}
Querying Audit Logs
# List all break-glass events
stella audit query --event-type "breakglass.*" --since "24h"
# Export for compliance
stella audit export \
--event-type "breakglass.*" \
--start 2026-01-01 \
--end 2026-01-31 \
--format json \
--output break-glass-audit-jan2026.json
Fallback Policy Store
Automatic Failover
When PostgreSQL becomes unavailable:
- Authority detects health check failures
- After
failureThreshold(default: 3) consecutive failures - Authority switches to local policy store
- Mode changes to
Fallback - Event logged:
authority.mode.changed
Policy Store Modes
| Mode | Description | Available Features |
|---|---|---|
Primary |
PostgreSQL available | Full RBAC, user management |
Fallback |
Using local policy | Break-glass only |
Degraded |
Both degraded | Emergency access only |
Recovery
When PostgreSQL recovers:
- Health checks pass
- After
minFallbackDurationMs(default: 30s) cooldown - Authority switches back to Primary
- Fallback sessions can continue until expiry
Security Considerations
Password Policy
Break-glass account passwords should:
- Be at least 20 characters
- Include upper, lower, numbers, symbols
- Be stored securely (HSM, Vault, split custody)
- Be rotated on a schedule (quarterly recommended)
Access Control
- Limit break-glass accounts to essential personnel
- Use separate accounts per operator when possible
- Review access list quarterly
- Disable unused accounts immediately
Monitoring
Set up alerts for break-glass activity:
# Alert rule example
- alert: BreakGlassSessionCreated
expr: stellaops_breakglass_sessions_created_total > 0
for: 0m
labels:
severity: warning
annotations:
summary: Break-glass session created
description: A break-glass session was created. Verify this is expected.
Troubleshooting
Login Failures
| Error | Cause | Resolution |
|---|---|---|
invalid_credentials |
Wrong password | Verify password |
invalid_reason_code |
Reason not in allowed list | Use approved reason code |
account_disabled |
Account explicitly disabled | Contact administrator |
break_glass_disabled |
Feature disabled in config | Enable in local-policy.yaml |
Session Issues
| Issue | Cause | Resolution |
|---|---|---|
| Session expired immediately | Clock skew | Sync server time |
| Cannot extend | Max extensions reached | Log out and re-authenticate |
| Actions failing | Insufficient roles | Verify account has required roles |
Policy Store Issues
# Check policy store status
stella doctor check --component authority
# Verify local policy file
stella auth policy validate --file /etc/stellaops/authority/local-policy.yaml
# Force reload policy
stella auth policy reload
Compliance Notes
Break-glass usage must be:
- Documented in incident reports
- Reviewed during security audits
- Reported in compliance dashboards
- Justified for each session
Retain audit logs for:
- SOC 2: 1 year minimum
- HIPAA: 6 years
- PCI-DSS: 1 year
- Internal policy: As defined