332 lines
8.6 KiB
Markdown
332 lines
8.6 KiB
Markdown
# Break-Glass Account Runbook
|
|
|
|
This runbook documents emergency access procedures using the break-glass account system when standard authentication is unavailable.
|
|
|
|
> **Sprint:** SPRINT_20260112_018_AUTH_local_rbac_fallback
|
|
|
|
## Overview
|
|
|
|
Break-glass accounts provide emergency administrative access when:
|
|
- PostgreSQL database is unavailable
|
|
- OIDC/OAuth2 identity provider is unreachable
|
|
- Authority service is degraded
|
|
- Network isolation prevents standard authentication
|
|
|
|
Break-glass access is fully audited and time-limited by design.
|
|
|
|
## When to Use Break-Glass Access
|
|
|
|
| Scenario | Standard Auth | Break-Glass |
|
|
|----------|---------------|-------------|
|
|
| Database maintenance | N/A | Use |
|
|
| IdP outage | Unavailable | Use |
|
|
| Network partition | Unavailable | Use |
|
|
| Routine operations | Available | Do NOT use |
|
|
| Security incident response | May be unavailable | Use with incident code |
|
|
|
|
**CRITICAL:** Break-glass access should only be used when standard authentication is genuinely unavailable. All usage is logged and auditable.
|
|
|
|
## Prerequisites
|
|
|
|
### Configuration Requirements
|
|
|
|
Break-glass must be explicitly enabled in local policy:
|
|
|
|
```yaml
|
|
# /etc/stellaops/authority/local-policy.yaml
|
|
breakGlass:
|
|
enabled: true
|
|
sessionTimeoutMinutes: 15
|
|
maxExtensions: 2
|
|
allowedReasonCodes:
|
|
- database_maintenance
|
|
- idp_outage
|
|
- network_partition
|
|
- security_incident
|
|
- disaster_recovery
|
|
accounts:
|
|
- id: "break-glass-admin"
|
|
passwordHash: "$argon2id$v=19$m=65536,t=3,p=4$..."
|
|
roles: ["admin"]
|
|
```
|
|
|
|
### Password Hash Generation
|
|
|
|
Generate password hashes using Argon2id:
|
|
|
|
```bash
|
|
# Using argon2 CLI tool
|
|
echo -n "your-secure-password" | argon2 $(openssl rand -base64 16) -id -t 3 -m 16 -p 4 -l 32 -e
|
|
|
|
# Or using stella CLI
|
|
stella auth hash-password --algorithm argon2id
|
|
```
|
|
|
|
## Break-Glass Login Procedure
|
|
|
|
### Step 1: Verify Standard Auth is Unavailable
|
|
|
|
Before using break-glass, confirm standard authentication is genuinely unavailable:
|
|
|
|
```bash
|
|
# Check Authority health
|
|
curl -s https://authority.example.com/health | jq .
|
|
|
|
# Check OIDC endpoint
|
|
curl -s https://idp.example.com/.well-known/openid-configuration
|
|
|
|
# Check database connectivity
|
|
stella doctor check --component postgres
|
|
```
|
|
|
|
### Step 2: Access Break-Glass Login
|
|
|
|
Navigate to the break-glass endpoint:
|
|
|
|
```
|
|
https://authority.example.com/break-glass/login
|
|
```
|
|
|
|
Or use the CLI:
|
|
|
|
```bash
|
|
stella auth break-glass login \
|
|
--account break-glass-admin \
|
|
--reason database_maintenance
|
|
```
|
|
|
|
### Step 3: Provide Credentials and Reason
|
|
|
|
| Field | Description | Required |
|
|
|-------|-------------|----------|
|
|
| Account ID | Break-glass account identifier | Yes |
|
|
| Password | Account password | Yes |
|
|
| Reason Code | Pre-approved reason code | Yes |
|
|
| Reason Details | Free-text explanation | Recommended |
|
|
|
|
**Approved Reason Codes:**
|
|
|
|
| Code | Description |
|
|
|------|-------------|
|
|
| `database_maintenance` | Scheduled or emergency database work |
|
|
| `idp_outage` | Identity provider unavailable |
|
|
| `network_partition` | Network connectivity issues |
|
|
| `security_incident` | Active security incident response |
|
|
| `disaster_recovery` | DR/BCP activation |
|
|
|
|
### Step 4: Session Created
|
|
|
|
On successful authentication:
|
|
|
|
- Session token issued with limited TTL (default: 15 minutes)
|
|
- Audit event logged: `breakglass.session.created`
|
|
- All subsequent actions are tagged with break-glass context
|
|
|
|
## Session Management
|
|
|
|
### Session Timeout
|
|
|
|
Break-glass sessions have strict time limits:
|
|
|
|
| Setting | Default | Description |
|
|
|---------|---------|-------------|
|
|
| `sessionTimeoutMinutes` | 15 | Session lifetime |
|
|
| `maxExtensions` | 2 | Maximum session extensions |
|
|
| Extension period | 15 min | Time added per extension |
|
|
|
|
### Extending a Session
|
|
|
|
If additional time is needed:
|
|
|
|
```bash
|
|
# CLI
|
|
stella auth break-glass extend \
|
|
--session-id <session-id> \
|
|
--reason "database migration still running"
|
|
|
|
# UI
|
|
# Click "Extend Session" button in break-glass banner
|
|
```
|
|
|
|
Extension requires:
|
|
1. Re-entering password
|
|
2. Providing extension reason
|
|
3. Not exceeding `maxExtensions` limit
|
|
|
|
### Session Termination
|
|
|
|
Sessions end when:
|
|
- User explicitly logs out
|
|
- Session timeout expires
|
|
- Max extensions reached
|
|
- Administrator force-terminates
|
|
|
|
```bash
|
|
# Explicit logout
|
|
stella auth break-glass logout --session-id <session-id>
|
|
|
|
# Force terminate (admin)
|
|
stella auth break-glass terminate --session-id <session-id> --reason "normal auth restored"
|
|
```
|
|
|
|
## Audit Trail
|
|
|
|
### Audit Events
|
|
|
|
All break-glass activity is logged:
|
|
|
|
| Event | Description |
|
|
|-------|-------------|
|
|
| `breakglass.session.created` | Session started |
|
|
| `breakglass.session.extended` | Session extended |
|
|
| `breakglass.session.terminated` | User logout |
|
|
| `breakglass.session.expired` | Timeout reached |
|
|
| `breakglass.auth.failed` | Authentication failed |
|
|
| `breakglass.reason.invalid` | Invalid reason code |
|
|
| `breakglass.extensions.exceeded` | Max extensions reached |
|
|
|
|
### Audit Event Structure
|
|
|
|
```json
|
|
{
|
|
"eventType": "breakglass.session.created",
|
|
"timestamp": "2026-01-16T10:30:00Z",
|
|
"accountId": "break-glass-admin",
|
|
"sessionId": "bg-sess-abc123",
|
|
"reasonCode": "database_maintenance",
|
|
"reasonDetails": "PostgreSQL major version upgrade",
|
|
"sourceIp": "10.0.1.50",
|
|
"userAgent": "stella-cli/2027.Q1"
|
|
}
|
|
```
|
|
|
|
### Querying Audit Logs
|
|
|
|
```bash
|
|
# List all break-glass events
|
|
stella audit query --event-type "breakglass.*" --since "24h"
|
|
|
|
# Export for compliance
|
|
stella audit export \
|
|
--event-type "breakglass.*" \
|
|
--start 2026-01-01 \
|
|
--end 2026-01-31 \
|
|
--format json \
|
|
--output break-glass-audit-jan2026.json
|
|
```
|
|
|
|
## Fallback Policy Store
|
|
|
|
### Automatic Failover
|
|
|
|
When PostgreSQL becomes unavailable:
|
|
|
|
1. Authority detects health check failures
|
|
2. After `failureThreshold` (default: 3) consecutive failures
|
|
3. Authority switches to local policy store
|
|
4. Mode changes to `Fallback`
|
|
5. Event logged: `authority.mode.changed`
|
|
|
|
### Policy Store Modes
|
|
|
|
| Mode | Description | Available Features |
|
|
|------|-------------|-------------------|
|
|
| `Primary` | PostgreSQL available | Full RBAC, user management |
|
|
| `Fallback` | Using local policy | Break-glass only |
|
|
| `Degraded` | Both degraded | Emergency access only |
|
|
|
|
### Recovery
|
|
|
|
When PostgreSQL recovers:
|
|
|
|
1. Health checks pass
|
|
2. After `minFallbackDurationMs` (default: 30s) cooldown
|
|
3. Authority switches back to Primary
|
|
4. Fallback sessions can continue until expiry
|
|
|
|
## Security Considerations
|
|
|
|
### Password Policy
|
|
|
|
Break-glass account passwords should:
|
|
- Be at least 20 characters
|
|
- Include upper, lower, numbers, symbols
|
|
- Be stored securely (HSM, Vault, split custody)
|
|
- Be rotated on a schedule (quarterly recommended)
|
|
|
|
### Access Control
|
|
|
|
- Limit break-glass accounts to essential personnel
|
|
- Use separate accounts per operator when possible
|
|
- Review access list quarterly
|
|
- Disable unused accounts immediately
|
|
|
|
### Monitoring
|
|
|
|
Set up alerts for break-glass activity:
|
|
|
|
```yaml
|
|
# Alert rule example
|
|
- alert: BreakGlassSessionCreated
|
|
expr: stellaops_breakglass_sessions_created_total > 0
|
|
for: 0m
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: Break-glass session created
|
|
description: A break-glass session was created. Verify this is expected.
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Login Failures
|
|
|
|
| Error | Cause | Resolution |
|
|
|-------|-------|------------|
|
|
| `invalid_credentials` | Wrong password | Verify password |
|
|
| `invalid_reason_code` | Reason not in allowed list | Use approved reason code |
|
|
| `account_disabled` | Account explicitly disabled | Contact administrator |
|
|
| `break_glass_disabled` | Feature disabled in config | Enable in local-policy.yaml |
|
|
|
|
### Session Issues
|
|
|
|
| Issue | Cause | Resolution |
|
|
|-------|-------|------------|
|
|
| Session expired immediately | Clock skew | Sync server time |
|
|
| Cannot extend | Max extensions reached | Log out and re-authenticate |
|
|
| Actions failing | Insufficient roles | Verify account has required roles |
|
|
|
|
### Policy Store Issues
|
|
|
|
```bash
|
|
# Check policy store status
|
|
stella doctor check --component authority
|
|
|
|
# Verify local policy file
|
|
stella auth policy validate --file /etc/stellaops/authority/local-policy.yaml
|
|
|
|
# Force reload policy
|
|
stella auth policy reload
|
|
```
|
|
|
|
## Compliance Notes
|
|
|
|
Break-glass usage must be:
|
|
- Documented in incident reports
|
|
- Reviewed during security audits
|
|
- Reported in compliance dashboards
|
|
- Justified for each session
|
|
|
|
Retain audit logs for:
|
|
- SOC 2: 1 year minimum
|
|
- HIPAA: 6 years
|
|
- PCI-DSS: 1 year
|
|
- Internal policy: As defined
|
|
|
|
## Related Documentation
|
|
|
|
- [Local RBAC Policy Schema](../modules/authority/local-policy-schema.md)
|
|
- [Authority Architecture](../modules/authority/architecture.md)
|
|
- [Offline Operations](../operations/airgap-operations-runbook.md)
|
|
- [Audit System](../modules/audit/architecture.md)
|