Files
git.stella-ops.org/docs/doctor/evidence-schemas.md

10 KiB

Doctor Check Evidence Schemas

This document defines the standardized evidence schemas for all Doctor health checks. These schemas enable AdvisoryAI to understand field meanings, expected ranges, and root cause differentiation.

Sprint: SPRINT_20260118_015_Doctor_check_quality_improvements
Task: DQUAL-006 - Standardize evidence schema documentation


Evidence Schema Conventions

Field Naming

  • Use snake_case for all field names
  • Boolean fields: is_*, has_*, *_enabled, *_available
  • Timestamp fields: *_utc suffix, ISO8601 format
  • Duration fields: *_ms or *_seconds suffix
  • Status fields: lowercase string enums

Value Types

  • string: UTF-8 text
  • int: 64-bit signed integer
  • float: 64-bit floating point
  • bool: true or false (lowercase in JSON)
  • list<T>: JSON array of type T
  • ISO8601: timestamp string in ISO8601 format

Policy Engine Checks

check.policy.engine

Description: Verify policy engine compilation, evaluation, and storage health

Evidence Fields:

Field Type Description Expected Range
engine_type string Policy engine type opa, rego, custom, unknown
engine_version string Engine version string Semantic version or unknown
engine_url string Policy engine base URL Valid HTTP(S) URL
compilation_status string Compilation health OK, FAILED
evaluation_status string Evaluation health OK, FAILED
storage_status string Storage health OK, FAILED
policy_count int Number of loaded policies ≥ 0
compilation_time_ms int Compilation latency 0-10000 (typical < 100)
evaluation_latency_p50_ms int Median evaluation time 0-5000 (typical < 50)
cache_hit_ratio float Policy cache efficiency 0.0-1.0
last_compilation_error string? Most recent compilation error null or error message
evaluation_error string? Most recent evaluation error null or error message
storage_error string? Most recent storage error null or error message

Likely Cause Differentiation:

Evidence Pattern Likely Cause
compilation_status=FAILED OPA/Rego syntax error or engine unavailable
evaluation_status=FAILED Policy evaluation timeout or runtime error
storage_status=FAILED PostgreSQL connection issue or disk full
evaluation_latency_p50_ms > 100 Complex policies or cold cache
cache_hit_ratio < 0.5 Cache not warmed or policies changing frequently

Authentication Checks

check.auth.oidc

Description: Verify connectivity to configured OIDC provider and discovery endpoint

Evidence Fields:

Field Type Description Expected Range
issuer_url string OIDC issuer URL Valid HTTPS URL
discovery_reachable bool Can reach discovery endpoint true or false
discovery_response_ms int Discovery fetch latency 0-10000 (typical < 500)
authorization_endpoint_present bool Has authorization endpoint true
token_endpoint_present bool Has token endpoint true
jwks_uri_present bool Has JWKS URI true
jwks_key_count int Number of signing keys ≥ 1
jwks_fetch_ms int JWKS fetch latency 0-10000 (typical < 500)
http_status_code int? HTTP response code null or 100-599
error_message string? Error details null or error string
connection_error_type string? Error classification ssl_error, dns_failure, refused, timeout, connection_failed

Likely Cause Differentiation:

Evidence Pattern Likely Cause
discovery_reachable=false, connection_error_type=dns_failure DNS resolution failure
discovery_reachable=false, connection_error_type=ssl_error TLS certificate issue
discovery_reachable=false, connection_error_type=refused OIDC provider down or firewall
discovery_reachable=true, authorization_endpoint_present=false Malformed discovery document
jwks_key_count=0 JWKS endpoint error or key rotation in progress

Cryptography Checks

check.crypto.fips

Description: Verify FIPS 140-2 mode is enabled when required by crypto profile

Evidence Fields:

Field Type Description Expected Range
fips_mode_enabled bool System FIPS mode active true or false
platform string Operating system platform windows, linux, macos, unknown
crypto_provider string Cryptographic provider bcrypt, openssl, managed, unknown
openssl_fips_module_loaded bool OpenSSL FIPS module status true or false
crypto_profile string Configured crypto profile Profile name from config
algorithms_tested string Comma-separated algorithm list Algorithm names
algorithms_available string Algorithms that passed testing Algorithm names
algorithms_missing string Algorithms that failed testing Algorithm names or none
status string Overall compliance status compliant, non-compliant
test_aes_256 string AES-256 test result pass or fail: <error>
test_sha_256 string SHA-256 test result pass or fail: <error>
test_sha_384 string SHA-384 test result pass or fail: <error>
test_sha_512 string SHA-512 test result pass or fail: <error>
test_rsa_2048 string RSA-2048 test result pass or fail: <error>
test_ecdsa_p256 string ECDSA-P256 test result pass or fail: <error>

Likely Cause Differentiation:

Evidence Pattern Likely Cause
fips_mode_enabled=false, platform=linux FIPS mode not enabled via fips-mode-setup
fips_mode_enabled=false, platform=windows FIPS Group Policy not configured
openssl_fips_module_loaded=false OpenSSL FIPS provider not installed
algorithms_missing contains values Crypto provider missing FIPS-validated algorithms

Attestation Checks

check.attestation.clock.skew

Description: Verify system clock is synchronized for attestation validity

Evidence Fields:

Field Type Description Expected Range
local_time_utc ISO8601 System time Valid timestamp
server_time_utc ISO8601 Reference server time Valid timestamp
skew_seconds float Clock difference (positive = ahead) -300 to 300 (typical < 5)
max_allowed_skew int Threshold in seconds Default: 5
ntp_daemon_running bool NTP service active true or false
ntp_daemon_type string NTP daemon type chronyd, ntpd, systemd-timesyncd, w32time, unknown
ntp_servers_configured string Comma-separated NTP servers Server hostnames
last_sync_time_utc ISO8601? Last successful sync Timestamp or null
sync_age_seconds int? Seconds since last sync ≥ 0 or null
is_virtual_machine bool Running in VM true or false
vm_type string VM hypervisor type vmware, hyper-v, kvm, xen, container, none
vm_clock_sync_enabled bool VM time sync tools enabled true or false
connection_error_type string? Network error type ssl_error, dns_failure, refused, timeout, connection_failed

Likely Cause Differentiation:

Evidence Pattern Likely Cause
ntp_daemon_running=false NTP service not started
ntp_daemon_running=true, sync_age_seconds > 3600 NTP server unreachable
is_virtual_machine=true, vm_clock_sync_enabled=false VM clock drift without sync
skew_seconds > 0 (large positive) System clock set to future
skew_seconds < 0 (large negative) System clock set to past

check.attestation.transparency.consistency

Description: Verify stored log checkpoints match remote transparency log

Evidence Fields:

Field Type Description Expected Range
checkpoint_path string Local checkpoint file path Filesystem path
stored_tree_size int Local tree size ≥ 0
remote_tree_size int Remote tree size ≥ stored_tree_size
stored_root_hash string Local root hash Hex string
remote_root_hash string Remote root hash Hex string
entries_behind int Entries to catch up ≥ 0
checkpoint_age ISO8601 Checkpoint last update Valid timestamp
consistency_verified bool Log is consistent true or false

Likely Cause Differentiation:

Evidence Pattern Likely Cause
remote_tree_size < stored_tree_size CRITICAL: Possible log rollback/tampering
stored_root_hash != remote_root_hash at same size CRITICAL: Possible log modification
entries_behind > 10000 Checkpoint very stale, needs sync
Checkpoint file parse error Corrupted checkpoint file

Remediation Step Properties

All remediation steps now include safety annotations:

Property Type Description
Order int Step sequence (1-based)
Description string Human-readable description
Command string Command to execute
CommandType enum Shell, Sql, Api, Manual, Comment
IsDestructive bool Step modifies/deletes data
DryRunVariant string? Safe preview command
Placeholders dict? User-supplied values needed

AdvisoryAI Integration:

  • Commands with IsDestructive=true must NOT be auto-executed
  • Always prefer DryRunVariant before suggesting destructive commands
  • CommandType.Manual requires human confirmation

Adding New Check Schemas

When adding a new Doctor check:

  1. Define evidence fields in the check implementation
  2. Add schema documentation to this file
  3. Include "Likely Cause Differentiation" table
  4. Test evidence output matches schema
  5. Update AdvisoryAI prompt if needed

Last updated: 2026-01-18 (SPRINT_20260118_015)