up
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-11-28 20:55:22 +02:00
parent d040c001ac
commit 2548abc56f
231 changed files with 47468 additions and 68 deletions

491
docs/db/CONVERSION_PLAN.md Normal file
View File

@@ -0,0 +1,491 @@
# MongoDB to PostgreSQL Conversion Plan
**Version:** 2.0.0
**Status:** APPROVED
**Created:** 2025-11-28
**Last Updated:** 2025-11-28
---
## Executive Summary
This document outlines the strategic plan to **convert** (not migrate) StellaOps from MongoDB to PostgreSQL for control-plane domains. The conversion follows a "strangler fig" pattern, introducing PostgreSQL repositories alongside existing MongoDB implementations and gradually switching each bounded context.
**Key Finding:** StellaOps already has production-ready PostgreSQL patterns in the Orchestrator and Findings modules that serve as templates for all other modules.
### Related Documents
| Document | Purpose |
|----------|---------|
| [SPECIFICATION.md](./SPECIFICATION.md) | Schema designs, naming conventions, data types |
| [RULES.md](./RULES.md) | Database coding rules and patterns |
| [VERIFICATION.md](./VERIFICATION.md) | Testing and verification requirements |
| [tasks/](./tasks/) | Detailed task definitions per phase |
---
## 1. Principles & Scope
### 1.1 Goals
Convert **control-plane** domains from MongoDB to PostgreSQL:
| Domain | Current DB | Target | Priority |
|--------|-----------|--------|----------|
| Authority | `stellaops_authority` | PostgreSQL | P0 |
| Scheduler | `stellaops_scheduler` | PostgreSQL | P0 |
| Notify | `stellaops_notify` | PostgreSQL | P1 |
| Policy | `stellaops_policy` | PostgreSQL | P1 |
| Vulnerabilities (Concelier) | `concelier` | PostgreSQL | P2 |
| VEX & Graph (Excititor) | `excititor` | PostgreSQL | P2 |
| PacksRegistry | `stellaops_packs` | PostgreSQL | P3 |
| IssuerDirectory | `stellaops_issuer` | PostgreSQL | P3 |
### 1.2 Non-Goals
- Scanner result storage (remains object storage + Mongo for now)
- Real-time event streams (separate infrastructure)
- Legacy data archive (can remain in MongoDB read-only)
### 1.3 Constraints
**MUST Preserve:**
- Deterministic, replayable scans
- "Preserve/prune source" rule for Concelier/Excititor
- Lattice logic in `Scanner.WebService` (not in DB)
- Air-gap friendliness and offline-kit packaging
- Multi-tenant isolation patterns
- Zero downtime during conversion
### 1.4 Conversion vs Migration
This is a **conversion**, not a 1:1 document→row mapping:
| Approach | When to Use |
|----------|-------------|
| **Normalize** | Identities, jobs, schedules, relationships |
| **Keep JSONB** | Advisory payloads, provenance trails, evidence manifests |
| **Drop/Archive** | Ephemeral data (caches, locks), historical logs |
---
## 2. Architecture
### 2.1 Strangler Fig Pattern
```
┌─────────────────────────────────────────────────────────────┐
│ Service Layer │
├─────────────────────────────────────────────────────────────┤
│ Repository Interface │
│ (e.g., IScheduleRepository) │
├──────────────────────┬──────────────────────────────────────┤
│ MongoRepository │ PostgresRepository │
│ (existing) │ (new) │
├──────────────────────┴──────────────────────────────────────┤
│ DI Container (configured switch) │
└─────────────────────────────────────────────────────────────┘
```
### 2.2 Configuration-Driven Backend Selection
```json
{
"Persistence": {
"Authority": "Postgres",
"Scheduler": "Postgres",
"Concelier": "Mongo",
"Excititor": "Mongo",
"Notify": "Postgres",
"Policy": "Mongo"
}
}
```
### 2.3 Existing PostgreSQL Patterns
The codebase already contains production-ready patterns:
| Module | Location | Reusable Components |
|--------|----------|---------------------|
| Orchestrator | `src/Orchestrator/.../Infrastructure/Postgres/` | DataSource, tenant context, repository pattern |
| Findings | `src/Findings/StellaOps.Findings.Ledger/Infrastructure/Postgres/` | Ledger events, Merkle anchors, projections |
**Reference Implementation:** `OrchestratorDataSource.cs`
---
## 3. Data Tiering
### 3.1 Tier Definitions
| Tier | Description | Strategy |
|------|-------------|----------|
| **A** | Critical business data | Full conversion with verification |
| **B** | Important but recoverable | Convert active records only |
| **C** | Ephemeral/cache data | Fresh start, no migration |
### 3.2 Module Tiering
#### Authority
| Collection | Tier | Strategy |
|------------|------|----------|
| `authority_users` | A | Full conversion |
| `authority_clients` | A | Full conversion |
| `authority_scopes` | A | Full conversion |
| `authority_tokens` | B | Active tokens only |
| `authority_service_accounts` | A | Full conversion |
| `authority_login_attempts` | B | Recent 90 days |
| `authority_revocations` | A | Full conversion |
#### Scheduler
| Collection | Tier | Strategy |
|------------|------|----------|
| `schedules` | A | Full conversion |
| `runs` | B | Recent 180 days |
| `graph_jobs` | B | Active/recent only |
| `policy_jobs` | B | Active/recent only |
| `impact_snapshots` | B | Recent 90 days |
| `locks` | C | Fresh start |
#### Concelier (Vulnerabilities)
| Collection | Tier | Strategy |
|------------|------|----------|
| `advisory` | A | Full conversion |
| `advisory_raw` | B | GridFS refs only |
| `alias` | A | Full conversion |
| `affected` | A | Full conversion |
| `source` | A | Full conversion |
| `source_state` | A | Full conversion |
| `jobs`, `locks` | C | Fresh start |
#### Excititor (VEX)
| Collection | Tier | Strategy |
|------------|------|----------|
| `vex.statements` | A | Full conversion |
| `vex.observations` | A | Full conversion |
| `vex.linksets` | A | Full conversion |
| `vex.consensus` | A | Full conversion |
| `vex.raw` | B | Active/recent only |
| `vex.cache` | C | Fresh start |
---
## 4. Execution Phases
### Phase Overview
```
Phase 0: Foundations [1 sprint]
├─→ Phase 1: Authority [1 sprint]
├─→ Phase 2: Scheduler [1 sprint]
├─→ Phase 3: Notify [1 sprint]
├─→ Phase 4: Policy [1 sprint]
└─→ Phase 5: Concelier [2 sprints]
└─→ Phase 6: Excititor [2-3 sprints]
└─→ Phase 7: Cleanup [1 sprint]
```
### Phase Summary
| Phase | Scope | Duration | Dependencies | Deliverable |
|-------|-------|----------|--------------|-------------|
| 0 | Foundations | 1 sprint | None | PostgreSQL infrastructure, shared library |
| 1 | Authority | 1 sprint | Phase 0 | Identity management on PostgreSQL |
| 2 | Scheduler | 1 sprint | Phase 0 | Job scheduling on PostgreSQL |
| 3 | Notify | 1 sprint | Phase 0 | Notifications on PostgreSQL |
| 4 | Policy | 1 sprint | Phase 0 | Policy engine on PostgreSQL |
| 5 | Concelier | 2 sprints | Phase 0 | Vulnerability index on PostgreSQL |
| 6 | Excititor | 2-3 sprints | Phase 5 | VEX & graphs on PostgreSQL |
| 7 | Cleanup | 1 sprint | All | MongoDB retired, docs updated |
**Total: 10-12 sprints**
### Detailed Task Definitions
See:
- [tasks/PHASE_0_FOUNDATIONS.md](./tasks/PHASE_0_FOUNDATIONS.md)
- [tasks/PHASE_1_AUTHORITY.md](./tasks/PHASE_1_AUTHORITY.md)
- [tasks/PHASE_2_SCHEDULER.md](./tasks/PHASE_2_SCHEDULER.md)
- [tasks/PHASE_3_NOTIFY.md](./tasks/PHASE_3_NOTIFY.md)
- [tasks/PHASE_4_POLICY.md](./tasks/PHASE_4_POLICY.md)
- [tasks/PHASE_5_VULNERABILITIES.md](./tasks/PHASE_5_VULNERABILITIES.md)
- [tasks/PHASE_6_VEX_GRAPH.md](./tasks/PHASE_6_VEX_GRAPH.md)
- [tasks/PHASE_7_CLEANUP.md](./tasks/PHASE_7_CLEANUP.md)
---
## 5. Conversion Strategy
### 5.1 Per-Module Approach
```
1. Create PostgreSQL storage project
2. Implement schema migrations
3. Implement repository interfaces
4. Add configuration switch
5. Enable dual-write (if Tier A)
6. Run verification tests
7. Switch to PostgreSQL-only
8. Archive MongoDB data
```
### 5.2 Dual-Write Pattern
For Tier A data requiring historical continuity:
```
┌──────────────────────────────────────────────────────────────┐
│ DualWriteRepository │
├──────────────────────────────────────────────────────────────┤
│ Write: PostgreSQL (primary) + MongoDB (secondary) │
│ Read: PostgreSQL (primary) → MongoDB (fallback) │
│ Config: WriteToBoth, FallbackToMongo, ConvertOnRead │
└──────────────────────────────────────────────────────────────┘
```
### 5.3 Fresh Start Pattern
For Tier C ephemeral data:
```
┌──────────────────────────────────────────────────────────────┐
│ 1. Deploy PostgreSQL schema │
│ 2. Switch configuration to PostgreSQL │
│ 3. New data goes to PostgreSQL only │
│ 4. Old MongoDB data ages out naturally │
└──────────────────────────────────────────────────────────────┘
```
---
## 6. Risk Assessment
### 6.1 Technical Risks
| Risk | Impact | Likelihood | Mitigation |
|------|--------|------------|------------|
| Data loss during conversion | High | Low | Dual-write mode, extensive verification |
| Performance regression | Medium | Medium | Load testing before switch, index optimization |
| Determinism violation | High | Medium | Automated verification tests, parallel pipeline |
| Schema evolution conflicts | Medium | Low | Migration framework, schema versioning |
| Transaction semantics differences | Medium | Low | Code review, integration tests |
### 6.2 Operational Risks
| Risk | Impact | Likelihood | Mitigation |
|------|--------|------------|------------|
| Extended conversion timeline | Medium | Medium | Phase-based approach, clear milestones |
| Team learning curve | Low | Medium | Reference implementations, documentation |
| Rollback complexity | Medium | Low | Keep Mongo data until verified, feature flags |
### 6.3 Rollback Strategy
Each phase has independent rollback capability:
| Level | Action | Recovery Time |
|-------|--------|---------------|
| Configuration | Change `Persistence:<Module>` to `Mongo` | Minutes |
| Data | MongoDB data retained during dual-write | None needed |
| Code | Git revert (PostgreSQL code isolated) | Hours |
---
## 7. Success Criteria
### 7.1 Per-Module Criteria
- [ ] All existing integration tests pass with PostgreSQL backend
- [ ] No performance regression >10% on critical paths
- [ ] Deterministic outputs verified against MongoDB baseline
- [ ] Zero data loss during conversion
- [ ] Tenant isolation verified
### 7.2 Overall Criteria
- [ ] All control-plane modules running on PostgreSQL
- [ ] MongoDB retired from production for converted modules
- [ ] Air-gap kit updated with PostgreSQL support
- [ ] Documentation updated for PostgreSQL operations
- [ ] Runbooks updated for PostgreSQL troubleshooting
---
## 8. Project Structure
### 8.1 New Projects
```
src/
├── Shared/
│ └── StellaOps.Infrastructure.Postgres/
│ ├── DataSourceBase.cs
│ ├── Migrations/
│ │ ├── IPostgresMigration.cs
│ │ └── PostgresMigrationRunner.cs
│ ├── Extensions/
│ │ └── NpgsqlExtensions.cs
│ └── ServiceCollectionExtensions.cs
├── Authority/
│ └── __Libraries/
│ └── StellaOps.Authority.Storage.Postgres/
│ ├── AuthorityDataSource.cs
│ ├── Repositories/
│ ├── Migrations/
│ └── ServiceCollectionExtensions.cs
├── Scheduler/
│ └── __Libraries/
│ └── StellaOps.Scheduler.Storage.Postgres/
├── Notify/
│ └── __Libraries/
│ └── StellaOps.Notify.Storage.Postgres/
├── Policy/
│ └── __Libraries/
│ └── StellaOps.Policy.Storage.Postgres/
├── Concelier/
│ └── __Libraries/
│ └── StellaOps.Concelier.Storage.Postgres/
└── Excititor/
└── __Libraries/
└── StellaOps.Excititor.Storage.Postgres/
```
### 8.2 Schema Files
```
docs/db/
├── schemas/
│ ├── authority.sql
│ ├── vuln.sql
│ ├── vex.sql
│ ├── scheduler.sql
│ ├── notify.sql
│ └── policy.sql
```
---
## 9. Timeline
### 9.1 Sprint Schedule
| Sprint | Phase | Focus |
|--------|-------|-------|
| 1 | 0 | PostgreSQL infrastructure, shared library |
| 2 | 1 | Authority module conversion |
| 3 | 2 | Scheduler module conversion |
| 4 | 3 | Notify module conversion |
| 5 | 4 | Policy module conversion |
| 6-7 | 5 | Concelier/Vulnerability conversion |
| 8-10 | 6 | Excititor/VEX conversion |
| 11 | 7 | Cleanup, optimization, documentation |
### 9.2 Milestones
| Milestone | Sprint | Criteria |
|-----------|--------|----------|
| M1: Infrastructure Ready | 1 | PostgreSQL cluster operational, CI tests passing |
| M2: Identity Converted | 2 | Authority on PostgreSQL, auth flows working |
| M3: Scheduling Converted | 3 | Scheduler on PostgreSQL, jobs executing |
| M4: Core Services Converted | 5 | Notify + Policy on PostgreSQL |
| M5: Vulnerability Index Converted | 7 | Concelier on PostgreSQL, scans deterministic |
| M6: VEX Converted | 10 | Excititor on PostgreSQL, graphs stable |
| M7: MongoDB Retired | 11 | All modules converted, Mongo archived |
---
## 10. Governance
### 10.1 Decision Log
| Date | Decision | Rationale | Approver |
|------|----------|-----------|----------|
| 2025-11-28 | Strangler fig pattern | Allows gradual rollout with rollback | Architecture Team |
| 2025-11-28 | JSONB for semi-structured data | Preserves flexibility, simplifies conversion | Architecture Team |
| 2025-11-28 | Phase 0 first | Infrastructure must be stable before modules | Architecture Team |
### 10.2 Change Control
Changes to this plan require:
1. Impact assessment documented
2. Risk analysis updated
3. Approval from Architecture Team
4. Updated task definitions in `docs/db/tasks/`
### 10.3 Status Reporting
Weekly status updates in sprint files tracking:
- Tasks completed
- Blockers encountered
- Verification results
- Next sprint objectives
---
## Appendix A: Reference Implementation
### DataSource Pattern
```csharp
public sealed class ModuleDataSource : IAsyncDisposable
{
private readonly NpgsqlDataSource _dataSource;
public async Task<NpgsqlConnection> OpenConnectionAsync(
string tenantId,
CancellationToken cancellationToken = default)
{
var connection = await _dataSource.OpenConnectionAsync(cancellationToken);
await ConfigureSessionAsync(connection, tenantId, cancellationToken);
return connection;
}
private static async Task ConfigureSessionAsync(
NpgsqlConnection connection,
string tenantId,
CancellationToken cancellationToken)
{
await using var cmd = connection.CreateCommand();
cmd.CommandText = $"""
SET app.tenant_id = '{tenantId}';
SET timezone = 'UTC';
SET statement_timeout = '30s';
""";
await cmd.ExecuteNonQueryAsync(cancellationToken);
}
}
```
### Repository Pattern
See [RULES.md](./RULES.md) Section 1 for complete repository implementation guidelines.
---
## Appendix B: Glossary
| Term | Definition |
|------|------------|
| **Strangler Fig** | Pattern where new system grows alongside old, gradually replacing it |
| **Dual-Write** | Writing to both MongoDB and PostgreSQL during transition |
| **Tier A/B/C** | Data classification by criticality for migration strategy |
| **DataSource** | Npgsql connection factory with tenant context configuration |
| **Determinism** | Property that same inputs always produce same outputs |
---
*Document Version: 2.0.0*
*Last Updated: 2025-11-28*

60
docs/db/README.md Normal file
View File

@@ -0,0 +1,60 @@
# StellaOps Database Documentation
This directory contains all documentation related to the StellaOps database architecture, including the MongoDB to PostgreSQL conversion project.
## Document Index
| Document | Purpose |
|----------|---------|
| [SPECIFICATION.md](./SPECIFICATION.md) | PostgreSQL schema design specification, data types, naming conventions |
| [RULES.md](./RULES.md) | Database coding rules, patterns, and constraints for all developers |
| [CONVERSION_PLAN.md](./CONVERSION_PLAN.md) | Strategic plan for MongoDB to PostgreSQL conversion |
| [VERIFICATION.md](./VERIFICATION.md) | Testing and verification requirements for database changes |
## Task Definitions
Sprint-level task definitions for the conversion project:
| Phase | Document | Status |
|-------|----------|--------|
| Phase 0 | [tasks/PHASE_0_FOUNDATIONS.md](./tasks/PHASE_0_FOUNDATIONS.md) | TODO |
| Phase 1 | [tasks/PHASE_1_AUTHORITY.md](./tasks/PHASE_1_AUTHORITY.md) | TODO |
| Phase 2 | [tasks/PHASE_2_SCHEDULER.md](./tasks/PHASE_2_SCHEDULER.md) | TODO |
| Phase 3 | [tasks/PHASE_3_NOTIFY.md](./tasks/PHASE_3_NOTIFY.md) | TODO |
| Phase 4 | [tasks/PHASE_4_POLICY.md](./tasks/PHASE_4_POLICY.md) | TODO |
| Phase 5 | [tasks/PHASE_5_VULNERABILITIES.md](./tasks/PHASE_5_VULNERABILITIES.md) | TODO |
| Phase 6 | [tasks/PHASE_6_VEX_GRAPH.md](./tasks/PHASE_6_VEX_GRAPH.md) | TODO |
| Phase 7 | [tasks/PHASE_7_CLEANUP.md](./tasks/PHASE_7_CLEANUP.md) | TODO |
## Schema Reference
Schema DDL files (generated from specifications):
| Schema | File | Tables |
|--------|------|--------|
| authority | [schemas/authority.sql](./schemas/authority.sql) | 12 |
| vuln | [schemas/vuln.sql](./schemas/vuln.sql) | 12 |
| vex | [schemas/vex.sql](./schemas/vex.sql) | 13 |
| scheduler | [schemas/scheduler.sql](./schemas/scheduler.sql) | 10 |
| notify | [schemas/notify.sql](./schemas/notify.sql) | 14 |
| policy | [schemas/policy.sql](./schemas/policy.sql) | 8 |
## Quick Links
- **For developers**: Start with [RULES.md](./RULES.md) for coding conventions
- **For architects**: Review [SPECIFICATION.md](./SPECIFICATION.md) for design rationale
- **For project managers**: See [CONVERSION_PLAN.md](./CONVERSION_PLAN.md) for timeline and phases
- **For QA**: Check [VERIFICATION.md](./VERIFICATION.md) for testing requirements
## Key Principles
1. **Determinism First**: All database operations must produce reproducible, stable outputs
2. **Tenant Isolation**: Multi-tenancy via `tenant_id` column with row-level security
3. **Strangler Fig Pattern**: Gradual conversion with rollback capability per module
4. **JSONB for Flexibility**: Semi-structured data stays as JSONB, relational data normalizes
## Related Documentation
- [Architecture Overview](../07_HIGH_LEVEL_ARCHITECTURE.md)
- [Module Dossiers](../modules/)
- [Air-Gap Operations](../24_OFFLINE_KIT.md)

839
docs/db/RULES.md Normal file
View File

@@ -0,0 +1,839 @@
# Database Coding Rules
**Version:** 1.0.0
**Status:** APPROVED
**Last Updated:** 2025-11-28
---
## Purpose
This document defines mandatory rules and guidelines for all database-related code in StellaOps. These rules ensure consistency, maintainability, determinism, and security across all modules.
**Compliance is mandatory.** Deviations require explicit approval documented in the relevant sprint file.
---
## 1. Repository Pattern Rules
### 1.1 Interface Location
**RULE:** Repository interfaces MUST be defined in the Core/Domain layer, NOT in the storage layer.
```
✓ CORRECT:
src/Scheduler/__Libraries/StellaOps.Scheduler.Core/Repositories/IScheduleRepository.cs
✗ INCORRECT:
src/Scheduler/__Libraries/StellaOps.Scheduler.Storage.Postgres/IScheduleRepository.cs
```
### 1.2 Implementation Naming
**RULE:** Repository implementations MUST be prefixed with the storage technology.
```csharp
// ✓ CORRECT
public sealed class PostgresScheduleRepository : IScheduleRepository
public sealed class MongoScheduleRepository : IScheduleRepository
// ✗ INCORRECT
public sealed class ScheduleRepository : IScheduleRepository
```
### 1.3 Dependency Injection
**RULE:** PostgreSQL repositories MUST be registered as `Scoped`. MongoDB repositories MAY be `Singleton`.
```csharp
// PostgreSQL - always scoped (connection per request)
services.AddScoped<IScheduleRepository, PostgresScheduleRepository>();
// MongoDB - singleton is acceptable (stateless)
services.AddSingleton<IScheduleRepository, MongoScheduleRepository>();
```
### 1.4 No Direct SQL in Services
**RULE:** Business logic services MUST NOT contain raw SQL. All database access MUST go through repository interfaces.
```csharp
// ✓ CORRECT
public class ScheduleService
{
private readonly IScheduleRepository _repository;
public Task<Schedule?> GetAsync(string id)
=> _repository.GetAsync(id);
}
// ✗ INCORRECT
public class ScheduleService
{
private readonly NpgsqlDataSource _dataSource;
public async Task<Schedule?> GetAsync(string id)
{
await using var conn = await _dataSource.OpenConnectionAsync();
// Direct SQL here - FORBIDDEN
}
}
```
---
## 2. Connection Management Rules
### 2.1 DataSource Pattern
**RULE:** Every module MUST have its own DataSource class that configures tenant context.
```csharp
public sealed class SchedulerDataSource : IAsyncDisposable
{
private readonly NpgsqlDataSource _dataSource;
public async Task<NpgsqlConnection> OpenConnectionAsync(
string tenantId,
CancellationToken cancellationToken = default)
{
var connection = await _dataSource.OpenConnectionAsync(cancellationToken);
await ConfigureSessionAsync(connection, tenantId, cancellationToken);
return connection;
}
private static async Task ConfigureSessionAsync(
NpgsqlConnection connection,
string tenantId,
CancellationToken cancellationToken)
{
// MANDATORY: Set tenant context and UTC timezone
await using var cmd = connection.CreateCommand();
cmd.CommandText = $"""
SET app.tenant_id = '{tenantId}';
SET timezone = 'UTC';
SET statement_timeout = '30s';
""";
await cmd.ExecuteNonQueryAsync(cancellationToken);
}
}
```
### 2.2 Connection Disposal
**RULE:** All NpgsqlConnection instances MUST be disposed via `await using`.
```csharp
// ✓ CORRECT
await using var connection = await _dataSource.OpenConnectionAsync(tenantId, ct);
// ✗ INCORRECT
var connection = await _dataSource.OpenConnectionAsync(tenantId, ct);
// Missing disposal
```
### 2.3 Command Disposal
**RULE:** All NpgsqlCommand instances MUST be disposed via `await using`.
```csharp
// ✓ CORRECT
await using var cmd = connection.CreateCommand();
// ✗ INCORRECT
var cmd = connection.CreateCommand();
```
### 2.4 Reader Disposal
**RULE:** All NpgsqlDataReader instances MUST be disposed via `await using`.
```csharp
// ✓ CORRECT
await using var reader = await cmd.ExecuteReaderAsync(ct);
// ✗ INCORRECT
var reader = await cmd.ExecuteReaderAsync(ct);
```
---
## 3. Tenant Isolation Rules
### 3.1 Tenant ID Required
**RULE:** Every tenant-scoped repository method MUST require `tenantId` as the first parameter.
```csharp
// ✓ CORRECT
Task<Schedule?> GetAsync(string tenantId, string scheduleId, CancellationToken ct);
Task<IReadOnlyList<Schedule>> ListAsync(string tenantId, QueryOptions? options, CancellationToken ct);
// ✗ INCORRECT
Task<Schedule?> GetAsync(string scheduleId, CancellationToken ct);
```
### 3.2 Tenant Filtering
**RULE:** All queries MUST include `tenant_id` in the WHERE clause for tenant-scoped tables.
```csharp
// ✓ CORRECT
cmd.CommandText = """
SELECT * FROM scheduler.schedules
WHERE tenant_id = @tenant_id AND id = @id
""";
// ✗ INCORRECT - Missing tenant filter
cmd.CommandText = """
SELECT * FROM scheduler.schedules
WHERE id = @id
""";
```
### 3.3 Session Context Verification
**RULE:** DataSource MUST set `app.tenant_id` on every connection before executing any queries.
```csharp
// ✓ CORRECT - Connection opened via DataSource sets tenant context
await using var connection = await _dataSource.OpenConnectionAsync(tenantId, ct);
// ✗ INCORRECT - Direct connection without tenant context
await using var connection = await _rawDataSource.OpenConnectionAsync(ct);
```
---
## 4. SQL Writing Rules
### 4.1 Parameterized Queries Only
**RULE:** All user-provided values MUST be passed as parameters. String interpolation is FORBIDDEN for values.
```csharp
// ✓ CORRECT
cmd.CommandText = "SELECT * FROM users WHERE id = @id";
cmd.Parameters.AddWithValue("id", userId);
// ✗ INCORRECT - SQL INJECTION VULNERABILITY
cmd.CommandText = $"SELECT * FROM users WHERE id = '{userId}'";
```
### 4.2 SQL String Constants
**RULE:** SQL strings MUST be defined as `const` or `static readonly` fields, or as raw string literals in methods.
```csharp
// ✓ CORRECT - Raw string literal
cmd.CommandText = """
SELECT id, name, created_at
FROM scheduler.schedules
WHERE tenant_id = @tenant_id
ORDER BY created_at DESC
""";
// ✓ CORRECT - Constant
private const string SelectScheduleSql = """
SELECT id, name, created_at
FROM scheduler.schedules
WHERE tenant_id = @tenant_id
""";
// ✗ INCORRECT - Dynamic string building without reason
cmd.CommandText = "SELECT " + columns + " FROM " + table;
```
### 4.3 Schema Qualification
**RULE:** All table references MUST include the schema name.
```csharp
// ✓ CORRECT
cmd.CommandText = "SELECT * FROM scheduler.schedules";
// ✗ INCORRECT - Missing schema
cmd.CommandText = "SELECT * FROM schedules";
```
### 4.4 Column Listing
**RULE:** SELECT statements MUST list columns explicitly. `SELECT *` is FORBIDDEN in production code.
```csharp
// ✓ CORRECT
cmd.CommandText = """
SELECT id, tenant_id, name, enabled, created_at
FROM scheduler.schedules
""";
// ✗ INCORRECT
cmd.CommandText = "SELECT * FROM scheduler.schedules";
```
### 4.5 Consistent Casing
**RULE:** SQL keywords MUST be lowercase for consistency with PostgreSQL conventions.
```csharp
// ✓ CORRECT
cmd.CommandText = """
select id, name
from scheduler.schedules
where tenant_id = @tenant_id
order by created_at desc
""";
// ✗ INCORRECT - Mixed casing
cmd.CommandText = """
SELECT id, name
FROM scheduler.schedules
WHERE tenant_id = @tenant_id
""";
```
---
## 5. Data Type Rules
### 5.1 UUID Handling
**RULE:** UUIDs MUST be passed as `Guid` type to Npgsql, NOT as strings.
```csharp
// ✓ CORRECT
cmd.Parameters.AddWithValue("id", Guid.Parse(scheduleId));
// ✗ INCORRECT
cmd.Parameters.AddWithValue("id", scheduleId); // String
```
### 5.2 Timestamp Handling
**RULE:** All timestamps MUST be `DateTimeOffset` or `DateTime` with `Kind = Utc`.
```csharp
// ✓ CORRECT
cmd.Parameters.AddWithValue("created_at", DateTimeOffset.UtcNow);
cmd.Parameters.AddWithValue("created_at", DateTime.UtcNow);
// ✗ INCORRECT - Local time
cmd.Parameters.AddWithValue("created_at", DateTime.Now);
```
### 5.3 JSONB Serialization
**RULE:** JSONB columns MUST be serialized using `System.Text.Json.JsonSerializer` with consistent options.
```csharp
// ✓ CORRECT
var json = JsonSerializer.Serialize(obj, JsonSerializerOptions.Default);
cmd.Parameters.AddWithValue("config", json);
// ✗ INCORRECT - Newtonsoft or inconsistent serialization
var json = Newtonsoft.Json.JsonConvert.SerializeObject(obj);
```
### 5.4 Null Handling
**RULE:** Nullable values MUST use `DBNull.Value` when null.
```csharp
// ✓ CORRECT
cmd.Parameters.AddWithValue("description", (object?)schedule.Description ?? DBNull.Value);
// ✗ INCORRECT - Will fail or behave unexpectedly
cmd.Parameters.AddWithValue("description", schedule.Description); // If null
```
### 5.5 Array Handling
**RULE:** PostgreSQL arrays MUST be passed as .NET arrays with explicit type.
```csharp
// ✓ CORRECT
cmd.Parameters.AddWithValue("tags", schedule.Tags.ToArray());
// ✗ INCORRECT - List won't map correctly
cmd.Parameters.AddWithValue("tags", schedule.Tags);
```
---
## 6. Transaction Rules
### 6.1 Explicit Transactions
**RULE:** Operations affecting multiple tables MUST use explicit transactions.
```csharp
// ✓ CORRECT
await using var transaction = await connection.BeginTransactionAsync(ct);
try
{
// Multiple operations
await cmd1.ExecuteNonQueryAsync(ct);
await cmd2.ExecuteNonQueryAsync(ct);
await transaction.CommitAsync(ct);
}
catch
{
await transaction.RollbackAsync(ct);
throw;
}
```
### 6.2 Transaction Isolation
**RULE:** Default isolation level is `ReadCommitted`. Stricter levels MUST be documented.
```csharp
// ✓ CORRECT - Default
await using var transaction = await connection.BeginTransactionAsync(ct);
// ✓ CORRECT - Explicit stricter level with documentation
// Using Serializable for financial consistency requirement
await using var transaction = await connection.BeginTransactionAsync(
IsolationLevel.Serializable, ct);
```
### 6.3 No Nested Transactions
**RULE:** Nested transactions are NOT supported. Use savepoints if needed.
```csharp
// ✗ INCORRECT - Nested transaction
await using var tx1 = await connection.BeginTransactionAsync(ct);
await using var tx2 = await connection.BeginTransactionAsync(ct); // FAILS
// ✓ CORRECT - Savepoint for partial rollback
await using var transaction = await connection.BeginTransactionAsync(ct);
await transaction.SaveAsync("savepoint1", ct);
// ... operations ...
await transaction.RollbackAsync("savepoint1", ct); // Partial rollback
await transaction.CommitAsync(ct);
```
---
## 7. Error Handling Rules
### 7.1 PostgreSQL Exception Handling
**RULE:** Catch `PostgresException` for database-specific errors, not generic exceptions.
```csharp
// ✓ CORRECT
try
{
await cmd.ExecuteNonQueryAsync(ct);
}
catch (PostgresException ex) when (ex.SqlState == "23505") // Unique violation
{
throw new DuplicateEntityException($"Entity already exists: {ex.ConstraintName}");
}
// ✗ INCORRECT - Too broad
catch (Exception ex)
{
// Can't distinguish database errors from other errors
}
```
### 7.2 Constraint Violation Handling
**RULE:** Unique constraint violations MUST be translated to domain exceptions.
| SQL State | Meaning | Domain Exception |
|-----------|---------|------------------|
| `23505` | Unique violation | `DuplicateEntityException` |
| `23503` | Foreign key violation | `ReferenceNotFoundException` |
| `23502` | Not null violation | `ValidationException` |
| `23514` | Check constraint | `ValidationException` |
### 7.3 Timeout Handling
**RULE:** Query timeouts MUST be caught and logged with context.
```csharp
try
{
await cmd.ExecuteNonQueryAsync(ct);
}
catch (NpgsqlException ex) when (ex.InnerException is TimeoutException)
{
_logger.LogWarning(ex, "Query timeout for schedule {ScheduleId}", scheduleId);
throw new QueryTimeoutException("Database query timed out", ex);
}
```
---
## 8. Pagination Rules
### 8.1 Keyset Pagination
**RULE:** Use keyset pagination, NOT offset pagination for large result sets.
```csharp
// ✓ CORRECT - Keyset pagination
cmd.CommandText = """
select id, name, created_at
from scheduler.schedules
where tenant_id = @tenant_id
and (created_at, id) < (@cursor_created_at, @cursor_id)
order by created_at desc, id desc
limit @page_size
""";
// ✗ INCORRECT - Offset pagination (slow for large offsets)
cmd.CommandText = """
select id, name, created_at
from scheduler.schedules
where tenant_id = @tenant_id
order by created_at desc
limit @page_size offset @offset
""";
```
### 8.2 Default Page Size
**RULE:** Default page size MUST be 50. Maximum page size MUST be 1000.
```csharp
public class QueryOptions
{
public int PageSize { get; init; } = 50;
public int GetValidatedPageSize()
=> Math.Clamp(PageSize, 1, 1000);
}
```
### 8.3 Continuation Tokens
**RULE:** Pagination cursors MUST be opaque, encoded tokens containing sort key values.
```csharp
public record PaginationCursor(DateTimeOffset CreatedAt, Guid Id)
{
public string Encode()
=> Convert.ToBase64String(
JsonSerializer.SerializeToUtf8Bytes(this));
public static PaginationCursor? Decode(string? token)
=> string.IsNullOrEmpty(token)
? null
: JsonSerializer.Deserialize<PaginationCursor>(
Convert.FromBase64String(token));
}
```
---
## 9. Ordering Rules
### 9.1 Deterministic Ordering
**RULE:** All queries returning multiple rows MUST have an ORDER BY clause that produces deterministic results.
```csharp
// ✓ CORRECT - Deterministic (includes unique column)
cmd.CommandText = """
select * from scheduler.runs
order by created_at desc, id asc
""";
// ✗ INCORRECT - Non-deterministic (created_at may have ties)
cmd.CommandText = """
select * from scheduler.runs
order by created_at desc
""";
```
### 9.2 Stable Ordering for JSONB Arrays
**RULE:** When serializing arrays to JSONB, ensure consistent ordering.
```csharp
// ✓ CORRECT - Sorted before serialization
var sortedTags = schedule.Tags.OrderBy(t => t).ToList();
cmd.Parameters.AddWithValue("tags", sortedTags.ToArray());
// ✗ INCORRECT - Order may vary
cmd.Parameters.AddWithValue("tags", schedule.Tags.ToArray());
```
---
## 10. Audit Rules
### 10.1 Timestamp Columns
**RULE:** All mutable tables MUST have `created_at` and `updated_at` columns.
```sql
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
```
### 10.2 Update Timestamp
**RULE:** `updated_at` MUST be set on every UPDATE operation.
```csharp
// ✓ CORRECT
cmd.CommandText = """
update scheduler.schedules
set name = @name, updated_at = @updated_at
where id = @id
""";
cmd.Parameters.AddWithValue("updated_at", DateTimeOffset.UtcNow);
// ✗ INCORRECT - Missing updated_at
cmd.CommandText = """
update scheduler.schedules
set name = @name
where id = @id
""";
```
### 10.3 Soft Delete Pattern
**RULE:** For audit-required entities, use soft delete with `deleted_at` and `deleted_by`.
```csharp
cmd.CommandText = """
update scheduler.schedules
set deleted_at = @deleted_at, deleted_by = @deleted_by
where tenant_id = @tenant_id and id = @id and deleted_at is null
""";
```
---
## 11. Testing Rules
### 11.1 Integration Test Database
**RULE:** Integration tests MUST use Testcontainers with PostgreSQL.
```csharp
public class PostgresFixture : IAsyncLifetime
{
private readonly PostgreSqlContainer _container = new PostgreSqlBuilder()
.WithImage("postgres:16")
.Build();
public string ConnectionString => _container.GetConnectionString();
public Task InitializeAsync() => _container.StartAsync();
public Task DisposeAsync() => _container.DisposeAsync().AsTask();
}
```
### 11.2 Test Isolation
**RULE:** Each test MUST run in a transaction that is rolled back after the test.
```csharp
public class ScheduleRepositoryTests : IClassFixture<PostgresFixture>
{
[Fact]
public async Task GetAsync_ReturnsSchedule_WhenExists()
{
await using var connection = await _fixture.OpenConnectionAsync();
await using var transaction = await connection.BeginTransactionAsync();
try
{
// Arrange, Act, Assert
}
finally
{
await transaction.RollbackAsync();
}
}
}
```
### 11.3 Determinism Tests
**RULE:** Every repository MUST have tests verifying deterministic output ordering.
```csharp
[Fact]
public async Task ListAsync_ReturnsDeterministicOrder()
{
// Insert records with same created_at
// Verify order is consistent across multiple calls
var result1 = await _repository.ListAsync(tenantId);
var result2 = await _repository.ListAsync(tenantId);
result1.Should().BeEquivalentTo(result2, options =>
options.WithStrictOrdering());
}
```
---
## 12. Migration Rules
### 12.1 Idempotent Migrations
**RULE:** All migrations MUST be idempotent using `IF NOT EXISTS` / `IF EXISTS`.
```sql
-- ✓ CORRECT
CREATE TABLE IF NOT EXISTS scheduler.schedules (...);
CREATE INDEX IF NOT EXISTS idx_schedules_tenant ON scheduler.schedules(tenant_id);
-- ✗ INCORRECT
CREATE TABLE scheduler.schedules (...); -- Fails if exists
```
### 12.2 No Breaking Changes
**RULE:** Migrations MUST NOT break existing code. Use expand-contract pattern.
```
Expand Phase:
1. Add new column as nullable
2. Deploy code that writes to both old and new columns
3. Backfill new column
Contract Phase:
4. Deploy code that reads from new column only
5. Add NOT NULL constraint
6. Drop old column
```
### 12.3 Index Creation
**RULE:** Large table indexes MUST be created with `CONCURRENTLY`.
```sql
-- ✓ CORRECT - Won't lock table
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_large_table_col
ON schema.large_table(column);
-- ✗ INCORRECT - Locks table during creation
CREATE INDEX idx_large_table_col ON schema.large_table(column);
```
---
## 13. Configuration Rules
### 13.1 Backend Selection
**RULE:** Storage backend MUST be configurable per module.
```json
{
"Persistence": {
"Authority": "Postgres",
"Scheduler": "Postgres",
"Concelier": "Mongo"
}
}
```
### 13.2 Connection String Security
**RULE:** Connection strings MUST NOT be logged or included in exception messages.
```csharp
// ✓ CORRECT
catch (NpgsqlException ex)
{
_logger.LogError(ex, "Database connection failed for module {Module}", moduleName);
throw;
}
// ✗ INCORRECT
catch (NpgsqlException ex)
{
_logger.LogError("Failed to connect: {ConnectionString}", connectionString);
}
```
### 13.3 Timeout Configuration
**RULE:** Command timeout MUST be configurable with sensible defaults.
```csharp
public class PostgresOptions
{
public int CommandTimeoutSeconds { get; set; } = 30;
public int ConnectionTimeoutSeconds { get; set; } = 15;
}
```
---
## 14. Documentation Rules
### 14.1 Repository Method Documentation
**RULE:** All public repository methods MUST have XML documentation.
```csharp
/// <summary>
/// Retrieves a schedule by its unique identifier.
/// </summary>
/// <param name="tenantId">The tenant identifier for isolation.</param>
/// <param name="scheduleId">The schedule's unique identifier.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>The schedule if found; otherwise, null.</returns>
Task<Schedule?> GetAsync(string tenantId, string scheduleId, CancellationToken cancellationToken);
```
### 14.2 SQL Comment Headers
**RULE:** Complex SQL queries SHOULD have a comment explaining the purpose.
```csharp
cmd.CommandText = """
-- Find schedules due to fire within the next minute
-- Uses compound index (tenant_id, next_fire_time) for efficiency
select s.id, s.name, t.next_fire_time
from scheduler.schedules s
join scheduler.triggers t on t.schedule_id = s.id
where s.tenant_id = @tenant_id
and s.enabled = true
and t.next_fire_time <= @window_end
order by t.next_fire_time asc
""";
```
---
## Enforcement
### Code Review Checklist
- [ ] Repository interfaces in Core layer
- [ ] PostgreSQL repositories prefixed with `Postgres`
- [ ] All connections disposed with `await using`
- [ ] Tenant ID required and used in all queries
- [ ] Parameterized queries (no string interpolation for values)
- [ ] Schema-qualified table names
- [ ] Explicit column lists (no `SELECT *`)
- [ ] Deterministic ORDER BY clauses
- [ ] Timestamps are UTC
- [ ] JSONB serialized with System.Text.Json
- [ ] PostgresException caught for constraint violations
- [ ] Integration tests use Testcontainers
### Automated Checks
These rules are enforced by:
- Roslyn analyzers in `StellaOps.Analyzers`
- SQL linting in CI pipeline
- Integration test requirements
---
*Document Version: 1.0.0*
*Last Updated: 2025-11-28*

1326
docs/db/SPECIFICATION.md Normal file

File diff suppressed because it is too large Load Diff

961
docs/db/VERIFICATION.md Normal file
View File

@@ -0,0 +1,961 @@
# Database Verification Requirements
**Version:** 1.0.0
**Status:** DRAFT
**Last Updated:** 2025-11-28
---
## Purpose
This document defines the verification and testing requirements for the MongoDB to PostgreSQL conversion. It ensures that the conversion maintains data integrity, determinism, and functional correctness.
---
## 1. Verification Principles
### 1.1 Core Guarantees
The conversion MUST maintain these guarantees:
| Guarantee | Description | Verification Method |
|-----------|-------------|---------------------|
| **Data Integrity** | No data loss during conversion | Record count comparison, checksum validation |
| **Determinism** | Same inputs produce identical outputs | Parallel pipeline comparison |
| **Functional Equivalence** | APIs behave identically | Integration test suite |
| **Performance Parity** | No significant degradation | Benchmark comparison |
| **Tenant Isolation** | Data remains properly isolated | Cross-tenant query tests |
### 1.2 Verification Levels
```
Level 1: Unit Tests
└── Individual repository method correctness
Level 2: Integration Tests
└── End-to-end repository operations with real PostgreSQL
Level 3: Comparison Tests
└── MongoDB vs PostgreSQL output comparison
Level 4: Load Tests
└── Performance and scalability verification
Level 5: Production Verification
└── Dual-write monitoring and validation
```
---
## 2. Test Infrastructure
### 2.1 Testcontainers Setup
All PostgreSQL integration tests MUST use Testcontainers:
```csharp
public sealed class PostgresTestFixture : IAsyncLifetime
{
private readonly PostgreSqlContainer _container;
private NpgsqlDataSource? _dataSource;
public PostgresTestFixture()
{
_container = new PostgreSqlBuilder()
.WithImage("postgres:16-alpine")
.WithDatabase("stellaops_test")
.WithUsername("test")
.WithPassword("test")
.WithWaitStrategy(Wait.ForUnixContainer()
.UntilPortIsAvailable(5432))
.Build();
}
public string ConnectionString => _container.GetConnectionString();
public NpgsqlDataSource DataSource => _dataSource
?? throw new InvalidOperationException("Not initialized");
public async Task InitializeAsync()
{
await _container.StartAsync();
_dataSource = NpgsqlDataSource.Create(ConnectionString);
await RunMigrationsAsync();
}
public async Task DisposeAsync()
{
if (_dataSource is not null)
await _dataSource.DisposeAsync();
await _container.DisposeAsync();
}
private async Task RunMigrationsAsync()
{
await using var connection = await _dataSource!.OpenConnectionAsync();
var migrationRunner = new PostgresMigrationRunner(_dataSource, GetMigrations());
await migrationRunner.RunAsync();
}
}
```
### 2.2 Test Database State Management
```csharp
public abstract class PostgresRepositoryTestBase : IAsyncLifetime
{
protected readonly PostgresTestFixture Fixture;
protected NpgsqlConnection Connection = null!;
protected NpgsqlTransaction Transaction = null!;
protected PostgresRepositoryTestBase(PostgresTestFixture fixture)
{
Fixture = fixture;
}
public async Task InitializeAsync()
{
Connection = await Fixture.DataSource.OpenConnectionAsync();
Transaction = await Connection.BeginTransactionAsync();
// Set test tenant context
await using var cmd = Connection.CreateCommand();
cmd.CommandText = "SET app.tenant_id = 'test-tenant-id'";
await cmd.ExecuteNonQueryAsync();
}
public async Task DisposeAsync()
{
await Transaction.RollbackAsync();
await Transaction.DisposeAsync();
await Connection.DisposeAsync();
}
}
```
### 2.3 Test Data Builders
```csharp
public sealed class ScheduleBuilder
{
private Guid _id = Guid.NewGuid();
private string _tenantId = "test-tenant";
private string _name = "test-schedule";
private bool _enabled = true;
private string? _cronExpression = "0 * * * *";
public ScheduleBuilder WithId(Guid id) { _id = id; return this; }
public ScheduleBuilder WithTenant(string tenantId) { _tenantId = tenantId; return this; }
public ScheduleBuilder WithName(string name) { _name = name; return this; }
public ScheduleBuilder Enabled(bool enabled = true) { _enabled = enabled; return this; }
public ScheduleBuilder WithCron(string? cron) { _cronExpression = cron; return this; }
public Schedule Build() => new()
{
Id = _id,
TenantId = _tenantId,
Name = _name,
Enabled = _enabled,
CronExpression = _cronExpression,
Timezone = "UTC",
Mode = ScheduleMode.Scheduled,
CreatedAt = DateTimeOffset.UtcNow,
UpdatedAt = DateTimeOffset.UtcNow
};
}
```
---
## 3. Unit Test Requirements
### 3.1 Repository CRUD Tests
Every repository implementation MUST have tests for:
```csharp
public class PostgresScheduleRepositoryTests : PostgresRepositoryTestBase
{
private readonly PostgresScheduleRepository _repository;
public PostgresScheduleRepositoryTests(PostgresTestFixture fixture)
: base(fixture)
{
_repository = new PostgresScheduleRepository(/* ... */);
}
// CREATE
[Fact]
public async Task UpsertAsync_CreatesNewSchedule_WhenNotExists()
{
var schedule = new ScheduleBuilder().Build();
await _repository.UpsertAsync(schedule, CancellationToken.None);
var retrieved = await _repository.GetAsync(
schedule.TenantId, schedule.Id.ToString(), CancellationToken.None);
retrieved.Should().BeEquivalentTo(schedule);
}
// READ
[Fact]
public async Task GetAsync_ReturnsNull_WhenNotExists()
{
var result = await _repository.GetAsync(
"tenant", Guid.NewGuid().ToString(), CancellationToken.None);
result.Should().BeNull();
}
[Fact]
public async Task GetAsync_ReturnsSchedule_WhenExists()
{
var schedule = new ScheduleBuilder().Build();
await _repository.UpsertAsync(schedule, CancellationToken.None);
var result = await _repository.GetAsync(
schedule.TenantId, schedule.Id.ToString(), CancellationToken.None);
result.Should().NotBeNull();
result!.Id.Should().Be(schedule.Id);
}
// UPDATE
[Fact]
public async Task UpsertAsync_UpdatesExisting_WhenExists()
{
var schedule = new ScheduleBuilder().Build();
await _repository.UpsertAsync(schedule, CancellationToken.None);
schedule = schedule with { Name = "updated-name" };
await _repository.UpsertAsync(schedule, CancellationToken.None);
var retrieved = await _repository.GetAsync(
schedule.TenantId, schedule.Id.ToString(), CancellationToken.None);
retrieved!.Name.Should().Be("updated-name");
}
// DELETE
[Fact]
public async Task SoftDeleteAsync_SetsDeletedAt_WhenExists()
{
var schedule = new ScheduleBuilder().Build();
await _repository.UpsertAsync(schedule, CancellationToken.None);
var result = await _repository.SoftDeleteAsync(
schedule.TenantId, schedule.Id.ToString(),
"test-user", DateTimeOffset.UtcNow, CancellationToken.None);
result.Should().BeTrue();
var retrieved = await _repository.GetAsync(
schedule.TenantId, schedule.Id.ToString(), CancellationToken.None);
retrieved.Should().BeNull(); // Soft-deleted not returned
}
// LIST
[Fact]
public async Task ListAsync_ReturnsAllForTenant()
{
var schedule1 = new ScheduleBuilder().WithName("schedule-1").Build();
var schedule2 = new ScheduleBuilder().WithName("schedule-2").Build();
await _repository.UpsertAsync(schedule1, CancellationToken.None);
await _repository.UpsertAsync(schedule2, CancellationToken.None);
var results = await _repository.ListAsync(
schedule1.TenantId, null, CancellationToken.None);
results.Should().HaveCount(2);
}
}
```
### 3.2 Tenant Isolation Tests
```csharp
public class TenantIsolationTests : PostgresRepositoryTestBase
{
[Fact]
public async Task GetAsync_DoesNotReturnOtherTenantData()
{
var tenant1Schedule = new ScheduleBuilder()
.WithTenant("tenant-1")
.WithName("tenant1-schedule")
.Build();
var tenant2Schedule = new ScheduleBuilder()
.WithTenant("tenant-2")
.WithName("tenant2-schedule")
.Build();
await _repository.UpsertAsync(tenant1Schedule, CancellationToken.None);
await _repository.UpsertAsync(tenant2Schedule, CancellationToken.None);
// Tenant 1 should not see Tenant 2's data
var result = await _repository.GetAsync(
"tenant-1", tenant2Schedule.Id.ToString(), CancellationToken.None);
result.Should().BeNull();
}
[Fact]
public async Task ListAsync_OnlyReturnsTenantData()
{
// Create schedules for two tenants
for (int i = 0; i < 5; i++)
{
await _repository.UpsertAsync(
new ScheduleBuilder().WithTenant("tenant-1").Build(),
CancellationToken.None);
await _repository.UpsertAsync(
new ScheduleBuilder().WithTenant("tenant-2").Build(),
CancellationToken.None);
}
var tenant1Results = await _repository.ListAsync(
"tenant-1", null, CancellationToken.None);
var tenant2Results = await _repository.ListAsync(
"tenant-2", null, CancellationToken.None);
tenant1Results.Should().HaveCount(5);
tenant2Results.Should().HaveCount(5);
tenant1Results.Should().OnlyContain(s => s.TenantId == "tenant-1");
tenant2Results.Should().OnlyContain(s => s.TenantId == "tenant-2");
}
}
```
### 3.3 Determinism Tests
```csharp
public class DeterminismTests : PostgresRepositoryTestBase
{
[Fact]
public async Task ListAsync_ReturnsDeterministicOrder()
{
// Insert multiple schedules with same created_at
var baseTime = DateTimeOffset.UtcNow;
var schedules = Enumerable.Range(0, 10)
.Select(i => new ScheduleBuilder()
.WithName($"schedule-{i}")
.Build() with { CreatedAt = baseTime })
.ToList();
foreach (var schedule in schedules)
await _repository.UpsertAsync(schedule, CancellationToken.None);
// Multiple calls should return same order
var results1 = await _repository.ListAsync("test-tenant", null, CancellationToken.None);
var results2 = await _repository.ListAsync("test-tenant", null, CancellationToken.None);
var results3 = await _repository.ListAsync("test-tenant", null, CancellationToken.None);
results1.Select(s => s.Id).Should().Equal(results2.Select(s => s.Id));
results2.Select(s => s.Id).Should().Equal(results3.Select(s => s.Id));
}
[Fact]
public async Task JsonbSerialization_IsDeterministic()
{
var schedule = new ScheduleBuilder()
.Build() with
{
Selection = new ScheduleSelector
{
Tags = new[] { "z", "a", "m" },
Repositories = new[] { "repo-2", "repo-1" }
}
};
await _repository.UpsertAsync(schedule, CancellationToken.None);
// Retrieve and re-save multiple times
for (int i = 0; i < 3; i++)
{
var retrieved = await _repository.GetAsync(
schedule.TenantId, schedule.Id.ToString(), CancellationToken.None);
await _repository.UpsertAsync(retrieved!, CancellationToken.None);
}
// Final retrieval should have identical JSONB
var final = await _repository.GetAsync(
schedule.TenantId, schedule.Id.ToString(), CancellationToken.None);
// Arrays should be consistently ordered
final!.Selection.Tags.Should().BeInAscendingOrder();
}
}
```
---
## 4. Comparison Test Requirements
### 4.1 MongoDB vs PostgreSQL Comparison Framework
```csharp
public abstract class ComparisonTestBase<TEntity, TRepository>
where TRepository : class
{
protected readonly TRepository MongoRepository;
protected readonly TRepository PostgresRepository;
protected abstract Task<TEntity?> GetFromMongo(string tenantId, string id);
protected abstract Task<TEntity?> GetFromPostgres(string tenantId, string id);
protected abstract Task<IReadOnlyList<TEntity>> ListFromMongo(string tenantId);
protected abstract Task<IReadOnlyList<TEntity>> ListFromPostgres(string tenantId);
[Fact]
public async Task Get_ReturnsSameEntity_FromBothBackends()
{
var entityId = GetTestEntityId();
var tenantId = GetTestTenantId();
var mongoResult = await GetFromMongo(tenantId, entityId);
var postgresResult = await GetFromPostgres(tenantId, entityId);
postgresResult.Should().BeEquivalentTo(mongoResult, options =>
options.Excluding(e => e.Path.Contains("Id"))); // IDs may differ
}
[Fact]
public async Task List_ReturnsSameEntities_FromBothBackends()
{
var tenantId = GetTestTenantId();
var mongoResults = await ListFromMongo(tenantId);
var postgresResults = await ListFromPostgres(tenantId);
postgresResults.Should().BeEquivalentTo(mongoResults, options =>
options
.Excluding(e => e.Path.Contains("Id"))
.WithStrictOrdering()); // Order must match
}
}
```
### 4.2 Advisory Matching Comparison
```csharp
public class AdvisoryMatchingComparisonTests
{
[Theory]
[MemberData(nameof(GetSampleSboms))]
public async Task VulnerabilityMatching_ProducesSameResults(string sbomPath)
{
var sbom = await LoadSbomAsync(sbomPath);
// Configure Mongo backend
var mongoConfig = CreateConfig("Mongo");
var mongoScanner = CreateScanner(mongoConfig);
var mongoFindings = await mongoScanner.ScanAsync(sbom);
// Configure Postgres backend
var postgresConfig = CreateConfig("Postgres");
var postgresScanner = CreateScanner(postgresConfig);
var postgresFindings = await postgresScanner.ScanAsync(sbom);
// Compare findings
postgresFindings.Should().BeEquivalentTo(mongoFindings, options =>
options
.WithStrictOrdering()
.Using<DateTimeOffset>(ctx =>
ctx.Subject.Should().BeCloseTo(ctx.Expectation, TimeSpan.FromSeconds(1)))
.WhenTypeIs<DateTimeOffset>());
}
public static IEnumerable<object[]> GetSampleSboms()
{
yield return new object[] { "testdata/sbom-alpine-3.18.json" };
yield return new object[] { "testdata/sbom-debian-12.json" };
yield return new object[] { "testdata/sbom-nodejs-app.json" };
yield return new object[] { "testdata/sbom-python-app.json" };
}
}
```
### 4.3 VEX Graph Comparison
```csharp
public class GraphRevisionComparisonTests
{
[Theory]
[MemberData(nameof(GetTestProjects))]
public async Task GraphComputation_ProducesIdenticalRevisionId(string projectId)
{
// Compute graph with Mongo backend
var mongoGraph = await ComputeGraphAsync(projectId, "Mongo");
// Compute graph with Postgres backend
var postgresGraph = await ComputeGraphAsync(projectId, "Postgres");
// Revision ID MUST be identical (hash-stable)
postgresGraph.RevisionId.Should().Be(mongoGraph.RevisionId);
// Node and edge counts should match
postgresGraph.NodeCount.Should().Be(mongoGraph.NodeCount);
postgresGraph.EdgeCount.Should().Be(mongoGraph.EdgeCount);
// VEX statements should match
var mongoStatements = await GetStatementsAsync(projectId, "Mongo");
var postgresStatements = await GetStatementsAsync(projectId, "Postgres");
postgresStatements.Should().BeEquivalentTo(mongoStatements, options =>
options
.Excluding(s => s.Id)
.WithStrictOrdering());
}
}
```
---
## 5. Performance Test Requirements
### 5.1 Benchmark Framework
```csharp
[MemoryDiagnoser]
[SimpleJob(RuntimeMoniker.Net80)]
public class RepositoryBenchmarks
{
private IScheduleRepository _mongoRepository = null!;
private IScheduleRepository _postgresRepository = null!;
private string _tenantId = null!;
[GlobalSetup]
public async Task Setup()
{
// Initialize both repositories
_mongoRepository = await CreateMongoRepositoryAsync();
_postgresRepository = await CreatePostgresRepositoryAsync();
_tenantId = await SeedTestDataAsync();
}
[Benchmark(Baseline = true)]
public async Task<Schedule?> Mongo_GetById()
{
return await _mongoRepository.GetAsync(_tenantId, _testScheduleId, CancellationToken.None);
}
[Benchmark]
public async Task<Schedule?> Postgres_GetById()
{
return await _postgresRepository.GetAsync(_tenantId, _testScheduleId, CancellationToken.None);
}
[Benchmark(Baseline = true)]
public async Task<IReadOnlyList<Schedule>> Mongo_List100()
{
return await _mongoRepository.ListAsync(_tenantId,
new QueryOptions { PageSize = 100 }, CancellationToken.None);
}
[Benchmark]
public async Task<IReadOnlyList<Schedule>> Postgres_List100()
{
return await _postgresRepository.ListAsync(_tenantId,
new QueryOptions { PageSize = 100 }, CancellationToken.None);
}
}
```
### 5.2 Performance Acceptance Criteria
| Operation | Mongo Baseline | Postgres Target | Maximum Acceptable |
|-----------|----------------|-----------------|-------------------|
| Get by ID | X ms | ≤ X ms | ≤ 1.5X ms |
| List (100 items) | Y ms | ≤ Y ms | ≤ 1.5Y ms |
| Insert | Z ms | ≤ Z ms | ≤ 2Z ms |
| Update | W ms | ≤ W ms | ≤ 2W ms |
| Complex query | V ms | ≤ V ms | ≤ 2V ms |
### 5.3 Load Test Scenarios
```yaml
# k6 load test configuration
scenarios:
constant_load:
executor: constant-arrival-rate
rate: 100
timeUnit: 1s
duration: 5m
preAllocatedVUs: 50
maxVUs: 100
spike_test:
executor: ramping-arrival-rate
startRate: 10
timeUnit: 1s
stages:
- duration: 1m
target: 10
- duration: 1m
target: 100
- duration: 2m
target: 100
- duration: 1m
target: 10
thresholds:
http_req_duration:
- p(95) < 200 # 95th percentile under 200ms
- p(99) < 500 # 99th percentile under 500ms
http_req_failed:
- rate < 0.01 # Error rate under 1%
```
---
## 6. Data Integrity Verification
### 6.1 Record Count Verification
```csharp
public class DataIntegrityVerifier
{
public async Task<VerificationResult> VerifyCountsAsync(string module)
{
var results = new Dictionary<string, (long mongo, long postgres)>();
foreach (var collection in GetCollections(module))
{
var mongoCount = await _mongoDb.GetCollection<BsonDocument>(collection)
.CountDocumentsAsync(FilterDefinition<BsonDocument>.Empty);
var postgresCount = await GetPostgresCountAsync(collection);
results[collection] = (mongoCount, postgresCount);
}
return new VerificationResult
{
Module = module,
Counts = results,
AllMatch = results.All(r => r.Value.mongo == r.Value.postgres)
};
}
}
```
### 6.2 Checksum Verification
```csharp
public class ChecksumVerifier
{
public async Task<bool> VerifyAdvisoryChecksumAsync(string advisoryKey)
{
var mongoAdvisory = await _mongoAdvisoryRepo.GetAsync(advisoryKey);
var postgresAdvisory = await _postgresAdvisoryRepo.GetAsync(advisoryKey);
if (mongoAdvisory is null || postgresAdvisory is null)
return mongoAdvisory is null && postgresAdvisory is null;
var mongoChecksum = ComputeChecksum(mongoAdvisory);
var postgresChecksum = ComputeChecksum(postgresAdvisory);
return mongoChecksum == postgresChecksum;
}
private string ComputeChecksum(Advisory advisory)
{
// Serialize to canonical JSON and hash
var json = JsonSerializer.Serialize(advisory, new JsonSerializerOptions
{
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
WriteIndented = false,
DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull
});
using var sha256 = SHA256.Create();
var hash = sha256.ComputeHash(Encoding.UTF8.GetBytes(json));
return Convert.ToHexString(hash);
}
}
```
### 6.3 Referential Integrity Verification
```csharp
public class ReferentialIntegrityTests
{
[Fact]
public async Task AllForeignKeys_ReferenceExistingRecords()
{
await using var connection = await _dataSource.OpenConnectionAsync();
await using var cmd = connection.CreateCommand();
// Check for orphaned references
cmd.CommandText = """
SELECT 'advisory_aliases' as table_name, COUNT(*) as orphan_count
FROM vuln.advisory_aliases a
LEFT JOIN vuln.advisories adv ON a.advisory_id = adv.id
WHERE adv.id IS NULL
UNION ALL
SELECT 'advisory_cvss', COUNT(*)
FROM vuln.advisory_cvss c
LEFT JOIN vuln.advisories adv ON c.advisory_id = adv.id
WHERE adv.id IS NULL
-- Add more tables...
""";
await using var reader = await cmd.ExecuteReaderAsync();
while (await reader.ReadAsync())
{
var tableName = reader.GetString(0);
var orphanCount = reader.GetInt64(1);
orphanCount.Should().Be(0, $"Table {tableName} has orphaned references");
}
}
}
```
---
## 7. Production Verification
### 7.1 Dual-Write Monitoring
```csharp
public class DualWriteMonitor
{
private readonly IMetrics _metrics;
public async Task RecordWriteAsync(
string module,
string operation,
bool mongoSuccess,
bool postgresSuccess,
TimeSpan mongoDuration,
TimeSpan postgresDuration)
{
_metrics.Counter("dual_write_total", new[]
{
("module", module),
("operation", operation),
("mongo_success", mongoSuccess.ToString()),
("postgres_success", postgresSuccess.ToString())
}).Inc();
_metrics.Histogram("dual_write_duration_ms", new[]
{
("module", module),
("operation", operation),
("backend", "mongo")
}).Observe(mongoDuration.TotalMilliseconds);
_metrics.Histogram("dual_write_duration_ms", new[]
{
("module", module),
("operation", operation),
("backend", "postgres")
}).Observe(postgresDuration.TotalMilliseconds);
if (mongoSuccess != postgresSuccess)
{
_metrics.Counter("dual_write_inconsistency", new[]
{
("module", module),
("operation", operation)
}).Inc();
_logger.LogWarning(
"Dual-write inconsistency: {Module}/{Operation} - Mongo: {Mongo}, Postgres: {Postgres}",
module, operation, mongoSuccess, postgresSuccess);
}
}
}
```
### 7.2 Read Comparison Sampling
```csharp
public class ReadComparisonSampler : BackgroundService
{
private readonly IOptions<SamplingOptions> _options;
private readonly Random _random = new();
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
if (_random.NextDouble() < _options.Value.SampleRate) // e.g., 1%
{
await CompareRandomRecordAsync(stoppingToken);
}
await Task.Delay(_options.Value.Interval, stoppingToken);
}
}
private async Task CompareRandomRecordAsync(CancellationToken ct)
{
var entityId = await GetRandomEntityIdAsync(ct);
var mongoEntity = await _mongoRepo.GetAsync(entityId, ct);
var postgresEntity = await _postgresRepo.GetAsync(entityId, ct);
if (!AreEquivalent(mongoEntity, postgresEntity))
{
_logger.LogError(
"Read comparison mismatch for entity {EntityId}",
entityId);
_metrics.Counter("read_comparison_mismatch").Inc();
}
}
}
```
### 7.3 Rollback Verification
```csharp
public class RollbackVerificationTests
{
[Fact]
public async Task Rollback_RestoresMongoAsSource_WhenPostgresFails()
{
// Simulate Postgres failure
await _postgresDataSource.DisposeAsync();
// Verify system falls back to Mongo
var config = _configuration.GetSection("Persistence");
config["Scheduler"] = "Mongo"; // Simulate config change
// Operations should continue working
var schedule = await _scheduleRepository.GetAsync(
"tenant", "schedule-id", CancellationToken.None);
schedule.Should().NotBeNull();
}
}
```
---
## 8. Module-Specific Verification
### 8.1 Authority Verification
| Test | Description | Pass Criteria |
|------|-------------|---------------|
| User CRUD | Create, read, update, delete users | All operations succeed |
| Role assignment | Assign/revoke roles | Roles correctly applied |
| Token issuance | Issue OAuth tokens | Tokens valid and verifiable |
| Token verification | Verify issued tokens | Verification succeeds |
| Login tracking | Record login attempts | Attempts logged correctly |
| License validation | Check license validity | Same result both backends |
### 8.2 Scheduler Verification
| Test | Description | Pass Criteria |
|------|-------------|---------------|
| Schedule CRUD | All CRUD operations | Data integrity preserved |
| Trigger calculation | Next fire time calculation | Identical results |
| Run history | Run creation and completion | Correct state transitions |
| Impact snapshots | Finding aggregation | Same counts and severity |
| Worker registration | Worker heartbeats | Consistent status |
### 8.3 Vulnerability Verification
| Test | Description | Pass Criteria |
|------|-------------|---------------|
| Advisory ingest | Import from feed | All advisories imported |
| Alias resolution | CVE → Advisory lookup | Same advisory returned |
| CVSS lookup | Get CVSS scores | Identical scores |
| Affected package match | PURL matching | Same vulnerabilities found |
| KEV flag lookup | Check KEV status | Correct flag status |
### 8.4 VEX Verification
| Test | Description | Pass Criteria |
|------|-------------|---------------|
| Graph revision | Compute revision ID | Identical revision IDs |
| Node/edge counts | Graph structure | Same counts |
| VEX statements | Status determination | Same statuses |
| Consensus computation | Aggregate signals | Same consensus |
| Evidence manifest | Merkle root | Identical roots |
---
## 9. Verification Checklist
### Per-Module Checklist
- [ ] All unit tests pass with PostgreSQL
- [ ] Tenant isolation tests pass
- [ ] Determinism tests pass
- [ ] Performance benchmarks within tolerance
- [ ] Record counts match between MongoDB and PostgreSQL
- [ ] Checksum verification passes for sample data
- [ ] Referential integrity verified
- [ ] Comparison tests pass for all scenarios
- [ ] Load tests pass with acceptable metrics
### Pre-Production Checklist
- [ ] Dual-write monitoring in place
- [ ] Read comparison sampling enabled
- [ ] Rollback procedure tested
- [ ] Performance baselines established
- [ ] Alert thresholds configured
- [ ] Runbook documented
### Post-Switch Checklist
- [ ] No dual-write inconsistencies for 7 days
- [ ] Read comparison sampling shows 100% match
- [ ] Performance within acceptable range
- [ ] No data integrity alerts
- [ ] MongoDB reads disabled
- [ ] MongoDB backups archived
---
## 10. Reporting
### 10.1 Verification Report Template
```markdown
# Database Conversion Verification Report
## Module: [Module Name]
## Date: [YYYY-MM-DD]
## Status: [PASS/FAIL]
### Summary
- Total Tests: X
- Passed: Y
- Failed: Z
### Unit Tests
| Category | Passed | Failed | Notes |
|----------|--------|--------|-------|
| CRUD | | | |
| Isolation| | | |
| Determinism | | | |
### Comparison Tests
| Test | Status | Notes |
|------|--------|-------|
| | | |
### Performance
| Operation | Mongo | Postgres | Diff |
|-----------|-------|----------|------|
| | | | |
### Data Integrity
- Record count match: [YES/NO]
- Checksum verification: [PASS/FAIL]
- Referential integrity: [PASS/FAIL]
### Sign-off
- [ ] QA Engineer
- [ ] Tech Lead
- [ ] Product Owner
```
---
*Document Version: 1.0.0*
*Last Updated: 2025-11-28*

View File

@@ -0,0 +1,404 @@
# Phase 0: Foundations
**Sprint:** 1
**Duration:** 1 sprint
**Status:** TODO
**Dependencies:** None
---
## Objectives
1. Provision PostgreSQL cluster for staging and production
2. Create shared infrastructure library (`StellaOps.Infrastructure.Postgres`)
3. Set up CI/CD pipeline for PostgreSQL migrations
4. Establish Testcontainers-based integration testing
---
## Deliverables
| Deliverable | Acceptance Criteria |
|-------------|---------------------|
| PostgreSQL cluster | Running in staging with proper configuration |
| Shared library | DataSource, migrations, extensions implemented |
| CI pipeline | PostgreSQL tests running on every PR |
| Documentation | SPECIFICATION.md, RULES.md reviewed and approved |
---
## Task Breakdown
### T0.1: PostgreSQL Cluster Provisioning
**Status:** TODO
**Assignee:** TBD
**Estimate:** 2 days
**Description:**
Provision PostgreSQL 16+ cluster with appropriate configuration for StellaOps workload.
**Subtasks:**
- [ ] T0.1.1: Select PostgreSQL hosting (managed vs self-hosted)
- [ ] T0.1.2: Create staging cluster with single primary
- [ ] T0.1.3: Configure connection pooling (PgBouncer or built-in)
- [ ] T0.1.4: Set up backup and restore procedures
- [ ] T0.1.5: Configure monitoring (pg_stat_statements, Prometheus exporter)
- [ ] T0.1.6: Document connection strings and access credentials
- [ ] T0.1.7: Configure SSL/TLS for connections
**Configuration Requirements:**
```
PostgreSQL Version: 16+
Max Connections: 100 (via pooler: 500)
Shared Buffers: 25% of RAM
Work Mem: 64MB
Maintenance Work Mem: 512MB
WAL Level: replica
Max WAL Size: 2GB
```
**Verification:**
- [ ] Can connect from development machines
- [ ] Can connect from CI/CD runners
- [ ] Monitoring dashboard shows metrics
- [ ] Backup tested and verified
---
### T0.2: Create StellaOps.Infrastructure.Postgres Library
**Status:** TODO
**Assignee:** TBD
**Estimate:** 3 days
**Description:**
Create shared library with reusable PostgreSQL infrastructure components.
**Subtasks:**
- [ ] T0.2.1: Create project `src/Shared/StellaOps.Infrastructure.Postgres/`
- [ ] T0.2.2: Add Npgsql NuGet package reference
- [ ] T0.2.3: Implement `DataSourceBase` abstract class
- [ ] T0.2.4: Implement `IPostgresMigration` interface
- [ ] T0.2.5: Implement `PostgresMigrationRunner` class
- [ ] T0.2.6: Implement `NpgsqlExtensions` helper methods
- [ ] T0.2.7: Implement `ServiceCollectionExtensions` for DI
- [ ] T0.2.8: Add XML documentation to all public APIs
- [ ] T0.2.9: Add unit tests for migration runner
**Files to Create:**
```
src/Shared/StellaOps.Infrastructure.Postgres/
├── StellaOps.Infrastructure.Postgres.csproj
├── DataSourceBase.cs
├── PostgresOptions.cs
├── Migrations/
│ ├── IPostgresMigration.cs
│ └── PostgresMigrationRunner.cs
├── Extensions/
│ ├── NpgsqlExtensions.cs
│ └── NpgsqlCommandExtensions.cs
└── ServiceCollectionExtensions.cs
```
**DataSourceBase Implementation:**
```csharp
public abstract class DataSourceBase : IAsyncDisposable
{
protected readonly NpgsqlDataSource DataSource;
protected readonly PostgresOptions Options;
protected DataSourceBase(IOptions<PostgresOptions> options)
{
Options = options.Value;
var builder = new NpgsqlDataSourceBuilder(Options.ConnectionString);
ConfigureDataSource(builder);
DataSource = builder.Build();
}
protected virtual void ConfigureDataSource(NpgsqlDataSourceBuilder builder)
{
// Override in derived classes for module-specific config
}
public async Task<NpgsqlConnection> OpenConnectionAsync(
string tenantId,
CancellationToken cancellationToken = default)
{
var connection = await DataSource.OpenConnectionAsync(cancellationToken);
await ConfigureSessionAsync(connection, tenantId, cancellationToken);
return connection;
}
protected virtual async Task ConfigureSessionAsync(
NpgsqlConnection connection,
string tenantId,
CancellationToken cancellationToken)
{
await using var cmd = connection.CreateCommand();
cmd.CommandText = $"""
SET app.tenant_id = '{tenantId}';
SET timezone = 'UTC';
SET statement_timeout = '{Options.CommandTimeoutSeconds}s';
""";
await cmd.ExecuteNonQueryAsync(cancellationToken);
}
public async ValueTask DisposeAsync()
{
await DataSource.DisposeAsync();
GC.SuppressFinalize(this);
}
}
```
**Verification:**
- [ ] Project builds without errors
- [ ] Unit tests pass
- [ ] Can be referenced from module projects
---
### T0.3: Migration Framework Implementation
**Status:** TODO
**Assignee:** TBD
**Estimate:** 2 days
**Description:**
Implement idempotent migration framework for schema management.
**Subtasks:**
- [ ] T0.3.1: Define `IPostgresMigration` interface
- [ ] T0.3.2: Implement `PostgresMigrationRunner` with transaction support
- [ ] T0.3.3: Implement migration tracking table (`_migrations`)
- [ ] T0.3.4: Add `IHostedService` for automatic migration on startup
- [ ] T0.3.5: Add CLI command for manual migration execution
- [ ] T0.3.6: Add migration rollback support (optional)
**Migration Interface:**
```csharp
public interface IPostgresMigration
{
/// <summary>
/// Unique migration identifier (e.g., "V001_CreateAuthoritySchema")
/// </summary>
string Id { get; }
/// <summary>
/// Human-readable description
/// </summary>
string Description { get; }
/// <summary>
/// Apply the migration
/// </summary>
Task UpAsync(NpgsqlConnection connection, CancellationToken cancellationToken);
/// <summary>
/// Rollback the migration (optional)
/// </summary>
Task DownAsync(NpgsqlConnection connection, CancellationToken cancellationToken);
}
```
**Verification:**
- [ ] Migrations run idempotently (can run multiple times)
- [ ] Migration state tracked correctly
- [ ] Failed migrations roll back cleanly
---
### T0.4: CI/CD Pipeline Configuration
**Status:** TODO
**Assignee:** TBD
**Estimate:** 2 days
**Description:**
Add PostgreSQL integration testing to CI/CD pipeline.
**Subtasks:**
- [ ] T0.4.1: Add Testcontainers.PostgreSql NuGet package to test projects
- [ ] T0.4.2: Create `PostgresTestFixture` base class
- [ ] T0.4.3: Update CI workflow to support PostgreSQL containers
- [ ] T0.4.4: Add parallel test execution configuration
- [ ] T0.4.5: Add test coverage reporting for PostgreSQL code
**PostgresTestFixture:**
```csharp
public sealed class PostgresTestFixture : IAsyncLifetime
{
private readonly PostgreSqlContainer _container;
private NpgsqlDataSource? _dataSource;
public PostgresTestFixture()
{
_container = new PostgreSqlBuilder()
.WithImage("postgres:16-alpine")
.WithDatabase("stellaops_test")
.WithUsername("test")
.WithPassword("test")
.WithWaitStrategy(Wait.ForUnixContainer()
.UntilPortIsAvailable(5432))
.Build();
}
public string ConnectionString => _container.GetConnectionString();
public NpgsqlDataSource DataSource => _dataSource
?? throw new InvalidOperationException("Fixture not initialized");
public async Task InitializeAsync()
{
await _container.StartAsync();
_dataSource = NpgsqlDataSource.Create(ConnectionString);
}
public async Task DisposeAsync()
{
if (_dataSource is not null)
await _dataSource.DisposeAsync();
await _container.DisposeAsync();
}
}
```
**CI Workflow Update:**
```yaml
# .gitea/workflows/build-test-deploy.yml
- name: Run PostgreSQL Integration Tests
run: |
dotnet test src/StellaOps.sln \
--filter "Category=PostgresIntegration" \
--logger "trx;LogFileName=postgres-test-results.trx"
env:
TESTCONTAINERS_RYUK_DISABLED: true
```
**Verification:**
- [ ] CI pipeline runs PostgreSQL tests
- [ ] Tests can run in parallel without conflicts
- [ ] Test results reported correctly
---
### T0.5: Persistence Configuration
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Description:**
Add persistence backend configuration to all services.
**Subtasks:**
- [ ] T0.5.1: Define `PersistenceOptions` class
- [ ] T0.5.2: Add configuration section to `appsettings.json`
- [ ] T0.5.3: Update service registration to read persistence config
- [ ] T0.5.4: Add configuration validation on startup
**PersistenceOptions:**
```csharp
public sealed class PersistenceOptions
{
public const string SectionName = "Persistence";
public string Authority { get; set; } = "Mongo";
public string Scheduler { get; set; } = "Mongo";
public string Concelier { get; set; } = "Mongo";
public string Excititor { get; set; } = "Mongo";
public string Notify { get; set; } = "Mongo";
public string Policy { get; set; } = "Mongo";
}
```
**Configuration Template:**
```json
{
"Persistence": {
"Authority": "Mongo",
"Scheduler": "Mongo",
"Concelier": "Mongo",
"Excititor": "Mongo",
"Notify": "Mongo",
"Policy": "Mongo"
},
"Postgres": {
"ConnectionString": "Host=localhost;Database=stellaops;Username=stellaops;Password=secret",
"CommandTimeoutSeconds": 30,
"ConnectionTimeoutSeconds": 15
}
}
```
**Verification:**
- [ ] Configuration loads correctly
- [ ] Invalid configuration throws on startup
- [ ] Environment variables can override settings
---
### T0.6: Documentation Review
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Description:**
Review and finalize database documentation.
**Subtasks:**
- [ ] T0.6.1: Review SPECIFICATION.md for completeness
- [ ] T0.6.2: Review RULES.md for clarity
- [ ] T0.6.3: Review VERIFICATION.md for test coverage
- [ ] T0.6.4: Get Architecture Team sign-off
- [ ] T0.6.5: Publish to team wiki/docs site
**Verification:**
- [ ] All documents reviewed by 2+ team members
- [ ] No outstanding questions or TODOs
- [ ] Architecture Team approval received
---
## Exit Criteria
- [ ] PostgreSQL cluster running and accessible
- [ ] `StellaOps.Infrastructure.Postgres` library implemented and tested
- [ ] CI pipeline running PostgreSQL integration tests
- [ ] Persistence configuration framework in place
- [ ] Documentation reviewed and approved
---
## Risks & Mitigations
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| PostgreSQL provisioning delays | Medium | High | Start early, have backup plan |
| Testcontainers compatibility issues | Low | Medium | Test on CI runners early |
| Configuration complexity | Low | Low | Use existing patterns from Orchestrator |
---
## Dependencies on Later Phases
Phase 0 must complete before any module conversion (Phases 1-6) can begin. The following are required:
1. PostgreSQL cluster operational
2. Shared library published
3. CI pipeline validated
4. Configuration framework deployed
---
## Notes
- Use Orchestrator module as reference for all patterns
- Prioritize getting CI pipeline working early
- Document all configuration decisions
---
*Phase Version: 1.0.0*
*Last Updated: 2025-11-28*

View File

@@ -0,0 +1,495 @@
# Phase 1: Authority Module Conversion
**Sprint:** 2
**Duration:** 1 sprint
**Status:** TODO
**Dependencies:** Phase 0 (Foundations)
---
## Objectives
1. Create `StellaOps.Authority.Storage.Postgres` project
2. Implement full Authority schema in PostgreSQL
3. Implement all repository interfaces
4. Enable dual-write mode for validation
5. Switch Authority to PostgreSQL-only after verification
---
## Deliverables
| Deliverable | Acceptance Criteria |
|-------------|---------------------|
| Authority schema | All tables created with indexes |
| Repository implementations | All 9 interfaces implemented |
| Dual-write wrapper | Optional, for safe rollout |
| Integration tests | 100% coverage of CRUD operations |
| Verification report | MongoDB vs PostgreSQL comparison passed |
---
## Schema Reference
See [SPECIFICATION.md](../SPECIFICATION.md) Section 5.1 for complete Authority schema.
**Tables:**
- `authority.tenants`
- `authority.users`
- `authority.roles`
- `authority.user_roles`
- `authority.service_accounts`
- `authority.clients`
- `authority.scopes`
- `authority.tokens`
- `authority.revocations`
- `authority.login_attempts`
- `authority.licenses`
- `authority.license_usage`
---
## Task Breakdown
### T1.1: Create Authority.Storage.Postgres Project
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Description:**
Create the PostgreSQL storage project for Authority module.
**Subtasks:**
- [ ] T1.1.1: Create project `src/Authority/__Libraries/StellaOps.Authority.Storage.Postgres/`
- [ ] T1.1.2: Add reference to `StellaOps.Infrastructure.Postgres`
- [ ] T1.1.3: Add reference to `StellaOps.Authority.Core`
- [ ] T1.1.4: Create `AuthorityDataSource` class
- [ ] T1.1.5: Create `AuthorityPostgresOptions` class
- [ ] T1.1.6: Create `ServiceCollectionExtensions.cs`
**Project Structure:**
```
src/Authority/__Libraries/StellaOps.Authority.Storage.Postgres/
├── StellaOps.Authority.Storage.Postgres.csproj
├── AuthorityDataSource.cs
├── AuthorityPostgresOptions.cs
├── Repositories/
│ ├── PostgresUserRepository.cs
│ ├── PostgresRoleRepository.cs
│ ├── PostgresServiceAccountRepository.cs
│ ├── PostgresClientRepository.cs
│ ├── PostgresScopeRepository.cs
│ ├── PostgresTokenRepository.cs
│ ├── PostgresRevocationRepository.cs
│ ├── PostgresLoginAttemptRepository.cs
│ └── PostgresLicenseRepository.cs
├── Migrations/
│ └── V001_CreateAuthoritySchema.cs
└── ServiceCollectionExtensions.cs
```
**Verification:**
- [ ] Project builds without errors
- [ ] Can be referenced from Authority.WebService
---
### T1.2: Implement Schema Migrations
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Description:**
Create PostgreSQL schema migration for Authority tables.
**Subtasks:**
- [ ] T1.2.1: Create `V001_CreateAuthoritySchema` migration
- [ ] T1.2.2: Include all tables from SPECIFICATION.md
- [ ] T1.2.3: Include all indexes
- [ ] T1.2.4: Add seed data for system roles/permissions
- [ ] T1.2.5: Test migration idempotency
**Migration Implementation:**
```csharp
public sealed class V001_CreateAuthoritySchema : IPostgresMigration
{
public string Id => "V001_CreateAuthoritySchema";
public string Description => "Create Authority schema with all tables and indexes";
public async Task UpAsync(NpgsqlConnection connection, CancellationToken ct)
{
await using var cmd = connection.CreateCommand();
cmd.CommandText = AuthoritySchemaSql;
await cmd.ExecuteNonQueryAsync(ct);
}
public Task DownAsync(NpgsqlConnection connection, CancellationToken ct)
=> throw new NotSupportedException("Rollback not supported for schema creation");
private const string AuthoritySchemaSql = """
CREATE SCHEMA IF NOT EXISTS authority;
CREATE TABLE IF NOT EXISTS authority.tenants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
code TEXT NOT NULL UNIQUE,
display_name TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'active'
CHECK (status IN ('active', 'suspended', 'trial', 'terminated')),
settings JSONB DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- ... rest of schema from SPECIFICATION.md
""";
}
```
**Verification:**
- [ ] Migration creates all tables
- [ ] Migration is idempotent
- [ ] Indexes created correctly
---
### T1.3: Implement User Repository
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Description:**
Implement `IUserRepository` for PostgreSQL.
**Subtasks:**
- [ ] T1.3.1: Implement `GetByIdAsync`
- [ ] T1.3.2: Implement `GetByUsernameAsync`
- [ ] T1.3.3: Implement `GetBySubjectIdAsync`
- [ ] T1.3.4: Implement `ListAsync` with pagination
- [ ] T1.3.5: Implement `CreateAsync`
- [ ] T1.3.6: Implement `UpdateAsync`
- [ ] T1.3.7: Implement `DeleteAsync`
- [ ] T1.3.8: Implement `GetRolesAsync`
- [ ] T1.3.9: Implement `AssignRoleAsync`
- [ ] T1.3.10: Implement `RevokeRoleAsync`
- [ ] T1.3.11: Write integration tests
**Interface Reference:**
```csharp
public interface IUserRepository
{
Task<User?> GetByIdAsync(string tenantId, Guid userId, CancellationToken ct);
Task<User?> GetByUsernameAsync(string tenantId, string username, CancellationToken ct);
Task<User?> GetBySubjectIdAsync(Guid subjectId, CancellationToken ct);
Task<PagedResult<User>> ListAsync(string tenantId, UserQuery query, CancellationToken ct);
Task<User> CreateAsync(User user, CancellationToken ct);
Task<User> UpdateAsync(User user, CancellationToken ct);
Task<bool> DeleteAsync(string tenantId, Guid userId, CancellationToken ct);
Task<IReadOnlyList<Role>> GetRolesAsync(string tenantId, Guid userId, CancellationToken ct);
Task AssignRoleAsync(string tenantId, Guid userId, Guid roleId, CancellationToken ct);
Task RevokeRoleAsync(string tenantId, Guid userId, Guid roleId, CancellationToken ct);
}
```
**Verification:**
- [ ] All methods implemented
- [ ] Integration tests pass
- [ ] Tenant isolation verified
---
### T1.4: Implement Service Account Repository
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Description:**
Implement `IServiceAccountRepository` for PostgreSQL.
**Subtasks:**
- [ ] T1.4.1: Implement `GetByIdAsync`
- [ ] T1.4.2: Implement `GetByAccountIdAsync`
- [ ] T1.4.3: Implement `ListAsync`
- [ ] T1.4.4: Implement `CreateAsync`
- [ ] T1.4.5: Implement `UpdateAsync`
- [ ] T1.4.6: Implement `DeleteAsync`
- [ ] T1.4.7: Write integration tests
**Verification:**
- [ ] All methods implemented
- [ ] Integration tests pass
---
### T1.5: Implement Client Repository
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Description:**
Implement `IClientRepository` for PostgreSQL (OpenIddict compatible).
**Subtasks:**
- [ ] T1.5.1: Implement `GetByIdAsync`
- [ ] T1.5.2: Implement `GetByClientIdAsync`
- [ ] T1.5.3: Implement `ListAsync`
- [ ] T1.5.4: Implement `CreateAsync`
- [ ] T1.5.5: Implement `UpdateAsync`
- [ ] T1.5.6: Implement `DeleteAsync`
- [ ] T1.5.7: Write integration tests
**Verification:**
- [ ] All methods implemented
- [ ] Integration tests pass
---
### T1.6: Implement Token Repository
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Description:**
Implement `ITokenRepository` for PostgreSQL.
**Subtasks:**
- [ ] T1.6.1: Implement `GetByIdAsync`
- [ ] T1.6.2: Implement `GetByHashAsync`
- [ ] T1.6.3: Implement `CreateAsync`
- [ ] T1.6.4: Implement `RevokeAsync`
- [ ] T1.6.5: Implement `PruneExpiredAsync`
- [ ] T1.6.6: Implement `GetActiveTokensAsync`
- [ ] T1.6.7: Write integration tests
**Verification:**
- [ ] All methods implemented
- [ ] Token lookup by hash is fast
- [ ] Expired token pruning works
---
### T1.7: Implement Remaining Repositories
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1.5 days
**Description:**
Implement remaining repository interfaces.
**Subtasks:**
- [ ] T1.7.1: Implement `IRoleRepository`
- [ ] T1.7.2: Implement `IScopeRepository`
- [ ] T1.7.3: Implement `IRevocationRepository`
- [ ] T1.7.4: Implement `ILoginAttemptRepository`
- [ ] T1.7.5: Implement `ILicenseRepository`
- [ ] T1.7.6: Write integration tests for all
**Verification:**
- [ ] All repositories implemented
- [ ] All integration tests pass
---
### T1.8: Add Configuration Switch
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Description:**
Add configuration-based backend selection for Authority.
**Subtasks:**
- [ ] T1.8.1: Update `ServiceCollectionExtensions` in Authority.WebService
- [ ] T1.8.2: Add conditional registration based on `Persistence:Authority`
- [ ] T1.8.3: Test switching between Mongo and Postgres
- [ ] T1.8.4: Document configuration options
**Implementation:**
```csharp
public static IServiceCollection AddAuthorityStorage(
this IServiceCollection services,
IConfiguration configuration)
{
var backend = configuration.GetValue<string>("Persistence:Authority") ?? "Mongo";
return backend.ToLowerInvariant() switch
{
"postgres" => services.AddAuthorityPostgresStorage(configuration),
"mongo" => services.AddAuthorityMongoStorage(configuration),
_ => throw new ArgumentException($"Unknown Authority backend: {backend}")
};
}
```
**Verification:**
- [ ] Can switch between backends via configuration
- [ ] Invalid configuration throws clear error
---
### T1.9: Implement Dual-Write Wrapper (Optional)
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Description:**
Implement dual-write repository wrapper for safe migration.
**Subtasks:**
- [ ] T1.9.1: Create `DualWriteUserRepository`
- [ ] T1.9.2: Implement write-to-both logic
- [ ] T1.9.3: Implement read-from-primary-with-fallback logic
- [ ] T1.9.4: Add metrics for dual-write operations
- [ ] T1.9.5: Add logging for inconsistencies
- [ ] T1.9.6: Create similar wrappers for other critical repositories
**Configuration Options:**
```csharp
public sealed class DualWriteOptions
{
public string PrimaryBackend { get; set; } = "Postgres";
public bool WriteToBoth { get; set; } = true;
public bool FallbackToSecondary { get; set; } = true;
public bool ConvertOnRead { get; set; } = true;
}
```
**Verification:**
- [ ] Writes go to both backends
- [ ] Reads work with fallback
- [ ] Inconsistencies are logged
---
### T1.10: Run Verification Tests
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Description:**
Verify PostgreSQL implementation matches MongoDB behavior.
**Subtasks:**
- [ ] T1.10.1: Run comparison tests for User repository
- [ ] T1.10.2: Run comparison tests for Token repository
- [ ] T1.10.3: Verify token issuance/verification flow
- [ ] T1.10.4: Verify login flow
- [ ] T1.10.5: Document any differences found
- [ ] T1.10.6: Generate verification report
**Verification Tests:**
```csharp
[Fact]
public async Task Users_Should_Match_Between_Mongo_And_Postgres()
{
var tenantIds = await GetSampleTenantIds(10);
foreach (var tenantId in tenantIds)
{
var mongoUsers = await _mongoRepo.ListAsync(tenantId, new UserQuery());
var postgresUsers = await _postgresRepo.ListAsync(tenantId, new UserQuery());
postgresUsers.Items.Should().BeEquivalentTo(mongoUsers.Items,
options => options.Excluding(u => u.Id));
}
}
```
**Verification:**
- [ ] All comparison tests pass
- [ ] No data discrepancies found
- [ ] Verification report approved
---
### T1.11: Backfill Data (If Required)
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Description:**
Backfill existing MongoDB data to PostgreSQL.
**Subtasks:**
- [ ] T1.11.1: Create backfill script for tenants
- [ ] T1.11.2: Create backfill script for users
- [ ] T1.11.3: Create backfill script for service accounts
- [ ] T1.11.4: Create backfill script for clients/scopes
- [ ] T1.11.5: Create backfill script for active tokens
- [ ] T1.11.6: Verify record counts match
- [ ] T1.11.7: Verify sample records match
**Verification:**
- [ ] All Tier A data backfilled
- [ ] Record counts match
- [ ] Sample verification passed
---
### T1.12: Switch to PostgreSQL-Only
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Description:**
Switch Authority to PostgreSQL-only mode.
**Subtasks:**
- [ ] T1.12.1: Update configuration to `"Authority": "Postgres"`
- [ ] T1.12.2: Deploy to staging
- [ ] T1.12.3: Run full integration test suite
- [ ] T1.12.4: Monitor for errors/issues
- [ ] T1.12.5: Deploy to production
- [ ] T1.12.6: Monitor production metrics
**Verification:**
- [ ] All tests pass in staging
- [ ] No errors in production
- [ ] Performance metrics acceptable
---
## Exit Criteria
- [ ] All repository interfaces implemented for PostgreSQL
- [ ] All integration tests pass
- [ ] Verification tests pass (MongoDB vs PostgreSQL comparison)
- [ ] Configuration switch working
- [ ] Authority running on PostgreSQL in production
- [ ] MongoDB Authority collections archived
---
## Risks & Mitigations
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Token verification regression | Low | High | Extensive testing, dual-write |
| OAuth flow breakage | Low | High | Test all OAuth flows |
| Performance regression | Medium | Medium | Load testing before switch |
---
## Rollback Plan
1. Change configuration: `"Authority": "Mongo"`
2. Deploy configuration change
3. MongoDB still has all data (dual-write period)
4. Investigate and fix PostgreSQL issues
5. Re-attempt conversion
---
*Phase Version: 1.0.0*
*Last Updated: 2025-11-28*

View File

@@ -0,0 +1,305 @@
# Phase 2: Scheduler Module Conversion
**Sprint:** 3
**Duration:** 1 sprint
**Status:** TODO
**Dependencies:** Phase 0 (Foundations)
---
## Objectives
1. Create `StellaOps.Scheduler.Storage.Postgres` project
2. Implement Scheduler schema in PostgreSQL
3. Implement 7+ repository interfaces
4. Replace MongoDB job tracking with PostgreSQL
5. Implement PostgreSQL advisory locks for distributed locking
---
## Deliverables
| Deliverable | Acceptance Criteria |
|-------------|---------------------|
| Scheduler schema | All tables created with indexes |
| Repository implementations | All 7+ interfaces implemented |
| Advisory locks | Distributed locking working |
| Integration tests | 100% coverage of CRUD operations |
| Verification report | Schedule execution verified |
---
## Schema Reference
See [SPECIFICATION.md](../SPECIFICATION.md) Section 5.4 for complete Scheduler schema.
**Tables:**
- `scheduler.schedules`
- `scheduler.triggers`
- `scheduler.runs`
- `scheduler.graph_jobs`
- `scheduler.policy_jobs`
- `scheduler.impact_snapshots`
- `scheduler.workers`
- `scheduler.execution_logs`
- `scheduler.locks`
- `scheduler.run_summaries`
- `scheduler.audit`
---
## Task Breakdown
### T2.1: Create Scheduler.Storage.Postgres Project
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Subtasks:**
- [ ] T2.1.1: Create project structure
- [ ] T2.1.2: Add NuGet references
- [ ] T2.1.3: Create `SchedulerDataSource` class
- [ ] T2.1.4: Create `ServiceCollectionExtensions.cs`
---
### T2.2: Implement Schema Migrations
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Subtasks:**
- [ ] T2.2.1: Create `V001_CreateSchedulerSchema` migration
- [ ] T2.2.2: Include all tables and indexes
- [ ] T2.2.3: Add partial index for active schedules
- [ ] T2.2.4: Test migration idempotency
---
### T2.3: Implement Schedule Repository
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Interface:**
```csharp
public interface IScheduleRepository
{
Task<Schedule?> GetAsync(string tenantId, string scheduleId, CancellationToken ct);
Task<IReadOnlyList<Schedule>> ListAsync(string tenantId, ScheduleQueryOptions? options, CancellationToken ct);
Task UpsertAsync(Schedule schedule, CancellationToken ct);
Task<bool> SoftDeleteAsync(string tenantId, string scheduleId, string deletedBy, DateTimeOffset deletedAt, CancellationToken ct);
Task<IReadOnlyList<Schedule>> GetDueSchedulesAsync(DateTimeOffset now, CancellationToken ct);
}
```
**Subtasks:**
- [ ] T2.3.1: Implement all interface methods
- [ ] T2.3.2: Handle soft delete correctly
- [ ] T2.3.3: Implement GetDueSchedules for trigger calculation
- [ ] T2.3.4: Write integration tests
---
### T2.4: Implement Run Repository
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Interface:**
```csharp
public interface IRunRepository
{
Task<Run?> GetAsync(string tenantId, Guid runId, CancellationToken ct);
Task<IReadOnlyList<Run>> ListAsync(string tenantId, RunQueryOptions? options, CancellationToken ct);
Task<Run> CreateAsync(Run run, CancellationToken ct);
Task<Run> UpdateAsync(Run run, CancellationToken ct);
Task<IReadOnlyList<Run>> GetPendingRunsAsync(string tenantId, CancellationToken ct);
Task<IReadOnlyList<Run>> GetRunsByScheduleAsync(string tenantId, Guid scheduleId, int limit, CancellationToken ct);
}
```
**Subtasks:**
- [ ] T2.4.1: Implement all interface methods
- [ ] T2.4.2: Handle state transitions
- [ ] T2.4.3: Implement efficient pagination
- [ ] T2.4.4: Write integration tests
---
### T2.5: Implement Graph Job Repository
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Subtasks:**
- [ ] T2.5.1: Implement CRUD operations
- [ ] T2.5.2: Implement status queries
- [ ] T2.5.3: Write integration tests
---
### T2.6: Implement Policy Job Repository
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Subtasks:**
- [ ] T2.6.1: Implement CRUD operations
- [ ] T2.6.2: Implement status queries
- [ ] T2.6.3: Write integration tests
---
### T2.7: Implement Impact Snapshot Repository
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Subtasks:**
- [ ] T2.7.1: Implement CRUD operations
- [ ] T2.7.2: Implement queries by run
- [ ] T2.7.3: Write integration tests
---
### T2.8: Implement Distributed Locking
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Description:**
Implement distributed locking using PostgreSQL advisory locks.
**Options:**
1. PostgreSQL advisory locks (`pg_advisory_lock`)
2. Table-based locks with SELECT FOR UPDATE SKIP LOCKED
3. Combination approach
**Subtasks:**
- [ ] T2.8.1: Choose locking strategy
- [ ] T2.8.2: Implement `IDistributedLock` interface
- [ ] T2.8.3: Implement lock acquisition with timeout
- [ ] T2.8.4: Implement lock renewal
- [ ] T2.8.5: Implement lock release
- [ ] T2.8.6: Write concurrency tests
**Implementation Example:**
```csharp
public sealed class PostgresDistributedLock : IDistributedLock
{
private readonly SchedulerDataSource _dataSource;
public async Task<IAsyncDisposable?> TryAcquireAsync(
string lockKey,
TimeSpan timeout,
CancellationToken ct)
{
var lockId = ComputeLockId(lockKey);
await using var connection = await _dataSource.OpenConnectionAsync("system", ct);
await using var cmd = connection.CreateCommand();
cmd.CommandText = "SELECT pg_try_advisory_lock(@lock_id)";
cmd.Parameters.AddWithValue("lock_id", lockId);
var acquired = await cmd.ExecuteScalarAsync(ct) is true;
if (!acquired) return null;
return new LockHandle(connection, lockId);
}
private static long ComputeLockId(string key)
=> unchecked((long)key.GetHashCode());
}
```
---
### T2.9: Implement Worker Registration
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Subtasks:**
- [ ] T2.9.1: Implement worker registration
- [ ] T2.9.2: Implement heartbeat updates
- [ ] T2.9.3: Implement dead worker detection
- [ ] T2.9.4: Write integration tests
---
### T2.10: Add Configuration Switch
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Subtasks:**
- [ ] T2.10.1: Update service registration
- [ ] T2.10.2: Test backend switching
- [ ] T2.10.3: Document configuration
---
### T2.11: Run Verification Tests
**Status:** TODO
**Assignee:** TBD
**Estimate:** 1 day
**Subtasks:**
- [ ] T2.11.1: Test schedule CRUD
- [ ] T2.11.2: Test run creation and state transitions
- [ ] T2.11.3: Test trigger calculation
- [ ] T2.11.4: Test distributed locking under concurrency
- [ ] T2.11.5: Test job execution end-to-end
- [ ] T2.11.6: Generate verification report
---
### T2.12: Switch to PostgreSQL-Only
**Status:** TODO
**Assignee:** TBD
**Estimate:** 0.5 days
**Subtasks:**
- [ ] T2.12.1: Update configuration
- [ ] T2.12.2: Deploy to staging
- [ ] T2.12.3: Run integration tests
- [ ] T2.12.4: Deploy to production
- [ ] T2.12.5: Monitor metrics
---
## Exit Criteria
- [ ] All repository interfaces implemented
- [ ] Distributed locking working correctly
- [ ] All integration tests pass
- [ ] Schedule execution working end-to-end
- [ ] Scheduler running on PostgreSQL in production
---
## Risks & Mitigations
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Lock contention | Medium | Medium | Test under load, tune timeouts |
| Trigger calculation errors | Low | High | Extensive testing with edge cases |
| State transition bugs | Medium | Medium | State machine tests |
---
*Phase Version: 1.0.0*
*Last Updated: 2025-11-28*

View File

@@ -0,0 +1,183 @@
# Phase 3: Notify Module Conversion
**Sprint:** 4
**Duration:** 1 sprint
**Status:** TODO
**Dependencies:** Phase 0 (Foundations)
---
## Objectives
1. Create `StellaOps.Notify.Storage.Postgres` project
2. Implement Notify schema in PostgreSQL
3. Implement 15 repository interfaces
4. Handle delivery tracking and escalation state
---
## Deliverables
| Deliverable | Acceptance Criteria |
|-------------|---------------------|
| Notify schema | All tables created with indexes |
| Repository implementations | All 15 interfaces implemented |
| Integration tests | 100% coverage of CRUD operations |
| Verification report | Notification delivery verified |
---
## Schema Reference
See [SPECIFICATION.md](../SPECIFICATION.md) Section 5.5 for complete Notify schema.
**Tables:**
- `notify.channels`
- `notify.rules`
- `notify.templates`
- `notify.deliveries`
- `notify.digests`
- `notify.quiet_hours`
- `notify.maintenance_windows`
- `notify.escalation_policies`
- `notify.escalation_states`
- `notify.on_call_schedules`
- `notify.inbox`
- `notify.incidents`
- `notify.audit`
---
## Task Breakdown
### T3.1: Create Notify.Storage.Postgres Project
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Create project structure
- [ ] Add NuGet references
- [ ] Create `NotifyDataSource` class
- [ ] Create `ServiceCollectionExtensions.cs`
---
### T3.2: Implement Schema Migrations
**Status:** TODO
**Estimate:** 1 day
**Subtasks:**
- [ ] Create schema migration
- [ ] Include all tables and indexes
- [ ] Test migration idempotency
---
### T3.3: Implement Channel Repository
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Handle channel types (email, slack, teams, etc.)
- [ ] Write integration tests
---
### T3.4: Implement Rule Repository
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Handle filter JSONB
- [ ] Write integration tests
---
### T3.5: Implement Template Repository
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Handle localization
- [ ] Write integration tests
---
### T3.6: Implement Delivery Repository
**Status:** TODO
**Estimate:** 1 day
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Handle status transitions
- [ ] Implement retry logic
- [ ] Write integration tests
---
### T3.7: Implement Remaining Repositories
**Status:** TODO
**Estimate:** 2 days
**Subtasks:**
- [ ] Implement Digest repository
- [ ] Implement QuietHours repository
- [ ] Implement MaintenanceWindow repository
- [ ] Implement EscalationPolicy repository
- [ ] Implement EscalationState repository
- [ ] Implement OnCallSchedule repository
- [ ] Implement Inbox repository
- [ ] Implement Incident repository
- [ ] Implement Audit repository
- [ ] Write integration tests for all
---
### T3.8: Add Configuration Switch
**Status:** TODO
**Estimate:** 0.5 days
---
### T3.9: Run Verification Tests
**Status:** TODO
**Estimate:** 1 day
**Subtasks:**
- [ ] Test notification delivery flow
- [ ] Test escalation handling
- [ ] Test digest aggregation
- [ ] Generate verification report
---
### T3.10: Switch to PostgreSQL-Only
**Status:** TODO
**Estimate:** 0.5 days
---
## Exit Criteria
- [ ] All 15 repository interfaces implemented
- [ ] All integration tests pass
- [ ] Notification delivery working end-to-end
- [ ] Notify running on PostgreSQL in production
---
*Phase Version: 1.0.0*
*Last Updated: 2025-11-28*

View File

@@ -0,0 +1,147 @@
# Phase 4: Policy Module Conversion
**Sprint:** 5
**Duration:** 1 sprint
**Status:** TODO
**Dependencies:** Phase 0 (Foundations)
---
## Objectives
1. Create `StellaOps.Policy.Storage.Postgres` project
2. Implement Policy schema in PostgreSQL
3. Handle policy pack versioning correctly
4. Implement risk profiles with version history
---
## Deliverables
| Deliverable | Acceptance Criteria |
|-------------|---------------------|
| Policy schema | All tables created with indexes |
| Repository implementations | All 4+ interfaces implemented |
| Version management | Pack versioning working correctly |
| Integration tests | 100% coverage of CRUD operations |
---
## Schema Reference
See [SPECIFICATION.md](../SPECIFICATION.md) Section 5.6 for complete Policy schema.
**Tables:**
- `policy.packs`
- `policy.pack_versions`
- `policy.rules`
- `policy.risk_profiles`
- `policy.evaluation_runs`
- `policy.explanations`
- `policy.exceptions`
- `policy.audit`
---
## Task Breakdown
### T4.1: Create Policy.Storage.Postgres Project
**Status:** TODO
**Estimate:** 0.5 days
---
### T4.2: Implement Schema Migrations
**Status:** TODO
**Estimate:** 1 day
---
### T4.3: Implement Pack Repository
**Status:** TODO
**Estimate:** 1 day
**Subtasks:**
- [ ] Implement CRUD for packs
- [ ] Implement version management
- [ ] Handle active version promotion
- [ ] Write integration tests
---
### T4.4: Implement Risk Profile Repository
**Status:** TODO
**Estimate:** 1 day
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Handle version history
- [ ] Implement GetVersionAsync
- [ ] Implement ListVersionsAsync
- [ ] Write integration tests
---
### T4.5: Implement Remaining Repositories
**Status:** TODO
**Estimate:** 1.5 days
**Subtasks:**
- [ ] Implement Evaluation Run repository
- [ ] Implement Explanation repository
- [ ] Implement Exception repository
- [ ] Implement Audit repository
- [ ] Write integration tests
---
### T4.6: Add Configuration Switch
**Status:** TODO
**Estimate:** 0.5 days
---
### T4.7: Run Verification Tests
**Status:** TODO
**Estimate:** 1 day
---
### T4.8: Migrate Active Policy Packs
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Export active packs from MongoDB
- [ ] Import to PostgreSQL
- [ ] Verify version numbers
- [ ] Verify active version settings
---
### T4.9: Switch to PostgreSQL-Only
**Status:** TODO
**Estimate:** 0.5 days
---
## Exit Criteria
- [ ] All repository interfaces implemented
- [ ] Pack versioning working correctly
- [ ] All integration tests pass
- [ ] Policy running on PostgreSQL in production
---
*Phase Version: 1.0.0*
*Last Updated: 2025-11-28*

View File

@@ -0,0 +1,334 @@
# Phase 5: Vulnerability Index Conversion (Concelier)
**Sprint:** 6-7
**Duration:** 2 sprints
**Status:** TODO
**Dependencies:** Phase 0 (Foundations)
---
## Objectives
1. Create `StellaOps.Concelier.Storage.Postgres` project
2. Implement full vulnerability schema in PostgreSQL
3. Build advisory conversion pipeline
4. Maintain deterministic vulnerability matching
---
## Deliverables
| Deliverable | Acceptance Criteria |
|-------------|---------------------|
| Vuln schema | All tables created with indexes |
| Conversion pipeline | MongoDB advisories converted to PostgreSQL |
| Matching verification | Same CVEs found for identical SBOMs |
| Integration tests | 100% coverage of query operations |
---
## Schema Reference
See [SPECIFICATION.md](../SPECIFICATION.md) Section 5.2 for complete vulnerability schema.
**Tables:**
- `vuln.sources`
- `vuln.feed_snapshots`
- `vuln.advisory_snapshots`
- `vuln.advisories`
- `vuln.advisory_aliases`
- `vuln.advisory_cvss`
- `vuln.advisory_affected`
- `vuln.advisory_references`
- `vuln.advisory_credits`
- `vuln.advisory_weaknesses`
- `vuln.kev_flags`
- `vuln.source_states`
- `vuln.merge_events`
---
## Sprint 5a: Schema & Repositories
### T5a.1: Create Concelier.Storage.Postgres Project
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Create project structure
- [ ] Add NuGet references
- [ ] Create `ConcelierDataSource` class
- [ ] Create `ServiceCollectionExtensions.cs`
---
### T5a.2: Implement Schema Migrations
**Status:** TODO
**Estimate:** 1.5 days
**Subtasks:**
- [ ] Create schema migration
- [ ] Include all tables
- [ ] Add full-text search index
- [ ] Add PURL lookup index
- [ ] Test migration idempotency
---
### T5a.3: Implement Source Repository
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Implement GetByKeyAsync
- [ ] Write integration tests
---
### T5a.4: Implement Advisory Repository
**Status:** TODO
**Estimate:** 2 days
**Interface:**
```csharp
public interface IAdvisoryRepository
{
Task<Advisory?> GetByKeyAsync(string advisoryKey, CancellationToken ct);
Task<Advisory?> GetByAliasAsync(string aliasType, string aliasValue, CancellationToken ct);
Task<IReadOnlyList<Advisory>> SearchAsync(AdvisorySearchQuery query, CancellationToken ct);
Task<Advisory> UpsertAsync(Advisory advisory, CancellationToken ct);
Task<IReadOnlyList<Advisory>> GetAffectingPackageAsync(string purl, CancellationToken ct);
Task<IReadOnlyList<Advisory>> GetAffectingPackageNameAsync(string ecosystem, string name, CancellationToken ct);
}
```
**Subtasks:**
- [ ] Implement GetByKeyAsync
- [ ] Implement GetByAliasAsync (CVE lookup)
- [ ] Implement SearchAsync with full-text search
- [ ] Implement UpsertAsync with all child tables
- [ ] Implement GetAffectingPackageAsync (PURL match)
- [ ] Implement GetAffectingPackageNameAsync
- [ ] Write integration tests
---
### T5a.5: Implement Child Table Repositories
**Status:** TODO
**Estimate:** 2 days
**Subtasks:**
- [ ] Implement Alias repository
- [ ] Implement CVSS repository
- [ ] Implement Affected repository
- [ ] Implement Reference repository
- [ ] Implement Credit repository
- [ ] Implement Weakness repository
- [ ] Implement KEV repository
- [ ] Write integration tests
---
### T5a.6: Implement Source State Repository
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Implement cursor management
- [ ] Write integration tests
---
## Sprint 5b: Conversion & Verification
### T5b.1: Build Advisory Conversion Service
**Status:** TODO
**Estimate:** 2 days
**Description:**
Create service to convert MongoDB advisory documents to PostgreSQL relational structure.
**Subtasks:**
- [ ] Parse MongoDB `AdvisoryDocument` structure
- [ ] Map to `vuln.advisories` table
- [ ] Extract and normalize aliases
- [ ] Extract and normalize CVSS metrics
- [ ] Extract and normalize affected packages
- [ ] Preserve provenance JSONB
- [ ] Handle version ranges (keep as JSONB)
- [ ] Handle normalized versions (keep as JSONB)
**Conversion Logic:**
```csharp
public sealed class AdvisoryConverter
{
public async Task ConvertAsync(
IMongoCollection<AdvisoryDocument> source,
IAdvisoryRepository target,
CancellationToken ct)
{
await foreach (var doc in source.AsAsyncEnumerable(ct))
{
var advisory = MapToAdvisory(doc);
await target.UpsertAsync(advisory, ct);
}
}
private Advisory MapToAdvisory(AdvisoryDocument doc)
{
// Extract from BsonDocument payload
var payload = doc.Payload;
return new Advisory
{
AdvisoryKey = doc.Id,
PrimaryVulnId = payload["primaryVulnId"].AsString,
Title = payload["title"]?.AsString,
Summary = payload["summary"]?.AsString,
// ... etc
Provenance = BsonSerializer.Deserialize<JsonElement>(payload["provenance"]),
};
}
}
```
---
### T5b.2: Build Feed Import Pipeline
**Status:** TODO
**Estimate:** 1 day
**Description:**
Modify feed import to write directly to PostgreSQL.
**Subtasks:**
- [ ] Update NVD importer to use PostgreSQL
- [ ] Update OSV importer to use PostgreSQL
- [ ] Update GHSA importer to use PostgreSQL
- [ ] Update vendor feed importers
- [ ] Test incremental imports
---
### T5b.3: Run Parallel Import
**Status:** TODO
**Estimate:** 1 day
**Description:**
Run imports to both MongoDB and PostgreSQL simultaneously.
**Subtasks:**
- [ ] Configure dual-import mode
- [ ] Run import cycle
- [ ] Compare record counts
- [ ] Sample comparison checks
---
### T5b.4: Verify Vulnerability Matching
**Status:** TODO
**Estimate:** 2 days
**Description:**
Verify that vulnerability matching produces identical results.
**Subtasks:**
- [ ] Select sample SBOMs (various ecosystems)
- [ ] Run matching with MongoDB backend
- [ ] Run matching with PostgreSQL backend
- [ ] Compare findings (must be identical)
- [ ] Document any differences
- [ ] Fix any issues found
**Verification Tests:**
```csharp
[Theory]
[MemberData(nameof(GetSampleSboms))]
public async Task Scanner_Should_Find_Same_Vulns(string sbomPath)
{
var sbom = await LoadSbom(sbomPath);
_config["Persistence:Concelier"] = "Mongo";
var mongoFindings = await _scanner.ScanAsync(sbom);
_config["Persistence:Concelier"] = "Postgres";
var postgresFindings = await _scanner.ScanAsync(sbom);
// Strict ordering for determinism
postgresFindings.Should().BeEquivalentTo(mongoFindings,
options => options.WithStrictOrdering());
}
```
---
### T5b.5: Performance Optimization
**Status:** TODO
**Estimate:** 1 day
**Subtasks:**
- [ ] Analyze slow queries with EXPLAIN ANALYZE
- [ ] Optimize indexes for common queries
- [ ] Consider partial indexes for active advisories
- [ ] Benchmark PostgreSQL vs MongoDB performance
---
### T5b.6: Switch Scanner to PostgreSQL
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Update configuration
- [ ] Deploy to staging
- [ ] Run full scan suite
- [ ] Deploy to production
---
## Exit Criteria
- [ ] All repository interfaces implemented
- [ ] Advisory conversion pipeline working
- [ ] Vulnerability matching produces identical results
- [ ] Feed imports working on PostgreSQL
- [ ] Concelier running on PostgreSQL in production
---
## Risks & Mitigations
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Matching discrepancies | Medium | High | Extensive comparison testing |
| Performance regression on queries | Medium | Medium | Index optimization, query tuning |
| Data loss during conversion | Low | High | Verify counts, sample checks |
---
## Data Volume Estimates
| Table | Estimated Rows | Growth Rate |
|-------|----------------|-------------|
| advisories | 300,000+ | ~100/day |
| advisory_aliases | 600,000+ | ~200/day |
| advisory_affected | 2,000,000+ | ~1000/day |
| advisory_cvss | 400,000+ | ~150/day |
---
*Phase Version: 1.0.0*
*Last Updated: 2025-11-28*

View File

@@ -0,0 +1,434 @@
# Phase 6: VEX & Graph Conversion (Excititor)
**Sprint:** 8-10
**Duration:** 2-3 sprints
**Status:** TODO
**Dependencies:** Phase 5 (Vulnerabilities)
---
## Objectives
1. Create `StellaOps.Excititor.Storage.Postgres` project
2. Implement VEX schema in PostgreSQL
3. Handle graph nodes/edges efficiently
4. Preserve graph_revision_id stability (determinism critical)
5. Maintain VEX statement lattice logic
---
## Deliverables
| Deliverable | Acceptance Criteria |
|-------------|---------------------|
| VEX schema | All tables created with indexes |
| Graph storage | Nodes/edges efficiently stored |
| Statement storage | VEX statements with full provenance |
| Revision stability | Same inputs produce same revision_id |
| Integration tests | 100% coverage |
---
## Schema Reference
See [SPECIFICATION.md](../SPECIFICATION.md) Section 5.3 for complete VEX schema.
**Tables:**
- `vex.projects`
- `vex.graph_revisions`
- `vex.graph_nodes`
- `vex.graph_edges`
- `vex.statements`
- `vex.observations`
- `vex.linksets`
- `vex.linkset_events`
- `vex.consensus`
- `vex.consensus_holds`
- `vex.unknowns_snapshots`
- `vex.unknown_items`
- `vex.evidence_manifests`
- `vex.cvss_receipts`
- `vex.attestations`
- `vex.timeline_events`
---
## Sprint 6a: Core Schema & Repositories
### T6a.1: Create Excititor.Storage.Postgres Project
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Create project structure
- [ ] Add NuGet references
- [ ] Create `ExcititorDataSource` class
- [ ] Create `ServiceCollectionExtensions.cs`
---
### T6a.2: Implement Schema Migrations
**Status:** TODO
**Estimate:** 1.5 days
**Subtasks:**
- [ ] Create schema migration
- [ ] Include all tables
- [ ] Add indexes for graph traversal
- [ ] Add indexes for VEX lookups
- [ ] Test migration idempotency
---
### T6a.3: Implement Project Repository
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Handle tenant scoping
- [ ] Write integration tests
---
### T6a.4: Implement VEX Statement Repository
**Status:** TODO
**Estimate:** 1.5 days
**Interface:**
```csharp
public interface IVexStatementRepository
{
Task<VexStatement?> GetAsync(string tenantId, Guid statementId, CancellationToken ct);
Task<IReadOnlyList<VexStatement>> GetByVulnerabilityAsync(
string tenantId, string vulnerabilityId, CancellationToken ct);
Task<IReadOnlyList<VexStatement>> GetByProjectAsync(
string tenantId, Guid projectId, CancellationToken ct);
Task<VexStatement> UpsertAsync(VexStatement statement, CancellationToken ct);
Task<IReadOnlyList<VexStatement>> GetByGraphRevisionAsync(
Guid graphRevisionId, CancellationToken ct);
}
```
**Subtasks:**
- [ ] Implement all interface methods
- [ ] Handle status and justification enums
- [ ] Preserve evidence JSONB
- [ ] Preserve provenance JSONB
- [ ] Write integration tests
---
### T6a.5: Implement VEX Observation Repository
**Status:** TODO
**Estimate:** 1 day
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Handle unique constraint on composite key
- [ ] Implement FindByVulnerabilityAndProductAsync
- [ ] Write integration tests
---
### T6a.6: Implement Linkset Repository
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Implement event logging
- [ ] Write integration tests
---
### T6a.7: Implement Consensus Repository
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Implement CRUD operations
- [ ] Implement hold management
- [ ] Write integration tests
---
## Sprint 6b: Graph Storage
### T6b.1: Implement Graph Revision Repository
**Status:** TODO
**Estimate:** 1 day
**Interface:**
```csharp
public interface IGraphRevisionRepository
{
Task<GraphRevision?> GetByIdAsync(Guid id, CancellationToken ct);
Task<GraphRevision?> GetByRevisionIdAsync(string revisionId, CancellationToken ct);
Task<GraphRevision?> GetLatestByProjectAsync(Guid projectId, CancellationToken ct);
Task<GraphRevision> CreateAsync(GraphRevision revision, CancellationToken ct);
Task<IReadOnlyList<GraphRevision>> GetHistoryAsync(
Guid projectId, int limit, CancellationToken ct);
}
```
**Subtasks:**
- [ ] Implement all interface methods
- [ ] Handle revision_id uniqueness
- [ ] Handle parent_revision_id linking
- [ ] Write integration tests
---
### T6b.2: Implement Graph Node Repository
**Status:** TODO
**Estimate:** 1.5 days
**Interface:**
```csharp
public interface IGraphNodeRepository
{
Task<GraphNode?> GetByIdAsync(long nodeId, CancellationToken ct);
Task<GraphNode?> GetByKeyAsync(Guid graphRevisionId, string nodeKey, CancellationToken ct);
Task<IReadOnlyList<GraphNode>> GetByRevisionAsync(
Guid graphRevisionId, CancellationToken ct);
Task BulkInsertAsync(
Guid graphRevisionId, IEnumerable<GraphNode> nodes, CancellationToken ct);
Task<int> GetCountAsync(Guid graphRevisionId, CancellationToken ct);
}
```
**Subtasks:**
- [ ] Implement all interface methods
- [ ] Implement bulk insert for efficiency
- [ ] Handle node_key uniqueness per revision
- [ ] Write integration tests
**Bulk Insert Optimization:**
```csharp
public async Task BulkInsertAsync(
Guid graphRevisionId,
IEnumerable<GraphNode> nodes,
CancellationToken ct)
{
await using var connection = await _dataSource.OpenConnectionAsync("system", ct);
await using var writer = await connection.BeginBinaryImportAsync(
"COPY vex.graph_nodes (graph_revision_id, node_key, node_type, purl, name, version, attributes) " +
"FROM STDIN (FORMAT BINARY)", ct);
foreach (var node in nodes)
{
await writer.StartRowAsync(ct);
await writer.WriteAsync(graphRevisionId, ct);
await writer.WriteAsync(node.NodeKey, ct);
await writer.WriteAsync(node.NodeType, ct);
await writer.WriteAsync(node.Purl, NpgsqlDbType.Text, ct);
await writer.WriteAsync(node.Name, NpgsqlDbType.Text, ct);
await writer.WriteAsync(node.Version, NpgsqlDbType.Text, ct);
await writer.WriteAsync(JsonSerializer.Serialize(node.Attributes), NpgsqlDbType.Jsonb, ct);
}
await writer.CompleteAsync(ct);
}
```
---
### T6b.3: Implement Graph Edge Repository
**Status:** TODO
**Estimate:** 1.5 days
**Interface:**
```csharp
public interface IGraphEdgeRepository
{
Task<IReadOnlyList<GraphEdge>> GetByRevisionAsync(
Guid graphRevisionId, CancellationToken ct);
Task<IReadOnlyList<GraphEdge>> GetOutgoingAsync(
long fromNodeId, CancellationToken ct);
Task<IReadOnlyList<GraphEdge>> GetIncomingAsync(
long toNodeId, CancellationToken ct);
Task BulkInsertAsync(
Guid graphRevisionId, IEnumerable<GraphEdge> edges, CancellationToken ct);
Task<int> GetCountAsync(Guid graphRevisionId, CancellationToken ct);
}
```
**Subtasks:**
- [ ] Implement all interface methods
- [ ] Implement bulk insert for efficiency
- [ ] Optimize for traversal queries
- [ ] Write integration tests
---
### T6b.4: Verify Graph Revision ID Stability
**Status:** TODO
**Estimate:** 1 day
**Description:**
Critical: Same SBOM + feeds + policy must produce identical revision_id.
**Subtasks:**
- [ ] Document revision_id computation algorithm
- [ ] Verify nodes are inserted in deterministic order
- [ ] Verify edges are inserted in deterministic order
- [ ] Write stability tests
**Stability Test:**
```csharp
[Fact]
public async Task Same_Inputs_Should_Produce_Same_RevisionId()
{
var sbom = await LoadSbom("testdata/stable-sbom.json");
var feedSnapshot = "feed-v1.2.3";
var policyVersion = "policy-v1.0";
// Compute multiple times
var revisions = new List<string>();
for (int i = 0; i < 5; i++)
{
var graph = await _graphService.ComputeGraphAsync(
sbom, feedSnapshot, policyVersion);
revisions.Add(graph.RevisionId);
}
// All must be identical
revisions.Distinct().Should().HaveCount(1);
}
```
---
## Sprint 6c: Migration & Verification
### T6c.1: Build Graph Conversion Service
**Status:** TODO
**Estimate:** 1.5 days
**Description:**
Convert existing MongoDB graphs to PostgreSQL.
**Subtasks:**
- [ ] Parse MongoDB graph documents
- [ ] Map to graph_revisions table
- [ ] Extract and insert nodes
- [ ] Extract and insert edges
- [ ] Verify node/edge counts match
---
### T6c.2: Build VEX Conversion Service
**Status:** TODO
**Estimate:** 1 day
**Subtasks:**
- [ ] Parse MongoDB VEX statements
- [ ] Map to vex.statements table
- [ ] Preserve provenance
- [ ] Preserve evidence
---
### T6c.3: Run Dual Pipeline Comparison
**Status:** TODO
**Estimate:** 2 days
**Description:**
Run graph computation on both backends and compare.
**Subtasks:**
- [ ] Select sample projects
- [ ] Compute graphs with MongoDB
- [ ] Compute graphs with PostgreSQL
- [ ] Compare revision_ids (must match)
- [ ] Compare node counts
- [ ] Compare edge counts
- [ ] Compare VEX statements
- [ ] Document any differences
---
### T6c.4: Migrate Projects
**Status:** TODO
**Estimate:** 1 day
**Subtasks:**
- [ ] Identify projects to migrate (active VEX)
- [ ] Run conversion for each project
- [ ] Verify latest graph revision
- [ ] Verify VEX statements
---
### T6c.5: Switch to PostgreSQL-Only
**Status:** TODO
**Estimate:** 0.5 days
**Subtasks:**
- [ ] Update configuration
- [ ] Deploy to staging
- [ ] Run full test suite
- [ ] Deploy to production
- [ ] Monitor metrics
---
## Exit Criteria
- [ ] All repository interfaces implemented
- [ ] Graph storage working efficiently
- [ ] Graph revision IDs stable (deterministic)
- [ ] VEX statements preserved correctly
- [ ] All comparison tests pass
- [ ] Excititor running on PostgreSQL in production
---
## Risks & Mitigations
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Revision ID instability | Medium | Critical | Deterministic ordering tests |
| Graph storage performance | Medium | High | Bulk insert, index optimization |
| VEX lattice logic errors | Low | High | Extensive comparison testing |
---
## Performance Considerations
### Graph Storage
- Use `BIGSERIAL` for node/edge IDs (high volume)
- Use `COPY` for bulk inserts (10-100x faster)
- Index `(graph_revision_id, node_key)` for lookups
- Index `(from_node_id)` and `(to_node_id)` for traversal
### Estimated Volumes
| Table | Estimated Rows per Project | Total Estimated |
|-------|---------------------------|-----------------|
| graph_nodes | 1,000 - 50,000 | 10M+ |
| graph_edges | 2,000 - 100,000 | 20M+ |
| vex_statements | 100 - 5,000 | 1M+ |
---
*Phase Version: 1.0.0*
*Last Updated: 2025-11-28*

View File

@@ -0,0 +1,305 @@
# Phase 7: Cleanup & Optimization
**Sprint:** 11
**Duration:** 1 sprint
**Status:** TODO
**Dependencies:** All previous phases completed
---
## Objectives
1. Remove MongoDB dependencies from converted modules
2. Archive MongoDB data
3. Optimize PostgreSQL performance
4. Update documentation
5. Update air-gap kit
---
## Deliverables
| Deliverable | Acceptance Criteria |
|-------------|---------------------|
| Code cleanup | MongoDB code removed from converted modules |
| Data archive | MongoDB data archived and documented |
| Performance tuning | Query times within acceptable range |
| Documentation | All docs updated for PostgreSQL |
| Air-gap kit | PostgreSQL support added |
---
## Task Breakdown
### T7.1: Remove MongoDB Dependencies
**Status:** TODO
**Estimate:** 2 days
**Description:**
Remove MongoDB storage projects and references from converted modules.
**Subtasks:**
- [ ] T7.1.1: Remove `StellaOps.Authority.Storage.Mongo` project
- [ ] T7.1.2: Remove `StellaOps.Scheduler.Storage.Mongo` project
- [ ] T7.1.3: Remove `StellaOps.Notify.Storage.Mongo` project
- [ ] T7.1.4: Remove `StellaOps.Policy.Storage.Mongo` project
- [ ] T7.1.5: Remove `StellaOps.Concelier.Storage.Mongo` project
- [ ] T7.1.6: Remove `StellaOps.Excititor.Storage.Mongo` project
- [ ] T7.1.7: Update solution files
- [ ] T7.1.8: Remove dual-write wrappers
- [ ] T7.1.9: Remove MongoDB configuration options
- [ ] T7.1.10: Run full build to verify no broken references
**Verification:**
- [ ] Solution builds without MongoDB packages
- [ ] No MongoDB references in converted modules
- [ ] All tests pass
---
### T7.2: Archive MongoDB Data
**Status:** TODO
**Estimate:** 1 day
**Description:**
Archive MongoDB databases for historical reference.
**Subtasks:**
- [ ] T7.2.1: Take final MongoDB backup
- [ ] T7.2.2: Export to BSON/JSON archives
- [ ] T7.2.3: Store archives in secure location
- [ ] T7.2.4: Document archive contents and structure
- [ ] T7.2.5: Set retention policy for archives
- [ ] T7.2.6: Schedule MongoDB cluster decommission
**Archive Structure:**
```
archives/
├── mongodb-authority-2025-XX-XX.bson.gz
├── mongodb-scheduler-2025-XX-XX.bson.gz
├── mongodb-notify-2025-XX-XX.bson.gz
├── mongodb-policy-2025-XX-XX.bson.gz
├── mongodb-concelier-2025-XX-XX.bson.gz
├── mongodb-excititor-2025-XX-XX.bson.gz
└── ARCHIVE_MANIFEST.md
```
---
### T7.3: PostgreSQL Performance Optimization
**Status:** TODO
**Estimate:** 2 days
**Description:**
Analyze and optimize PostgreSQL performance.
**Subtasks:**
- [ ] T7.3.1: Enable `pg_stat_statements` extension
- [ ] T7.3.2: Identify slow queries
- [ ] T7.3.3: Analyze query plans with EXPLAIN ANALYZE
- [ ] T7.3.4: Add missing indexes
- [ ] T7.3.5: Remove unused indexes
- [ ] T7.3.6: Tune PostgreSQL configuration
- [ ] T7.3.7: Set up query monitoring dashboard
- [ ] T7.3.8: Document performance baselines
**Configuration Tuning:**
```ini
# postgresql.conf optimizations
shared_buffers = 25% of RAM
effective_cache_size = 75% of RAM
work_mem = 64MB
maintenance_work_mem = 512MB
random_page_cost = 1.1 # for SSD
effective_io_concurrency = 200 # for SSD
max_parallel_workers_per_gather = 4
```
**Monitoring Queries:**
```sql
-- Top slow queries
SELECT query, calls, mean_time, total_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 20;
-- Unused indexes
SELECT schemaname, tablename, indexname
FROM pg_stat_user_indexes
WHERE idx_scan = 0;
-- Table bloat
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) as size
FROM pg_stat_user_tables
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC;
```
---
### T7.4: Update Documentation
**Status:** TODO
**Estimate:** 1.5 days
**Description:**
Update all documentation to reflect PostgreSQL as the primary database.
**Subtasks:**
- [ ] T7.4.1: Update `docs/07_HIGH_LEVEL_ARCHITECTURE.md`
- [ ] T7.4.2: Update module architecture docs
- [ ] T7.4.3: Update deployment guides
- [ ] T7.4.4: Update operations runbooks
- [ ] T7.4.5: Update troubleshooting guides
- [ ] T7.4.6: Update `CLAUDE.md` technology stack
- [ ] T7.4.7: Create PostgreSQL operations guide
- [ ] T7.4.8: Document backup/restore procedures
- [ ] T7.4.9: Document scaling recommendations
**New Documents:**
- `docs/operations/postgresql-guide.md`
- `docs/operations/postgresql-backup-restore.md`
- `docs/operations/postgresql-troubleshooting.md`
---
### T7.5: Update Air-Gap Kit
**Status:** TODO
**Estimate:** 1 day
**Description:**
Update offline/air-gap kit to include PostgreSQL.
**Subtasks:**
- [ ] T7.5.1: Add PostgreSQL container image to kit
- [ ] T7.5.2: Update kit scripts for PostgreSQL setup
- [ ] T7.5.3: Include schema migrations in kit
- [ ] T7.5.4: Update kit documentation
- [ ] T7.5.5: Test kit installation in air-gapped environment
- [ ] T7.5.6: Update `docs/24_OFFLINE_KIT.md`
**Air-Gap Kit Structure:**
```
offline-kit/
├── images/
│ ├── postgres-16-alpine.tar
│ └── stellaops-*.tar
├── schemas/
│ ├── authority.sql
│ ├── vuln.sql
│ ├── vex.sql
│ ├── scheduler.sql
│ ├── notify.sql
│ └── policy.sql
├── scripts/
│ ├── setup-postgres.sh
│ ├── run-migrations.sh
│ └── import-data.sh
└── docs/
└── OFFLINE_SETUP.md
```
---
### T7.6: Final Verification
**Status:** TODO
**Estimate:** 1 day
**Description:**
Run final verification of all systems.
**Subtasks:**
- [ ] T7.6.1: Run full integration test suite
- [ ] T7.6.2: Run performance benchmark suite
- [ ] T7.6.3: Verify all modules on PostgreSQL
- [ ] T7.6.4: Verify determinism tests pass
- [ ] T7.6.5: Verify air-gap kit works
- [ ] T7.6.6: Generate final verification report
- [ ] T7.6.7: Get sign-off from stakeholders
---
### T7.7: Decommission MongoDB
**Status:** TODO
**Estimate:** 0.5 days
**Description:**
Final decommission of MongoDB infrastructure.
**Subtasks:**
- [ ] T7.7.1: Verify no services using MongoDB
- [ ] T7.7.2: Stop MongoDB instances
- [ ] T7.7.3: Archive final state
- [ ] T7.7.4: Remove MongoDB from infrastructure
- [ ] T7.7.5: Update monitoring/alerting
- [ ] T7.7.6: Update cost projections
---
## Exit Criteria
- [ ] All MongoDB code removed from converted modules
- [ ] MongoDB data archived
- [ ] PostgreSQL performance optimized
- [ ] All documentation updated
- [ ] Air-gap kit updated and tested
- [ ] Final verification report approved
- [ ] MongoDB infrastructure decommissioned
---
## Post-Conversion Monitoring
### First Week
- Monitor error rates closely
- Track query performance
- Watch for any data inconsistencies
- Have rollback plan ready (restore MongoDB)
### First Month
- Review query statistics weekly
- Optimize any slow queries found
- Monitor storage growth
- Adjust vacuum settings if needed
### Ongoing
- Regular performance reviews
- Index maintenance
- Backup verification
- Capacity planning
---
## Rollback Considerations
**Note:** After Phase 7 completion, rollback to MongoDB becomes significantly more complex. Ensure all stakeholders understand:
1. MongoDB archives are read-only backup
2. Any new data created after cutover is PostgreSQL-only
3. Full rollback would require data export/import
---
## Success Metrics
| Metric | Target | Measurement |
|--------|--------|-------------|
| Query latency (p95) | < 100ms | pg_stat_statements |
| Error rate | < 0.01% | Application logs |
| Storage efficiency | < 120% of MongoDB | Disk usage |
| Test coverage | 100% | CI reports |
| Documentation coverage | 100% | Manual review |
---
*Phase Version: 1.0.0*
*Last Updated: 2025-11-28*