save checkpoint. addition features and their state. check some ofthem

This commit is contained in:
master
2026-02-10 07:54:44 +02:00
parent 4bdc298ec1
commit 5593212b41
211 changed files with 10248 additions and 1208 deletions

View File

@@ -0,0 +1,154 @@
# Claude Code Team Strategy for Feature Verification
Alternative to the OpenCode pipeline. Uses Claude Code's built-in team orchestration
(TeamCreate, Task tools, SendMessage) for parallel feature verification.
---
## Architecture
```
Team Lead (Opus 4.6)
├── tier0-scanner (Haiku 4.5) # Fast: source file existence checks
├── checker-1 (Opus 4.6) # Full pipeline per feature
├── checker-2 (Opus 4.6) # Parallel checker
├── checker-3 (Opus 4.6) # Parallel checker
├── triage-agent (Sonnet 4.5) # Issue diagnosis
└── fixer-agent (Opus 4.6) # Code fixes
```
## How It Differs from OpenCode
| Aspect | OpenCode | Claude Code Teams |
|--------|----------|-------------------|
| State management | JSON state files on disk | TaskCreate/TaskList/TaskUpdate in-memory |
| Parallelism | Sequential (one feature at a time) | Parallel (multiple agents simultaneously) |
| Orchestration | Prompt-driven subagent calls | Native team messaging (SendMessage) |
| File access | All agents share filesystem | All agents share filesystem |
| Cost model | Per-token across models | Per-token, Opus 4.6 primarily |
| Session limits | Unlimited (persistent state) | Context window limits (use /compact) |
| Best for | Long-running unattended pipelines | Interactive sessions with parallelism |
## Execution Plan
### Phase 1: Setup (Team Lead)
```
1. TeamCreate: "feature-verify-<module>"
2. TaskCreate: One task per feature in the module
3. Spawn tier0-scanner agent
4. Wait for Tier 0 results
5. Filter: remove not_implemented features
```
### Phase 2: Parallel Checking (3 Checker Agents)
```
1. Spawn 3 checker agents
2. Each checker claims tasks from TaskList (lowest ID first)
3. For each feature:
a. Read feature .md file
b. Run Tier 1: dotnet build + dotnet test
c. Run Tier 2: Playwright/CLI/API (if applicable)
d. TaskUpdate: status=completed or note failures
e. Claim next task
4. Checkers go idle when no tasks remain
```
### Phase 3: Triage + Fix (if failures found)
```
1. Team lead reviews completed tasks for failures
2. For each failure:
a. TaskCreate: triage task with failure details
b. Send to triage-agent
c. triage-agent reads source, returns diagnosis
d. Team lead reviews diagnosis
e. TaskCreate: fix task with confirmed diagnosis
f. Send to fixer-agent
g. fixer-agent implements fix
h. Team lead re-checks via checker agent
```
### Phase 4: Cleanup
```
1. Move passed features: unchecked/ -> checked/
2. Move not_implemented: unchecked/ -> unimplemented/
3. Update state summary
4. Shutdown all agents
5. TeamDelete
```
## Claude Code Prompt (Copy-Paste Ready)
Use this prompt to start the team in Claude Code:
```
You are verifying features for the <MODULE> module of Stella Ops.
Read docs/qa/feature-checks/FLOW.md for the full state machine and tier system.
Your workflow:
1. Create team "verify-<module>"
2. Create one task per feature file in docs/features/unchecked/<module>/
3. Spawn a tier0-scanner (haiku) to verify source files exist for all features
4. After Tier 0, spawn 3 checker agents (opus) in parallel to run Tier 1 (build+test)
5. For features that fail, spawn a triage agent (sonnet) to diagnose
6. For confirmed issues, spawn a fixer agent (opus) to implement fixes
7. Re-check fixed features
8. Move passed features to docs/features/checked/<module>/
9. Report final status and shutdown team
For each feature, write run artifacts to docs/qa/feature-checks/runs/<module>/<feature>/
State file: docs/qa/feature-checks/state/<module>.json
IMPORTANT:
- Feature files are at docs/features/unchecked/<module>/*.md
- Each has Implementation Details with source file paths
- Each has an E2E Test Plan with verification steps
- Source code is under src/
- .NET 10 backend, Angular 21 frontend
- Use dotnet build/test for backend, ng build/test for frontend
```
## Recommended Test Modules (Side-by-Side Comparison)
For testing both pipelines simultaneously on different modules:
### OpenCode Pipeline: `gateway` (8 features)
- Pure backend, no Playwright needed
- Small enough to complete in one session
- Has rate limiting + circuit breaker logic (meaningful checks)
- Features: router-back-pressure-middleware, stellarouter-performance-testing-pipeline, + 6 existing
### Claude Code Teams: `graph` (7 features)
- Pure backend, no Playwright needed
- Similar size to gateway
- Has graph data structures (meaningful checks)
- Features: graph-edge-metadata-with-reason-evidence-provenance, + 6 existing
Both modules are:
- Small (7-8 features each) - completable in one session
- Backend-only - no Playwright/environment complexity
- Meaningful - real logic to verify, not just config
- Independent - no cross-module dependencies
### Alternative Pair (if you want frontend testing):
- **OpenCode**: `exportcenter` (7 features) - has CLI+UI surface
- **Claude Code**: `vexlens` (7 features) - has truth-table tests + UI
---
## State File Compatibility
Both strategies use the SAME state file format (`docs/qa/feature-checks/state/<module>.json`)
and artifact format (`docs/qa/feature-checks/runs/<module>/<feature>/<runId>/`).
This means you can:
- Start with OpenCode on module A
- Start with Claude Code on module B
- Compare results using the same state format
- Switch strategies mid-stream if one works better