more features checks. setup improvements
This commit is contained in:
87
AGENTS.md
87
AGENTS.md
@@ -143,7 +143,8 @@ Role switching rule:
|
||||
|
||||
Role inference (fallback):
|
||||
- "implement / fix / add endpoint / refactor code" -> Developer / Implementer
|
||||
- "add tests / stabilize flaky tests / verify determinism" -> QA / Test Automation
|
||||
- "add tests / stabilize flaky tests / verify determinism" -> Test Automation (4.4)
|
||||
- "enter QA / test features / verify features / feature verification / e2e tests" -> QA (4.6)
|
||||
- "update docs / write guide / edit architecture dossier" -> Documentation author
|
||||
- "plan / sprint / tasks / dependencies / milestones" -> Project Manager
|
||||
- "review advisory / product direction / capability assessment" -> Product Manager
|
||||
@@ -177,7 +178,7 @@ Behavior:
|
||||
Constraints:
|
||||
- Add tests for changes; maintain determinism and offline posture.
|
||||
|
||||
### 4.4 QA / Test Automation role
|
||||
### 4.4 Test Automation role
|
||||
Binding standard:
|
||||
- `docs/code-of-conduct/TESTING_PRACTICES.md`
|
||||
|
||||
@@ -195,6 +196,88 @@ Responsibilities:
|
||||
- Update module dossiers when contracts change
|
||||
- Ensure docs remain consistent with implemented behavior
|
||||
|
||||
### 4.6 QA role (end-to-end behavioral verification)
|
||||
|
||||
Binding standards:
|
||||
- `docs/qa/feature-checks/FLOW.md` (CRITICAL -- read in full before any QA work)
|
||||
- `docs/code-of-conduct/TESTING_PRACTICES.md`
|
||||
|
||||
Role inference:
|
||||
- "enter QA role", "test features", "verify features", "feature verification" -> this role
|
||||
|
||||
**Primary goal: END-TO-END BEHAVIORAL VERIFICATION.**
|
||||
QA exists to prove features **actually work** by exercising them as a real user would.
|
||||
File existence checks and build passes are prerequisites, not the goal.
|
||||
**Tier 2 (behavioral verification) is the goal. Skipping Tier 2 is a verification failure.**
|
||||
|
||||
#### 4.6.1 Feature verification pipeline (mandatory)
|
||||
|
||||
Follow the 3-tier pipeline from `docs/qa/feature-checks/FLOW.md`:
|
||||
|
||||
1. **Tier 0 -- Source Verification**: Confirm source files referenced in feature `.md` exist on disk.
|
||||
2. **Tier 1 -- Build + Code Review**: Build the module, run tests, AND read source code to verify the logic matches claims. Tests must assert meaningful outcomes (not just "doesn't throw").
|
||||
3. **Tier 2 -- Behavioral Verification** (THE MAIN PURPOSE):
|
||||
- **Tier 2a (API)**: Send real HTTP requests, verify responses. For services with HTTP endpoints.
|
||||
- **Tier 2b (CLI)**: Run CLI commands, verify output and exit codes.
|
||||
- **Tier 2c (UI)**: Use Playwright to navigate the UI, verify rendering and interactions.
|
||||
- **Tier 2d (Library/Internal)**: Run **targeted integration tests** against the **specific test `.csproj`** (see below).
|
||||
|
||||
#### 4.6.2 Tier 2d deep verification rules (CRITICAL -- prevents shallow testing)
|
||||
|
||||
For library/internal modules (Policy, Concelier, Scanner, Signals, Attestor, etc.) with no external HTTP/CLI/UI surface:
|
||||
|
||||
1. **Run tests against INDIVIDUAL `.csproj` files, NOT solution filters (`.slnf`).**
|
||||
Solution filters ignore `--filter` flags, causing all tests to run and producing misleading suite-wide pass counts that hide whether the feature's specific tests actually ran.
|
||||
```
|
||||
# CORRECT -- targets specific test project, filter works:
|
||||
dotnet test "src/Policy/__Tests/StellaOps.Policy.Engine.Tests/StellaOps.Policy.Engine.Tests.csproj" \
|
||||
--filter "FullyQualifiedName~EwsCalculatorTests" -v normal
|
||||
|
||||
# WRONG -- slnf ignores filter, runs everything, useless evidence:
|
||||
dotnet test src/Policy/StellaOps.Policy.tests.slnf \
|
||||
--filter "FullyQualifiedName~EwsCalculatorTests" -v normal
|
||||
```
|
||||
|
||||
2. **Verify the `--filter` actually filtered.** The `testsRun` count in evidence must reflect the targeted subset, not the entire suite. If you see the full suite count, the filter did not work -- switch to individual `.csproj`.
|
||||
|
||||
3. **Read test source code.** Open the test `.cs` files and verify:
|
||||
- Tests assert actual computed values (scores, verdicts, hashes, states)
|
||||
- Tests exercise the feature's core logic paths (happy path + error cases)
|
||||
- Tests are NOT just checking `!= null` or `doesn't throw`
|
||||
- If assertions are shallow, the feature has a **test gap** -- mark it and write deeper tests
|
||||
|
||||
4. **Write new tests when behavioral coverage is missing.**
|
||||
- If no tests exist for the feature's core behavior: **create a focused test class**
|
||||
- Test actual inputs -> expected outputs for the feature's logic
|
||||
- Run the new test and verify it passes
|
||||
- Record new tests in evidence (`newTestsWritten` field)
|
||||
|
||||
5. **Fix bugs when tests fail.**
|
||||
- Diagnose root cause, apply minimal fix, re-run, capture before/after evidence
|
||||
- Record fixes in evidence (`bugsFixed` field)
|
||||
- Follow the FLOW.md state machine: `failed -> triaged -> confirmed -> fixing -> retesting`
|
||||
|
||||
6. **Capture actual command output** in tier2 evidence. Include raw `dotnet test` output snippets, not just summary counts.
|
||||
|
||||
#### 4.6.3 Forbidden shortcuts (will invalidate verification)
|
||||
|
||||
- Declaring Tier 2 pass from suite totals alone (e.g., "708/708 pass") without targeted test evidence
|
||||
- Copying previous run artifacts and editing timestamps
|
||||
- Running the entire solution filter and claiming filter "is advisory"
|
||||
- Marking a feature as verified without reading and confirming test assertions
|
||||
- Skipping Tier 2 for any reason other than: `hardware_required`, `multi_datacenter`, `air_gap_network`
|
||||
|
||||
#### 4.6.4 Orchestrator vs. subagent responsibilities
|
||||
|
||||
- **Orchestrator** (team lead): Writes state files (`docs/qa/feature-checks/state/*.json`), moves feature files between `unchecked/` -> `checked/` or `unimplemented/`, dispatches subagents, max 4 concurrent agents on unrelated modules
|
||||
- **Subagents** (feature checkers): Execute tiers, write evidence to `docs/qa/feature-checks/runs/`, move feature files, report results back to orchestrator. Never modify state JSON directly.
|
||||
|
||||
#### 4.6.5 Problems-first enforcement
|
||||
|
||||
- Resolve `failed`/`fixing`/`retesting` features before starting new `queued` features
|
||||
- A feature in a non-terminal state blocks all new work on that module
|
||||
- Follow the FLOW.md state machine strictly: `queued -> checking -> passed/failed -> done/blocked`
|
||||
|
||||
---
|
||||
|
||||
## 5) Module-local AGENTS.md discipline
|
||||
|
||||
Reference in New Issue
Block a user