162 lines
9.7 KiB
Markdown
162 lines
9.7 KiB
Markdown
# Windows Analyzer Design Brief (Draft)
|
|
|
|
> Owners: Scanner Guild, Policy Guild, Offline Kit Guild, Security Guild
|
|
> Related backlog (proposed): SCANNER-ENG-0024..0027, DOCS-SCANNER-BENCH-62-002
|
|
> Status: Draft — contingent on Windows demand threshold (see `docs/benchmarks/scanner/windows-macos-demand.md`)
|
|
|
|
## 1. Objectives & boundaries
|
|
- Provide deterministic inventory for Windows Server/container images covering MSI/WinSxS assemblies, Chocolatey packages, and registry-derived installers.
|
|
- Preserve replayability (layer fragments, provenance metadata) and align outputs with existing SBOM/policy pipelines.
|
|
- Respect sovereignty constraints: offline-friendly, signed rule bundles, no reliance on Windows APIs unavailable in containerized scans.
|
|
|
|
Out of scope (Phase 1):
|
|
- Live registry queries on running Windows hosts (requires runtime agent; defer to Zastava/Runtime roadmap).
|
|
- Windows Update patch baseline comparison (tracked separately under Runtime/Posture).
|
|
- UWP/MSIX packages (flagged for follow-up once MSI parity is complete).
|
|
|
|
## 2. Architecture overview
|
|
```
|
|
Scanner.Worker (Windows profile)
|
|
├─ Surface.Validation (enforce layer size, path allowlists)
|
|
├─ Surface.FS (materialized NTFS image via 7z/guestmount)
|
|
├─ MsiCollector -> LayerComponentFragment (windows-msi)
|
|
├─ WinSxSCollector -> LayerComponentFragment (windows-winsxs)
|
|
├─ ChocolateyCollector -> LayerComponentFragment (windows-choco)
|
|
├─ RegistryCollector -> Evidence overlays (uninstall/services)
|
|
├─ DriverCapabilityMapper -> Capability overlays (kernel/user drivers)
|
|
└─ WindowsComponentMapper -> ComponentGraph + capability metadata
|
|
```
|
|
|
|
- Collectors operate on extracted filesystem snapshots; registry access performed on exported hive files produced during image extraction (document in ops runbooks).
|
|
- `WindowsComponentMapper` normalizes component identities (ProductCode, AssemblyIdentity, Chocolatey package ID) and merges overlapping evidence into deterministic fragments.
|
|
|
|
## 3. Collectors
|
|
### 3.1 MSI collector
|
|
- Input: `Windows/Installer/*.msi` database files (Jet OLE DB), registry hive exports for product mapping.
|
|
- Implementation approach:
|
|
- Use open-source MSI parser (custom or MIT-compatible) to avoid COM dependencies.
|
|
- Extract Product, Component, File, Feature, Media tables.
|
|
- Compute SHA256 for installed files via Component table, linking to WinSxS manifests.
|
|
- Output metadata: `productCode`, `upgradeCode`, `productVersion`, `manufacturer`, `language`, `installContext`, `packageCode`, `sourceList`.
|
|
- Evidence: file paths with digests, component IDs, CAB/patch references.
|
|
|
|
### 3.2 WinSxS collector
|
|
- Input: `Windows/WinSxS/Manifests/*.manifest`, `Windows/WinSxS/` payload directories, catalog (.cat) files.
|
|
- Parse XML assembly identities (name, version, processor architecture, public key token, language).
|
|
- Map to MSI components when file hashes match.
|
|
- Capture catalog signature thumbprint and optional patch KB references for policy gating.
|
|
|
|
### 3.3 Chocolatey collector
|
|
- Input: `ProgramData/Chocolatey/lib/**`, `ProgramData/Chocolatey/package.backup`, `chocolateyinstall.ps1`, `.nuspec`.
|
|
- Extract package ID, version, checksum, source feed, installed files and scripts.
|
|
- Note whether install used cache or remote feed; record script hash for determinism.
|
|
|
|
### 3.4 Registry collector
|
|
- Input: Exported `SOFTWARE` hive covering:
|
|
- `Microsoft\Windows\CurrentVersion\Uninstall`
|
|
- `Microsoft\Windows\CurrentVersion\Installer\UserData`
|
|
- `Microsoft\Windows\CurrentVersion\Run` (startup apps)
|
|
- Service/driver configuration from `SYSTEM` hive under `Services`.
|
|
- Emit fallback evidence for installers not captured by MSI/Chocolatey (legacy EXE installers).
|
|
- Record uninstall strings, install dates, publisher, estimated size, install location.
|
|
|
|
### 3.5 Driver & service mapper
|
|
- Parse `SYSTEM` hive `Services` entries to detect drivers (type=1 or 2) and critical services (start mode auto/boot).
|
|
- Output capability overlays (e.g., `windows.driver.kernelMode(true)`, `windows.service.autoStart("Spooler")`) for Policy Engine.
|
|
|
|
## 4. Component mapping & output
|
|
- `WindowsComponentMapper`:
|
|
- Generate `LayerComponentFragment`s with synthetic layer digests (e.g., `sha256:stellaops-windows-msi`).
|
|
- Build `ComponentIdentity` with PURL-like scheme: `pkg:msi/<productCode>` or `pkg:winsxs/<assemblyIdentity>`.
|
|
- Include metadata: signature thumbprint, catalog hash, KB references, install context, manufacturer.
|
|
- Capability overlays stored under `ScanAnalysisKeys.capability.windows` for policy consumption.
|
|
- Export Center bundling:
|
|
- Include MSI manifest extracts, WinSxS assembly manifests, Chocolatey nuspec snapshots, and service/driver capability CSV.
|
|
|
|
## 5. Policy integration
|
|
- Predicates to introduce:
|
|
- `windows.package.signed(expectedThumbprint?)`
|
|
- `windows.package.unsupportedInstallerType`
|
|
- `windows.driver.kernelMode`, `windows.driver.unsigned`
|
|
- `windows.service.autoStart(name)`
|
|
- `windows.choco.sourceAllowed(feed)`
|
|
- Lattice approach:
|
|
- Unsigned kernel drivers → default `fail`.
|
|
- Unknown installer sources → `warn` with escalation on critical services.
|
|
- Chocolatey packages from non-whitelisted feeds → configurable severity.
|
|
- Waiver semantics bind to product code + signature thumbprint; waivers expire when package version changes.
|
|
|
|
## 6. Offline kit & distribution
|
|
- Package:
|
|
- MSI schema definitions and parser binaries (signed).
|
|
- Chocolatey feed snapshot (nupkg archives + index) for allow-listed feeds.
|
|
- Windows catalog certificate chains + optional CRL/OCSP caches.
|
|
- Documentation:
|
|
- Provide instructions for exporting registry hives during image extraction (PowerShell script included).
|
|
- Note disk space expectations (Chocolatey snapshot size, WinSxS manifest volume).
|
|
|
|
## 7. Testing strategy
|
|
- Fixtures:
|
|
- Sample MSI packages (with/without transforms), WinSxS manifests, Chocolatey packages.
|
|
- Registry hive exports representing mixed installer types.
|
|
- Tests:
|
|
- Unit tests for each collector parsing edge cases (language-specific manifests, transforms, script hashing).
|
|
- Integration tests using synthetic Windows container image layers (generated via CI on Windows worker).
|
|
- Determinism checks ensuring repeated runs produce identical fragments.
|
|
- Security review:
|
|
- Validate script execution paths (collectors must never execute Chocolatey scripts; inspect only).
|
|
|
|
## 8. Dependencies & open questions
|
|
| Item | Description | Owner | Status |
|
|
| --- | --- | --- | --- |
|
|
| MSI parser choice | Select MIT/Apache-compatible parser or build internal reader | Scanner Guild | TBD |
|
|
| Registry export tooling | Determine standard script/utility for hive exports in container context | Ops Guild | TBD |
|
|
| Authenticodes verification locus | Decide scanner vs policy responsibility for signature verification | Security Guild | TBD |
|
|
| Feed mirroring policy | Which Chocolatey feeds to mirror by default | Product + Security Guilds | TBD |
|
|
|
|
## 9. Implementation status
|
|
|
|
| ID | Title | Status | Notes |
|
|
| --- | --- | --- | --- |
|
|
| SCANNER-ENG-0024 | Windows MSI collector | **DONE** | `StellaOps.Scanner.Analyzers.OS.Windows.Msi` - OLE compound document parser, extracts Product/File tables, 22 tests passing |
|
|
| SCANNER-ENG-0025 | WinSxS manifest collector | **DONE** | `StellaOps.Scanner.Analyzers.OS.Windows.WinSxS` - XML manifest parser, assembly identity extraction, 18 tests passing |
|
|
| SCANNER-ENG-0026 | Chocolatey collector | **DONE** | `StellaOps.Scanner.Analyzers.OS.Windows.Chocolatey` - nuspec parser with directory fallback, 44 tests passing |
|
|
| SCANNER-ENG-0026 | Registry collector | DEFERRED | Requires exported hive parsing; tracked separately |
|
|
| SCANNER-ENG-0027 | Policy predicates | PENDING | Requires Policy module integration (see §5) |
|
|
| SCANNER-ENG-0027 | Offline kit packaging | DONE | All analyzers work offline (local file parsing only) |
|
|
|
|
### Implementation details
|
|
|
|
**MSI collector** (`windows-msi` analyzer ID):
|
|
- Parses MSI database files using OLE compound document signature detection
|
|
- Extracts ProductCode, UpgradeCode, ProductName, Manufacturer, ProductVersion
|
|
- PURL format: `pkg:generic/windows-msi/{normalized-name}@{version}?upgrade_code={code}`
|
|
- Vendor metadata: `msi:product_code`, `msi:upgrade_code`, `msi:manufacturer`, etc.
|
|
|
|
**WinSxS collector** (`windows-winsxs` analyzer ID):
|
|
- Scans `Windows/WinSxS/Manifests/*.manifest` files
|
|
- Parses XML assembly identity with multiple namespace support (2006/2009/2016)
|
|
- Extracts name, version, architecture, public key token, language, type
|
|
- PURL format: `pkg:generic/windows-winsxs/{assembly-name}@{version}?arch={arch}`
|
|
- Vendor metadata: `winsxs:name`, `winsxs:version`, `winsxs:public_key_token`, etc.
|
|
|
|
**Chocolatey collector** (`windows-chocolatey` analyzer ID):
|
|
- Scans `ProgramData/Chocolatey/lib/` and `ProgramData/chocolatey/lib/`
|
|
- Parses `.nuspec` files with multiple schema namespace support (2010/2011/2015)
|
|
- Falls back to directory name parsing when nuspec missing
|
|
- Computes SHA256 hash of `chocolateyinstall.ps1` for determinism
|
|
- PURL format: `pkg:chocolatey/{package-id}@{version}`
|
|
- Vendor metadata: `choco:id`, `choco:authors`, `choco:install_script_hash`, etc.
|
|
|
|
## 10. References
|
|
- `docs/benchmarks/scanner/deep-dives/windows.md`
|
|
- `docs/benchmarks/scanner/windows-macos-demand.md`
|
|
- `docs/modules/scanner/design/macos-analyzer.md` (structure/composition parallels)
|
|
- Surface design docs (`surface-fs.md`, `surface-validation.md`, `surface-secrets.md`) for interfacing expectations.
|
|
|
|
Further reading: `../../api/scanner/windows-coverage.md` (summary) and `../../api/scanner/windows-macos-summary.md` (metrics dashboard).
|
|
|
|
Policy readiness alignment: see `../policy/windows-package-readiness.md` (POLICY-READINESS-0002).
|
|
|
|
Upcoming milestone: FinSecure Corp PCI review requires Authenticode/feed decision by 2025-11-07 before Windows analyzer spike kickoff.
|