save progress
This commit is contained in:
237
docs/modules/feedser/architecture.md
Normal file
237
docs/modules/feedser/architecture.md
Normal file
@@ -0,0 +1,237 @@
|
||||
# component_architecture_feedser.md - **Stella Ops Feedser** (2025Q4)
|
||||
|
||||
> Evidence collection library for backport detection and binary fingerprinting.
|
||||
|
||||
> **Scope.** Library architecture for **Feedser**: patch signature extraction, binary fingerprinting, and evidence collection supporting the four-tier backport proof system. Consumed primarily by Concelier's ProofService layer.
|
||||
|
||||
---
|
||||
|
||||
## 0) Mission & boundaries
|
||||
|
||||
**Mission.** Provide deterministic, cryptographic evidence collection for backport detection. Extract patch signatures from unified diffs and binary fingerprints from compiled code to enable high-confidence vulnerability status determination for packages where upstream fixes have been backported by distro maintainers.
|
||||
|
||||
**Boundaries.**
|
||||
|
||||
* Feedser is a **library**, not a standalone service. It does not expose REST APIs directly.
|
||||
* Feedser **does not** make vulnerability decisions. It provides evidence that feeds into VEX statements and Policy Engine evaluation.
|
||||
* Feedser **does not** store data. Storage is handled by consuming services (Concelier ProofService, Attestor).
|
||||
* All outputs are **deterministic** with canonical JSON serialization and stable hashing.
|
||||
|
||||
---
|
||||
|
||||
## 1) Solution & project layout
|
||||
|
||||
```
|
||||
src/Feedser/
|
||||
├─ StellaOps.Feedser.Core/ # Patch signature extraction (HunkSig)
|
||||
│ ├─ HunkSigExtractor.cs # Unified diff parser and normalizer
|
||||
│ ├─ Models/
|
||||
│ │ ├─ PatchSignature.cs # Deterministic patch identifier
|
||||
│ │ ├─ HunkSignature.cs # Individual hunk with normalized content
|
||||
│ │ └─ DiffParseResult.cs # Parse output with file paths and hunks
|
||||
│ └─ Normalization/
|
||||
│ └─ WhitespaceNormalizer.cs # Whitespace/comment stripping
|
||||
│
|
||||
├─ StellaOps.Feedser.BinaryAnalysis/ # Binary fingerprinting engine
|
||||
│ ├─ BinaryFingerprintFactory.cs # Factory for fingerprinting strategies
|
||||
│ ├─ IBinaryFingerprinter.cs # Fingerprinter interface
|
||||
│ ├─ Models/
|
||||
│ │ ├─ BinaryFingerprint.cs # Fingerprint record with method/value
|
||||
│ │ └─ FingerprintMatchResult.cs # Match score and confidence
|
||||
│ └─ Fingerprinters/
|
||||
│ ├─ SimplifiedTlshFingerprinter.cs # TLSH fuzzy hashing
|
||||
│ └─ InstructionHashFingerprinter.cs # Instruction sequence hashing
|
||||
│
|
||||
├─ plugins/
|
||||
│ └─ concelier/ # Concelier integration plugin
|
||||
│
|
||||
└─ __Tests/
|
||||
└─ StellaOps.Feedser.Core.Tests/ # Unit tests
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2) External dependencies
|
||||
|
||||
* **Concelier ProofService** - Primary consumer; orchestrates four-tier evidence collection
|
||||
* **Attestor ProofChain** - Consumes evidence for proof blob generation
|
||||
* **.NET 10** - Runtime target
|
||||
* No database dependencies (stateless library)
|
||||
* No external network dependencies
|
||||
|
||||
---
|
||||
|
||||
## 3) Contracts & data model
|
||||
|
||||
### 3.1 Patch Signature (Tier 3 Evidence)
|
||||
|
||||
```csharp
|
||||
public sealed record PatchSignature
|
||||
{
|
||||
public required string Id { get; init; } // Deterministic SHA256
|
||||
public required string FilePath { get; init; } // Source file path
|
||||
public required IReadOnlyList<HunkSignature> Hunks { get; init; }
|
||||
public required string ContentHash { get; init; } // BLAKE3-256 of normalized content
|
||||
public string? CommitId { get; init; } // Git commit SHA if available
|
||||
public string? UpstreamCve { get; init; } // Associated CVE
|
||||
}
|
||||
|
||||
public sealed record HunkSignature
|
||||
{
|
||||
public required int OldStart { get; init; }
|
||||
public required int NewStart { get; init; }
|
||||
public required string NormalizedContent { get; init; } // Whitespace-stripped
|
||||
public required string ContentHash { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 Binary Fingerprint (Tier 4 Evidence)
|
||||
|
||||
```csharp
|
||||
public sealed record BinaryFingerprint
|
||||
{
|
||||
public required string Method { get; init; } // tlsh, instruction_hash
|
||||
public required string Value { get; init; } // Fingerprint value
|
||||
public required string TargetPath { get; init; } // Binary file path
|
||||
public string? FunctionName { get; init; } // Function if scoped
|
||||
public required string Architecture { get; init; } // x86_64, aarch64, etc.
|
||||
}
|
||||
|
||||
public sealed record FingerprintMatchResult
|
||||
{
|
||||
public required decimal Similarity { get; init; } // 0.0-1.0
|
||||
public required decimal Confidence { get; init; } // 0.0-1.0
|
||||
public required string Method { get; init; }
|
||||
public required BinaryFingerprint Query { get; init; }
|
||||
public required BinaryFingerprint Match { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### 3.3 Evidence Tier Confidence Levels
|
||||
|
||||
| Tier | Evidence Type | Confidence Range | Description |
|
||||
|------|--------------|------------------|-------------|
|
||||
| 1 | Distro Advisory | 0.95-0.98 | Official vendor/distro statement |
|
||||
| 2 | Changelog Mention | 0.75-0.85 | CVE mentioned in changelog |
|
||||
| 3 | Patch Signature (HunkSig) | 0.85-0.95 | Normalized patch hash match |
|
||||
| 4 | Binary Fingerprint | 0.55-0.85 | Compiled code similarity |
|
||||
|
||||
---
|
||||
|
||||
## 4) Core Components
|
||||
|
||||
### 4.1 HunkSigExtractor
|
||||
|
||||
Parses unified diff format and extracts normalized patch signatures:
|
||||
|
||||
```csharp
|
||||
public interface IHunkSigExtractor
|
||||
{
|
||||
PatchSignature Extract(string unifiedDiff, string? commitId = null);
|
||||
IReadOnlyList<PatchSignature> ExtractMultiple(string multiFileDiff);
|
||||
}
|
||||
```
|
||||
|
||||
**Normalization rules:**
|
||||
- Strip leading/trailing whitespace
|
||||
- Normalize line endings to LF
|
||||
- Remove C-style comments (optional)
|
||||
- Collapse multiple whitespace to single space
|
||||
- Sort hunks by (file_path, old_start) for determinism
|
||||
|
||||
### 4.2 BinaryFingerprintFactory
|
||||
|
||||
Factory for creating fingerprinters based on binary type and analysis requirements:
|
||||
|
||||
```csharp
|
||||
public interface IBinaryFingerprintFactory
|
||||
{
|
||||
IBinaryFingerprinter Create(FingerprintMethod method);
|
||||
IReadOnlyList<IBinaryFingerprinter> GetAll();
|
||||
}
|
||||
|
||||
public interface IBinaryFingerprinter
|
||||
{
|
||||
string Method { get; }
|
||||
BinaryFingerprint Extract(ReadOnlySpan<byte> binary, string path);
|
||||
FingerprintMatchResult Match(BinaryFingerprint query, BinaryFingerprint candidate);
|
||||
}
|
||||
```
|
||||
|
||||
**Fingerprinting methods:**
|
||||
|
||||
| Method | Description | Confidence | Use Case |
|
||||
|--------|-------------|------------|----------|
|
||||
| `tlsh` | TLSH fuzzy hash | 0.75-0.85 | General binary similarity |
|
||||
| `instruction_hash` | Normalized instruction sequences | 0.55-0.75 | Function-level matching |
|
||||
|
||||
---
|
||||
|
||||
## 5) Integration with Concelier
|
||||
|
||||
Feedser is consumed via `StellaOps.Concelier.ProofService.BackportProofService`:
|
||||
|
||||
```
|
||||
BackportProofService (Concelier)
|
||||
├─ Tier 1: Query advisory_observations (distro advisories)
|
||||
├─ Tier 2: Query changelogs via ISourceRepository
|
||||
├─ Tier 3: Query patches via IPatchRepository + HunkSigExtractor
|
||||
├─ Tier 4: Query binaries + BinaryFingerprintFactory
|
||||
└─ Aggregate → ProofBlob with combined confidence score
|
||||
```
|
||||
|
||||
The ProofService orchestrates evidence collection across all tiers and produces cryptographic proof blobs for downstream consumption.
|
||||
|
||||
---
|
||||
|
||||
## 6) Security & compliance
|
||||
|
||||
* **Determinism**: All outputs use canonical JSON with sorted keys, UTC timestamps
|
||||
* **Tamper evidence**: BLAKE3-256 content hashes for all signatures
|
||||
* **No secrets**: Library handles only public patch/binary data
|
||||
* **Offline capable**: No network dependencies in core library
|
||||
|
||||
---
|
||||
|
||||
## 7) Performance targets
|
||||
|
||||
* **Patch extraction**: < 10ms for typical unified diff (< 1000 lines)
|
||||
* **Binary fingerprinting**: < 100ms for 10MB ELF binary
|
||||
* **Memory**: Streaming processing for large binaries; no full file buffering
|
||||
* **Parallelism**: Thread-safe extractors; concurrent fingerprinting supported
|
||||
|
||||
---
|
||||
|
||||
## 8) Observability
|
||||
|
||||
Library consumers (ProofService) emit metrics:
|
||||
|
||||
* `feedser.hunk_extraction_duration_seconds`
|
||||
* `feedser.binary_fingerprint_duration_seconds`
|
||||
* `feedser.fingerprint_match_score{method}`
|
||||
* `feedser.evidence_tier_confidence{tier}`
|
||||
|
||||
---
|
||||
|
||||
## 9) Testing matrix
|
||||
|
||||
* **Unit tests**: HunkSigExtractor parsing, normalization edge cases
|
||||
* **Fingerprint tests**: Known binary pairs with expected similarity scores
|
||||
* **Determinism tests**: Same input produces identical output across runs
|
||||
* **Performance tests**: Large diff/binary processing within targets
|
||||
|
||||
---
|
||||
|
||||
## 10) Historical note
|
||||
|
||||
Concelier was formerly named "Feedser" (see `docs/airgap/airgap-mode.md`). The module was refactored:
|
||||
- **Feedser** retained as evidence collection library
|
||||
- **Concelier** became the advisory aggregation service consuming Feedser
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
* Concelier architecture: `../concelier/architecture.md`
|
||||
* Attestor ProofChain: `../attestor/architecture.md`
|
||||
* Backport proof system: `../../reachability/backport-proofs.md`
|
||||
Reference in New Issue
Block a user