238 lines
8.5 KiB
Markdown
238 lines
8.5 KiB
Markdown
# component_architecture_feedser.md - **Stella Ops Feedser** (2025Q4)
|
|
|
|
> Evidence collection library for backport detection and binary fingerprinting.
|
|
|
|
> **Scope.** Library architecture for **Feedser**: patch signature extraction, binary fingerprinting, and evidence collection supporting the four-tier backport proof system. Consumed primarily by Concelier's ProofService layer.
|
|
|
|
---
|
|
|
|
## 0) Mission & boundaries
|
|
|
|
**Mission.** Provide deterministic, cryptographic evidence collection for backport detection. Extract patch signatures from unified diffs and binary fingerprints from compiled code to enable high-confidence vulnerability status determination for packages where upstream fixes have been backported by distro maintainers.
|
|
|
|
**Boundaries.**
|
|
|
|
* Feedser is a **library**, not a standalone service. It does not expose REST APIs directly.
|
|
* Feedser **does not** make vulnerability decisions. It provides evidence that feeds into VEX statements and Policy Engine evaluation.
|
|
* Feedser **does not** store data. Storage is handled by consuming services (Concelier ProofService, Attestor).
|
|
* All outputs are **deterministic** with canonical JSON serialization and stable hashing.
|
|
|
|
---
|
|
|
|
## 1) Solution & project layout
|
|
|
|
```
|
|
src/Feedser/
|
|
├─ StellaOps.Feedser.Core/ # Patch signature extraction (HunkSig)
|
|
│ ├─ HunkSigExtractor.cs # Unified diff parser and normalizer
|
|
│ ├─ Models/
|
|
│ │ ├─ PatchSignature.cs # Deterministic patch identifier
|
|
│ │ ├─ HunkSignature.cs # Individual hunk with normalized content
|
|
│ │ └─ DiffParseResult.cs # Parse output with file paths and hunks
|
|
│ └─ Normalization/
|
|
│ └─ WhitespaceNormalizer.cs # Whitespace/comment stripping
|
|
│
|
|
├─ StellaOps.Feedser.BinaryAnalysis/ # Binary fingerprinting engine
|
|
│ ├─ BinaryFingerprintFactory.cs # Factory for fingerprinting strategies
|
|
│ ├─ IBinaryFingerprinter.cs # Fingerprinter interface
|
|
│ ├─ Models/
|
|
│ │ ├─ BinaryFingerprint.cs # Fingerprint record with method/value
|
|
│ │ └─ FingerprintMatchResult.cs # Match score and confidence
|
|
│ └─ Fingerprinters/
|
|
│ ├─ SimplifiedTlshFingerprinter.cs # TLSH fuzzy hashing
|
|
│ └─ InstructionHashFingerprinter.cs # Instruction sequence hashing
|
|
│
|
|
├─ plugins/
|
|
│ └─ concelier/ # Concelier integration plugin
|
|
│
|
|
└─ __Tests/
|
|
└─ StellaOps.Feedser.Core.Tests/ # Unit tests
|
|
```
|
|
|
|
---
|
|
|
|
## 2) External dependencies
|
|
|
|
* **Concelier ProofService** - Primary consumer; orchestrates four-tier evidence collection
|
|
* **Attestor ProofChain** - Consumes evidence for proof blob generation
|
|
* **.NET 10** - Runtime target
|
|
* No database dependencies (stateless library)
|
|
* No external network dependencies
|
|
|
|
---
|
|
|
|
## 3) Contracts & data model
|
|
|
|
### 3.1 Patch Signature (Tier 3 Evidence)
|
|
|
|
```csharp
|
|
public sealed record PatchSignature
|
|
{
|
|
public required string Id { get; init; } // Deterministic SHA256
|
|
public required string FilePath { get; init; } // Source file path
|
|
public required IReadOnlyList<HunkSignature> Hunks { get; init; }
|
|
public required string ContentHash { get; init; } // BLAKE3-256 of normalized content
|
|
public string? CommitId { get; init; } // Git commit SHA if available
|
|
public string? UpstreamCve { get; init; } // Associated CVE
|
|
}
|
|
|
|
public sealed record HunkSignature
|
|
{
|
|
public required int OldStart { get; init; }
|
|
public required int NewStart { get; init; }
|
|
public required string NormalizedContent { get; init; } // Whitespace-stripped
|
|
public required string ContentHash { get; init; }
|
|
}
|
|
```
|
|
|
|
### 3.2 Binary Fingerprint (Tier 4 Evidence)
|
|
|
|
```csharp
|
|
public sealed record BinaryFingerprint
|
|
{
|
|
public required string Method { get; init; } // tlsh, instruction_hash
|
|
public required string Value { get; init; } // Fingerprint value
|
|
public required string TargetPath { get; init; } // Binary file path
|
|
public string? FunctionName { get; init; } // Function if scoped
|
|
public required string Architecture { get; init; } // x86_64, aarch64, etc.
|
|
}
|
|
|
|
public sealed record FingerprintMatchResult
|
|
{
|
|
public required decimal Similarity { get; init; } // 0.0-1.0
|
|
public required decimal Confidence { get; init; } // 0.0-1.0
|
|
public required string Method { get; init; }
|
|
public required BinaryFingerprint Query { get; init; }
|
|
public required BinaryFingerprint Match { get; init; }
|
|
}
|
|
```
|
|
|
|
### 3.3 Evidence Tier Confidence Levels
|
|
|
|
| Tier | Evidence Type | Confidence Range | Description |
|
|
|------|--------------|------------------|-------------|
|
|
| 1 | Distro Advisory | 0.95-0.98 | Official vendor/distro statement |
|
|
| 2 | Changelog Mention | 0.75-0.85 | CVE mentioned in changelog |
|
|
| 3 | Patch Signature (HunkSig) | 0.85-0.95 | Normalized patch hash match |
|
|
| 4 | Binary Fingerprint | 0.55-0.85 | Compiled code similarity |
|
|
|
|
---
|
|
|
|
## 4) Core Components
|
|
|
|
### 4.1 HunkSigExtractor
|
|
|
|
Parses unified diff format and extracts normalized patch signatures:
|
|
|
|
```csharp
|
|
public interface IHunkSigExtractor
|
|
{
|
|
PatchSignature Extract(string unifiedDiff, string? commitId = null);
|
|
IReadOnlyList<PatchSignature> ExtractMultiple(string multiFileDiff);
|
|
}
|
|
```
|
|
|
|
**Normalization rules:**
|
|
- Strip leading/trailing whitespace
|
|
- Normalize line endings to LF
|
|
- Remove C-style comments (optional)
|
|
- Collapse multiple whitespace to single space
|
|
- Sort hunks by (file_path, old_start) for determinism
|
|
|
|
### 4.2 BinaryFingerprintFactory
|
|
|
|
Factory for creating fingerprinters based on binary type and analysis requirements:
|
|
|
|
```csharp
|
|
public interface IBinaryFingerprintFactory
|
|
{
|
|
IBinaryFingerprinter Create(FingerprintMethod method);
|
|
IReadOnlyList<IBinaryFingerprinter> GetAll();
|
|
}
|
|
|
|
public interface IBinaryFingerprinter
|
|
{
|
|
string Method { get; }
|
|
BinaryFingerprint Extract(ReadOnlySpan<byte> binary, string path);
|
|
FingerprintMatchResult Match(BinaryFingerprint query, BinaryFingerprint candidate);
|
|
}
|
|
```
|
|
|
|
**Fingerprinting methods:**
|
|
|
|
| Method | Description | Confidence | Use Case |
|
|
|--------|-------------|------------|----------|
|
|
| `tlsh` | TLSH fuzzy hash | 0.75-0.85 | General binary similarity |
|
|
| `instruction_hash` | Normalized instruction sequences | 0.55-0.75 | Function-level matching |
|
|
|
|
---
|
|
|
|
## 5) Integration with Concelier
|
|
|
|
Feedser is consumed via `StellaOps.Concelier.ProofService.BackportProofService`:
|
|
|
|
```
|
|
BackportProofService (Concelier)
|
|
├─ Tier 1: Query advisory_observations (distro advisories)
|
|
├─ Tier 2: Query changelogs via ISourceRepository
|
|
├─ Tier 3: Query patches via IPatchRepository + HunkSigExtractor
|
|
├─ Tier 4: Query binaries + BinaryFingerprintFactory
|
|
└─ Aggregate → ProofBlob with combined confidence score
|
|
```
|
|
|
|
The ProofService orchestrates evidence collection across all tiers and produces cryptographic proof blobs for downstream consumption.
|
|
|
|
---
|
|
|
|
## 6) Security & compliance
|
|
|
|
* **Determinism**: All outputs use canonical JSON with sorted keys, UTC timestamps
|
|
* **Tamper evidence**: BLAKE3-256 content hashes for all signatures
|
|
* **No secrets**: Library handles only public patch/binary data
|
|
* **Offline capable**: No network dependencies in core library
|
|
|
|
---
|
|
|
|
## 7) Performance targets
|
|
|
|
* **Patch extraction**: < 10ms for typical unified diff (< 1000 lines)
|
|
* **Binary fingerprinting**: < 100ms for 10MB ELF binary
|
|
* **Memory**: Streaming processing for large binaries; no full file buffering
|
|
* **Parallelism**: Thread-safe extractors; concurrent fingerprinting supported
|
|
|
|
---
|
|
|
|
## 8) Observability
|
|
|
|
Library consumers (ProofService) emit metrics:
|
|
|
|
* `feedser.hunk_extraction_duration_seconds`
|
|
* `feedser.binary_fingerprint_duration_seconds`
|
|
* `feedser.fingerprint_match_score{method}`
|
|
* `feedser.evidence_tier_confidence{tier}`
|
|
|
|
---
|
|
|
|
## 9) Testing matrix
|
|
|
|
* **Unit tests**: HunkSigExtractor parsing, normalization edge cases
|
|
* **Fingerprint tests**: Known binary pairs with expected similarity scores
|
|
* **Determinism tests**: Same input produces identical output across runs
|
|
* **Performance tests**: Large diff/binary processing within targets
|
|
|
|
---
|
|
|
|
## 10) Historical note
|
|
|
|
Concelier was formerly named "Feedser" (see `docs/airgap/airgap-mode.md`). The module was refactored:
|
|
- **Feedser** retained as evidence collection library
|
|
- **Concelier** became the advisory aggregation service consuming Feedser
|
|
|
|
---
|
|
|
|
## Related Documentation
|
|
|
|
* Concelier architecture: `../concelier/architecture.md`
|
|
* Attestor ProofChain: `../attestor/architecture.md`
|
|
* Backport proof system: `../../reachability/backport-proofs.md`
|