- Add RateLimitConfig for configuration management with YAML binding support. - Introduce RateLimitDecision to encapsulate the result of rate limit checks. - Implement RateLimitMetrics for OpenTelemetry metrics tracking. - Create RateLimitMiddleware for enforcing rate limits on incoming requests. - Develop RateLimitService to orchestrate instance and environment rate limit checks. - Add RateLimitServiceCollectionExtensions for dependency injection registration.
7.5 KiB
Corpus Contribution Guide
Sprint: SPRINT_3500_0003_0001
Task: CORPUS-014 - Document corpus contribution guide
Overview
The Ground-Truth Corpus is a collection of validated test samples used to measure scanner accuracy. Each sample has known reachability status and expected findings, enabling deterministic quality metrics.
Corpus Structure
datasets/reachability/
├── corpus.json # Index of all samples
├── schemas/
│ └── corpus-sample.v1.json # JSON schema for samples
├── samples/
│ ├── gt-0001/ # Sample directory
│ │ ├── sample.json # Sample metadata
│ │ ├── expected.json # Expected findings
│ │ ├── sbom.json # Input SBOM
│ │ └── source/ # Optional source files
│ └── ...
└── baselines/
└── v1.0.0.json # Baseline metrics
Sample Format
sample.json
{
"id": "gt-0001",
"name": "Python SQL Injection - Reachable",
"description": "Flask app with reachable SQL injection via user input",
"language": "python",
"ecosystem": "pypi",
"scenario": "webapi",
"entrypoints": ["app.py:main"],
"reachability_tier": "tainted_sink",
"created_at": "2025-01-15T00:00:00Z",
"author": "security-team",
"tags": ["sql-injection", "flask", "reachable"]
}
expected.json
{
"findings": [
{
"vuln_key": "CVE-2024-1234:pkg:pypi/sqlalchemy@1.4.0",
"tier": "tainted_sink",
"rule_key": "py.sql.injection.param_concat",
"sink_class": "sql",
"location_hint": "app.py:42"
}
]
}
Contributing a Sample
Step 1: Choose a Scenario
Select a scenario that is not well-covered in the corpus:
| Scenario | Description | Example |
|---|---|---|
webapi |
Web application endpoint | Flask, FastAPI, Express |
cli |
Command-line tool | argparse, click, commander |
job |
Background/scheduled job | Celery, cron script |
lib |
Library code | Reusable package |
Step 2: Create Sample Directory
cd datasets/reachability/samples
mkdir gt-NNNN
cd gt-NNNN
Use the next available sample ID (check corpus.json for the highest).
Step 3: Create Minimal Reproducible Case
Requirements:
- Smallest possible code to demonstrate the vulnerability
- Real or realistic vulnerability (use CVE when possible)
- Clear entrypoint definition
- Deterministic behavior (no network, no randomness)
Example Python Sample:
# app.py - gt-0001
from flask import Flask, request
import sqlite3
app = Flask(__name__)
@app.route("/user")
def get_user():
user_id = request.args.get("id") # Taint source
conn = sqlite3.connect(":memory:")
# SQL injection: user_id flows to query without sanitization
result = conn.execute(f"SELECT * FROM users WHERE id = {user_id}") # Taint sink
return str(result.fetchall())
if __name__ == "__main__":
app.run()
Step 4: Define Expected Findings
Create expected.json with all expected findings:
{
"findings": [
{
"vuln_key": "CWE-89:pkg:pypi/flask@2.0.0",
"tier": "tainted_sink",
"rule_key": "py.sql.injection",
"sink_class": "sql",
"location_hint": "app.py:13",
"notes": "User input from request.args flows to sqlite3.execute"
}
]
}
Step 5: Create SBOM
Generate or create an SBOM for the sample:
{
"bomFormat": "CycloneDX",
"specVersion": "1.6",
"version": 1,
"components": [
{
"type": "library",
"name": "flask",
"version": "2.0.0",
"purl": "pkg:pypi/flask@2.0.0"
},
{
"type": "library",
"name": "sqlite3",
"version": "3.39.0",
"purl": "pkg:pypi/sqlite3@3.39.0"
}
]
}
Step 6: Update Corpus Index
Add entry to corpus.json:
{
"id": "gt-0001",
"path": "samples/gt-0001",
"language": "python",
"tier": "tainted_sink",
"scenario": "webapi",
"expected_count": 1
}
Step 7: Validate Locally
# Run corpus validation
dotnet test tests/reachability/StellaOps.Reachability.FixtureTests \
--filter "FullyQualifiedName~CorpusFixtureTests"
# Run benchmark
stellaops bench corpus run --sample gt-0001 --verbose
Tier Guidelines
Imported Tier Samples
For imported tier samples:
- Vulnerability in a dependency
- No execution path to vulnerable code
- Package is in lockfile but not called
Example: Unused dependency with known CVE.
Executed Tier Samples
For executed tier samples:
- Vulnerable code is called from entrypoint
- No user-controlled data reaches the vulnerability
- Static or coverage analysis proves execution
Example: Hardcoded SQL query (no injection).
Tainted→Sink Tier Samples
For tainted_sink tier samples:
- User-controlled input reaches vulnerable code
- Clear source → sink data flow
- Include sink class taxonomy
Example: User input to SQL query, command execution, etc.
Sink Classes
When contributing tainted_sink samples, specify the sink class:
| Sink Class | Description | Examples |
|---|---|---|
sql |
SQL injection | sqlite3.execute, cursor.execute |
command |
Command injection | os.system, subprocess.run |
ssrf |
Server-side request forgery | requests.get, urllib.urlopen |
path |
Path traversal | open(), os.path.join |
deser |
Deserialization | pickle.loads, yaml.load |
eval |
Code evaluation | eval(), exec() |
xxe |
XML external entity | lxml.parse, ET.parse |
xss |
Cross-site scripting | innerHTML, document.write |
Quality Criteria
Samples must meet these criteria:
- Deterministic: Same input → same output
- Minimal: Smallest code to demonstrate
- Documented: Clear description and notes
- Validated: Passes local tests
- Realistic: Based on real vulnerability patterns
- Self-contained: No external network calls
Negative Samples
Include "negative" samples where scanner should NOT find vulnerabilities:
{
"id": "gt-0050",
"name": "Python SQL - Properly Sanitized",
"tier": "imported",
"expected_count": 0,
"notes": "Uses parameterized queries, no injection possible"
}
Review Process
- Create PR with new sample(s)
- CI runs validation tests
- Security team reviews expected findings
- QA team verifies determinism
- Merge and update baseline
Updating Baselines
After adding samples, update baseline metrics:
# Generate new baseline
stellaops bench corpus run --all --output baselines/v1.1.0.json
# Compare to previous
stellaops bench corpus compare baselines/v1.0.0.json baselines/v1.1.0.json
FAQ
How many samples should I contribute?
Start with 2-3 high-quality samples covering different aspects of the same vulnerability class.
Can I use synthetic vulnerabilities?
Yes, but prefer real CVE patterns when possible. Synthetic samples should document the vulnerability pattern clearly.
What if my sample has multiple findings?
Include all expected findings in expected.json. Multi-finding samples are valuable for testing.
How do I test tier classification?
Run with verbose output:
stellaops bench corpus run --sample gt-NNNN --verbose --show-evidence