up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-11-27 23:44:42 +02:00
parent ef6e4b2067
commit 3b96b2e3ea
298 changed files with 47516 additions and 1168 deletions

View File

@@ -0,0 +1,154 @@
# VEX Raw Migration Rollback Guide
This document describes how to rollback migrations applied to the `vex_raw` collection.
## Migration: 20251127-vex-raw-idempotency-indexes
### Description
Adds unique idempotency indexes to enforce content-addressed storage:
- `idx_provider_sourceUri_digest_unique`: Prevents duplicate documents from same provider/source
- `idx_digest_providerId`: Optimizes evidence queries by digest
- `idx_retrievedAt`: Supports time-based queries and future TTL operations
### Rollback Steps
#### Option 1: MongoDB Shell
```javascript
// Connect to your MongoDB instance
mongosh "mongodb://localhost:27017/excititor"
// Drop the idempotency indexes
db.vex_raw.dropIndex("idx_provider_sourceUri_digest_unique")
db.vex_raw.dropIndex("idx_digest_providerId")
db.vex_raw.dropIndex("idx_retrievedAt")
// Verify indexes are dropped
db.vex_raw.getIndexes()
```
#### Option 2: Programmatic Rollback (C#)
```csharp
using StellaOps.Excititor.Storage.Mongo.Migrations;
// Get the database instance
var database = client.GetDatabase("excititor");
// Execute rollback
await database.RollbackIdempotencyIndexesAsync(cancellationToken);
// Verify rollback
var verified = await database.VerifyIdempotencyIndexesExistAsync(cancellationToken);
Console.WriteLine($"Indexes exist after rollback: {verified}"); // Should be false
```
#### Option 3: MongoDB Compass
1. Connect to your MongoDB instance
2. Navigate to the `excititor` database
3. Select the `vex_raw` collection
4. Go to the "Indexes" tab
5. Click "Drop Index" for each of:
- `idx_provider_sourceUri_digest_unique`
- `idx_digest_providerId`
- `idx_retrievedAt`
### Impact of Rollback
**Before rollback (indexes present):**
- Documents are prevented from being duplicated
- Evidence queries are optimized
- Unique constraint enforced
**After rollback (indexes dropped):**
- Duplicate documents may be inserted
- Evidence queries may be slower
- No unique constraint enforcement
### Re-applying the Migration
To re-apply the migration after rollback:
```javascript
// MongoDB shell
db.vex_raw.createIndex(
{ "providerId": 1, "sourceUri": 1, "digest": 1 },
{ unique: true, name: "idx_provider_sourceUri_digest_unique", background: true }
)
db.vex_raw.createIndex(
{ "digest": 1, "providerId": 1 },
{ name: "idx_digest_providerId", background: true }
)
db.vex_raw.createIndex(
{ "retrievedAt": 1 },
{ name: "idx_retrievedAt", background: true }
)
```
Or run the migration runner:
```bash
stellaops excititor migrate --run 20251127-vex-raw-idempotency-indexes
```
## Migration: 20251125-vex-raw-json-schema
### Description
Adds a JSON Schema validator to the `vex_raw` collection with `validationAction: warn`.
### Rollback Steps
```javascript
// MongoDB shell - remove the validator
db.runCommand({
collMod: "vex_raw",
validator: {},
validationAction: "off",
validationLevel: "off"
})
// Verify validator is removed
db.getCollectionInfos({ name: "vex_raw" })[0].options
```
### Impact of Rollback
- Documents will no longer be validated against the schema
- Invalid documents may be inserted
- Existing documents are not affected
## General Rollback Guidelines
1. **Always backup first**: Create a backup before any rollback operation
2. **Test in staging**: Verify rollback procedure in a non-production environment
3. **Monitor performance**: Watch for query performance changes after rollback
4. **Document changes**: Log all rollback operations for audit purposes
## Troubleshooting
### Index Drop Fails
If you see "IndexNotFound" errors, the index may have already been dropped or was never created:
```javascript
// Check existing indexes
db.vex_raw.getIndexes()
```
### Validator Removal Fails
If the validator command fails, verify you have the correct permissions:
```javascript
// Check current user roles
db.runCommand({ usersInfo: 1 })
```
## Related Documentation
- [VEX Raw Schema Validation](vex-raw-schema-validation.md)
- [MongoDB Index Management](https://www.mongodb.com/docs/manual/indexes/)
- [Excititor Architecture](../modules/excititor/architecture.md)

View File

@@ -0,0 +1,197 @@
# VEX Raw Schema Validation - Offline Kit
This document describes how operators can validate the integrity of VEX raw evidence stored in MongoDB, ensuring that Excititor stores only immutable, content-addressed documents.
## Overview
The `vex_raw` collection stores raw VEX documents with content-addressed storage (documents are keyed by their cryptographic hash). This ensures immutability - documents cannot be modified after insertion without changing their key.
## Schema Definition
The MongoDB JSON Schema enforces the following structure:
```json
{
"$jsonSchema": {
"bsonType": "object",
"title": "VEX Raw Document Schema",
"description": "Schema for immutable VEX evidence storage",
"required": ["_id", "providerId", "format", "sourceUri", "retrievedAt", "digest"],
"properties": {
"_id": {
"bsonType": "string",
"description": "Content digest serving as immutable key"
},
"providerId": {
"bsonType": "string",
"minLength": 1,
"description": "VEX provider identifier"
},
"format": {
"bsonType": "string",
"enum": ["csaf", "cyclonedx", "openvex"],
"description": "VEX document format"
},
"sourceUri": {
"bsonType": "string",
"minLength": 1,
"description": "Original source URI"
},
"retrievedAt": {
"bsonType": "date",
"description": "Timestamp when document was fetched"
},
"digest": {
"bsonType": "string",
"minLength": 32,
"description": "Content hash (SHA-256 hex)"
},
"content": {
"bsonType": ["binData", "string"],
"description": "Raw document content"
},
"gridFsObjectId": {
"bsonType": ["objectId", "null", "string"],
"description": "GridFS reference for large documents"
},
"metadata": {
"bsonType": "object",
"description": "Provider-specific metadata"
}
}
}
}
```
## Offline Validation Steps
### 1. Export the Schema
The schema can be exported from the application using the validator tooling:
```bash
# Using the Excititor CLI
stellaops excititor schema export --collection vex_raw --output vex-raw-schema.json
# Or via MongoDB shell
mongosh --eval "db.getCollectionInfos({name: 'vex_raw'})[0].options.validator" > vex-raw-schema.json
```
### 2. Validate Documents in MongoDB Shell
```javascript
// Connect to your MongoDB instance
mongosh "mongodb://localhost:27017/excititor"
// Get all documents that violate the schema
db.runCommand({
validate: "vex_raw",
full: true
})
// Or check individual documents
db.vex_raw.find().forEach(function(doc) {
var result = db.runCommand({
validate: "vex_raw",
documentId: doc._id
});
if (!result.valid) {
print("Invalid: " + doc._id);
}
});
```
### 3. Programmatic Validation (C#)
```csharp
using StellaOps.Excititor.Storage.Mongo.Validation;
// Validate a single document
var result = VexRawSchemaValidator.Validate(document);
if (!result.IsValid)
{
foreach (var violation in result.Violations)
{
Console.WriteLine($"{violation.Field}: {violation.Message}");
}
}
// Batch validation
var batchResult = VexRawSchemaValidator.ValidateBatch(documents);
Console.WriteLine($"Valid: {batchResult.ValidCount}, Invalid: {batchResult.InvalidCount}");
```
### 4. Export Schema for External Tools
```csharp
// Get schema as JSON for external validation tools
var schemaJson = VexRawSchemaValidator.GetJsonSchemaAsJson();
File.WriteAllText("vex-raw-schema.json", schemaJson);
```
## Verification Checklist
Use this checklist to verify schema compliance:
- [ ] All documents have required fields (_id, providerId, format, sourceUri, retrievedAt, digest)
- [ ] The `_id` matches the `digest` value (content-addressed)
- [ ] Format is one of: csaf, cyclonedx, openvex
- [ ] Digest is at least 32 characters (SHA-256 hex)
- [ ] No documents have been modified after insertion (verify via digest recomputation)
## Immutability Verification
To verify documents haven't been tampered with:
```javascript
// MongoDB shell - verify content matches digest
db.vex_raw.find().forEach(function(doc) {
var content = doc.content;
if (content) {
// Compute SHA-256 of content
var computedDigest = hex_md5(content); // Use appropriate hash function
if (computedDigest !== doc.digest) {
print("TAMPERED: " + doc._id);
}
}
});
```
## Auditing
For compliance auditing, export a validation report:
```bash
# Generate validation report
stellaops excititor validate --collection vex_raw --report validation-report.json
# The report includes:
# - Total document count
# - Valid/invalid counts
# - List of violations by document
# - Schema version used for validation
```
## Troubleshooting
### Common Violations
1. **Missing required field**: Ensure all required fields are present
2. **Invalid format**: Format must be exactly "csaf", "cyclonedx", or "openvex"
3. **Digest too short**: Digest must be at least 32 hex characters
4. **Wrong type**: Check field types match schema requirements
### Recovery
If invalid documents are found:
1. Do NOT modify documents in place (violates immutability)
2. Export the invalid documents for analysis
3. Re-ingest from original sources with correct data
4. Document the incident in audit logs
## Related Documentation
- [Excititor Architecture](../modules/excititor/architecture.md)
- [VEX Storage Design](../modules/excititor/storage.md)
- [Offline Operation Guide](../24_OFFLINE_KIT.md)