Runtime Data Assets
Runtime data assets are files that Stella Ops services need at runtime but that
are not produced by dotnet publish or the Angular build. They must be
provisioned separately — either baked into Docker images, mounted as volumes, or
supplied via an init container.
This directory contains the canonical inventory, acquisition scripts, and packaging tools for all such assets.
If you are setting up Stella Ops for the first time, read this document
before running docker compose up. Services will start without these assets but
will operate in degraded mode (no semantic search, no binary analysis, dev-only
certificates).
Quick reference
| Category | Required? | Size | Provisioned by |
|---|---|---|---|
| ML model weights | Yes (for semantic search) | ~80 MB | acquire.sh |
| JDK + Ghidra | Optional (binary analysis) | ~1.6 GB | acquire.sh |
| Search seed snapshots | Yes (first boot) | ~7 KB | Included in source |
| Translations (i18n) | Yes | ~500 KB | Baked into Angular dist |
| Certificates and trust stores | Yes | ~50 KB | etc/ + volume mounts |
| Regional crypto configuration | Per region | ~20 KB | Compose overlays |
| Evidence storage | Yes | Grows | Persistent named volume |
| Vulnerability feeds | Yes (offline) | ~300 MB | Offline Kit (docs/OFFLINE_KIT.md) |
1. ML model weights
What: The all-MiniLM-L6-v2 sentence-transformer model in ONNX format,
used by OnnxVectorEncoder for semantic vector search in AdvisoryAI.
License: Apache-2.0 (compatible with BUSL-1.1; see third-party-licenses/all-MiniLM-L6-v2-Apache-2.0.txt).
Where it goes:
<app-root>/models/all-MiniLM-L6-v2.onnx
Configurable via KnowledgeSearch__OnnxModelPath environment variable.
How to acquire:
# Option A: use the acquisition script (recommended)
./devops/runtime-assets/acquire.sh --models
# Option B: manual download
mkdir -p src/AdvisoryAI/StellaOps.AdvisoryAI/models
curl -L https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model.onnx \
-o src/AdvisoryAI/StellaOps.AdvisoryAI/models/all-MiniLM-L6-v2.onnx
Verification:
sha256sum src/AdvisoryAI/StellaOps.AdvisoryAI/models/all-MiniLM-L6-v2.onnx
# Expected: see manifest.yaml for pinned digest
Degraded mode: If the model file is missing or is a placeholder, the encoder falls back to a deterministic character-ngram projection. Search works but semantic quality is significantly reduced.
Docker / Compose mount:
services:
advisory-ai-web:
volumes:
- ml-models:/app/models:ro
volumes:
ml-models:
driver: local
Air-gap: Include the .onnx file in the Offline Kit under
models/all-MiniLM-L6-v2.onnx. The acquire.sh --package command produces a
verified tarball for sneakernet transfer.
2. JDK + Ghidra
What: OpenJDK 17+ runtime and Ghidra 11.x installation for headless binary analysis (decompilation, BSim similarity, call-graph extraction).
License: OpenJDK — GPLv2+CE (Classpath Exception, allows linking); Ghidra — Apache-2.0 (NSA release).
Required only when: GhidraOptions__Enabled=true (default). Set to false
to skip entirely if binary analysis is not needed.
Where it goes:
/opt/java/openjdk/ # JDK installation (JAVA_HOME)
/opt/ghidra/ # Ghidra installation (GhidraOptions__GhidraHome)
/tmp/stellaops-ghidra/ # Workspace (GhidraOptions__WorkDir) — writable
How to acquire:
# Option A: use the acquisition script
./devops/runtime-assets/acquire.sh --ghidra
# Option B: manual
# JDK (Eclipse Temurin 17)
curl -L https://github.com/adoptium/temurin17-binaries/releases/download/jdk-17.0.13%2B11/OpenJDK17U-jre_x64_linux_hotspot_17.0.13_11.tar.gz \
| tar -xz -C /opt/java/
# Ghidra 11.2
curl -L https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_11.2_build/ghidra_11.2_PUBLIC_20241105.zip \
-o ghidra.zip && unzip ghidra.zip -d /opt/ghidra/
Docker: For services that need Ghidra, use a dedicated Dockerfile stage or a
sidecar data image. See docs/modules/binary-index/ghidra-deployment.md.
Air-gap: Pre-download both archives on a connected machine and include them
in the Offline Kit under tools/jdk/ and tools/ghidra/.
3. Search seed snapshots
What: Small JSON files that bootstrap the unified search index on first start. Without them, search returns empty results until live data adapters populate the index.
Where they are:
src/AdvisoryAI/StellaOps.AdvisoryAI/UnifiedSearch/Snapshots/
findings.snapshot.json (1.3 KB)
vex.snapshot.json (1.2 KB)
policy.snapshot.json (1.2 KB)
graph.snapshot.json (758 B)
scanner.snapshot.json (751 B)
opsmemory.snapshot.json (1.1 KB)
timeline.snapshot.json (824 B)
How they get into the image: The .csproj copies them to the output
directory via <Content> items. They are included in dotnet publish output
automatically.
Runtime behavior: UnifiedSearchIndexer loads them at startup and refreshes
from live data adapters every 300 seconds (UnifiedSearch__AutoRefreshIntervalSeconds).
No separate provisioning needed unless you want to supply custom seed data, in which case mount a volume at the snapshot path and set:
KnowledgeSearch__UnifiedFindingsSnapshotPath=/app/snapshots/findings.snapshot.json
4. Translations (i18n)
What: JSON translation bundles for the Angular frontend, supporting 9 locales: en-US, de-DE, bg-BG, ru-RU, es-ES, fr-FR, uk-UA, zh-CN, zh-TW.
Where they are:
src/Web/StellaOps.Web/src/i18n/*.common.json
How they get into the image: Compiled into the Angular dist/ bundle during
npm run build. The console Docker image (devops/docker/Dockerfile.console)
includes them automatically.
Runtime overrides: The backend TranslationRegistry supports
database-backed translation overrides (priority 100) over file-based bundles
(priority 10). For custom translations in offline environments, seed the
database or mount override JSON files.
No separate provisioning needed for standard deployments.
5. Certificates and trust stores
What: TLS certificates, signing keys, and CA trust bundles for inter-service communication and attestation verification.
Development defaults (not for production):
etc/authority/keys/
kestrel-dev.pfx # Kestrel TLS (password: devpass)
kestrel-dev.crt / .key
ack-token-dev.pem # Token signing key
signing-dev.pem # Service signing key
etc/trust-profiles/assets/
ca.crt # Root CA bundle
rekor-public.pem # Rekor transparency log public key
Compose mounts (already configured):
volumes:
- ../../etc/authority/keys:/app/etc/certs:ro
- ./combined-ca-bundle.crt:/etc/ssl/certs/ca-certificates.crt:ro
Production: Replace dev certificates with properly issued certificates.
Mount as read-only volumes. See docs/SECURITY_HARDENING_GUIDE.md.
Air-gap: Include the full trust chain in the Offline Kit. For Russian
deployments, include certificates/russian_trusted_bundle.pem (see
docs/OFFLINE_KIT.md).
6. Regional crypto configuration
What: YAML configuration files that select the cryptographic profile (algorithms, key types, HSM settings) per deployment region.
Files:
etc/appsettings.crypto.international.yaml # Default (ECDSA/RSA/EdDSA)
etc/appsettings.crypto.eu.yaml # eIDAS qualified signatures
etc/appsettings.crypto.russia.yaml # GOST R 34.10/34.11
etc/appsettings.crypto.china.yaml # SM2/SM3/SM4
etc/crypto-plugins-manifest.json # Plugin registry
Selection: Via Docker Compose overlays:
# EU deployment
docker compose -f docker-compose.stella-ops.yml \
-f docker-compose.compliance-eu.yml up -d
No separate provisioning needed — files ship in the source tree and are
selected by compose overlay. See devops/compose/README.md for details.
7. Evidence storage
What: Persistent storage for evidence bundles (SBOMs, attestations, signatures, scan proofs). Grows with usage.
Default path: /data/evidence (named volume evidence-data).
Configured via: EvidenceLocker__ObjectStore__FileSystem__RootPath
Compose (already configured):
volumes:
evidence-data:
driver: local
Sizing: Plan ~1 GB per 1000 scans as a rough baseline. Monitor with
Prometheus metric evidence_locker_storage_bytes_total.
Backup: Include in PostgreSQL backup strategy. Evidence files are content-addressed and immutable — append-only, safe to rsync.
8. Vulnerability feeds
What: Merged advisory feeds (OSV, GHSA, NVD 2.0, and regional feeds). Required for offline vulnerability matching.
Provisioned by: The Offline Update Kit (docs/OFFLINE_KIT.md). This is a
separate, well-documented workflow. See that document for full details.
Not covered by acquire.sh — feed management is handled by the Concelier
module and the Offline Kit import pipeline.
Acquisition script
The acquire.sh script automates downloading, verifying, and staging runtime
data assets. It is idempotent — safe to run multiple times.
# Acquire everything (models + Ghidra + JDK)
./devops/runtime-assets/acquire.sh --all
# Models only (for environments without binary analysis)
./devops/runtime-assets/acquire.sh --models
# Ghidra + JDK only
./devops/runtime-assets/acquire.sh --ghidra
# Package all acquired assets into a portable tarball for air-gap transfer
./devops/runtime-assets/acquire.sh --package
# Verify already-acquired assets against pinned checksums
./devops/runtime-assets/acquire.sh --verify
Asset checksums are pinned in manifest.yaml in this directory. The script
verifies SHA-256 digests after every download and refuses corrupted files.
Docker integration
Option A: Bake into image (simplest)
Run acquire.sh --models before docker build. The .csproj copies
models/all-MiniLM-L6-v2.onnx into the publish output automatically.
Option B: Shared data volume (recommended for production)
Build a lightweight data image or use an init container:
# Dockerfile.runtime-assets
FROM busybox:1.37
COPY models/ /data/models/
VOLUME /data/models
Mount in compose:
services:
advisory-ai-web:
volumes:
- runtime-assets:/app/models:ro
depends_on:
runtime-assets-init:
condition: service_completed_successfully
runtime-assets-init:
build:
context: .
dockerfile: devops/runtime-assets/Dockerfile.runtime-assets
volumes:
- runtime-assets:/data/models
volumes:
runtime-assets:
Option C: Air-gap tarball
./devops/runtime-assets/acquire.sh --package
# Produces: out/runtime-assets/stella-ops-runtime-assets-<date>.tar.gz
# Transfer to air-gapped host, then:
tar -xzf stella-ops-runtime-assets-*.tar.gz -C /opt/stellaops/
Checklist: before you ship a release
models/all-MiniLM-L6-v2.onnxcontains real weights (not the 120-byte placeholder)acquire.sh --verifypasses all checksums- Certificates are production-issued (not
*-dev.*) - Evidence storage volume is provisioned with adequate capacity
- Regional crypto profile is selected if applicable
- Offline Kit includes runtime assets tarball if deploying to air-gap
NOTICE.mdandthird-party-licenses/are included in the image
Related documentation
- Installation guide:
docs/INSTALL_GUIDE.md - Offline Update Kit:
docs/OFFLINE_KIT.md - Security hardening:
docs/SECURITY_HARDENING_GUIDE.md - Ghidra deployment:
docs/modules/binary-index/ghidra-deployment.md - LLM model bundles (separate from ONNX):
docs/modules/advisory-ai/guides/offline-model-bundles.md - Third-party dependencies:
docs/legal/THIRD-PARTY-DEPENDENCIES.md - Compose profiles:
devops/compose/README.md