Files
git.stella-ops.org/docs/modules/binary-index/ghidra-deployment.md
StellaOps Bot 37e11918e0 save progress
2026-01-06 09:42:20 +02:00

1183 lines
31 KiB
Markdown

# Ghidra Deployment Guide
> **Module:** BinaryIndex
> **Component:** Ghidra Integration
> **Status:** PRODUCTION-READY
> **Version:** 1.0.0
> **Related:** [BinaryIndex Architecture](./architecture.md), [SPRINT_20260105_001_003](../../implplan/SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md)
---
## 1. Overview
This guide covers the deployment of Ghidra as a secondary analysis backend for the BinaryIndex module. Ghidra provides mature binary analysis capabilities including Version Tracking, BSim behavioral similarity, and FunctionID matching via headless analysis.
### 1.1 Architecture Overview
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Unified Disassembly/Analysis Layer │
│ │
│ Primary: B2R2 (fast, deterministic) │
│ Fallback: Ghidra (complex cases, low B2R2 confidence) │
│ │
│ ┌──────────────────────────┐ ┌──────────────────────────────────────┐ │
│ │ B2R2 Backend │ │ Ghidra Backend │ │
│ │ │ │ │ │
│ │ - Native .NET │ │ ┌────────────────────────────────┐ │ │
│ │ - LowUIR lifting │ │ │ Ghidra Headless Server │ │ │
│ │ - CFG recovery │ │ │ │ │ │
│ │ - Fast fingerprinting │ │ │ - P-Code decompilation │ │ │
│ │ │ │ │ - Version Tracking │ │ │
│ └──────────────────────────┘ │ │ - BSim queries │ │ │
│ │ │ - FunctionID matching │ │ │
│ │ └────────────────────────────────┘ │ │
│ │ │ │ │
│ │ v │ │
│ │ ┌────────────────────────────────┐ │ │
│ │ │ ghidriff Bridge │ │ │
│ │ │ │ │ │
│ │ │ - Automated patch diffing │ │ │
│ │ │ - JSON/Markdown output │ │ │
│ │ │ - CI/CD integration │ │ │
│ │ └────────────────────────────────┘ │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### 1.2 When Ghidra is Used
Ghidra serves as a fallback/enhancement layer for:
1. **Architectures B2R2 handles poorly** - Exotic architectures, embedded systems
2. **Complex obfuscation scenarios** - Heavily obfuscated or packed binaries
3. **Version Tracking** - Patch diffing with multiple correlators
4. **BSim database queries** - Behavioral similarity matching against known libraries
5. **Low B2R2 confidence** - When B2R2 analysis confidence falls below threshold
---
## 2. Prerequisites
### 2.1 System Requirements
| Component | Requirement | Notes |
|-----------|-------------|-------|
| **Java** | OpenJDK 17+ | Eclipse Temurin recommended |
| **Ghidra** | 11.x (11.2+) | NSA Ghidra from official releases |
| **Python** | 3.10+ | Required for ghidriff |
| **Memory** | 8GB+ RAM | 4GB for Ghidra JVM, 4GB for OS/services |
| **CPU** | 4+ cores | More cores improve analysis speed |
| **Storage** | 10GB+ free | Ghidra installation + project files |
### 2.2 Operating System Support
- **Linux:** Ubuntu 22.04+, Debian Bookworm+, RHEL 9+, Alpine 3.19+
- **Windows:** Windows Server 2022, Windows 10/11 (development only)
- **macOS:** macOS 12+ (development only, limited support)
### 2.3 Network Requirements
For air-gapped deployments:
- Pre-download Ghidra release archives
- Pre-install ghidriff Python package wheels
- No external network access required at runtime
---
## 3. Java Installation
### 3.1 Linux (Ubuntu/Debian)
```bash
# Install Eclipse Temurin 17
wget -O - https://packages.adoptium.net/artifactory/api/gpg/key/public | sudo apt-key add -
echo "deb https://packages.adoptium.net/artifactory/deb $(awk -F= '/^VERSION_CODENAME/{print$2}' /etc/os-release) main" | sudo tee /etc/apt/sources.list.d/adoptium.list
sudo apt-get update
sudo apt-get install -y temurin-17-jdk
# Verify installation
java -version
# Expected: openjdk version "17.0.x"
```
### 3.2 Linux (RHEL/Fedora)
```bash
# Install OpenJDK 17
sudo dnf install -y java-17-openjdk-devel
# Set JAVA_HOME
echo 'export JAVA_HOME=/usr/lib/jvm/java-17-openjdk' | sudo tee -a /etc/profile.d/java.sh
source /etc/profile.d/java.sh
# Verify
java -version
```
### 3.3 Linux (Alpine)
```bash
# Install OpenJDK 17
apk add --no-cache openjdk17-jdk
# Set JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk
echo 'export JAVA_HOME=/usr/lib/jvm/java-17-openjdk' >> /etc/profile
# Verify
java -version
```
### 3.4 Docker (Recommended)
Use Eclipse Temurin base image (included in Dockerfile, see section 6):
```dockerfile
FROM eclipse-temurin:17-jdk-jammy
```
---
## 4. Ghidra Installation
### 4.1 Download Ghidra
```bash
# Set version
GHIDRA_VERSION=11.2
GHIDRA_BUILD_DATE=20241105 # Adjust to actual build date
# Download from GitHub releases
cd /tmp
wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip
# Verify checksum (obtain SHA256 from release page)
GHIDRA_SHA256="<insert-sha256-here>"
echo "${GHIDRA_SHA256} ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip" | sha256sum -c -
```
### 4.2 Extract and Install
```bash
# Extract to /opt
sudo unzip ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip -d /opt
# Create symlink for version-agnostic path
sudo ln -s /opt/ghidra_${GHIDRA_VERSION}_PUBLIC /opt/ghidra
# Set permissions
sudo chmod +x /opt/ghidra/support/analyzeHeadless
sudo chmod +x /opt/ghidra/ghidraRun
# Set environment variables
echo 'export GHIDRA_HOME=/opt/ghidra' | sudo tee -a /etc/profile.d/ghidra.sh
echo 'export PATH="${GHIDRA_HOME}/support:${PATH}"' | sudo tee -a /etc/profile.d/ghidra.sh
source /etc/profile.d/ghidra.sh
```
### 4.3 Verify Installation
```bash
# Test headless mode
analyzeHeadless /tmp TempProject -help
# Expected output: Ghidra Headless Analyzer usage information
```
---
## 5. Python and ghidriff Installation
### 5.1 Install Python Dependencies
```bash
# Ubuntu/Debian
sudo apt-get install -y python3 python3-pip python3-venv
# RHEL/Fedora
sudo dnf install -y python3 python3-pip
# Alpine
apk add --no-cache python3 py3-pip
```
### 5.2 Install ghidriff
```bash
# Install globally (not recommended for production)
sudo pip3 install ghidriff
# Install in virtual environment (recommended)
python3 -m venv /opt/stellaops/venv
source /opt/stellaops/venv/bin/activate
pip install ghidriff
# Verify installation
python3 -m ghidriff --version
# Expected: ghidriff version 0.x.x
```
### 5.3 Air-Gapped Installation
```bash
# On internet-connected machine, download wheels
mkdir -p /tmp/ghidriff-wheels
pip download --dest /tmp/ghidriff-wheels ghidriff
# Transfer /tmp/ghidriff-wheels to air-gapped machine
# On air-gapped machine, install from local wheels
pip install --no-index --find-links /tmp/ghidriff-wheels ghidriff
```
---
## 6. Docker Deployment
### 6.1 Dockerfile
Create `devops/docker/ghidra/Dockerfile.headless`:
```dockerfile
# Copyright (c) StellaOps. All rights reserved.
# Licensed under AGPL-3.0-or-later.
FROM eclipse-temurin:17-jdk-jammy
ARG GHIDRA_VERSION=11.2
ARG GHIDRA_BUILD_DATE=20241105
ARG GHIDRA_SHA256=<insert-sha256-here>
LABEL org.opencontainers.image.title="StellaOps Ghidra Headless"
LABEL org.opencontainers.image.description="Ghidra headless analysis server with ghidriff for BinaryIndex"
LABEL org.opencontainers.image.version="${GHIDRA_VERSION}"
LABEL org.opencontainers.image.licenses="AGPL-3.0-or-later"
# Install dependencies
RUN apt-get update && apt-get install -y \
python3 \
python3-pip \
python3-venv \
curl \
unzip \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Download and verify Ghidra
RUN curl -fsSL "https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip" \
-o /tmp/ghidra.zip \
&& echo "${GHIDRA_SHA256} /tmp/ghidra.zip" | sha256sum -c - \
&& unzip /tmp/ghidra.zip -d /opt \
&& rm /tmp/ghidra.zip \
&& ln -s /opt/ghidra_${GHIDRA_VERSION}_PUBLIC /opt/ghidra \
&& chmod +x /opt/ghidra/support/analyzeHeadless
# Install ghidriff
RUN python3 -m venv /opt/venv \
&& /opt/venv/bin/pip install --no-cache-dir ghidriff
# Set environment variables
ENV GHIDRA_HOME=/opt/ghidra
ENV JAVA_HOME=/opt/java/openjdk
ENV PATH="${GHIDRA_HOME}/support:/opt/venv/bin:${PATH}"
ENV MAXMEM=4G
# Create working directories
RUN mkdir -p /projects /scripts /output \
&& chmod 755 /projects /scripts /output
WORKDIR /projects
# Healthcheck
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD analyzeHeadless /tmp HealthCheck -help > /dev/null 2>&1 || exit 1
# Default entrypoint
ENTRYPOINT ["analyzeHeadless"]
CMD ["--help"]
```
### 6.2 Build Docker Image
```bash
# Navigate to docker directory
cd devops/docker/ghidra
# Build image
docker build \
-f Dockerfile.headless \
-t stellaops/ghidra-headless:11.2 \
-t stellaops/ghidra-headless:latest \
--build-arg GHIDRA_SHA256=<insert-sha256> \
.
# Verify build
docker run --rm stellaops/ghidra-headless:latest --help
```
### 6.3 Docker Compose Configuration
Create `devops/compose/docker-compose.ghidra.yml`:
```yaml
# Copyright (c) StellaOps. All rights reserved.
# Licensed under AGPL-3.0-or-later.
version: "3.9"
services:
ghidra-headless:
image: stellaops/ghidra-headless:11.2
container_name: stellaops-ghidra-headless
hostname: ghidra-headless
restart: unless-stopped
volumes:
- ghidra-projects:/projects
- ghidra-scripts:/scripts
- ghidra-output:/output
- /etc/localtime:/etc/localtime:ro
environment:
JAVA_HOME: /opt/java/openjdk
MAXMEM: ${GHIDRA_MAXMEM:-4G}
GHIDRA_INSTALL_DIR: /opt/ghidra
deploy:
resources:
limits:
cpus: '4'
memory: 8G
reservations:
cpus: '2'
memory: 4G
networks:
- stellaops-backend
# Override entrypoint for long-running service
# In production, use a wrapper script or queue-based invocation
entrypoint: ["/bin/bash"]
command: ["-c", "tail -f /dev/null"]
bsim-postgres:
image: postgres:16-alpine
container_name: stellaops-bsim-postgres
hostname: bsim-postgres
restart: unless-stopped
volumes:
- bsim-data:/var/lib/postgresql/data
- ./init-bsim-db.sql:/docker-entrypoint-initdb.d/01-init.sql:ro
environment:
POSTGRES_DB: bsim
POSTGRES_USER: bsim
POSTGRES_PASSWORD: ${BSIM_DB_PASSWORD:-changeme}
PGDATA: /var/lib/postgresql/data/pgdata
deploy:
resources:
limits:
cpus: '2'
memory: 2G
reservations:
cpus: '1'
memory: 1G
networks:
- stellaops-backend
healthcheck:
test: ["CMD-SHELL", "pg_isready -U bsim"]
interval: 10s
timeout: 5s
retries: 5
volumes:
ghidra-projects:
name: stellaops-ghidra-projects
ghidra-scripts:
name: stellaops-ghidra-scripts
ghidra-output:
name: stellaops-ghidra-output
bsim-data:
name: stellaops-bsim-data
networks:
stellaops-backend:
name: stellaops-backend
external: true
```
### 6.4 BSim Database Initialization
Create `devops/compose/init-bsim-db.sql`:
```sql
-- Copyright (c) StellaOps. All rights reserved.
-- Licensed under AGPL-3.0-or-later.
-- BSim database initialization for Ghidra
-- This schema is managed by Ghidra's BSim tooling
-- Create extensions
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Create application user (if different from postgres user)
-- Adjust as needed for your deployment
DO $$
BEGIN
IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = 'bsim_app') THEN
CREATE ROLE bsim_app WITH LOGIN PASSWORD 'changeme';
END IF;
END
$$;
-- Grant permissions
GRANT ALL PRIVILEGES ON DATABASE bsim TO bsim_app;
-- Note: Ghidra's BSim will create its own schema tables on first use
-- See Ghidra BSim documentation for schema details
```
### 6.5 Start Services
```bash
# Create backend network if it doesn't exist
docker network create stellaops-backend
# Set environment variables
export BSIM_DB_PASSWORD=your-secure-password
export GHIDRA_MAXMEM=8G
# Start services
docker-compose -f devops/compose/docker-compose.ghidra.yml up -d
# Verify services are running
docker-compose -f devops/compose/docker-compose.ghidra.yml ps
# Check logs
docker-compose -f devops/compose/docker-compose.ghidra.yml logs -f ghidra-headless
docker-compose -f devops/compose/docker-compose.ghidra.yml logs -f bsim-postgres
```
---
## 7. BSim PostgreSQL Database Setup
### 7.1 Database Creation
BSim uses PostgreSQL as its backend database. Ghidra's BSim tooling will create the schema automatically on first use, but you need to provision the database instance.
### 7.2 Manual Database Setup (Non-Docker)
```bash
# As postgres user, create database and user
sudo -u postgres psql <<EOF
CREATE DATABASE bsim;
CREATE USER bsim WITH ENCRYPTED PASSWORD 'your-secure-password';
GRANT ALL PRIVILEGES ON DATABASE bsim TO bsim;
\c bsim
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
GRANT ALL PRIVILEGES ON SCHEMA public TO bsim;
EOF
```
### 7.3 BSim Server Configuration
Create BSim server configuration (if using BSim server mode, optional):
```xml
<!-- /etc/stellaops/bsim-server.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<bsim>
<database>
<host>localhost</host>
<port>5432</port>
<name>bsim</name>
<user>bsim</user>
<password>your-secure-password</password>
</database>
<server>
<port>6543</port>
<maxConnections>10</maxConnections>
</server>
</bsim>
```
### 7.4 Test BSim Connection
```bash
# Using Ghidra's bsim command-line tool
$GHIDRA_HOME/support/bsim createdb postgresql://bsim:your-secure-password@localhost:5432/bsim stellaops_corpus
# Expected: Database created successfully
```
---
## 8. Configuration
### 8.1 StellaOps Configuration
Add Ghidra configuration to your StellaOps service configuration file (e.g., `etc/binaryindex.yaml`):
```yaml
# Ghidra Integration Configuration
Ghidra:
# Path to Ghidra installation directory (GHIDRA_HOME)
GhidraHome: /opt/ghidra
# Path to Java installation directory (JAVA_HOME)
# If not set, system JAVA_HOME will be used
JavaHome: /usr/lib/jvm/java-17-openjdk
# Working directory for Ghidra projects and temporary files
WorkDir: /var/lib/stellaops/ghidra
# Path to custom Ghidra scripts directory
ScriptsDir: /opt/stellaops/ghidra-scripts
# Maximum memory for Ghidra JVM (e.g., "4G", "8192M")
MaxMemory: 4G
# Maximum CPU cores for Ghidra analysis
MaxCpu: 4
# Default timeout for analysis operations in seconds
DefaultTimeoutSeconds: 300
# Whether to clean up temporary projects after analysis
CleanupTempProjects: true
# Maximum concurrent Ghidra instances
MaxConcurrentInstances: 1
# Whether Ghidra integration is enabled
Enabled: true
# BSim Database Configuration
BSim:
# BSim database connection string
# Format: postgresql://user:pass@host:port/database
ConnectionString: postgresql://bsim:your-secure-password@bsim-postgres:5432/bsim
# Alternative: Specify components separately
# Host: bsim-postgres
# Port: 5432
# Database: bsim
# Username: bsim
# Password: your-secure-password
# Default minimum similarity for queries
DefaultMinSimilarity: 0.7
# Default maximum results per query
DefaultMaxResults: 10
# Whether BSim integration is enabled
Enabled: true
# ghidriff Python Bridge Configuration
Ghidriff:
# Path to Python executable
# If not set, "python3" or "python" will be used from PATH
PythonPath: /opt/venv/bin/python3
# Path to ghidriff module (if not installed via pip)
# GhidriffModulePath: /opt/stellaops/ghidriff
# Whether to include decompilation in diff output by default
DefaultIncludeDecompilation: true
# Whether to include disassembly in diff output by default
DefaultIncludeDisassembly: true
# Default timeout for ghidriff operations in seconds
DefaultTimeoutSeconds: 600
# Working directory for ghidriff output
WorkDir: /var/lib/stellaops/ghidriff
# Whether ghidriff integration is enabled
Enabled: true
```
### 8.2 Environment Variables
You can also configure Ghidra via environment variables:
```bash
# Ghidra
export STELLAOPS_GHIDRA_GHIDRAHOME=/opt/ghidra
export STELLAOPS_GHIDRA_JAVAHOME=/usr/lib/jvm/java-17-openjdk
export STELLAOPS_GHIDRA_MAXMEMORY=4G
export STELLAOPS_GHIDRA_MAXCPU=4
export STELLAOPS_GHIDRA_ENABLED=true
# BSim
export STELLAOPS_BSIM_CONNECTIONSTRING=postgresql://bsim:password@localhost:5432/bsim
export STELLAOPS_BSIM_ENABLED=true
# ghidriff
export STELLAOPS_GHIDRIFF_PYTHONPATH=/opt/venv/bin/python3
export STELLAOPS_GHIDRIFF_ENABLED=true
```
### 8.3 appsettings.json (ASP.NET Core)
For services using ASP.NET Core configuration:
```json
{
"Ghidra": {
"GhidraHome": "/opt/ghidra",
"JavaHome": "/usr/lib/jvm/java-17-openjdk",
"WorkDir": "/var/lib/stellaops/ghidra",
"MaxMemory": "4G",
"MaxCpu": 4,
"DefaultTimeoutSeconds": 300,
"CleanupTempProjects": true,
"MaxConcurrentInstances": 1,
"Enabled": true
},
"BSim": {
"ConnectionString": "postgresql://bsim:password@bsim-postgres:5432/bsim",
"DefaultMinSimilarity": 0.7,
"DefaultMaxResults": 10,
"Enabled": true
},
"Ghidriff": {
"PythonPath": "/opt/venv/bin/python3",
"DefaultIncludeDecompilation": true,
"DefaultIncludeDisassembly": true,
"DefaultTimeoutSeconds": 600,
"WorkDir": "/var/lib/stellaops/ghidriff",
"Enabled": true
}
}
```
---
## 9. Testing and Validation
### 9.1 Ghidra Headless Test
Create a simple test binary and analyze it:
```bash
# Create test C program
cat > /tmp/test.c <<'EOF'
#include <stdio.h>
int add(int a, int b) {
return a + b;
}
int main() {
int result = add(5, 3);
printf("Result: %d\n", result);
return 0;
}
EOF
# Compile
gcc -o /tmp/test /tmp/test.c
# Run Ghidra analysis
analyzeHeadless /tmp TestProject \
-import /tmp/test \
-postScript ListFunctionsScript.java \
-noanalysis
# Expected: Analysis completes without errors, lists functions (main, add)
```
### 9.2 BSim Database Test
```bash
# Create test BSim database
$GHIDRA_HOME/support/bsim createdb \
postgresql://bsim:password@localhost:5432/bsim \
test_corpus
# Ingest test binary into BSim
$GHIDRA_HOME/support/bsim ingest \
postgresql://bsim:password@localhost:5432/bsim/test_corpus \
/tmp/test
# Query BSim
$GHIDRA_HOME/support/bsim querysimilar \
postgresql://bsim:password@localhost:5432/bsim/test_corpus \
/tmp/test \
--threshold 0.7
# Expected: Shows functions from test binary with similarity scores
```
### 9.3 ghidriff Test
```bash
# Create two versions of a binary (modify test.c slightly)
cat > /tmp/test_v2.c <<'EOF'
#include <stdio.h>
int add(int a, int b) {
// Added comment
return a + b + 1; // Modified
}
int main() {
int result = add(5, 3);
printf("Result: %d\n", result);
return 0;
}
EOF
gcc -o /tmp/test_v2 /tmp/test_v2.c
# Run ghidriff
python3 -m ghidriff /tmp/test /tmp/test_v2 \
--output-dir /tmp/ghidriff-test \
--output-format json
# Expected: Creates diff.json in /tmp/ghidriff-test showing changes
cat /tmp/ghidriff-test/diff.json
```
### 9.4 Integration Test
Test the BinaryIndex Ghidra integration:
```bash
# Run BinaryIndex integration tests
dotnet test src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Ghidra.Tests/ \
--filter "Category=Integration" \
--logger "trx;LogFileName=ghidra-tests.trx"
# Expected: All tests pass
```
---
## 10. Troubleshooting
### 10.1 Common Issues
#### Issue: "analyzeHeadless: command not found"
**Solution:**
```bash
# Ensure GHIDRA_HOME is set
export GHIDRA_HOME=/opt/ghidra
export PATH="${GHIDRA_HOME}/support:${PATH}"
# Verify
which analyzeHeadless
```
#### Issue: "Java version mismatch" or "UnsupportedClassVersionError"
**Solution:**
```bash
# Check Java version
java -version
# Must be Java 17+
# Set correct JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk
```
#### Issue: "OutOfMemoryError: Java heap space"
**Solution:**
```bash
# Increase MAXMEM
export MAXMEM=8G
# Or in configuration
Ghidra:
MaxMemory: 8G
```
#### Issue: "ghidriff: No module named 'ghidriff'"
**Solution:**
```bash
# Install ghidriff
pip3 install ghidriff
# Or activate venv
source /opt/venv/bin/activate
pip install ghidriff
# Verify
python3 -m ghidriff --version
```
#### Issue: "BSim connection refused"
**Solution:**
```bash
# Check PostgreSQL is running
docker-compose -f devops/compose/docker-compose.ghidra.yml ps bsim-postgres
# Test connection
psql -h localhost -p 5432 -U bsim -d bsim -c "SELECT version();"
# Check connection string in configuration
# Ensure format: postgresql://user:pass@host:port/database
```
#### Issue: "Ghidra analysis hangs or times out"
**Solution:**
```bash
# Increase timeout
Ghidra:
DefaultTimeoutSeconds: 600 # 10 minutes
# Reduce analysis scope (disable certain analyzers)
analyzeHeadless /tmp TestProject -import /tmp/test \
-noanalysis \
-processor x86:LE:64:default
# Check system resources (CPU, memory)
docker stats stellaops-ghidra-headless
```
### 10.2 Logging and Diagnostics
#### Enable Ghidra Debug Logging
```bash
# Run with verbose output
analyzeHeadless /tmp TestProject -import /tmp/test \
-log /tmp/ghidra-analysis.log \
-logLevel DEBUG
# Check log file
tail -f /tmp/ghidra-analysis.log
```
#### Enable StellaOps Ghidra Logging
Add to `appsettings.json`:
```json
{
"Logging": {
"LogLevel": {
"Default": "Information",
"StellaOps.BinaryIndex.Ghidra": "Debug"
}
}
}
```
#### Docker Container Logs
```bash
# View Ghidra headless logs
docker logs stellaops-ghidra-headless -f
# View BSim PostgreSQL logs
docker logs stellaops-bsim-postgres -f
# View logs with timestamps
docker logs stellaops-ghidra-headless --timestamps
```
### 10.3 Performance Tuning
#### Optimize Ghidra Memory Settings
```yaml
Ghidra:
# For large binaries (>100MB)
MaxMemory: 16G
# For many concurrent analyses
MaxConcurrentInstances: 4
```
#### Optimize BSim Queries
```yaml
BSim:
# Reduce result set for faster queries
DefaultMaxResults: 5
# Increase similarity threshold to reduce matches
DefaultMinSimilarity: 0.8
```
#### Docker Resource Limits
```yaml
services:
ghidra-headless:
deploy:
resources:
limits:
cpus: '8' # Increase for faster analysis
memory: 16G # Match MaxMemory + overhead
```
---
## 11. Production Deployment Checklist
### 11.1 Pre-Deployment
- [ ] Java 17+ installed and verified
- [ ] Ghidra 11.2+ downloaded and SHA256 verified
- [ ] Python 3.10+ installed
- [ ] ghidriff installed and tested
- [ ] PostgreSQL 16+ available for BSim
- [ ] Docker images built and tested
- [ ] Configuration files reviewed and validated
- [ ] Network connectivity verified (or air-gap packages prepared)
### 11.2 Security Hardening
- [ ] BSim database password set to strong value (not "changeme")
- [ ] PostgreSQL configured with TLS/SSL
- [ ] Ghidra working directories have restricted permissions (700)
- [ ] Docker containers run as non-root user
- [ ] Network segmentation configured (backend network only)
- [ ] Firewall rules restrict BSim PostgreSQL access
- [ ] Audit logging enabled for Ghidra operations
### 11.3 Post-Deployment
- [ ] Ghidra headless test completed successfully
- [ ] BSim database initialized and accessible
- [ ] ghidriff integration tested
- [ ] BinaryIndex integration tests pass
- [ ] Monitoring and alerting configured
- [ ] Log aggregation configured
- [ ] Backup strategy for BSim database configured
- [ ] Runbook/procedures documented
---
## 12. Monitoring and Observability
### 12.1 Metrics
StellaOps exposes Prometheus metrics for Ghidra integration:
| Metric | Type | Description |
|--------|------|-------------|
| `ghidra_analysis_total` | Counter | Total Ghidra analyses performed |
| `ghidra_analysis_duration_seconds` | Histogram | Duration of Ghidra analyses |
| `ghidra_analysis_errors_total` | Counter | Total Ghidra analysis errors |
| `ghidra_instances_active` | Gauge | Active Ghidra headless instances |
| `bsim_query_total` | Counter | Total BSim queries |
| `bsim_query_duration_seconds` | Histogram | Duration of BSim queries |
| `bsim_matches_total` | Counter | Total BSim matches found |
| `ghidriff_diff_total` | Counter | Total ghidriff diffs performed |
| `ghidriff_diff_duration_seconds` | Histogram | Duration of ghidriff diffs |
### 12.2 Health Checks
Ghidra service health check endpoint (if using wrapper service):
```bash
# HTTP health check
curl http://localhost:8080/health/ghidra
# Expected response:
{
"status": "Healthy",
"ghidra": {
"available": true,
"version": "11.2",
"javaVersion": "17.0.x"
},
"bsim": {
"available": true,
"connection": "OK"
}
}
```
### 12.3 Alerts
Recommended Prometheus alerts:
```yaml
groups:
- name: ghidra
rules:
- alert: GhidraAnalysisHighErrorRate
expr: rate(ghidra_analysis_errors_total[5m]) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "High Ghidra analysis error rate"
description: "Ghidra error rate is {{ $value }} errors/sec"
- alert: GhidraAnalysisSlow
expr: histogram_quantile(0.95, ghidra_analysis_duration_seconds) > 600
for: 10m
labels:
severity: warning
annotations:
summary: "Ghidra analyses are slow"
description: "P95 analysis duration is {{ $value }}s (>10m)"
- alert: BSimDatabaseDown
expr: up{job="bsim-postgres"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "BSim database is down"
description: "BSim PostgreSQL database is unreachable"
```
---
## 13. Backup and Recovery
### 13.1 BSim Database Backup
```bash
# Automated backup script
#!/bin/bash
BACKUP_DIR=/var/backups/stellaops/bsim
DATE=$(date +%Y%m%d_%H%M%S)
# Create backup
docker exec stellaops-bsim-postgres \
pg_dump -U bsim -Fc bsim > ${BACKUP_DIR}/bsim_${DATE}.dump
# Compress (optional)
gzip ${BACKUP_DIR}/bsim_${DATE}.dump
# Retention: keep last 7 days
find ${BACKUP_DIR} -name "bsim_*.dump.gz" -mtime +7 -delete
```
### 13.2 BSim Database Restore
```bash
# Stop dependent services
docker-compose -f devops/compose/docker-compose.ghidra.yml stop ghidra-headless
# Restore from backup
gunzip -c /var/backups/stellaops/bsim/bsim_20260105_120000.dump.gz | \
docker exec -i stellaops-bsim-postgres \
pg_restore -U bsim -d bsim --clean --if-exists
# Restart services
docker-compose -f devops/compose/docker-compose.ghidra.yml up -d
```
### 13.3 Ghidra Project Backup
```bash
# Backup Ghidra projects (if using persistent projects)
tar -czf /var/backups/stellaops/ghidra/projects_$(date +%Y%m%d).tar.gz \
/var/lib/stellaops/ghidra/projects
# Scripts backup
tar -czf /var/backups/stellaops/ghidra/scripts_$(date +%Y%m%d).tar.gz \
/opt/stellaops/ghidra-scripts
```
---
## 14. Air-Gapped Deployment
### 14.1 Package Preparation
On internet-connected machine:
```bash
# Download Ghidra
wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_11.2_build/ghidra_11.2_PUBLIC_20241105.zip
# Download Python wheels
mkdir -p airgap-packages
pip download --dest airgap-packages ghidriff
# Download Docker images
docker save stellaops/ghidra-headless:11.2 | gzip > airgap-packages/ghidra-headless-11.2.tar.gz
docker save postgres:16-alpine | gzip > airgap-packages/postgres-16-alpine.tar.gz
# Create tarball
tar -czf stellaops-ghidra-airgap.tar.gz airgap-packages/
```
### 14.2 Air-Gapped Installation
On air-gapped machine:
```bash
# Extract package
tar -xzf stellaops-ghidra-airgap.tar.gz
# Install Ghidra
cd airgap-packages
unzip ghidra_11.2_PUBLIC_20241105.zip -d /opt
ln -s /opt/ghidra_11.2_PUBLIC /opt/ghidra
# Install Python packages
pip install --no-index --find-links . ghidriff
# Load Docker images
docker load < ghidra-headless-11.2.tar.gz
docker load < postgres-16-alpine.tar.gz
# Proceed with normal deployment
```
---
## 15. References
### 15.1 Documentation
- **Ghidra Official Documentation:** https://ghidra.re/ghidra_docs/
- **Ghidra Version Tracking Guide:** https://cve-north-stars.github.io/docs/Ghidra-Patch-Diffing
- **ghidriff Repository:** https://github.com/clearbluejar/ghidriff
- **BSim Documentation:** https://ghidra.re/ghidra_docs/api/ghidra/features/bsim/
- **BinaryIndex Architecture:** [architecture.md](./architecture.md)
- **Sprint Documentation:** [SPRINT_20260105_001_003](../../implplan/SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md)
### 15.2 Related StellaOps Documentation
- **PostgreSQL Guide:** `docs/operations/postgresql-guide.md`
- **Docker Deployment Guide:** `docs/operations/docker-deployment.md`
- **Air-Gap Operation Guide:** `docs/OFFLINE_KIT.md`
- **Security Hardening Guide:** `docs/operations/security-hardening.md`
### 15.3 External Resources
- **Eclipse Temurin Downloads:** https://adoptium.net/
- **Ghidra Releases:** https://github.com/NationalSecurityAgency/ghidra/releases
- **ghidriff PyPI:** https://pypi.org/project/ghidriff/
- **PostgreSQL Documentation:** https://www.postgresql.org/docs/16/
---
## 16. Changelog
| Date | Version | Changes |
|------|---------|---------|
| 2026-01-05 | 1.0.0 | Initial deployment guide created for GHID-019 |
---
*Document Version: 1.0.0*
*Last Updated: 2026-01-05*
*Maintainer: BinaryIndex Guild*