Files
git.stella-ops.org/docs/modules/binary-index/ghidra-deployment.md
StellaOps Bot 37e11918e0 save progress
2026-01-06 09:42:20 +02:00

31 KiB

Ghidra Deployment Guide

Module: BinaryIndex Component: Ghidra Integration Status: PRODUCTION-READY Version: 1.0.0 Related: BinaryIndex Architecture, SPRINT_20260105_001_003


1. Overview

This guide covers the deployment of Ghidra as a secondary analysis backend for the BinaryIndex module. Ghidra provides mature binary analysis capabilities including Version Tracking, BSim behavioral similarity, and FunctionID matching via headless analysis.

1.1 Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                    Unified Disassembly/Analysis Layer                        │
│                                                                              │
│  Primary: B2R2 (fast, deterministic)                                        │
│  Fallback: Ghidra (complex cases, low B2R2 confidence)                      │
│                                                                              │
│  ┌──────────────────────────┐    ┌──────────────────────────────────────┐  │
│  │       B2R2 Backend        │    │         Ghidra Backend               │  │
│  │                          │    │                                      │  │
│  │  - Native .NET           │    │  ┌────────────────────────────────┐  │  │
│  │  - LowUIR lifting        │    │  │     Ghidra Headless Server     │  │  │
│  │  - CFG recovery          │    │  │                                │  │  │
│  │  - Fast fingerprinting   │    │  │  - P-Code decompilation        │  │  │
│  │                          │    │  │  - Version Tracking            │  │  │
│  └──────────────────────────┘    │  │  - BSim queries                │  │  │
│                                  │  │  - FunctionID matching         │  │  │
│                                  │  └────────────────────────────────┘  │  │
│                                  │                  │                    │  │
│                                  │                  v                    │  │
│                                  │  ┌────────────────────────────────┐  │  │
│                                  │  │        ghidriff Bridge         │  │  │
│                                  │  │                                │  │  │
│                                  │  │  - Automated patch diffing     │  │  │
│                                  │  │  - JSON/Markdown output        │  │  │
│                                  │  │  - CI/CD integration           │  │  │
│                                  │  └────────────────────────────────┘  │  │
│                                  └──────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────────┘

1.2 When Ghidra is Used

Ghidra serves as a fallback/enhancement layer for:

  1. Architectures B2R2 handles poorly - Exotic architectures, embedded systems
  2. Complex obfuscation scenarios - Heavily obfuscated or packed binaries
  3. Version Tracking - Patch diffing with multiple correlators
  4. BSim database queries - Behavioral similarity matching against known libraries
  5. Low B2R2 confidence - When B2R2 analysis confidence falls below threshold

2. Prerequisites

2.1 System Requirements

Component Requirement Notes
Java OpenJDK 17+ Eclipse Temurin recommended
Ghidra 11.x (11.2+) NSA Ghidra from official releases
Python 3.10+ Required for ghidriff
Memory 8GB+ RAM 4GB for Ghidra JVM, 4GB for OS/services
CPU 4+ cores More cores improve analysis speed
Storage 10GB+ free Ghidra installation + project files

2.2 Operating System Support

  • Linux: Ubuntu 22.04+, Debian Bookworm+, RHEL 9+, Alpine 3.19+
  • Windows: Windows Server 2022, Windows 10/11 (development only)
  • macOS: macOS 12+ (development only, limited support)

2.3 Network Requirements

For air-gapped deployments:

  • Pre-download Ghidra release archives
  • Pre-install ghidriff Python package wheels
  • No external network access required at runtime

3. Java Installation

3.1 Linux (Ubuntu/Debian)

# Install Eclipse Temurin 17
wget -O - https://packages.adoptium.net/artifactory/api/gpg/key/public | sudo apt-key add -
echo "deb https://packages.adoptium.net/artifactory/deb $(awk -F= '/^VERSION_CODENAME/{print$2}' /etc/os-release) main" | sudo tee /etc/apt/sources.list.d/adoptium.list
sudo apt-get update
sudo apt-get install -y temurin-17-jdk

# Verify installation
java -version
# Expected: openjdk version "17.0.x"

3.2 Linux (RHEL/Fedora)

# Install OpenJDK 17
sudo dnf install -y java-17-openjdk-devel

# Set JAVA_HOME
echo 'export JAVA_HOME=/usr/lib/jvm/java-17-openjdk' | sudo tee -a /etc/profile.d/java.sh
source /etc/profile.d/java.sh

# Verify
java -version

3.3 Linux (Alpine)

# Install OpenJDK 17
apk add --no-cache openjdk17-jdk

# Set JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk
echo 'export JAVA_HOME=/usr/lib/jvm/java-17-openjdk' >> /etc/profile

# Verify
java -version

Use Eclipse Temurin base image (included in Dockerfile, see section 6):

FROM eclipse-temurin:17-jdk-jammy

4. Ghidra Installation

4.1 Download Ghidra

# Set version
GHIDRA_VERSION=11.2
GHIDRA_BUILD_DATE=20241105  # Adjust to actual build date

# Download from GitHub releases
cd /tmp
wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip

# Verify checksum (obtain SHA256 from release page)
GHIDRA_SHA256="<insert-sha256-here>"
echo "${GHIDRA_SHA256}  ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip" | sha256sum -c -

4.2 Extract and Install

# Extract to /opt
sudo unzip ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip -d /opt

# Create symlink for version-agnostic path
sudo ln -s /opt/ghidra_${GHIDRA_VERSION}_PUBLIC /opt/ghidra

# Set permissions
sudo chmod +x /opt/ghidra/support/analyzeHeadless
sudo chmod +x /opt/ghidra/ghidraRun

# Set environment variables
echo 'export GHIDRA_HOME=/opt/ghidra' | sudo tee -a /etc/profile.d/ghidra.sh
echo 'export PATH="${GHIDRA_HOME}/support:${PATH}"' | sudo tee -a /etc/profile.d/ghidra.sh
source /etc/profile.d/ghidra.sh

4.3 Verify Installation

# Test headless mode
analyzeHeadless /tmp TempProject -help

# Expected output: Ghidra Headless Analyzer usage information

5. Python and ghidriff Installation

5.1 Install Python Dependencies

# Ubuntu/Debian
sudo apt-get install -y python3 python3-pip python3-venv

# RHEL/Fedora
sudo dnf install -y python3 python3-pip

# Alpine
apk add --no-cache python3 py3-pip

5.2 Install ghidriff

# Install globally (not recommended for production)
sudo pip3 install ghidriff

# Install in virtual environment (recommended)
python3 -m venv /opt/stellaops/venv
source /opt/stellaops/venv/bin/activate
pip install ghidriff

# Verify installation
python3 -m ghidriff --version
# Expected: ghidriff version 0.x.x

5.3 Air-Gapped Installation

# On internet-connected machine, download wheels
mkdir -p /tmp/ghidriff-wheels
pip download --dest /tmp/ghidriff-wheels ghidriff

# Transfer /tmp/ghidriff-wheels to air-gapped machine

# On air-gapped machine, install from local wheels
pip install --no-index --find-links /tmp/ghidriff-wheels ghidriff

6. Docker Deployment

6.1 Dockerfile

Create devops/docker/ghidra/Dockerfile.headless:

# Copyright (c) StellaOps. All rights reserved.
# Licensed under AGPL-3.0-or-later.

FROM eclipse-temurin:17-jdk-jammy

ARG GHIDRA_VERSION=11.2
ARG GHIDRA_BUILD_DATE=20241105
ARG GHIDRA_SHA256=<insert-sha256-here>

LABEL org.opencontainers.image.title="StellaOps Ghidra Headless"
LABEL org.opencontainers.image.description="Ghidra headless analysis server with ghidriff for BinaryIndex"
LABEL org.opencontainers.image.version="${GHIDRA_VERSION}"
LABEL org.opencontainers.image.licenses="AGPL-3.0-or-later"

# Install dependencies
RUN apt-get update && apt-get install -y \
    python3 \
    python3-pip \
    python3-venv \
    curl \
    unzip \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Download and verify Ghidra
RUN curl -fsSL "https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip" \
    -o /tmp/ghidra.zip \
    && echo "${GHIDRA_SHA256}  /tmp/ghidra.zip" | sha256sum -c - \
    && unzip /tmp/ghidra.zip -d /opt \
    && rm /tmp/ghidra.zip \
    && ln -s /opt/ghidra_${GHIDRA_VERSION}_PUBLIC /opt/ghidra \
    && chmod +x /opt/ghidra/support/analyzeHeadless

# Install ghidriff
RUN python3 -m venv /opt/venv \
    && /opt/venv/bin/pip install --no-cache-dir ghidriff

# Set environment variables
ENV GHIDRA_HOME=/opt/ghidra
ENV JAVA_HOME=/opt/java/openjdk
ENV PATH="${GHIDRA_HOME}/support:/opt/venv/bin:${PATH}"
ENV MAXMEM=4G

# Create working directories
RUN mkdir -p /projects /scripts /output \
    && chmod 755 /projects /scripts /output

WORKDIR /projects

# Healthcheck
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD analyzeHeadless /tmp HealthCheck -help > /dev/null 2>&1 || exit 1

# Default entrypoint
ENTRYPOINT ["analyzeHeadless"]
CMD ["--help"]

6.2 Build Docker Image

# Navigate to docker directory
cd devops/docker/ghidra

# Build image
docker build \
    -f Dockerfile.headless \
    -t stellaops/ghidra-headless:11.2 \
    -t stellaops/ghidra-headless:latest \
    --build-arg GHIDRA_SHA256=<insert-sha256> \
    .

# Verify build
docker run --rm stellaops/ghidra-headless:latest --help

6.3 Docker Compose Configuration

Create devops/compose/docker-compose.ghidra.yml:

# Copyright (c) StellaOps. All rights reserved.
# Licensed under AGPL-3.0-or-later.

version: "3.9"

services:
  ghidra-headless:
    image: stellaops/ghidra-headless:11.2
    container_name: stellaops-ghidra-headless
    hostname: ghidra-headless
    restart: unless-stopped

    volumes:
      - ghidra-projects:/projects
      - ghidra-scripts:/scripts
      - ghidra-output:/output
      - /etc/localtime:/etc/localtime:ro

    environment:
      JAVA_HOME: /opt/java/openjdk
      MAXMEM: ${GHIDRA_MAXMEM:-4G}
      GHIDRA_INSTALL_DIR: /opt/ghidra

    deploy:
      resources:
        limits:
          cpus: '4'
          memory: 8G
        reservations:
          cpus: '2'
          memory: 4G

    networks:
      - stellaops-backend

    # Override entrypoint for long-running service
    # In production, use a wrapper script or queue-based invocation
    entrypoint: ["/bin/bash"]
    command: ["-c", "tail -f /dev/null"]

  bsim-postgres:
    image: postgres:16-alpine
    container_name: stellaops-bsim-postgres
    hostname: bsim-postgres
    restart: unless-stopped

    volumes:
      - bsim-data:/var/lib/postgresql/data
      - ./init-bsim-db.sql:/docker-entrypoint-initdb.d/01-init.sql:ro

    environment:
      POSTGRES_DB: bsim
      POSTGRES_USER: bsim
      POSTGRES_PASSWORD: ${BSIM_DB_PASSWORD:-changeme}
      PGDATA: /var/lib/postgresql/data/pgdata

    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 2G
        reservations:
          cpus: '1'
          memory: 1G

    networks:
      - stellaops-backend

    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U bsim"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  ghidra-projects:
    name: stellaops-ghidra-projects
  ghidra-scripts:
    name: stellaops-ghidra-scripts
  ghidra-output:
    name: stellaops-ghidra-output
  bsim-data:
    name: stellaops-bsim-data

networks:
  stellaops-backend:
    name: stellaops-backend
    external: true

6.4 BSim Database Initialization

Create devops/compose/init-bsim-db.sql:

-- Copyright (c) StellaOps. All rights reserved.
-- Licensed under AGPL-3.0-or-later.

-- BSim database initialization for Ghidra
-- This schema is managed by Ghidra's BSim tooling

-- Create extensions
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- Create application user (if different from postgres user)
-- Adjust as needed for your deployment
DO $$
BEGIN
    IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = 'bsim_app') THEN
        CREATE ROLE bsim_app WITH LOGIN PASSWORD 'changeme';
    END IF;
END
$$;

-- Grant permissions
GRANT ALL PRIVILEGES ON DATABASE bsim TO bsim_app;

-- Note: Ghidra's BSim will create its own schema tables on first use
-- See Ghidra BSim documentation for schema details

6.5 Start Services

# Create backend network if it doesn't exist
docker network create stellaops-backend

# Set environment variables
export BSIM_DB_PASSWORD=your-secure-password
export GHIDRA_MAXMEM=8G

# Start services
docker-compose -f devops/compose/docker-compose.ghidra.yml up -d

# Verify services are running
docker-compose -f devops/compose/docker-compose.ghidra.yml ps

# Check logs
docker-compose -f devops/compose/docker-compose.ghidra.yml logs -f ghidra-headless
docker-compose -f devops/compose/docker-compose.ghidra.yml logs -f bsim-postgres

7. BSim PostgreSQL Database Setup

7.1 Database Creation

BSim uses PostgreSQL as its backend database. Ghidra's BSim tooling will create the schema automatically on first use, but you need to provision the database instance.

7.2 Manual Database Setup (Non-Docker)

# As postgres user, create database and user
sudo -u postgres psql <<EOF
CREATE DATABASE bsim;
CREATE USER bsim WITH ENCRYPTED PASSWORD 'your-secure-password';
GRANT ALL PRIVILEGES ON DATABASE bsim TO bsim;
\c bsim
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
GRANT ALL PRIVILEGES ON SCHEMA public TO bsim;
EOF

7.3 BSim Server Configuration

Create BSim server configuration (if using BSim server mode, optional):

<!-- /etc/stellaops/bsim-server.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<bsim>
    <database>
        <host>localhost</host>
        <port>5432</port>
        <name>bsim</name>
        <user>bsim</user>
        <password>your-secure-password</password>
    </database>
    <server>
        <port>6543</port>
        <maxConnections>10</maxConnections>
    </server>
</bsim>

7.4 Test BSim Connection

# Using Ghidra's bsim command-line tool
$GHIDRA_HOME/support/bsim createdb postgresql://bsim:your-secure-password@localhost:5432/bsim stellaops_corpus

# Expected: Database created successfully

8. Configuration

8.1 StellaOps Configuration

Add Ghidra configuration to your StellaOps service configuration file (e.g., etc/binaryindex.yaml):

# Ghidra Integration Configuration
Ghidra:
  # Path to Ghidra installation directory (GHIDRA_HOME)
  GhidraHome: /opt/ghidra

  # Path to Java installation directory (JAVA_HOME)
  # If not set, system JAVA_HOME will be used
  JavaHome: /usr/lib/jvm/java-17-openjdk

  # Working directory for Ghidra projects and temporary files
  WorkDir: /var/lib/stellaops/ghidra

  # Path to custom Ghidra scripts directory
  ScriptsDir: /opt/stellaops/ghidra-scripts

  # Maximum memory for Ghidra JVM (e.g., "4G", "8192M")
  MaxMemory: 4G

  # Maximum CPU cores for Ghidra analysis
  MaxCpu: 4

  # Default timeout for analysis operations in seconds
  DefaultTimeoutSeconds: 300

  # Whether to clean up temporary projects after analysis
  CleanupTempProjects: true

  # Maximum concurrent Ghidra instances
  MaxConcurrentInstances: 1

  # Whether Ghidra integration is enabled
  Enabled: true

# BSim Database Configuration
BSim:
  # BSim database connection string
  # Format: postgresql://user:pass@host:port/database
  ConnectionString: postgresql://bsim:your-secure-password@bsim-postgres:5432/bsim

  # Alternative: Specify components separately
  # Host: bsim-postgres
  # Port: 5432
  # Database: bsim
  # Username: bsim
  # Password: your-secure-password

  # Default minimum similarity for queries
  DefaultMinSimilarity: 0.7

  # Default maximum results per query
  DefaultMaxResults: 10

  # Whether BSim integration is enabled
  Enabled: true

# ghidriff Python Bridge Configuration
Ghidriff:
  # Path to Python executable
  # If not set, "python3" or "python" will be used from PATH
  PythonPath: /opt/venv/bin/python3

  # Path to ghidriff module (if not installed via pip)
  # GhidriffModulePath: /opt/stellaops/ghidriff

  # Whether to include decompilation in diff output by default
  DefaultIncludeDecompilation: true

  # Whether to include disassembly in diff output by default
  DefaultIncludeDisassembly: true

  # Default timeout for ghidriff operations in seconds
  DefaultTimeoutSeconds: 600

  # Working directory for ghidriff output
  WorkDir: /var/lib/stellaops/ghidriff

  # Whether ghidriff integration is enabled
  Enabled: true

8.2 Environment Variables

You can also configure Ghidra via environment variables:

# Ghidra
export STELLAOPS_GHIDRA_GHIDRAHOME=/opt/ghidra
export STELLAOPS_GHIDRA_JAVAHOME=/usr/lib/jvm/java-17-openjdk
export STELLAOPS_GHIDRA_MAXMEMORY=4G
export STELLAOPS_GHIDRA_MAXCPU=4
export STELLAOPS_GHIDRA_ENABLED=true

# BSim
export STELLAOPS_BSIM_CONNECTIONSTRING=postgresql://bsim:password@localhost:5432/bsim
export STELLAOPS_BSIM_ENABLED=true

# ghidriff
export STELLAOPS_GHIDRIFF_PYTHONPATH=/opt/venv/bin/python3
export STELLAOPS_GHIDRIFF_ENABLED=true

8.3 appsettings.json (ASP.NET Core)

For services using ASP.NET Core configuration:

{
  "Ghidra": {
    "GhidraHome": "/opt/ghidra",
    "JavaHome": "/usr/lib/jvm/java-17-openjdk",
    "WorkDir": "/var/lib/stellaops/ghidra",
    "MaxMemory": "4G",
    "MaxCpu": 4,
    "DefaultTimeoutSeconds": 300,
    "CleanupTempProjects": true,
    "MaxConcurrentInstances": 1,
    "Enabled": true
  },
  "BSim": {
    "ConnectionString": "postgresql://bsim:password@bsim-postgres:5432/bsim",
    "DefaultMinSimilarity": 0.7,
    "DefaultMaxResults": 10,
    "Enabled": true
  },
  "Ghidriff": {
    "PythonPath": "/opt/venv/bin/python3",
    "DefaultIncludeDecompilation": true,
    "DefaultIncludeDisassembly": true,
    "DefaultTimeoutSeconds": 600,
    "WorkDir": "/var/lib/stellaops/ghidriff",
    "Enabled": true
  }
}

9. Testing and Validation

9.1 Ghidra Headless Test

Create a simple test binary and analyze it:

# Create test C program
cat > /tmp/test.c <<'EOF'
#include <stdio.h>

int add(int a, int b) {
    return a + b;
}

int main() {
    int result = add(5, 3);
    printf("Result: %d\n", result);
    return 0;
}
EOF

# Compile
gcc -o /tmp/test /tmp/test.c

# Run Ghidra analysis
analyzeHeadless /tmp TestProject \
    -import /tmp/test \
    -postScript ListFunctionsScript.java \
    -noanalysis

# Expected: Analysis completes without errors, lists functions (main, add)

9.2 BSim Database Test

# Create test BSim database
$GHIDRA_HOME/support/bsim createdb \
    postgresql://bsim:password@localhost:5432/bsim \
    test_corpus

# Ingest test binary into BSim
$GHIDRA_HOME/support/bsim ingest \
    postgresql://bsim:password@localhost:5432/bsim/test_corpus \
    /tmp/test

# Query BSim
$GHIDRA_HOME/support/bsim querysimilar \
    postgresql://bsim:password@localhost:5432/bsim/test_corpus \
    /tmp/test \
    --threshold 0.7

# Expected: Shows functions from test binary with similarity scores

9.3 ghidriff Test

# Create two versions of a binary (modify test.c slightly)
cat > /tmp/test_v2.c <<'EOF'
#include <stdio.h>

int add(int a, int b) {
    // Added comment
    return a + b + 1;  // Modified
}

int main() {
    int result = add(5, 3);
    printf("Result: %d\n", result);
    return 0;
}
EOF

gcc -o /tmp/test_v2 /tmp/test_v2.c

# Run ghidriff
python3 -m ghidriff /tmp/test /tmp/test_v2 \
    --output-dir /tmp/ghidriff-test \
    --output-format json

# Expected: Creates diff.json in /tmp/ghidriff-test showing changes
cat /tmp/ghidriff-test/diff.json

9.4 Integration Test

Test the BinaryIndex Ghidra integration:

# Run BinaryIndex integration tests
dotnet test src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Ghidra.Tests/ \
    --filter "Category=Integration" \
    --logger "trx;LogFileName=ghidra-tests.trx"

# Expected: All tests pass

10. Troubleshooting

10.1 Common Issues

Issue: "analyzeHeadless: command not found"

Solution:

# Ensure GHIDRA_HOME is set
export GHIDRA_HOME=/opt/ghidra
export PATH="${GHIDRA_HOME}/support:${PATH}"

# Verify
which analyzeHeadless

Issue: "Java version mismatch" or "UnsupportedClassVersionError"

Solution:

# Check Java version
java -version
# Must be Java 17+

# Set correct JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk

Issue: "OutOfMemoryError: Java heap space"

Solution:

# Increase MAXMEM
export MAXMEM=8G

# Or in configuration
Ghidra:
  MaxMemory: 8G

Issue: "ghidriff: No module named 'ghidriff'"

Solution:

# Install ghidriff
pip3 install ghidriff

# Or activate venv
source /opt/venv/bin/activate
pip install ghidriff

# Verify
python3 -m ghidriff --version

Issue: "BSim connection refused"

Solution:

# Check PostgreSQL is running
docker-compose -f devops/compose/docker-compose.ghidra.yml ps bsim-postgres

# Test connection
psql -h localhost -p 5432 -U bsim -d bsim -c "SELECT version();"

# Check connection string in configuration
# Ensure format: postgresql://user:pass@host:port/database

Issue: "Ghidra analysis hangs or times out"

Solution:

# Increase timeout
Ghidra:
  DefaultTimeoutSeconds: 600  # 10 minutes

# Reduce analysis scope (disable certain analyzers)
analyzeHeadless /tmp TestProject -import /tmp/test \
    -noanalysis \
    -processor x86:LE:64:default

# Check system resources (CPU, memory)
docker stats stellaops-ghidra-headless

10.2 Logging and Diagnostics

Enable Ghidra Debug Logging

# Run with verbose output
analyzeHeadless /tmp TestProject -import /tmp/test \
    -log /tmp/ghidra-analysis.log \
    -logLevel DEBUG

# Check log file
tail -f /tmp/ghidra-analysis.log

Enable StellaOps Ghidra Logging

Add to appsettings.json:

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "StellaOps.BinaryIndex.Ghidra": "Debug"
    }
  }
}

Docker Container Logs

# View Ghidra headless logs
docker logs stellaops-ghidra-headless -f

# View BSim PostgreSQL logs
docker logs stellaops-bsim-postgres -f

# View logs with timestamps
docker logs stellaops-ghidra-headless --timestamps

10.3 Performance Tuning

Optimize Ghidra Memory Settings

Ghidra:
  # For large binaries (>100MB)
  MaxMemory: 16G

  # For many concurrent analyses
  MaxConcurrentInstances: 4

Optimize BSim Queries

BSim:
  # Reduce result set for faster queries
  DefaultMaxResults: 5

  # Increase similarity threshold to reduce matches
  DefaultMinSimilarity: 0.8

Docker Resource Limits

services:
  ghidra-headless:
    deploy:
      resources:
        limits:
          cpus: '8'      # Increase for faster analysis
          memory: 16G    # Match MaxMemory + overhead

11. Production Deployment Checklist

11.1 Pre-Deployment

  • Java 17+ installed and verified
  • Ghidra 11.2+ downloaded and SHA256 verified
  • Python 3.10+ installed
  • ghidriff installed and tested
  • PostgreSQL 16+ available for BSim
  • Docker images built and tested
  • Configuration files reviewed and validated
  • Network connectivity verified (or air-gap packages prepared)

11.2 Security Hardening

  • BSim database password set to strong value (not "changeme")
  • PostgreSQL configured with TLS/SSL
  • Ghidra working directories have restricted permissions (700)
  • Docker containers run as non-root user
  • Network segmentation configured (backend network only)
  • Firewall rules restrict BSim PostgreSQL access
  • Audit logging enabled for Ghidra operations

11.3 Post-Deployment

  • Ghidra headless test completed successfully
  • BSim database initialized and accessible
  • ghidriff integration tested
  • BinaryIndex integration tests pass
  • Monitoring and alerting configured
  • Log aggregation configured
  • Backup strategy for BSim database configured
  • Runbook/procedures documented

12. Monitoring and Observability

12.1 Metrics

StellaOps exposes Prometheus metrics for Ghidra integration:

Metric Type Description
ghidra_analysis_total Counter Total Ghidra analyses performed
ghidra_analysis_duration_seconds Histogram Duration of Ghidra analyses
ghidra_analysis_errors_total Counter Total Ghidra analysis errors
ghidra_instances_active Gauge Active Ghidra headless instances
bsim_query_total Counter Total BSim queries
bsim_query_duration_seconds Histogram Duration of BSim queries
bsim_matches_total Counter Total BSim matches found
ghidriff_diff_total Counter Total ghidriff diffs performed
ghidriff_diff_duration_seconds Histogram Duration of ghidriff diffs

12.2 Health Checks

Ghidra service health check endpoint (if using wrapper service):

# HTTP health check
curl http://localhost:8080/health/ghidra

# Expected response:
{
  "status": "Healthy",
  "ghidra": {
    "available": true,
    "version": "11.2",
    "javaVersion": "17.0.x"
  },
  "bsim": {
    "available": true,
    "connection": "OK"
  }
}

12.3 Alerts

Recommended Prometheus alerts:

groups:
  - name: ghidra
    rules:
      - alert: GhidraAnalysisHighErrorRate
        expr: rate(ghidra_analysis_errors_total[5m]) > 0.1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High Ghidra analysis error rate"
          description: "Ghidra error rate is {{ $value }} errors/sec"

      - alert: GhidraAnalysisSlow
        expr: histogram_quantile(0.95, ghidra_analysis_duration_seconds) > 600
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Ghidra analyses are slow"
          description: "P95 analysis duration is {{ $value }}s (>10m)"

      - alert: BSimDatabaseDown
        expr: up{job="bsim-postgres"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "BSim database is down"
          description: "BSim PostgreSQL database is unreachable"

13. Backup and Recovery

13.1 BSim Database Backup

# Automated backup script
#!/bin/bash
BACKUP_DIR=/var/backups/stellaops/bsim
DATE=$(date +%Y%m%d_%H%M%S)

# Create backup
docker exec stellaops-bsim-postgres \
    pg_dump -U bsim -Fc bsim > ${BACKUP_DIR}/bsim_${DATE}.dump

# Compress (optional)
gzip ${BACKUP_DIR}/bsim_${DATE}.dump

# Retention: keep last 7 days
find ${BACKUP_DIR} -name "bsim_*.dump.gz" -mtime +7 -delete

13.2 BSim Database Restore

# Stop dependent services
docker-compose -f devops/compose/docker-compose.ghidra.yml stop ghidra-headless

# Restore from backup
gunzip -c /var/backups/stellaops/bsim/bsim_20260105_120000.dump.gz | \
docker exec -i stellaops-bsim-postgres \
    pg_restore -U bsim -d bsim --clean --if-exists

# Restart services
docker-compose -f devops/compose/docker-compose.ghidra.yml up -d

13.3 Ghidra Project Backup

# Backup Ghidra projects (if using persistent projects)
tar -czf /var/backups/stellaops/ghidra/projects_$(date +%Y%m%d).tar.gz \
    /var/lib/stellaops/ghidra/projects

# Scripts backup
tar -czf /var/backups/stellaops/ghidra/scripts_$(date +%Y%m%d).tar.gz \
    /opt/stellaops/ghidra-scripts

14. Air-Gapped Deployment

14.1 Package Preparation

On internet-connected machine:

# Download Ghidra
wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_11.2_build/ghidra_11.2_PUBLIC_20241105.zip

# Download Python wheels
mkdir -p airgap-packages
pip download --dest airgap-packages ghidriff

# Download Docker images
docker save stellaops/ghidra-headless:11.2 | gzip > airgap-packages/ghidra-headless-11.2.tar.gz
docker save postgres:16-alpine | gzip > airgap-packages/postgres-16-alpine.tar.gz

# Create tarball
tar -czf stellaops-ghidra-airgap.tar.gz airgap-packages/

14.2 Air-Gapped Installation

On air-gapped machine:

# Extract package
tar -xzf stellaops-ghidra-airgap.tar.gz

# Install Ghidra
cd airgap-packages
unzip ghidra_11.2_PUBLIC_20241105.zip -d /opt
ln -s /opt/ghidra_11.2_PUBLIC /opt/ghidra

# Install Python packages
pip install --no-index --find-links . ghidriff

# Load Docker images
docker load < ghidra-headless-11.2.tar.gz
docker load < postgres-16-alpine.tar.gz

# Proceed with normal deployment

15. References

15.1 Documentation

  • PostgreSQL Guide: docs/operations/postgresql-guide.md
  • Docker Deployment Guide: docs/operations/docker-deployment.md
  • Air-Gap Operation Guide: docs/OFFLINE_KIT.md
  • Security Hardening Guide: docs/operations/security-hardening.md

15.3 External Resources


16. Changelog

Date Version Changes
2026-01-05 1.0.0 Initial deployment guide created for GHID-019

Document Version: 1.0.0 Last Updated: 2026-01-05 Maintainer: BinaryIndex Guild