1183 lines
31 KiB
Markdown
1183 lines
31 KiB
Markdown
# Ghidra Deployment Guide
|
|
|
|
> **Module:** BinaryIndex
|
|
> **Component:** Ghidra Integration
|
|
> **Status:** PRODUCTION-READY
|
|
> **Version:** 1.0.0
|
|
> **Related:** [BinaryIndex Architecture](./architecture.md), [SPRINT_20260105_001_003](../../implplan/SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md)
|
|
|
|
---
|
|
|
|
## 1. Overview
|
|
|
|
This guide covers the deployment of Ghidra as a secondary analysis backend for the BinaryIndex module. Ghidra provides mature binary analysis capabilities including Version Tracking, BSim behavioral similarity, and FunctionID matching via headless analysis.
|
|
|
|
### 1.1 Architecture Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ Unified Disassembly/Analysis Layer │
|
|
│ │
|
|
│ Primary: B2R2 (fast, deterministic) │
|
|
│ Fallback: Ghidra (complex cases, low B2R2 confidence) │
|
|
│ │
|
|
│ ┌──────────────────────────┐ ┌──────────────────────────────────────┐ │
|
|
│ │ B2R2 Backend │ │ Ghidra Backend │ │
|
|
│ │ │ │ │ │
|
|
│ │ - Native .NET │ │ ┌────────────────────────────────┐ │ │
|
|
│ │ - LowUIR lifting │ │ │ Ghidra Headless Server │ │ │
|
|
│ │ - CFG recovery │ │ │ │ │ │
|
|
│ │ - Fast fingerprinting │ │ │ - P-Code decompilation │ │ │
|
|
│ │ │ │ │ - Version Tracking │ │ │
|
|
│ └──────────────────────────┘ │ │ - BSim queries │ │ │
|
|
│ │ │ - FunctionID matching │ │ │
|
|
│ │ └────────────────────────────────┘ │ │
|
|
│ │ │ │ │
|
|
│ │ v │ │
|
|
│ │ ┌────────────────────────────────┐ │ │
|
|
│ │ │ ghidriff Bridge │ │ │
|
|
│ │ │ │ │ │
|
|
│ │ │ - Automated patch diffing │ │ │
|
|
│ │ │ - JSON/Markdown output │ │ │
|
|
│ │ │ - CI/CD integration │ │ │
|
|
│ │ └────────────────────────────────┘ │ │
|
|
│ └──────────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### 1.2 When Ghidra is Used
|
|
|
|
Ghidra serves as a fallback/enhancement layer for:
|
|
|
|
1. **Architectures B2R2 handles poorly** - Exotic architectures, embedded systems
|
|
2. **Complex obfuscation scenarios** - Heavily obfuscated or packed binaries
|
|
3. **Version Tracking** - Patch diffing with multiple correlators
|
|
4. **BSim database queries** - Behavioral similarity matching against known libraries
|
|
5. **Low B2R2 confidence** - When B2R2 analysis confidence falls below threshold
|
|
|
|
---
|
|
|
|
## 2. Prerequisites
|
|
|
|
### 2.1 System Requirements
|
|
|
|
| Component | Requirement | Notes |
|
|
|-----------|-------------|-------|
|
|
| **Java** | OpenJDK 17+ | Eclipse Temurin recommended |
|
|
| **Ghidra** | 11.x (11.2+) | NSA Ghidra from official releases |
|
|
| **Python** | 3.10+ | Required for ghidriff |
|
|
| **Memory** | 8GB+ RAM | 4GB for Ghidra JVM, 4GB for OS/services |
|
|
| **CPU** | 4+ cores | More cores improve analysis speed |
|
|
| **Storage** | 10GB+ free | Ghidra installation + project files |
|
|
|
|
### 2.2 Operating System Support
|
|
|
|
- **Linux:** Ubuntu 22.04+, Debian Bookworm+, RHEL 9+, Alpine 3.19+
|
|
- **Windows:** Windows Server 2022, Windows 10/11 (development only)
|
|
- **macOS:** macOS 12+ (development only, limited support)
|
|
|
|
### 2.3 Network Requirements
|
|
|
|
For air-gapped deployments:
|
|
|
|
- Pre-download Ghidra release archives
|
|
- Pre-install ghidriff Python package wheels
|
|
- No external network access required at runtime
|
|
|
|
---
|
|
|
|
## 3. Java Installation
|
|
|
|
### 3.1 Linux (Ubuntu/Debian)
|
|
|
|
```bash
|
|
# Install Eclipse Temurin 17
|
|
wget -O - https://packages.adoptium.net/artifactory/api/gpg/key/public | sudo apt-key add -
|
|
echo "deb https://packages.adoptium.net/artifactory/deb $(awk -F= '/^VERSION_CODENAME/{print$2}' /etc/os-release) main" | sudo tee /etc/apt/sources.list.d/adoptium.list
|
|
sudo apt-get update
|
|
sudo apt-get install -y temurin-17-jdk
|
|
|
|
# Verify installation
|
|
java -version
|
|
# Expected: openjdk version "17.0.x"
|
|
```
|
|
|
|
### 3.2 Linux (RHEL/Fedora)
|
|
|
|
```bash
|
|
# Install OpenJDK 17
|
|
sudo dnf install -y java-17-openjdk-devel
|
|
|
|
# Set JAVA_HOME
|
|
echo 'export JAVA_HOME=/usr/lib/jvm/java-17-openjdk' | sudo tee -a /etc/profile.d/java.sh
|
|
source /etc/profile.d/java.sh
|
|
|
|
# Verify
|
|
java -version
|
|
```
|
|
|
|
### 3.3 Linux (Alpine)
|
|
|
|
```bash
|
|
# Install OpenJDK 17
|
|
apk add --no-cache openjdk17-jdk
|
|
|
|
# Set JAVA_HOME
|
|
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk
|
|
echo 'export JAVA_HOME=/usr/lib/jvm/java-17-openjdk' >> /etc/profile
|
|
|
|
# Verify
|
|
java -version
|
|
```
|
|
|
|
### 3.4 Docker (Recommended)
|
|
|
|
Use Eclipse Temurin base image (included in Dockerfile, see section 6):
|
|
|
|
```dockerfile
|
|
FROM eclipse-temurin:17-jdk-jammy
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Ghidra Installation
|
|
|
|
### 4.1 Download Ghidra
|
|
|
|
```bash
|
|
# Set version
|
|
GHIDRA_VERSION=11.2
|
|
GHIDRA_BUILD_DATE=20241105 # Adjust to actual build date
|
|
|
|
# Download from GitHub releases
|
|
cd /tmp
|
|
wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip
|
|
|
|
# Verify checksum (obtain SHA256 from release page)
|
|
GHIDRA_SHA256="<insert-sha256-here>"
|
|
echo "${GHIDRA_SHA256} ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip" | sha256sum -c -
|
|
```
|
|
|
|
### 4.2 Extract and Install
|
|
|
|
```bash
|
|
# Extract to /opt
|
|
sudo unzip ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip -d /opt
|
|
|
|
# Create symlink for version-agnostic path
|
|
sudo ln -s /opt/ghidra_${GHIDRA_VERSION}_PUBLIC /opt/ghidra
|
|
|
|
# Set permissions
|
|
sudo chmod +x /opt/ghidra/support/analyzeHeadless
|
|
sudo chmod +x /opt/ghidra/ghidraRun
|
|
|
|
# Set environment variables
|
|
echo 'export GHIDRA_HOME=/opt/ghidra' | sudo tee -a /etc/profile.d/ghidra.sh
|
|
echo 'export PATH="${GHIDRA_HOME}/support:${PATH}"' | sudo tee -a /etc/profile.d/ghidra.sh
|
|
source /etc/profile.d/ghidra.sh
|
|
```
|
|
|
|
### 4.3 Verify Installation
|
|
|
|
```bash
|
|
# Test headless mode
|
|
analyzeHeadless /tmp TempProject -help
|
|
|
|
# Expected output: Ghidra Headless Analyzer usage information
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Python and ghidriff Installation
|
|
|
|
### 5.1 Install Python Dependencies
|
|
|
|
```bash
|
|
# Ubuntu/Debian
|
|
sudo apt-get install -y python3 python3-pip python3-venv
|
|
|
|
# RHEL/Fedora
|
|
sudo dnf install -y python3 python3-pip
|
|
|
|
# Alpine
|
|
apk add --no-cache python3 py3-pip
|
|
```
|
|
|
|
### 5.2 Install ghidriff
|
|
|
|
```bash
|
|
# Install globally (not recommended for production)
|
|
sudo pip3 install ghidriff
|
|
|
|
# Install in virtual environment (recommended)
|
|
python3 -m venv /opt/stellaops/venv
|
|
source /opt/stellaops/venv/bin/activate
|
|
pip install ghidriff
|
|
|
|
# Verify installation
|
|
python3 -m ghidriff --version
|
|
# Expected: ghidriff version 0.x.x
|
|
```
|
|
|
|
### 5.3 Air-Gapped Installation
|
|
|
|
```bash
|
|
# On internet-connected machine, download wheels
|
|
mkdir -p /tmp/ghidriff-wheels
|
|
pip download --dest /tmp/ghidriff-wheels ghidriff
|
|
|
|
# Transfer /tmp/ghidriff-wheels to air-gapped machine
|
|
|
|
# On air-gapped machine, install from local wheels
|
|
pip install --no-index --find-links /tmp/ghidriff-wheels ghidriff
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Docker Deployment
|
|
|
|
### 6.1 Dockerfile
|
|
|
|
Create `devops/docker/ghidra/Dockerfile.headless`:
|
|
|
|
```dockerfile
|
|
# Copyright (c) StellaOps. All rights reserved.
|
|
# Licensed under AGPL-3.0-or-later.
|
|
|
|
FROM eclipse-temurin:17-jdk-jammy
|
|
|
|
ARG GHIDRA_VERSION=11.2
|
|
ARG GHIDRA_BUILD_DATE=20241105
|
|
ARG GHIDRA_SHA256=<insert-sha256-here>
|
|
|
|
LABEL org.opencontainers.image.title="StellaOps Ghidra Headless"
|
|
LABEL org.opencontainers.image.description="Ghidra headless analysis server with ghidriff for BinaryIndex"
|
|
LABEL org.opencontainers.image.version="${GHIDRA_VERSION}"
|
|
LABEL org.opencontainers.image.licenses="AGPL-3.0-or-later"
|
|
|
|
# Install dependencies
|
|
RUN apt-get update && apt-get install -y \
|
|
python3 \
|
|
python3-pip \
|
|
python3-venv \
|
|
curl \
|
|
unzip \
|
|
&& apt-get clean \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
# Download and verify Ghidra
|
|
RUN curl -fsSL "https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_${GHIDRA_BUILD_DATE}.zip" \
|
|
-o /tmp/ghidra.zip \
|
|
&& echo "${GHIDRA_SHA256} /tmp/ghidra.zip" | sha256sum -c - \
|
|
&& unzip /tmp/ghidra.zip -d /opt \
|
|
&& rm /tmp/ghidra.zip \
|
|
&& ln -s /opt/ghidra_${GHIDRA_VERSION}_PUBLIC /opt/ghidra \
|
|
&& chmod +x /opt/ghidra/support/analyzeHeadless
|
|
|
|
# Install ghidriff
|
|
RUN python3 -m venv /opt/venv \
|
|
&& /opt/venv/bin/pip install --no-cache-dir ghidriff
|
|
|
|
# Set environment variables
|
|
ENV GHIDRA_HOME=/opt/ghidra
|
|
ENV JAVA_HOME=/opt/java/openjdk
|
|
ENV PATH="${GHIDRA_HOME}/support:/opt/venv/bin:${PATH}"
|
|
ENV MAXMEM=4G
|
|
|
|
# Create working directories
|
|
RUN mkdir -p /projects /scripts /output \
|
|
&& chmod 755 /projects /scripts /output
|
|
|
|
WORKDIR /projects
|
|
|
|
# Healthcheck
|
|
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
|
|
CMD analyzeHeadless /tmp HealthCheck -help > /dev/null 2>&1 || exit 1
|
|
|
|
# Default entrypoint
|
|
ENTRYPOINT ["analyzeHeadless"]
|
|
CMD ["--help"]
|
|
```
|
|
|
|
### 6.2 Build Docker Image
|
|
|
|
```bash
|
|
# Navigate to docker directory
|
|
cd devops/docker/ghidra
|
|
|
|
# Build image
|
|
docker build \
|
|
-f Dockerfile.headless \
|
|
-t stellaops/ghidra-headless:11.2 \
|
|
-t stellaops/ghidra-headless:latest \
|
|
--build-arg GHIDRA_SHA256=<insert-sha256> \
|
|
.
|
|
|
|
# Verify build
|
|
docker run --rm stellaops/ghidra-headless:latest --help
|
|
```
|
|
|
|
### 6.3 Docker Compose Configuration
|
|
|
|
Create `devops/compose/docker-compose.ghidra.yml`:
|
|
|
|
```yaml
|
|
# Copyright (c) StellaOps. All rights reserved.
|
|
# Licensed under AGPL-3.0-or-later.
|
|
|
|
version: "3.9"
|
|
|
|
services:
|
|
ghidra-headless:
|
|
image: stellaops/ghidra-headless:11.2
|
|
container_name: stellaops-ghidra-headless
|
|
hostname: ghidra-headless
|
|
restart: unless-stopped
|
|
|
|
volumes:
|
|
- ghidra-projects:/projects
|
|
- ghidra-scripts:/scripts
|
|
- ghidra-output:/output
|
|
- /etc/localtime:/etc/localtime:ro
|
|
|
|
environment:
|
|
JAVA_HOME: /opt/java/openjdk
|
|
MAXMEM: ${GHIDRA_MAXMEM:-4G}
|
|
GHIDRA_INSTALL_DIR: /opt/ghidra
|
|
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '4'
|
|
memory: 8G
|
|
reservations:
|
|
cpus: '2'
|
|
memory: 4G
|
|
|
|
networks:
|
|
- stellaops-backend
|
|
|
|
# Override entrypoint for long-running service
|
|
# In production, use a wrapper script or queue-based invocation
|
|
entrypoint: ["/bin/bash"]
|
|
command: ["-c", "tail -f /dev/null"]
|
|
|
|
bsim-postgres:
|
|
image: postgres:16-alpine
|
|
container_name: stellaops-bsim-postgres
|
|
hostname: bsim-postgres
|
|
restart: unless-stopped
|
|
|
|
volumes:
|
|
- bsim-data:/var/lib/postgresql/data
|
|
- ./init-bsim-db.sql:/docker-entrypoint-initdb.d/01-init.sql:ro
|
|
|
|
environment:
|
|
POSTGRES_DB: bsim
|
|
POSTGRES_USER: bsim
|
|
POSTGRES_PASSWORD: ${BSIM_DB_PASSWORD:-changeme}
|
|
PGDATA: /var/lib/postgresql/data/pgdata
|
|
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '2'
|
|
memory: 2G
|
|
reservations:
|
|
cpus: '1'
|
|
memory: 1G
|
|
|
|
networks:
|
|
- stellaops-backend
|
|
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "pg_isready -U bsim"]
|
|
interval: 10s
|
|
timeout: 5s
|
|
retries: 5
|
|
|
|
volumes:
|
|
ghidra-projects:
|
|
name: stellaops-ghidra-projects
|
|
ghidra-scripts:
|
|
name: stellaops-ghidra-scripts
|
|
ghidra-output:
|
|
name: stellaops-ghidra-output
|
|
bsim-data:
|
|
name: stellaops-bsim-data
|
|
|
|
networks:
|
|
stellaops-backend:
|
|
name: stellaops-backend
|
|
external: true
|
|
```
|
|
|
|
### 6.4 BSim Database Initialization
|
|
|
|
Create `devops/compose/init-bsim-db.sql`:
|
|
|
|
```sql
|
|
-- Copyright (c) StellaOps. All rights reserved.
|
|
-- Licensed under AGPL-3.0-or-later.
|
|
|
|
-- BSim database initialization for Ghidra
|
|
-- This schema is managed by Ghidra's BSim tooling
|
|
|
|
-- Create extensions
|
|
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
|
|
|
-- Create application user (if different from postgres user)
|
|
-- Adjust as needed for your deployment
|
|
DO $$
|
|
BEGIN
|
|
IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = 'bsim_app') THEN
|
|
CREATE ROLE bsim_app WITH LOGIN PASSWORD 'changeme';
|
|
END IF;
|
|
END
|
|
$$;
|
|
|
|
-- Grant permissions
|
|
GRANT ALL PRIVILEGES ON DATABASE bsim TO bsim_app;
|
|
|
|
-- Note: Ghidra's BSim will create its own schema tables on first use
|
|
-- See Ghidra BSim documentation for schema details
|
|
```
|
|
|
|
### 6.5 Start Services
|
|
|
|
```bash
|
|
# Create backend network if it doesn't exist
|
|
docker network create stellaops-backend
|
|
|
|
# Set environment variables
|
|
export BSIM_DB_PASSWORD=your-secure-password
|
|
export GHIDRA_MAXMEM=8G
|
|
|
|
# Start services
|
|
docker-compose -f devops/compose/docker-compose.ghidra.yml up -d
|
|
|
|
# Verify services are running
|
|
docker-compose -f devops/compose/docker-compose.ghidra.yml ps
|
|
|
|
# Check logs
|
|
docker-compose -f devops/compose/docker-compose.ghidra.yml logs -f ghidra-headless
|
|
docker-compose -f devops/compose/docker-compose.ghidra.yml logs -f bsim-postgres
|
|
```
|
|
|
|
---
|
|
|
|
## 7. BSim PostgreSQL Database Setup
|
|
|
|
### 7.1 Database Creation
|
|
|
|
BSim uses PostgreSQL as its backend database. Ghidra's BSim tooling will create the schema automatically on first use, but you need to provision the database instance.
|
|
|
|
### 7.2 Manual Database Setup (Non-Docker)
|
|
|
|
```bash
|
|
# As postgres user, create database and user
|
|
sudo -u postgres psql <<EOF
|
|
CREATE DATABASE bsim;
|
|
CREATE USER bsim WITH ENCRYPTED PASSWORD 'your-secure-password';
|
|
GRANT ALL PRIVILEGES ON DATABASE bsim TO bsim;
|
|
\c bsim
|
|
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
|
GRANT ALL PRIVILEGES ON SCHEMA public TO bsim;
|
|
EOF
|
|
```
|
|
|
|
### 7.3 BSim Server Configuration
|
|
|
|
Create BSim server configuration (if using BSim server mode, optional):
|
|
|
|
```xml
|
|
<!-- /etc/stellaops/bsim-server.xml -->
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
<bsim>
|
|
<database>
|
|
<host>localhost</host>
|
|
<port>5432</port>
|
|
<name>bsim</name>
|
|
<user>bsim</user>
|
|
<password>your-secure-password</password>
|
|
</database>
|
|
<server>
|
|
<port>6543</port>
|
|
<maxConnections>10</maxConnections>
|
|
</server>
|
|
</bsim>
|
|
```
|
|
|
|
### 7.4 Test BSim Connection
|
|
|
|
```bash
|
|
# Using Ghidra's bsim command-line tool
|
|
$GHIDRA_HOME/support/bsim createdb postgresql://bsim:your-secure-password@localhost:5432/bsim stellaops_corpus
|
|
|
|
# Expected: Database created successfully
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Configuration
|
|
|
|
### 8.1 StellaOps Configuration
|
|
|
|
Add Ghidra configuration to your StellaOps service configuration file (e.g., `etc/binaryindex.yaml`):
|
|
|
|
```yaml
|
|
# Ghidra Integration Configuration
|
|
Ghidra:
|
|
# Path to Ghidra installation directory (GHIDRA_HOME)
|
|
GhidraHome: /opt/ghidra
|
|
|
|
# Path to Java installation directory (JAVA_HOME)
|
|
# If not set, system JAVA_HOME will be used
|
|
JavaHome: /usr/lib/jvm/java-17-openjdk
|
|
|
|
# Working directory for Ghidra projects and temporary files
|
|
WorkDir: /var/lib/stellaops/ghidra
|
|
|
|
# Path to custom Ghidra scripts directory
|
|
ScriptsDir: /opt/stellaops/ghidra-scripts
|
|
|
|
# Maximum memory for Ghidra JVM (e.g., "4G", "8192M")
|
|
MaxMemory: 4G
|
|
|
|
# Maximum CPU cores for Ghidra analysis
|
|
MaxCpu: 4
|
|
|
|
# Default timeout for analysis operations in seconds
|
|
DefaultTimeoutSeconds: 300
|
|
|
|
# Whether to clean up temporary projects after analysis
|
|
CleanupTempProjects: true
|
|
|
|
# Maximum concurrent Ghidra instances
|
|
MaxConcurrentInstances: 1
|
|
|
|
# Whether Ghidra integration is enabled
|
|
Enabled: true
|
|
|
|
# BSim Database Configuration
|
|
BSim:
|
|
# BSim database connection string
|
|
# Format: postgresql://user:pass@host:port/database
|
|
ConnectionString: postgresql://bsim:your-secure-password@bsim-postgres:5432/bsim
|
|
|
|
# Alternative: Specify components separately
|
|
# Host: bsim-postgres
|
|
# Port: 5432
|
|
# Database: bsim
|
|
# Username: bsim
|
|
# Password: your-secure-password
|
|
|
|
# Default minimum similarity for queries
|
|
DefaultMinSimilarity: 0.7
|
|
|
|
# Default maximum results per query
|
|
DefaultMaxResults: 10
|
|
|
|
# Whether BSim integration is enabled
|
|
Enabled: true
|
|
|
|
# ghidriff Python Bridge Configuration
|
|
Ghidriff:
|
|
# Path to Python executable
|
|
# If not set, "python3" or "python" will be used from PATH
|
|
PythonPath: /opt/venv/bin/python3
|
|
|
|
# Path to ghidriff module (if not installed via pip)
|
|
# GhidriffModulePath: /opt/stellaops/ghidriff
|
|
|
|
# Whether to include decompilation in diff output by default
|
|
DefaultIncludeDecompilation: true
|
|
|
|
# Whether to include disassembly in diff output by default
|
|
DefaultIncludeDisassembly: true
|
|
|
|
# Default timeout for ghidriff operations in seconds
|
|
DefaultTimeoutSeconds: 600
|
|
|
|
# Working directory for ghidriff output
|
|
WorkDir: /var/lib/stellaops/ghidriff
|
|
|
|
# Whether ghidriff integration is enabled
|
|
Enabled: true
|
|
```
|
|
|
|
### 8.2 Environment Variables
|
|
|
|
You can also configure Ghidra via environment variables:
|
|
|
|
```bash
|
|
# Ghidra
|
|
export STELLAOPS_GHIDRA_GHIDRAHOME=/opt/ghidra
|
|
export STELLAOPS_GHIDRA_JAVAHOME=/usr/lib/jvm/java-17-openjdk
|
|
export STELLAOPS_GHIDRA_MAXMEMORY=4G
|
|
export STELLAOPS_GHIDRA_MAXCPU=4
|
|
export STELLAOPS_GHIDRA_ENABLED=true
|
|
|
|
# BSim
|
|
export STELLAOPS_BSIM_CONNECTIONSTRING=postgresql://bsim:password@localhost:5432/bsim
|
|
export STELLAOPS_BSIM_ENABLED=true
|
|
|
|
# ghidriff
|
|
export STELLAOPS_GHIDRIFF_PYTHONPATH=/opt/venv/bin/python3
|
|
export STELLAOPS_GHIDRIFF_ENABLED=true
|
|
```
|
|
|
|
### 8.3 appsettings.json (ASP.NET Core)
|
|
|
|
For services using ASP.NET Core configuration:
|
|
|
|
```json
|
|
{
|
|
"Ghidra": {
|
|
"GhidraHome": "/opt/ghidra",
|
|
"JavaHome": "/usr/lib/jvm/java-17-openjdk",
|
|
"WorkDir": "/var/lib/stellaops/ghidra",
|
|
"MaxMemory": "4G",
|
|
"MaxCpu": 4,
|
|
"DefaultTimeoutSeconds": 300,
|
|
"CleanupTempProjects": true,
|
|
"MaxConcurrentInstances": 1,
|
|
"Enabled": true
|
|
},
|
|
"BSim": {
|
|
"ConnectionString": "postgresql://bsim:password@bsim-postgres:5432/bsim",
|
|
"DefaultMinSimilarity": 0.7,
|
|
"DefaultMaxResults": 10,
|
|
"Enabled": true
|
|
},
|
|
"Ghidriff": {
|
|
"PythonPath": "/opt/venv/bin/python3",
|
|
"DefaultIncludeDecompilation": true,
|
|
"DefaultIncludeDisassembly": true,
|
|
"DefaultTimeoutSeconds": 600,
|
|
"WorkDir": "/var/lib/stellaops/ghidriff",
|
|
"Enabled": true
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 9. Testing and Validation
|
|
|
|
### 9.1 Ghidra Headless Test
|
|
|
|
Create a simple test binary and analyze it:
|
|
|
|
```bash
|
|
# Create test C program
|
|
cat > /tmp/test.c <<'EOF'
|
|
#include <stdio.h>
|
|
|
|
int add(int a, int b) {
|
|
return a + b;
|
|
}
|
|
|
|
int main() {
|
|
int result = add(5, 3);
|
|
printf("Result: %d\n", result);
|
|
return 0;
|
|
}
|
|
EOF
|
|
|
|
# Compile
|
|
gcc -o /tmp/test /tmp/test.c
|
|
|
|
# Run Ghidra analysis
|
|
analyzeHeadless /tmp TestProject \
|
|
-import /tmp/test \
|
|
-postScript ListFunctionsScript.java \
|
|
-noanalysis
|
|
|
|
# Expected: Analysis completes without errors, lists functions (main, add)
|
|
```
|
|
|
|
### 9.2 BSim Database Test
|
|
|
|
```bash
|
|
# Create test BSim database
|
|
$GHIDRA_HOME/support/bsim createdb \
|
|
postgresql://bsim:password@localhost:5432/bsim \
|
|
test_corpus
|
|
|
|
# Ingest test binary into BSim
|
|
$GHIDRA_HOME/support/bsim ingest \
|
|
postgresql://bsim:password@localhost:5432/bsim/test_corpus \
|
|
/tmp/test
|
|
|
|
# Query BSim
|
|
$GHIDRA_HOME/support/bsim querysimilar \
|
|
postgresql://bsim:password@localhost:5432/bsim/test_corpus \
|
|
/tmp/test \
|
|
--threshold 0.7
|
|
|
|
# Expected: Shows functions from test binary with similarity scores
|
|
```
|
|
|
|
### 9.3 ghidriff Test
|
|
|
|
```bash
|
|
# Create two versions of a binary (modify test.c slightly)
|
|
cat > /tmp/test_v2.c <<'EOF'
|
|
#include <stdio.h>
|
|
|
|
int add(int a, int b) {
|
|
// Added comment
|
|
return a + b + 1; // Modified
|
|
}
|
|
|
|
int main() {
|
|
int result = add(5, 3);
|
|
printf("Result: %d\n", result);
|
|
return 0;
|
|
}
|
|
EOF
|
|
|
|
gcc -o /tmp/test_v2 /tmp/test_v2.c
|
|
|
|
# Run ghidriff
|
|
python3 -m ghidriff /tmp/test /tmp/test_v2 \
|
|
--output-dir /tmp/ghidriff-test \
|
|
--output-format json
|
|
|
|
# Expected: Creates diff.json in /tmp/ghidriff-test showing changes
|
|
cat /tmp/ghidriff-test/diff.json
|
|
```
|
|
|
|
### 9.4 Integration Test
|
|
|
|
Test the BinaryIndex Ghidra integration:
|
|
|
|
```bash
|
|
# Run BinaryIndex integration tests
|
|
dotnet test src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Ghidra.Tests/ \
|
|
--filter "Category=Integration" \
|
|
--logger "trx;LogFileName=ghidra-tests.trx"
|
|
|
|
# Expected: All tests pass
|
|
```
|
|
|
|
---
|
|
|
|
## 10. Troubleshooting
|
|
|
|
### 10.1 Common Issues
|
|
|
|
#### Issue: "analyzeHeadless: command not found"
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Ensure GHIDRA_HOME is set
|
|
export GHIDRA_HOME=/opt/ghidra
|
|
export PATH="${GHIDRA_HOME}/support:${PATH}"
|
|
|
|
# Verify
|
|
which analyzeHeadless
|
|
```
|
|
|
|
#### Issue: "Java version mismatch" or "UnsupportedClassVersionError"
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Check Java version
|
|
java -version
|
|
# Must be Java 17+
|
|
|
|
# Set correct JAVA_HOME
|
|
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk
|
|
```
|
|
|
|
#### Issue: "OutOfMemoryError: Java heap space"
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Increase MAXMEM
|
|
export MAXMEM=8G
|
|
|
|
# Or in configuration
|
|
Ghidra:
|
|
MaxMemory: 8G
|
|
```
|
|
|
|
#### Issue: "ghidriff: No module named 'ghidriff'"
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Install ghidriff
|
|
pip3 install ghidriff
|
|
|
|
# Or activate venv
|
|
source /opt/venv/bin/activate
|
|
pip install ghidriff
|
|
|
|
# Verify
|
|
python3 -m ghidriff --version
|
|
```
|
|
|
|
#### Issue: "BSim connection refused"
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Check PostgreSQL is running
|
|
docker-compose -f devops/compose/docker-compose.ghidra.yml ps bsim-postgres
|
|
|
|
# Test connection
|
|
psql -h localhost -p 5432 -U bsim -d bsim -c "SELECT version();"
|
|
|
|
# Check connection string in configuration
|
|
# Ensure format: postgresql://user:pass@host:port/database
|
|
```
|
|
|
|
#### Issue: "Ghidra analysis hangs or times out"
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Increase timeout
|
|
Ghidra:
|
|
DefaultTimeoutSeconds: 600 # 10 minutes
|
|
|
|
# Reduce analysis scope (disable certain analyzers)
|
|
analyzeHeadless /tmp TestProject -import /tmp/test \
|
|
-noanalysis \
|
|
-processor x86:LE:64:default
|
|
|
|
# Check system resources (CPU, memory)
|
|
docker stats stellaops-ghidra-headless
|
|
```
|
|
|
|
### 10.2 Logging and Diagnostics
|
|
|
|
#### Enable Ghidra Debug Logging
|
|
|
|
```bash
|
|
# Run with verbose output
|
|
analyzeHeadless /tmp TestProject -import /tmp/test \
|
|
-log /tmp/ghidra-analysis.log \
|
|
-logLevel DEBUG
|
|
|
|
# Check log file
|
|
tail -f /tmp/ghidra-analysis.log
|
|
```
|
|
|
|
#### Enable StellaOps Ghidra Logging
|
|
|
|
Add to `appsettings.json`:
|
|
|
|
```json
|
|
{
|
|
"Logging": {
|
|
"LogLevel": {
|
|
"Default": "Information",
|
|
"StellaOps.BinaryIndex.Ghidra": "Debug"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
#### Docker Container Logs
|
|
|
|
```bash
|
|
# View Ghidra headless logs
|
|
docker logs stellaops-ghidra-headless -f
|
|
|
|
# View BSim PostgreSQL logs
|
|
docker logs stellaops-bsim-postgres -f
|
|
|
|
# View logs with timestamps
|
|
docker logs stellaops-ghidra-headless --timestamps
|
|
```
|
|
|
|
### 10.3 Performance Tuning
|
|
|
|
#### Optimize Ghidra Memory Settings
|
|
|
|
```yaml
|
|
Ghidra:
|
|
# For large binaries (>100MB)
|
|
MaxMemory: 16G
|
|
|
|
# For many concurrent analyses
|
|
MaxConcurrentInstances: 4
|
|
```
|
|
|
|
#### Optimize BSim Queries
|
|
|
|
```yaml
|
|
BSim:
|
|
# Reduce result set for faster queries
|
|
DefaultMaxResults: 5
|
|
|
|
# Increase similarity threshold to reduce matches
|
|
DefaultMinSimilarity: 0.8
|
|
```
|
|
|
|
#### Docker Resource Limits
|
|
|
|
```yaml
|
|
services:
|
|
ghidra-headless:
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '8' # Increase for faster analysis
|
|
memory: 16G # Match MaxMemory + overhead
|
|
```
|
|
|
|
---
|
|
|
|
## 11. Production Deployment Checklist
|
|
|
|
### 11.1 Pre-Deployment
|
|
|
|
- [ ] Java 17+ installed and verified
|
|
- [ ] Ghidra 11.2+ downloaded and SHA256 verified
|
|
- [ ] Python 3.10+ installed
|
|
- [ ] ghidriff installed and tested
|
|
- [ ] PostgreSQL 16+ available for BSim
|
|
- [ ] Docker images built and tested
|
|
- [ ] Configuration files reviewed and validated
|
|
- [ ] Network connectivity verified (or air-gap packages prepared)
|
|
|
|
### 11.2 Security Hardening
|
|
|
|
- [ ] BSim database password set to strong value (not "changeme")
|
|
- [ ] PostgreSQL configured with TLS/SSL
|
|
- [ ] Ghidra working directories have restricted permissions (700)
|
|
- [ ] Docker containers run as non-root user
|
|
- [ ] Network segmentation configured (backend network only)
|
|
- [ ] Firewall rules restrict BSim PostgreSQL access
|
|
- [ ] Audit logging enabled for Ghidra operations
|
|
|
|
### 11.3 Post-Deployment
|
|
|
|
- [ ] Ghidra headless test completed successfully
|
|
- [ ] BSim database initialized and accessible
|
|
- [ ] ghidriff integration tested
|
|
- [ ] BinaryIndex integration tests pass
|
|
- [ ] Monitoring and alerting configured
|
|
- [ ] Log aggregation configured
|
|
- [ ] Backup strategy for BSim database configured
|
|
- [ ] Runbook/procedures documented
|
|
|
|
---
|
|
|
|
## 12. Monitoring and Observability
|
|
|
|
### 12.1 Metrics
|
|
|
|
StellaOps exposes Prometheus metrics for Ghidra integration:
|
|
|
|
| Metric | Type | Description |
|
|
|--------|------|-------------|
|
|
| `ghidra_analysis_total` | Counter | Total Ghidra analyses performed |
|
|
| `ghidra_analysis_duration_seconds` | Histogram | Duration of Ghidra analyses |
|
|
| `ghidra_analysis_errors_total` | Counter | Total Ghidra analysis errors |
|
|
| `ghidra_instances_active` | Gauge | Active Ghidra headless instances |
|
|
| `bsim_query_total` | Counter | Total BSim queries |
|
|
| `bsim_query_duration_seconds` | Histogram | Duration of BSim queries |
|
|
| `bsim_matches_total` | Counter | Total BSim matches found |
|
|
| `ghidriff_diff_total` | Counter | Total ghidriff diffs performed |
|
|
| `ghidriff_diff_duration_seconds` | Histogram | Duration of ghidriff diffs |
|
|
|
|
### 12.2 Health Checks
|
|
|
|
Ghidra service health check endpoint (if using wrapper service):
|
|
|
|
```bash
|
|
# HTTP health check
|
|
curl http://localhost:8080/health/ghidra
|
|
|
|
# Expected response:
|
|
{
|
|
"status": "Healthy",
|
|
"ghidra": {
|
|
"available": true,
|
|
"version": "11.2",
|
|
"javaVersion": "17.0.x"
|
|
},
|
|
"bsim": {
|
|
"available": true,
|
|
"connection": "OK"
|
|
}
|
|
}
|
|
```
|
|
|
|
### 12.3 Alerts
|
|
|
|
Recommended Prometheus alerts:
|
|
|
|
```yaml
|
|
groups:
|
|
- name: ghidra
|
|
rules:
|
|
- alert: GhidraAnalysisHighErrorRate
|
|
expr: rate(ghidra_analysis_errors_total[5m]) > 0.1
|
|
for: 5m
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: "High Ghidra analysis error rate"
|
|
description: "Ghidra error rate is {{ $value }} errors/sec"
|
|
|
|
- alert: GhidraAnalysisSlow
|
|
expr: histogram_quantile(0.95, ghidra_analysis_duration_seconds) > 600
|
|
for: 10m
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: "Ghidra analyses are slow"
|
|
description: "P95 analysis duration is {{ $value }}s (>10m)"
|
|
|
|
- alert: BSimDatabaseDown
|
|
expr: up{job="bsim-postgres"} == 0
|
|
for: 1m
|
|
labels:
|
|
severity: critical
|
|
annotations:
|
|
summary: "BSim database is down"
|
|
description: "BSim PostgreSQL database is unreachable"
|
|
```
|
|
|
|
---
|
|
|
|
## 13. Backup and Recovery
|
|
|
|
### 13.1 BSim Database Backup
|
|
|
|
```bash
|
|
# Automated backup script
|
|
#!/bin/bash
|
|
BACKUP_DIR=/var/backups/stellaops/bsim
|
|
DATE=$(date +%Y%m%d_%H%M%S)
|
|
|
|
# Create backup
|
|
docker exec stellaops-bsim-postgres \
|
|
pg_dump -U bsim -Fc bsim > ${BACKUP_DIR}/bsim_${DATE}.dump
|
|
|
|
# Compress (optional)
|
|
gzip ${BACKUP_DIR}/bsim_${DATE}.dump
|
|
|
|
# Retention: keep last 7 days
|
|
find ${BACKUP_DIR} -name "bsim_*.dump.gz" -mtime +7 -delete
|
|
```
|
|
|
|
### 13.2 BSim Database Restore
|
|
|
|
```bash
|
|
# Stop dependent services
|
|
docker-compose -f devops/compose/docker-compose.ghidra.yml stop ghidra-headless
|
|
|
|
# Restore from backup
|
|
gunzip -c /var/backups/stellaops/bsim/bsim_20260105_120000.dump.gz | \
|
|
docker exec -i stellaops-bsim-postgres \
|
|
pg_restore -U bsim -d bsim --clean --if-exists
|
|
|
|
# Restart services
|
|
docker-compose -f devops/compose/docker-compose.ghidra.yml up -d
|
|
```
|
|
|
|
### 13.3 Ghidra Project Backup
|
|
|
|
```bash
|
|
# Backup Ghidra projects (if using persistent projects)
|
|
tar -czf /var/backups/stellaops/ghidra/projects_$(date +%Y%m%d).tar.gz \
|
|
/var/lib/stellaops/ghidra/projects
|
|
|
|
# Scripts backup
|
|
tar -czf /var/backups/stellaops/ghidra/scripts_$(date +%Y%m%d).tar.gz \
|
|
/opt/stellaops/ghidra-scripts
|
|
```
|
|
|
|
---
|
|
|
|
## 14. Air-Gapped Deployment
|
|
|
|
### 14.1 Package Preparation
|
|
|
|
On internet-connected machine:
|
|
|
|
```bash
|
|
# Download Ghidra
|
|
wget https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_11.2_build/ghidra_11.2_PUBLIC_20241105.zip
|
|
|
|
# Download Python wheels
|
|
mkdir -p airgap-packages
|
|
pip download --dest airgap-packages ghidriff
|
|
|
|
# Download Docker images
|
|
docker save stellaops/ghidra-headless:11.2 | gzip > airgap-packages/ghidra-headless-11.2.tar.gz
|
|
docker save postgres:16-alpine | gzip > airgap-packages/postgres-16-alpine.tar.gz
|
|
|
|
# Create tarball
|
|
tar -czf stellaops-ghidra-airgap.tar.gz airgap-packages/
|
|
```
|
|
|
|
### 14.2 Air-Gapped Installation
|
|
|
|
On air-gapped machine:
|
|
|
|
```bash
|
|
# Extract package
|
|
tar -xzf stellaops-ghidra-airgap.tar.gz
|
|
|
|
# Install Ghidra
|
|
cd airgap-packages
|
|
unzip ghidra_11.2_PUBLIC_20241105.zip -d /opt
|
|
ln -s /opt/ghidra_11.2_PUBLIC /opt/ghidra
|
|
|
|
# Install Python packages
|
|
pip install --no-index --find-links . ghidriff
|
|
|
|
# Load Docker images
|
|
docker load < ghidra-headless-11.2.tar.gz
|
|
docker load < postgres-16-alpine.tar.gz
|
|
|
|
# Proceed with normal deployment
|
|
```
|
|
|
|
---
|
|
|
|
## 15. References
|
|
|
|
### 15.1 Documentation
|
|
|
|
- **Ghidra Official Documentation:** https://ghidra.re/ghidra_docs/
|
|
- **Ghidra Version Tracking Guide:** https://cve-north-stars.github.io/docs/Ghidra-Patch-Diffing
|
|
- **ghidriff Repository:** https://github.com/clearbluejar/ghidriff
|
|
- **BSim Documentation:** https://ghidra.re/ghidra_docs/api/ghidra/features/bsim/
|
|
- **BinaryIndex Architecture:** [architecture.md](./architecture.md)
|
|
- **Sprint Documentation:** [SPRINT_20260105_001_003](../../implplan/SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md)
|
|
|
|
### 15.2 Related StellaOps Documentation
|
|
|
|
- **PostgreSQL Guide:** `docs/operations/postgresql-guide.md`
|
|
- **Docker Deployment Guide:** `docs/operations/docker-deployment.md`
|
|
- **Air-Gap Operation Guide:** `docs/OFFLINE_KIT.md`
|
|
- **Security Hardening Guide:** `docs/operations/security-hardening.md`
|
|
|
|
### 15.3 External Resources
|
|
|
|
- **Eclipse Temurin Downloads:** https://adoptium.net/
|
|
- **Ghidra Releases:** https://github.com/NationalSecurityAgency/ghidra/releases
|
|
- **ghidriff PyPI:** https://pypi.org/project/ghidriff/
|
|
- **PostgreSQL Documentation:** https://www.postgresql.org/docs/16/
|
|
|
|
---
|
|
|
|
## 16. Changelog
|
|
|
|
| Date | Version | Changes |
|
|
|------|---------|---------|
|
|
| 2026-01-05 | 1.0.0 | Initial deployment guide created for GHID-019 |
|
|
|
|
---
|
|
|
|
*Document Version: 1.0.0*
|
|
*Last Updated: 2026-01-05*
|
|
*Maintainer: BinaryIndex Guild*
|