Files
git.stella-ops.org/docs/DEVELOPER_ONBOARDING.md
2025-12-24 12:38:14 +02:00

801 lines
21 KiB
Markdown

# StellaOps Developer Onboarding Guide
> **Target Audience:** DevOps operators with developer knowledge who need to understand, deploy, and debug the StellaOps platform.
## Table of Contents
1. [Architecture Overview](#architecture-overview)
2. [Prerequisites](#prerequisites)
3. [Quick Start - Full Platform in Docker](#quick-start)
4. [Hybrid Debugging Workflow](#hybrid-debugging-workflow)
5. [Service-by-Service Debugging Guide](#service-by-service-debugging-guide)
6. [Configuration Deep Dive](#configuration-deep-dive)
7. [Common Development Workflows](#common-development-workflows)
8. [Troubleshooting](#troubleshooting)
---
## Architecture Overview
StellaOps is a deterministic, offline-first SBOM + VEX platform built as a microservice architecture. The system is designed so every verdict can be replayed from concrete evidence (SBOM slices, advisory/VEX observations, policy decision traces, and optional attestations).
### Canonical references
- Architecture overview (10-minute tour): `docs/40_ARCHITECTURE_OVERVIEW.md`
- High-level reference map: `docs/07_HIGH_LEVEL_ARCHITECTURE.md`
- Detailed architecture index: `docs/technical/architecture/README.md`
- Topology: `docs/technical/architecture/platform-topology.md`
- Infrastructure: `docs/technical/architecture/infrastructure-dependencies.md`
- Flows: `docs/technical/architecture/request-flows.md`
- Data isolation: `docs/technical/architecture/data-isolation.md`
- Security boundaries: `docs/technical/architecture/security-boundaries.md`
### Key architectural principles
1. **Deterministic evidence**: the same inputs produce the same outputs (stable ordering, stable IDs, replayable artifacts).
2. **VEX-first decisioning**: policy decisions are driven by VEX inputs and issuer trust, not enumeration alone.
3. **Offline-first**: fully air-gapped workflows are supported (mirrors, bundles, importer/controller).
4. **Extensibility without drift**: connectors, plugins, and policy packs must preserve determinism.
5. **Sovereign posture**: bring-your-own trust roots and configurable crypto profiles where enabled.
6. **Isolation boundaries**: clear module ownership, schema boundaries, and tenant scoping.
### Service categories (orientation)
| Category | Examples | Purpose |
| --- | --- | --- |
| Infrastructure | PostgreSQL, Valkey, RustFS/S3, optional message broker | Durable state, coordination, artifact storage, transport abstraction. |
| Auth & signing | Authority, Signer, Attestor, issuer trust services | Identity, scopes/tenancy, evidence signing and attestation workflows. |
| Ingestion | Concelier, Excititor | Advisory and VEX ingestion/normalization with deterministic merges. |
| Scanning | Scanner (API + workers) | Container analysis, SBOM generation, artifact production. |
| Policy & risk | Policy engine + explain traces | Deterministic verdicts, waivers/exceptions, explainability for audits. |
| Orchestration | Scheduler, Orchestrator | Re-scan orchestration, workflows, pack runs. |
| Notifications | Notification engine(s) | Event delivery and idempotent notifications. |
| User experience | Gateway, Web UI, CLI | Authenticated access, routing, operator workflows. |
### Canonical flows
- Scan execution, ingestion updates, policy evaluation, and notification delivery are described in `docs/technical/architecture/request-flows.md`.
---
## Prerequisites
### Required Software
1. **Docker Desktop** (Windows/Mac) or **Docker Engine + Docker Compose** (Linux)
- Version: 20.10+ recommended
- Enable WSL2 backend (Windows)
2. **.NET 10 SDK**
- Download: https://dotnet.microsoft.com/download/dotnet/10.0
- Verify: `dotnet --version` (should show 10.0.x)
3. **Visual Studio 2022** (v17.12+) or **Visual Studio Code**
- Workload: ASP.NET and web development
- Workload: .NET desktop development
- Extension (VS Code): C# Dev Kit
4. **Git**
- Version: 2.30+ recommended
### Optional Tools
- **PostgreSQL Client** (psql, pgAdmin, DBeaver) - for database inspection
- **Redis Insight** or **Another Redis Desktop Manager** - for Valkey inspection (Valkey is Redis-compatible)
- **Postman/Insomnia** - for API testing
- **AWS CLI or s3cmd** - for RustFS (S3-compatible) inspection
### System Requirements
- **RAM:** 16 GB minimum, 32 GB recommended
- **Disk:** 50 GB free space (for Docker images, volumes, build artifacts)
- **CPU:** 4 cores minimum, 8 cores recommended
---
## Quick Start
### Step 1: Clone the Repository
```bash
cd C:\dev\
git clone https://git.stella-ops.org/stella-ops.org/git.stella-ops.org.git
cd git.stella-ops.org
```
### Step 2: Prepare Environment Configuration
```bash
# Copy the development environment template
cd deploy\compose
copy env\dev.env.example .env
# Edit .env with your preferred text editor
notepad .env
```
**Key settings to configure:**
- Copy and edit the profile env file (`deploy/compose/env/dev.env.example` -> `.env`).
- Update at minimum `POSTGRES_PASSWORD` and any host port overrides needed for your machine.
- Treat `deploy/compose/env/*.env.example` as the authoritative list of variables for each profile (queue/transport knobs are profile-dependent).
### Step 3: Start the Full Platform
```bash
# From deploy/compose directory
docker compose -f docker-compose.dev.yaml up -d
```
**This will start all infrastructure and services:**
- PostgreSQL v16+ (port 5432) - Primary database for all services
- Valkey 8.0 (port 6379) - Cache, DPoP nonces, event streams, rate limiting
- RustFS (port 8080) - S3-compatible object storage for artifacts/SBOMs
- NATS JetStream (port 4222) - Optional transport (only if configured)
- Authority (port 8440) - OAuth2/OIDC authentication
- Signer (port 8441) - Cryptographic signing
- Attestor (port 8442) - in-toto attestation generation
- Scanner.Web (port 8444) - Scan API
- Concelier (port 8445) - Advisory ingestion
- And 30+ more services...
### Step 4: Verify Services Are Running
```bash
# Check all services are up
docker compose -f docker-compose.dev.yaml ps
# Check logs for a specific service
docker compose -f docker-compose.dev.yaml logs -f scanner-web
# Check infrastructure health
docker compose -f docker-compose.dev.yaml logs postgres
docker compose -f docker-compose.dev.yaml logs valkey
docker compose -f docker-compose.dev.yaml logs rustfs
```
### Step 5: Access the Platform
Open your browser and navigate to:
- **RustFS:** http://localhost:8080 (S3-compatible object storage)
- **Scanner API:** http://localhost:8444/swagger (if Swagger enabled)
- **Concelier API:** http://localhost:8445/swagger
- **Authority:** http://localhost:8440/.well-known/openid-configuration (OIDC discovery)
---
## Hybrid Debugging Workflow
Hybrid debugging runs the full platform in Docker, then stops one service container and runs that service locally under a debugger while it continues to use Docker-hosted dependencies.
Canonical guide:
- `docs/QUICKSTART_HYBRID_DEBUG.md`
Related references:
- Compose profiles: `deploy/compose/README.md`
- Install guide: `docs/21_INSTALL_GUIDE.md`
- Service-specific runbooks: `docs/modules/<module>/operations/`
## Service-by-Service Debugging Guide
Service-specific debugging guidance lives with each module to avoid stale, copy-pasted configuration examples.
Generic workflow:
1. Stop the service container in `deploy/compose` (for example: `docker compose -f docker-compose.dev.yaml stop <service>`).
2. Run the service locally under a debugger.
3. Update dependent services to call `host.docker.internal:<port>` (or your host IP) and restart them.
4. Use the module operations docs for required env vars, auth scopes, and health checks.
Start here:
- Hybrid debugging walkthrough: `docs/QUICKSTART_HYBRID_DEBUG.md`
- Architecture index: `docs/technical/architecture/README.md`
- Module dossiers and operations: `docs/modules/`
Common module runbooks:
- Authority: `docs/modules/authority/operations/`
- Scanner: `docs/modules/scanner/operations/`
- Concelier: `docs/modules/concelier/operations/`
- Scheduler: `docs/modules/scheduler/operations/`
- UI / Console: `docs/modules/ui/`
## Configuration Deep Dive
### Configuration Hierarchy
All services follow this configuration priority (highest to lowest):
1. **Environment Variables** - `STELLAOPS_<MODULE>_<SETTING>` or `<MODULE>__<SETTING>`
2. **appsettings.{Environment}.json** - `appsettings.Development.json`, `appsettings.Production.json`
3. **appsettings.json** - Base configuration
4. **YAML files** - `../etc/<service>.yaml`, `../etc/<service>.local.yaml`
### Common Configuration Patterns
#### PostgreSQL Connection Strings
```json
{
"ConnectionStrings": {
"DefaultConnection": "Host=localhost;Port=5432;Database=<db_name>;Username=stellaops;Password=<password>;Pooling=true;Minimum Pool Size=1;Maximum Pool Size=100;Command Timeout=60"
}
}
```
**Database names by service:**
- Scanner: `stellaops_platform` or `scanner_*`
- Orchestrator: `stellaops_orchestrator`
- Authority: `stellaops_platform` (shared, schema-isolated)
- Concelier: `stellaops_platform` (vuln schema)
- Notify: `stellaops_platform` (notify schema)
#### Valkey Configuration (Default Transport)
```json
{
"Scanner": {
"Events": {
"Driver": "valkey",
"Dsn": "localhost:6379"
},
"Cache": {
"Redis": {
"ConnectionString": "localhost:6379"
}
}
},
"Scheduler": {
"Queue": {
"Kind": "Valkey",
"Valkey": {
"Url": "localhost:6379"
}
}
}
}
```
#### NATS Queue Configuration (Optional Alternative Transport)
```json
{
"Scanner": {
"Events": {
"Driver": "nats",
"Dsn": "nats://localhost:4222"
}
},
"Scheduler": {
"Queue": {
"Kind": "Nats",
"Nats": {
"Url": "nats://localhost:4222"
}
}
}
}
```
#### RustFS Configuration (S3-Compatible Object Storage)
```json
{
"Scanner": {
"Storage": {
"RustFS": {
"Endpoint": "http://localhost:8080",
"AccessKeyId": "stellaops",
"SecretAccessKey": "your_password",
"BucketName": "scanner-artifacts",
"Region": "us-east-1",
"ForcePathStyle": true
}
}
}
}
```
#### RustFS Configuration
```json
{
"Scanner": {
"ArtifactStore": {
"Driver": "rustfs",
"Endpoint": "http://localhost:8080/api/v1",
"Bucket": "scanner-artifacts",
"TimeoutSeconds": 30
}
}
}
```
### Environment Variable Mapping
ASP.NET Core uses `__` (double underscore) for nested configuration:
```bash
# This JSON configuration:
{
"Scanner": {
"Queue": {
"Broker": "nats://localhost:4222"
}
}
}
# Can be set via environment variable:
SCANNER__QUEUE__BROKER=nats://localhost:4222
# Or with STELLAOPS_ prefix:
STELLAOPS_SCANNER__QUEUE__BROKER=nats://localhost:4222
```
---
## Common Development Workflows
### Workflow 1: Debug a Single Service with Full Stack
**Scenario:** You need to debug Scanner.WebService while all other services run normally.
```bash
# 1. Start full platform
cd deploy\compose
docker compose -f docker-compose.dev.yaml up -d
# 2. Stop the service you want to debug
docker compose -f docker-compose.dev.yaml stop scanner-web
# 3. Open Visual Studio
cd C:\dev\New folder\git.stella-ops.org
start src\StellaOps.sln
# 4. Set Scanner.WebService as startup project and F5
# 5. Test the service
curl -X POST http://localhost:5210/api/scans -H "Content-Type: application/json" -d '{"imageRef":"alpine:latest"}'
# 6. When done, stop VS debugger and restart Docker container
docker compose -f docker-compose.dev.yaml start scanner-web
```
### Workflow 2: Debug Multiple Services Together
**Scenario:** Debug Scanner.WebService and Scanner.Worker together.
```bash
# 1. Stop both containers
docker compose -f docker-compose.dev.yaml stop scanner-web scanner-worker
# 2. In Visual Studio, configure multiple startup projects:
# - Right-click solution > Properties
# - Set "Multiple startup projects"
# - Select Scanner.WebService: Start
# - Select Scanner.Worker: Start
# 3. Press F5 to debug both simultaneously
```
### Workflow 3: Test Integration with Modified Code
**Scenario:** You modified Concelier and want to test how Scanner integrates with it.
```bash
# 1. Build Concelier locally
cd src\Concelier\StellaOps.Concelier.WebService
dotnet build
# 2. Stop Docker Concelier
cd ..\..\..\deploy\compose
docker compose -f docker-compose.dev.yaml stop concelier
# 3. Run Concelier in Visual Studio (F5)
# 4. Keep Scanner in Docker, but point it to localhost Concelier
# Update .env:
CONCELIER_BASEURL=http://host.docker.internal:5000
# 5. Restart Scanner to pick up new config
docker compose -f docker-compose.dev.yaml restart scanner-web
```
### Workflow 4: Reset Database State
**Scenario:** You need a clean database to test migrations or start fresh.
```bash
# 1. Stop all services
docker compose -f docker-compose.dev.yaml down
# 2. Remove database volumes
docker volume rm compose_postgres-data
docker volume rm compose_mongo-data
# 3. Restart platform (will recreate volumes and databases)
docker compose -f docker-compose.dev.yaml up -d
# 4. Wait for migrations to run
docker compose -f docker-compose.dev.yaml logs -f postgres
# Look for migration completion messages
```
### Workflow 5: Test Offline/Air-Gap Mode
**Scenario:** Test the platform in offline mode.
```bash
# 1. Use the air-gap compose profile
cd deploy\compose
docker compose -f docker-compose.airgap.yaml up -d
# 2. Verify no external network calls
docker compose -f docker-compose.airgap.yaml logs | grep -i "external\|outbound\|internet"
```
---
## Troubleshooting
### Common Issues
#### 1. Port Already in Use
**Error:**
```
Error starting userland proxy: listen tcp 0.0.0.0:5432: bind: address already in use
```
**Solutions:**
**Option A: Change the port in .env**
```bash
# Edit .env
POSTGRES_PORT=5433 # Use a different port
```
**Option B: Stop the conflicting process**
```bash
# Windows
netstat -ano | findstr :5432
taskkill /PID <PID> /F
# Linux/Mac
lsof -i :5432
kill -9 <PID>
```
#### 2. Cannot Connect to PostgreSQL from Visual Studio
**Error:**
```
Npgsql.NpgsqlException: Connection refused
```
**Solutions:**
1. **Verify PostgreSQL is accessible from host:**
```bash
psql -h localhost -U stellaops -d stellaops_platform
```
2. **Check Docker network:**
```bash
docker network inspect compose_stellaops
# Ensure your service has "host.docker.internal" DNS resolution
```
3. **Update connection string:**
```json
{
"ConnectionStrings": {
"DefaultConnection": "Host=localhost;Port=5432;Database=stellaops_platform;Username=stellaops;Password=your_password;Include Error Detail=true"
}
}
```
#### 3. NATS Connection Refused
**Error:**
```
NATS connection error: connection refused
```
**Solution:**
By default, services use **Valkey** for messaging, not NATS. Ensure Valkey is running:
```bash
docker compose -f docker-compose.dev.yaml ps valkey
# Should show: State = "Up"
# Test connectivity
telnet localhost 6379
```
Update configuration to use Valkey (default):
```json
{
"Scanner": {
"Events": {
"Driver": "valkey",
"Dsn": "localhost:6379"
}
},
"Scheduler": {
"Queue": {
"Kind": "Valkey",
"Valkey": {
"Url": "localhost:6379"
}
}
}
}
```
**If you explicitly want to use NATS** (optional):
```bash
docker compose -f docker-compose.dev.yaml ps nats
# Ensure NATS is running
# Update appsettings.Development.json:
{
"Scanner": {
"Events": {
"Driver": "nats",
"Dsn": "nats://localhost:4222"
}
}
}
```
#### 4. Valkey Connection Refused
**Error:**
```
StackExchange.Redis.RedisConnectionException: It was not possible to connect to the redis server(s)
```
**Solutions:**
1. **Check Valkey is running:**
```bash
docker compose -f docker-compose.dev.yaml ps valkey
# Should show: State = "Up"
# Check logs
docker compose -f docker-compose.dev.yaml logs valkey
```
2. **Reset Valkey:**
```bash
docker compose -f docker-compose.dev.yaml stop valkey
docker volume rm compose_valkey-data
docker compose -f docker-compose.dev.yaml up -d valkey
```
#### 5. Service Cannot Reach host.docker.internal
**Error:**
```
Could not resolve host: host.docker.internal
```
**Solution (Windows/Mac):**
Should work automatically with Docker Desktop.
**Solution (Linux):**
Add to docker-compose.dev.yaml:
```yaml
services:
scanner-web:
extra_hosts:
- "host.docker.internal:host-gateway"
```
Or use the host's IP address:
```bash
# Find host IP
ip addr show docker0
# Use that IP instead of host.docker.internal
```
#### 6. Certificate Validation Errors (Authority/HTTPS)
**Error:**
```
The SSL connection could not be established
```
**Solution:**
For development, disable certificate validation:
```json
{
"Authority": {
"ValidateCertificate": false
}
}
```
Or trust the development certificate:
```bash
dotnet dev-certs https --trust
```
#### 7. Build Errors - Missing SDK
**Error:**
```
error MSB4236: The SDK 'Microsoft.NET.Sdk.Web' specified could not be found
```
**Solution:**
Install .NET 10 SDK:
```bash
# Verify installation
dotnet --list-sdks
# Should show:
# 10.0.xxx [C:\Program Files\dotnet\sdk]
```
#### 8. Hot Reload Not Working
**Symptom:** Changes in code don't reflect when running in Visual Studio.
**Solutions:**
1. Ensure Hot Reload is enabled: Tools > Options > Debugging > .NET Hot Reload > Enable Hot Reload
2. Rebuild the project: Ctrl+Shift+B
3. Restart debugging session: Shift+F5, then F5
#### 9. Docker Compose Fails to Parse .env
**Error:**
```
invalid interpolation format
```
**Solution:**
Ensure no spaces around `=` in .env:
```bash
# Wrong
POSTGRES_USER = stellaops
# Correct
POSTGRES_USER=stellaops
```
#### 10. Volume Permission Issues (Linux)
**Error:**
```
Permission denied writing to /data/db
```
**Solution:**
```bash
# Fix permissions on volume directories
sudo chown -R $USER:$USER ./volumes
# Or run Docker as root (not recommended for production)
sudo docker compose -f docker-compose.dev.yaml up -d
```
---
## Next Steps
### Learning Path
1. **Week 1: Infrastructure**
- Understand PostgreSQL schema isolation (all services use PostgreSQL)
- Learn Valkey streams for event queuing and caching
- Study RustFS S3-compatible object storage
- Optional: NATS JetStream as alternative transport
2. **Week 2: Core Services**
- Deep dive into Scanner architecture (analyzers, workers, caching)
- Understand Concelier advisory ingestion and merging
- Study VEX workflow in Excititor
3. **Week 3: Authentication & Security**
- Master OAuth2/OIDC flow in Authority
- Understand signing flow (Signer -> Attestor -> Rekor)
- Study policy evaluation engine
4. **Week 4: Integration**
- Build end-to-end scan workflow
- Implement custom Concelier connector
- Create custom notification rules
### Key Documentation
- **Architecture:** `docs/07_HIGH_LEVEL_ARCHITECTURE.md`
- **Build Commands:** `CLAUDE.md`
- **Database Spec:** `docs/db/SPECIFICATION.md`
- **API Reference:** `docs/09_API_CLI_REFERENCE.md`
- **Module Architecture:** `docs/modules/<module>/architecture.md`
### Support
- **Issues:** https://git.stella-ops.org/stella-ops.org/git.stella-ops.org/issues
- **Discussions:** Internal team channels
- **Documentation:** `docs/` directory in the repository
---
## Quick Reference Card
### Essential Commands
```bash
# Start full platform
cd deploy\compose
docker compose -f docker-compose.dev.yaml up -d
# Stop a specific service for debugging
docker compose -f docker-compose.dev.yaml stop <service-name>
# View logs
docker compose -f docker-compose.dev.yaml logs -f <service-name>
# Restart a service
docker compose -f docker-compose.dev.yaml restart <service-name>
# Stop all services
docker compose -f docker-compose.dev.yaml down
# Stop all services and remove volumes (DESTRUCTIVE)
docker compose -f docker-compose.dev.yaml down -v
# Build the solution
cd C:\dev\New folder\git.stella-ops.org
dotnet build src\StellaOps.sln
# Run tests
dotnet test src\StellaOps.sln
# Run a specific project
cd src\Scanner\StellaOps.Scanner.WebService
dotnet run
```
### Service Default Ports
| Service | Port | URL | Notes |
|---------|------|-----|-------|
| **Infrastructure** |
| PostgreSQL | 5432 | `localhost:5432` | Primary database (REQUIRED) |
| Valkey | 6379 | `localhost:6379` | Cache/events/queues (REQUIRED) |
| RustFS | 8080 | http://localhost:8080 | S3-compatible storage (REQUIRED) |
| NATS | 4222 | `nats://localhost:4222` | Optional alternative transport |
| **Services** |
| Authority | 8440 | https://localhost:8440 | OAuth2/OIDC auth |
| Signer | 8441 | https://localhost:8441 | Cryptographic signing |
| Attestor | 8442 | https://localhost:8442 | in-toto attestations |
| Scanner.Web | 8444 | http://localhost:8444 | Scan API |
| Concelier | 8445 | http://localhost:8445 | Advisory ingestion |
| Notify | 8446 | http://localhost:8446 | Notifications |
| IssuerDirectory | 8447 | http://localhost:8447 | CSAF publisher discovery |
### Visual Studio Shortcuts
| Action | Shortcut |
|--------|----------|
| Start Debugging | F5 |
| Start Without Debugging | Ctrl+F5 |
| Stop Debugging | Shift+F5 |
| Step Over | F10 |
| Step Into | F11 |
| Step Out | Shift+F11 |
| Toggle Breakpoint | F9 |
| Build Solution | Ctrl+Shift+B |
| Rebuild Solution | Ctrl+Shift+F5 |
---
**Document Version:** 1.0
**Last Updated:** 2025-12-22
**Maintained By:** StellaOps Development Team