Files
git.stella-ops.org/docs/modules/plugin/architecture.md

157 lines
7.6 KiB
Markdown

# Plugin Framework Architecture
> Technical architecture for the universal plugin lifecycle, sandboxing, and registry framework.
## Overview
The Plugin Framework provides the core extensibility infrastructure for the Stella Ops platform. It defines how plugins are discovered, loaded, initialized, monitored, and shut down. A three-tier trust model ensures that untrusted plugins cannot compromise the host process, while built-in plugins benefit from zero-overhead in-process execution. The framework is consumed as a library by other modules; it does not expose HTTP endpoints.
## Design Principles
1. **Security by default** - Untrusted plugins are process-isolated; capabilities are explicitly declared and enforced
2. **Lifecycle consistency** - All plugins follow the same state machine regardless of trust level
3. **Zero-overhead for built-ins** - BuiltIn plugins run in-process with direct method calls; no serialization or IPC cost
4. **Testability** - Every component has an in-memory or mock alternative for deterministic testing
## Components
```
Plugin/
├── StellaOps.Plugin.Abstractions/ # Core interfaces (IPlugin, PluginInfo, PluginCapabilities)
├── StellaOps.Plugin.Host/ # Plugin host, lifecycle manager, trust enforcement
├── StellaOps.Plugin.Registry/ # Plugin catalog (InMemory + PostgreSQL backends)
├── StellaOps.Plugin.Sandbox/ # Process isolation and gRPC IPC for untrusted plugins
├── StellaOps.Plugin.Sdk/ # SDK for plugin authors (base classes, helpers)
├── StellaOps.Plugin.Testing/ # Test utilities (mock host, fake registry)
├── Samples/
│ └── HelloWorld/ # Sample plugin demonstrating the SDK
└── __Tests/
└── StellaOps.Plugin.Tests/ # Unit and integration tests
```
## Core Interfaces
### IPlugin
```csharp
public interface IPlugin
{
PluginInfo Info { get; }
PluginCapabilities Capabilities { get; }
Task InitializeAsync(IPluginContext context, CancellationToken ct);
Task StartAsync(CancellationToken ct);
Task StopAsync(CancellationToken ct);
}
```
### PluginInfo
```csharp
public sealed record PluginInfo
{
public required string Id { get; init; }
public required string Name { get; init; }
public required Version Version { get; init; }
public required PluginTrustLevel TrustLevel { get; init; }
public string? Description { get; init; }
public string? Author { get; init; }
}
```
### PluginCapabilities
Declares what the plugin can do (e.g., `CanScan`, `CanEvaluatePolicy`, `CanConnect`). The host checks capabilities before routing work to a plugin.
## Plugin Lifecycle
```
[Discovery] --> [Loading] --> [Initialization] --> [Active] --> [Shutdown]
│ │ │ │
│ │ └── failure ──> [Failed] │
│ └── failure ──> [Failed] │
└── not found ──> (skip)
```
| State | Description |
|-------|-------------|
| **Discovery** | Host scans configured paths for assemblies or packages containing `IPlugin` implementations |
| **Loading** | Assembly or process is loaded; plugin metadata is read and validated |
| **Initialization** | `InitializeAsync` is called with an `IPluginContext` providing configuration and service access |
| **Active** | Plugin is ready to receive work; `StartAsync` has completed |
| **Shutdown** | `StopAsync` is called during graceful host shutdown or plugin unload |
| **Failed** | Plugin encountered an unrecoverable error during loading or initialization; logged and excluded |
## Trust Levels
| Level | Execution Model | IPC | Use Case |
|-------|----------------|-----|----------|
| **BuiltIn** | In-process, direct method calls | None | First-party plugins shipped with the platform |
| **Trusted** | In-process with monitoring | None | Vetted third-party plugins with signed manifests |
| **Untrusted** | Separate process via `ProcessSandbox` | gRPC | Community or unverified plugins |
### ProcessSandbox (Untrusted Plugins)
Untrusted plugins run in a child process managed by `ProcessSandbox`:
1. **Process creation:** The sandbox spawns a new process with restricted permissions
2. **gRPC channel:** A bidirectional gRPC channel is established for host-plugin communication
3. **Capability enforcement:** The host proxy only forwards calls matching declared capabilities
4. **Resource limits:** CPU and memory limits are enforced at the process level
5. **Crash isolation:** If the plugin process crashes, the host logs the failure and marks the plugin as Failed; the host process is unaffected
## Database Schema
Database: PostgreSQL (via `PostgresPluginRegistry`)
| Table | Purpose |
|-------|---------|
| `plugins` | Registered plugins (id, name, trust_level, status, config_json, registered_at) |
| `plugin_versions` | Version history per plugin (plugin_id, version, assembly_hash, published_at) |
| `plugin_capabilities` | Declared capabilities per plugin version (plugin_version_id, capability, parameters) |
The `InMemoryPluginRegistry` provides an equivalent in-memory implementation for testing and offline scenarios.
## Data Flow
```
[Module Host] ── discover ──> [Plugin.Host]
load plugins
┌───────────────┼───────────────┐
│ │ │
[BuiltIn] [Trusted] [Untrusted]
(in-process) (in-process) (ProcessSandbox)
│ │ │
└───────────────┼───────────────┘
[Plugin.Registry] ── persist ──> [PostgreSQL]
```
## Security Considerations
- **Trust level enforcement:** The host never executes untrusted plugin code in-process; all untrusted execution is delegated to the sandbox
- **Capability restrictions:** Plugins can only perform actions matching their declared capabilities; the host rejects unauthorized calls
- **Assembly hash verification:** Plugin assemblies are hashed at registration; the host verifies the hash at load time to detect tampering
- **No network access for untrusted plugins:** The sandbox process has restricted network permissions; plugins that need network access must be at least Trusted
- **Audit trail:** Plugin lifecycle events (registration, activation, failure, shutdown) are logged with timestamps and actor identity
## Observability
- **Metrics:** `plugin_active_count{trust_level}`, `plugin_load_duration_ms`, `plugin_failures_total{plugin_id}`, `sandbox_process_restarts_total`
- **Logs:** Structured logs with `pluginId`, `trustLevel`, `lifecycleState`, `capability`
- **Health:** The registry exposes plugin health status; modules can query whether a required plugin is active
## Performance Characteristics
- BuiltIn plugins: zero overhead (direct method dispatch)
- Trusted plugins: negligible overhead (monitoring wrapper)
- Untrusted plugins: gRPC serialization cost per call (~1-5ms depending on payload size)
- Plugin discovery: runs at host startup; cached until restart or explicit re-scan
## References
- [Module README](./README.md)
- [Integrations Architecture](../integrations/architecture.md) - Primary consumer
- [Scanner Architecture](../scanner/architecture.md) - Plugin-based analysis