# LLM Provider Plugin Architecture (Multi-Provider Inference)

## Module
AdvisoryAI

## Status
IMPLEMENTED

## Description
Pluggable LLM provider architecture with ILlmProvider interface supporting OpenAI, Claude, Gemini, llama.cpp (LlamaServer), and Ollama backends. Includes LlmProviderFactory for runtime selection and configuration validation. Enables sovereign/offline inference by switching to local providers.

## Implementation Details
- **Modules**: `src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/`, `src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/`, `src/AdvisoryAi/StellaOps.AdvisoryAI.Plugin.Unified/`
- **Key Classes**:
  - `LlmProviderFactory` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/LlmProviderFactory.cs`) - factory for runtime LLM provider selection
  - `OpenAiLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/OpenAiLlmProvider.cs`) - OpenAI API provider
  - `ClaudeLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/ClaudeLlmProvider.cs`) - Anthropic Claude API provider
  - `GeminiLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/GeminiLlmProvider.cs`) - Google Gemini API provider
  - `LlamaServerLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/LlamaServerLlmProvider.cs`) - local llama.cpp server provider
  - `OllamaLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/OllamaLlmProvider.cs`) - Ollama local inference provider
  - `LlmProviderOptions` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/LlmProviderOptions.cs`) - provider configuration and validation
  - `ClaudeInferenceClient` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/ClaudeInferenceClient.cs`) - Claude-specific chat inference client
  - `OpenAIInferenceClient` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/OpenAIInferenceClient.cs`) - OpenAI-specific chat inference client
  - `OllamaInferenceClient` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/OllamaInferenceClient.cs`) - Ollama-specific chat inference client
  - `LocalInferenceClient` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/LocalInferenceClient.cs`) - local model inference client
  - `LlmPluginAdapter` (`src/AdvisoryAi/StellaOps.AdvisoryAI.Plugin.Unified/LlmPluginAdapter.cs`) - unified plugin adapter for LLM providers
  - `LlmPluginAdapterFactory` (`src/AdvisoryAi/StellaOps.AdvisoryAI.Plugin.Unified/LlmPluginAdapterFactory.cs`) - factory for creating LLM plugin adapters
  - `SystemPromptLoader` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/SystemPromptLoader.cs`) - loads system prompts for inference clients
- **Interfaces**: `ILlmProvider`, `ILlmProviderPlugin`, `IAdvisoryChatInferenceClient`
- **Source**: SPRINT_20251226_019_AI_offline_inference.md

## E2E Test Plan
- [ ] Configure `LlmProviderFactory` with multiple providers and verify runtime selection based on configuration
- [ ] Verify `OpenAiLlmProvider` sends requests to OpenAI API with correct authentication and model parameters
- [ ] Verify `ClaudeLlmProvider` sends requests to Claude API with correct authentication
- [ ] Verify `OllamaLlmProvider` connects to local Ollama instance and performs inference
- [ ] Verify `LlamaServerLlmProvider` connects to local llama.cpp server endpoint
- [ ] Verify `LlmProviderOptions` validation rejects invalid configurations (missing API keys, invalid endpoints)
- [ ] Verify `LlmPluginAdapter` provides health checks for configured LLM providers
- [ ] Verify provider failover: when primary provider is unavailable, factory falls back to secondary