# LLM Provider Plugin Architecture (Multi-Provider Inference) ## Module AdvisoryAI ## Status IMPLEMENTED ## Description Pluggable LLM provider architecture with ILlmProvider interface supporting OpenAI, Claude, Gemini, llama.cpp (LlamaServer), and Ollama backends. Includes LlmProviderFactory for runtime selection and configuration validation. Enables sovereign/offline inference by switching to local providers. ## Implementation Details - **Modules**: `src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/`, `src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/`, `src/AdvisoryAi/StellaOps.AdvisoryAI.Plugin.Unified/` - **Key Classes**: - `LlmProviderFactory` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/LlmProviderFactory.cs`) - factory for runtime LLM provider selection - `OpenAiLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/OpenAiLlmProvider.cs`) - OpenAI API provider - `ClaudeLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/ClaudeLlmProvider.cs`) - Anthropic Claude API provider - `GeminiLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/GeminiLlmProvider.cs`) - Google Gemini API provider - `LlamaServerLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/LlamaServerLlmProvider.cs`) - local llama.cpp server provider - `OllamaLlmProvider` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/OllamaLlmProvider.cs`) - Ollama local inference provider - `LlmProviderOptions` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Inference/LlmProviders/LlmProviderOptions.cs`) - provider configuration and validation - `ClaudeInferenceClient` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/ClaudeInferenceClient.cs`) - Claude-specific chat inference client - `OpenAIInferenceClient` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/OpenAIInferenceClient.cs`) - OpenAI-specific chat inference client - `OllamaInferenceClient` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/OllamaInferenceClient.cs`) - Ollama-specific chat inference client - `LocalInferenceClient` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/LocalInferenceClient.cs`) - local model inference client - `LlmPluginAdapter` (`src/AdvisoryAi/StellaOps.AdvisoryAI.Plugin.Unified/LlmPluginAdapter.cs`) - unified plugin adapter for LLM providers - `LlmPluginAdapterFactory` (`src/AdvisoryAi/StellaOps.AdvisoryAI.Plugin.Unified/LlmPluginAdapterFactory.cs`) - factory for creating LLM plugin adapters - `SystemPromptLoader` (`src/AdvisoryAi/StellaOps.AdvisoryAI/Chat/Inference/SystemPromptLoader.cs`) - loads system prompts for inference clients - **Interfaces**: `ILlmProvider`, `ILlmProviderPlugin`, `IAdvisoryChatInferenceClient` - **Source**: SPRINT_20251226_019_AI_offline_inference.md ## E2E Test Plan - [ ] Configure `LlmProviderFactory` with multiple providers and verify runtime selection based on configuration - [ ] Verify `OpenAiLlmProvider` sends requests to OpenAI API with correct authentication and model parameters - [ ] Verify `ClaudeLlmProvider` sends requests to Claude API with correct authentication - [ ] Verify `OllamaLlmProvider` connects to local Ollama instance and performs inference - [ ] Verify `LlamaServerLlmProvider` connects to local llama.cpp server endpoint - [ ] Verify `LlmProviderOptions` validation rejects invalid configurations (missing API keys, invalid endpoints) - [ ] Verify `LlmPluginAdapter` provides health checks for configured LLM providers - [ ] Verify provider failover: when primary provider is unavailable, factory falls back to secondary