AI Models Integration

Overview

AI/Run CodeMie provides flexible integration options for connecting to Large Language Models (LLMs) and embedding models from various cloud providers. This section guides you through configuring model access, managing model settings, and choosing the right integration approach for your deployment.

Prerequisites

Before configuring AI models, complete the following:

Enable Models in Cloud Provider
- Azure OpenAI: Create Azure OpenAI service and deploy models
- AWS Bedrock: Request access to foundation models in AWS
- Google Vertex AI: Enable Vertex AI API and configure partner models
Obtain Cloud Credentials
- API keys, endpoints, or authentication credentials for your chosen provider
- Required IAM permissions for model access
Access Helm Chart Configuration
- Clone codemie-helm-charts repository
- Locate codemie-api/values.yaml file

Integration Options

AI/Run CodeMie supports three model integration approaches. We recommend LiteLLM Proxy for production deployments due to its advanced features, flexibility, and usage tracking capabilities.

Option 1: LiteLLM Proxy (Recommended)

Recommended Approach

LiteLLM is the preferred integration method for production environments. It provides comprehensive usage statistics, detailed analytics, flexible routing, and enterprise-grade features that are essential for managing LLM and embedding models at scale.

Best for: Production deployments, multiple LLM providers usage, budget control, usage tracking

Deploy LiteLLM as a centralized proxy gateway with enterprise features including:

Usage Statistics & Analytics: Detailed tracking of model usage, costs, token consumption, and performance metrics
Budget Management: Set spending limits per user, team, or project with real-time enforcement
Multi-provider Support: Route requests across AWS, Azure, GCP seamlessly with unified API
Load Balancing: Distribute requests across multiple model deployments for high availability
Fallback Routing: Automatic failover if primary model unavailable
Rate Limiting: Control request rates per user or team
Caching: Built-in response caching to reduce costs and latency
Observability: OpenTelemetry integration for monitoring and tracing

Configuration: Requires LiteLLM Proxy deployment

Required Environment Variables:

LLM_PROXY_MODE=lite_llm - Enable LiteLLM Proxy mode
LLM_PROXY_ENDPOINT=https://your-litellm-proxy-url - LiteLLM Proxy URL
LLM_PROXY_API_KEY=your-api-key - LiteLLM Proxy authentication key

Learn more about LiteLLM Proxy →

Option 2: Native Provider Integration (Easiest & Fastest Setup)

Best for: Quick setup, testing, single-cloud deployments, minimal configuration

Connect directly to cloud provider APIs without additional proxy layers. This is the easiest and fastest option to get started, requiring minimal configuration and no additional infrastructure.

Supported Providers:

Azure OpenAI
AWS Bedrock
Google Vertex AI

Configuration: Uses llm-<MODELS_ENV>-config.yaml files mounted to CodeMie API pods.

Proxy Mode: LLM_PROXY_MODE=internal

Learn more about CodeMie Native LLM Config →

Option 3: Third-Party LLM Proxy

Best for: Existing proxy infrastructure, OpenAI-compatible proxies, custom routing requirements

Integrate with your existing LLM proxy service that implements OpenAI-compatible APIs. Configure CodeMie to route requests through your proxy endpoint.

Requirements:

OpenAI API-compatible endpoints
Authentication mechanism (API keys, tokens)

Configuration: Use native integration with custom endpoint URLs

Choosing the Right Integration Option

Choose LiteLLM if you need production-grade features, usage analytics, multi-cloud support, or plan to scale
Choose Native if you need to get started quickly, testing features, or have simple single-provider requirements
Choose Third-Party if you already have an existing LLM proxy infrastructure

Overview​

Prerequisites​

Integration Options​

Option 1: LiteLLM Proxy (Recommended)​

Option 2: Native Provider Integration (Easiest & Fastest Setup)​

Option 3: Third-Party LLM Proxy​

Choosing the Right Integration Option​