Skip to main content

Scripted Infrastructure Deployment

This guide walks you through deploying Azure infrastructure for AI/Run CodeMie using the automated azure-terraform.sh deployment script. The script handles all three deployment phases automatically: Terraform state backend, core platform infrastructure, and optional AI model deployments.

Recommended Approach

Scripted deployment is the recommended method as it handles prerequisite checks, configuration validation, and proper sequencing of Terraform operations automatically.

Prerequisites

Before starting the deployment, ensure you have completed all requirements from the Prerequisites page:

Verification Checklist

  • Azure Access: Contributor role with Entra ID App Registration access
  • Tools Installed: Terraform 1.5.7, Azure CLI, kubectl, Helm, gcloud CLI, Docker
  • Azure Authentication: Logged in via az login and subscription set
  • Repository Access: Have access to codemie-terraform-azure repository
  • Network Planning: Prepared list of allowed networks
  • Domain & Certificate: DNS zone and TLS certificate ready (for public access) or will use private DNS
Authentication Required

You must be authenticated to Azure CLI before running the deployment script. Run az login and verify with az account show.

Deployment Phases

The script automatically deploys infrastructure in three sequential phases:

PhaseDescriptionRequired
Phase 1: State BackendCreates Azure Storage Account for Terraform state filesYes
Phase 2: Platform InfrastructureDeploys AKS, networking, storage, databases, security componentsYes
Phase 3: AI ModelsProvisions Azure OpenAI services (if enabled)Optional
Skipping AI Models

Set DEPLOY_AI_MODELS="false" in configuration to skip Phase 3 if using external AI providers.

Phase 1, 2 & 3: Deploy Infrastructure

This phase deploys all infrastructure components using the automated deployment script: Terraform state backend (Phase 1), core platform infrastructure (Phase 2), and optionally AI model deployments (Phase 3).

Step 1: Clone Repository

Clone the Terraform deployment repository:

git clone git@gitbud.epam.com:epm-cdme/codemie-terraform-azure.git
cd codemie-terraform-azure

Step 2: Configure Deployment

Edit the deployment.conf file to provide your Azure-specific configuration:

# Required: Azure Account Information
AZURE_TENANT_ID="00000000-0000-0000-0000-000000000000"
AZURE_SUBSCRIPTION_ID="11111111-1111-1111-1111-111111111111"

# Required: Basic Configuration
TF_VAR_customer="airun" # Customer identifier (lowercase letters only)
TF_VAR_location="West Europe" # Azure region for deployment
TF_VAR_resource_group_name="" # Leave empty to auto-generate

# Required: AKS Admin Access
TF_VAR_admin_group_object_ids='["3a459347-0000-1111-2222-e73413cfa80a"]'

# Optional: Resource Tagging
TF_VAR_tags='{"createdWith":"Terraform","environment":"production"}'

# Optional: AI Models Deployment
DEPLOY_AI_MODELS="true" # Set to "false" to skip Azure OpenAI deployment
Required vs Optional Variables

The configuration file contains many variables. Most have sensible defaults. Focus on the Required variables first. See the Configuration Reference below for advanced options.

Complete Variable List

For all available configuration options, refer to the variables.tf files:

Step 3: Run Deployment Script

Execute the automated deployment script:

bash ./azure-terraform.sh

The script will automatically execute the following operations:

  1. Validate Environment: Check for required tools and Azure authentication
  2. Verify Configuration: Validate deployment.conf parameters
  3. Deploy State Backend: Create Azure Storage Account for Terraform state files
  4. Deploy Platform Infrastructure: Provision core platform infrastructure (AKS, networking, storage, databases)
  5. Deploy AI Models: Provision Azure OpenAI services (if DEPLOY_AI_MODELS="true")
  6. Generate Outputs: Create deployment_outputs.env with infrastructure details that will be required during next phases
Deployment in Progress

Do not interrupt the script during execution. Monitor the output for any errors.

Configuration Reference

AI Models Deployment Control

Control whether Azure OpenAI services are deployed using the DEPLOY_AI_MODELS parameter:

SettingBehaviorUse Case
"true" (default)Deploys Azure OpenAI services, private endpoints, and AI applicationUsing Azure-hosted AI models
"false"Skips AI models deployment entirelyUsing external AI providers (OpenAI API, Anthropic, AWS Bedrock, GCP Vertex AI)
When to Skip AI Models

Skip Azure OpenAI deployment (DEPLOY_AI_MODELS="false") if you:

  • Already have Azure OpenAI services deployed
  • Plan to use other non GPT family models (Claude, Gemini, etc)
  • Want to deploy AI models separately later
  • Are deploying infrastructure in stages

AI Models Network Access

Configure network access controls for Azure OpenAI services. All deployments include private endpoint connectivity; public access is optional and can be restricted.

Most Secure Configuration - Access only through Azure Private Endpoints

TF_VAR_ai_models_public_network_access_enabled="false"

Result:

  • ✅ Access via Azure Private Links from your VNet
  • ❌ Public internet access completely disabled
  • ✅ Recommended for production environments

Private Endpoint Configuration

Private endpoints are automatically deployed for secure VNet connectivity. Customize the network location if needed:

# Default values (can be customized)
TF_VAR_ai_network_name="AksVNet" # VNet for private endpoint
TF_VAR_ai_endpoint_subnet_name="UserSubnet" # Subnet for private endpoint
Private Endpoints

Private endpoints are created regardless of public access settings, ensuring secure connectivity from your Azure infrastructure.

Azure OpenAI Model Configuration

When DEPLOY_AI_MODELS="true", configure which AI models to deploy and their regional distribution using TF_VAR_cognitive_regions.

Configuration Parameters

Region-Level Settings:

  • region_name: Azure region (e.g., "eastus", "westeurope", "japaneast")
  • count: Number of Azure OpenAI instances to create in this region
  • custom_domain_name: Enable custom domain names (true/false)

Model-Level Settings:

  • format: Always "OpenAI"
  • name: Deployment name used in API calls
  • model_name: Azure OpenAI model identifier (e.g., "gpt-4o", "gpt-4", "text-embedding-ada-002")
  • version: Model version (e.g., "2024-11-20")
  • capacity: Total capacity units (automatically distributed across instances)
  • type: "Standard" (regional) or "GlobalStandard" (global with higher availability)
Capacity Distribution

The capacity value is the total capacity for that model. It's automatically divided by count:

  • capacity: 348 with count: 3 → Each instance gets 116 capacity units
  • capacity: 200 with count: 2 → Each instance gets 100 capacity units

Configuration Examples

Single Region Configuration Example
TF_VAR_cognitive_regions='{
"eastus": {
"count": 2,
"custom_domain_name": true,
"available_models": [
{
"format": "OpenAI",
"name": "gpt-4o-2024-11-20",
"model_name": "gpt-4o",
"version": "2024-11-20",
"capacity": 200,
"type": "GlobalStandard"
},
{
"format": "OpenAI",
"name": "gpt-4.1-2025-04-14",
"model_name": "gpt-4.1",
"version": "2025-04-14",
"capacity": 200,
"type": "GlobalStandard"
},
{
"format": "OpenAI",
"name": "text-embedding-ada-002",
"model_name": "text-embedding-ada-002",
"version": "2",
"capacity": 200,
"type": "GlobalStandard"
}
]
}
}'
Multiple Regions Configuration Example
TF_VAR_cognitive_regions='{
"eastus": {
"count": 2,
"custom_domain_name": true,
"available_models": [
{
"format": "OpenAI",
"name": "gpt-4o-2024-11-20",
"model_name": "gpt-4o",
"version": "2024-11-20",
"capacity": 200,
"type": "GlobalStandard"
},
{
"format": "OpenAI",
"name": "gpt-4.1-2025-04-14",
"model_name": "gpt-4.1",
"version": "2025-04-14",
"capacity": 200,
"type": "GlobalStandard"
},
{
"format": "OpenAI",
"name": "text-embedding-ada-002",
"model_name": "text-embedding-ada-002",
"version": "2",
"capacity": 200,
"type": "GlobalStandard"
}
]
},
"eastus2": {
"count": 2,
"custom_domain_name": true,
"available_models": [
{
"format": "OpenAI",
"name": "gpt-4o-2024-11-20",
"model_name": "gpt-4o",
"version": "2024-11-20",
"capacity": 200,
"type": "GlobalStandard"
},
{
"format": "OpenAI",
"name": "gpt-4.1-2025-04-14",
"model_name": "gpt-4.1",
"version": "2025-04-14",
"capacity": 200,
"type": "GlobalStandard"
},
{
"format": "OpenAI",
"name": "text-embedding-ada-002",
"model_name": "text-embedding-ada-002",
"version": "2",
"capacity": 200,
"type": "GlobalStandard"
}
]
}
}'

Deployment Outputs

Upon successful deployment, the script generates a deployment_outputs.env file containing essential infrastructure details needed for the next deployment phase:

# Platform Infrastructure Outputs
AZURE_CLIENT_ID="00000000-0000-0000-0000-000000000000"
AZURE_KEY_VAULT_URL="https://codemie-kv-abc123.vault.azure.net"
AZURE_KEY_NAME="codemie-key"
AZURE_STORAGE_ACCOUNT_NAME="codemiestorage123"
AZURE_RESOURCE_GROUP="airun-codemie"
BASTION_ADMIN_USERNAME="azadmin"
CODEMIE_DOMAIN_NAME="example.com"

# AI Model Outputs (if DEPLOY_AI_MODELS="true")
AZURE_AI_TENANT_ID="00000000-0000-0000-0000-000000000000"
AZURE_AI_CLIENT_ID="00000000-0000-0000-0000-000000000000"
AZURE_AI_CLIENT_SECRET="some-secret"

# Database Outputs
CODEMIE_POSTGRES_DATABASE_HOST="codemie-psql-abc123.postgres.database.azure.com"
CODEMIE_POSTGRES_DATABASE_PORT="5432"
CODEMIE_POSTGRES_DATABASE_NAME="codemie"
CODEMIE_POSTGRES_DATABASE_USER="pgadmin"
CODEMIE_POSTGRES_DATABASE_PASSWORD="password"
Save These Outputs

The deployment_outputs.env file contains sensitive information. Store it securely and reference it during the Components Deployment phase.

Post-Deployment Validation

After deployment completes, verify that all infrastructure was created successfully:

Step 1: Verify Azure Resources

Check that all expected resources were created in the Azure Portal:

# List all resources in the resource group
az resource list --resource-group <resource-group-name> --output table

# Verify AKS cluster status
az aks show --resource-group <resource-group-name> --name CodeMieAks --query "provisioningState"

# Verify PostgreSQL server status
az postgres flexible-server show --resource-group <resource-group-name> --name <postgres-server-name>

Step 2: Check Deployment Logs

Review the deployment logs in the logs/ directory for any warnings or errors:

ls -la logs/
# Review logs
cat logs/codemie_azure_deployment_YYYY-MM-DD-HHMMSS.log

Step 3: Verify Key Resources

Ensure critical resources are accessible:

ResourceVerification
AKS ClusterStatus should be "Succeeded", private endpoint created
Key VaultAccessible, contains SSH keys and secrets
Storage AccountCreated with private endpoint
PostgreSQLRunning, accessible via private endpoint
Azure BastionDeployed and associated with Hub VNet
NAT GatewayPublic IP assigned and associated with AKS subnets

Access Jumpbox VM via Bastion

The Jumpbox VM provides secure management access to your AKS cluster. Access it through Azure Bastion:

Step 1: Connect via SSH (Initial Setup)

  1. Navigate to your resource group in the Azure Portal (default: CodeMieRG)
  2. Select the Jumpbox VM (CodeMieVM)
  3. Click ConnectConnect via Bastion
  4. Configure connection settings:
    • Authentication type: SSH Private Key from Azure Key Vault
    • Username: azadmin
    • Subscription: Your Azure subscription
    • Azure Key Vault: CodeMieAskVault (or your Key Vault name)
    • Azure Key Vault Secret: codemie-vm-private-key
  5. Click Connect
Browser Shortcuts

Use Ctrl+Shift+C and Ctrl+Shift+V to copy/paste in the browser-based Bastion session.

Step 2: Set User Password

After initial SSH connection, set a password for the azadmin user (required for RDP access):

sudo passwd azadmin

Step 3: Connect via RDP (Management Access)

  1. Disconnect from SSH session
  2. Return to VM → ConnectConnect via Bastion
  3. Configure connection settings:
    • Protocol: RDP
    • Username: azadmin
    • Authentication type: Password
    • Password: Password set in Step 2
  4. Click Connect

Configure Jumpbox for AKS Access

Once connected to the Jumpbox via RDP, configure access to the AKS cluster:

Step 1: Authenticate to Azure

az login

Step 2: Set Active Subscription

az account set --subscription <subscription-id>

Step 3: Configure kubectl Access

Retrieve AKS credentials and configure kubectl:

# Replace <resource-group-name> with your resource group
# Default: CodeMieRG (unless overridden in deployment.conf)
az aks get-credentials \
--resource-group <resource-group-name> \
--name CodeMieAks \
--overwrite-existing

# Convert kubeconfig for Azure CLI authentication
kubelogin convert-kubeconfig -l azurecli

Step 4: Set Default Resource Group

# Set default resource group for Azure CLI commands
az configure --defaults group=<resource-group-name>

Step 5: Verify Cluster Access

# Test cluster connectivity
kubectl get nodes

# View cluster information
kubectl cluster-info

Next Steps

After successful infrastructure deployment and validation, proceed to:

Components Deployment - Deploy AI/Run CodeMie application components to your AKS cluster