Skip to main content

Data Layer

This guide covers the installation of data storage components that provide persistent storage for AI/Run CodeMie application data, logs, and user content.

Overview

The data layer consists of two components:

  • Elasticsearch - Document storage and search engine for application logs, embeddings, and search functionality
  • PostgreSQL - Cloud-managed relational database for application metadata
Installation Order

These components must be installed in the order presented.

Elasticsearch Installation

Elasticsearch provides document storage and full-text search capabilities for AI/Run CodeMie. It stores conversation history, embeddings, logs, and provides search functionality.

Step 1: Create Elasticsearch Namespace

Create a dedicated namespace for Elasticsearch:

kubectl create namespace elastic
Namespace Verification

Check if the namespace already exists: kubectl get namespace elastic

Step 2: Create Elasticsearch Credentials Secret

Generate and store Elasticsearch authentication credentials:

kubectl -n elastic create secret generic elasticsearch-master-credentials \
--from-literal=username=elastic \
--from-literal=password="$(openssl rand -base64 12)" \
--type=Opaque \
--dry-run=client -o yaml | kubectl apply -f -

Secret Structure:

apiVersion: v1
kind: Secret
metadata:
name: elasticsearch-master-credentials
namespace: elastic
type: Opaque
data:
username: <base64-encoded-username>
password: <base64-encoded-password>
Retrieve Password

Save the generated password for troubleshooting: kubectl get secret elasticsearch-master-credentials -n elastic -o jsonpath='{.data.password}' | base64 -d

Step 3: Install Elasticsearch Helm Chart

Deploy Elasticsearch using Helm:

helm upgrade --install elastic elasticsearch/. \
-n elastic \
--values elasticsearch/values-azure.yaml \
--wait \
--timeout 900s \
--dependency-update

Step 4: Verify Elasticsearch Deployment

Check that Elasticsearch is running:

# Check pod status
kubectl get pods -n elastic

# Check StatefulSet
kubectl get statefulset -n elastic

# Verify persistent volumes
kubectl get pvc -n elastic

Expected output:

  • Pods should be in Running state (typically 3 pods for a cluster)
  • StatefulSet should show desired replicas match ready replicas
  • PVCs should be in Bound state

Step 5: Test Elasticsearch Health

Verify Elasticsearch cluster health:

# Port-forward to Elasticsearch
kubectl port-forward -n elastic svc/elasticsearch-master 9200:9200

# Check cluster health (use saved password from Step 2)
curl -u elastic:<password> http://localhost:9200/_cluster/health?pretty

# Stop port-forward when done

Expected response should show "status" : "green" or "status" : "yellow" (yellow is acceptable for single-node clusters).

PostgreSQL Configuration

CodeMie uses Azure Database for PostgreSQL (created during infrastructure deployment) rather than running PostgreSQL in the cluster. This section configures the connection credentials.

Step 1: Retrieve Database Credentials

Get your Azure Database for PostgreSQL connection details from the infrastructure deployment outputs:

# From your deployment_outputs.env file, note these values:
# - CODEMIE_POSTGRES_DATABASE_HOST
# - CODEMIE_POSTGRES_DATABASE_NAME
# - CODEMIE_POSTGRES_DATABASE_USER
# - CODEMIE_POSTGRES_DATABASE_PASSWORD
Finding Credentials

Your deployment_outputs.env file was created during Infrastructure Deployment. It should be located in your Terraform working directory.

Step 2: Create PostgreSQL Connection Secret

Create a secret with the cloud-managed PostgreSQL credentials:

kubectl create secret generic codemie-postgresql \
--from-literal=PG_PASS=<CODEMIE_POSTGRES_DATABASE_PASSWORD> \
--from-literal=PG_USER=<CODEMIE_POSTGRES_DATABASE_USER> \
--from-literal=PG_HOST=<CODEMIE_POSTGRES_DATABASE_HOST> \
--from-literal=PG_NAME=<CODEMIE_POSTGRES_DATABASE_NAME> \
--namespace codemie
Replace Placeholders

Replace all <CODEMIE_POSTGRES_DATABASE_*> placeholders with actual values from your deployment_outputs.env file. Do not use angle brackets in the actual command.

Example with Real Values:

kubectl create secret generic codemie-postgresql \
--from-literal=PG_PASS='MySecureP@ssw0rd!' \
--from-literal=PG_USER='codemie_admin' \
--from-literal=PG_HOST='codemie-postgres.postgres.database.azure.com' \
--from-literal=PG_NAME='codemie' \
--namespace codemie

Secret Structure:

apiVersion: v1
kind: Secret
metadata:
name: codemie-postgresql
namespace: codemie
type: Opaque
data:
PG_HOST: <base64-encoded-host>
PG_NAME: <base64-encoded-db-name>
PG_PASS: <base64-encoded-password>
PG_USER: <base64-encoded-user>

Step 3: Verify PostgreSQL Secret

Confirm the secret was created correctly:

# Check secret exists
kubectl get secret codemie-postgresql -n codemie

# Verify secret contents (decode to check values)
kubectl get secret codemie-postgresql -n codemie -o jsonpath='{.data.PG_HOST}' | base64 -d
kubectl get secret codemie-postgresql -n codemie -o jsonpath='{.data.PG_USER}' | base64 -d

Post-Installation Validation

After completing all data layer installations, verify the following:

# Elasticsearch is running
kubectl get pods -n elastic | grep Running
kubectl get statefulset -n elastic

# PostgreSQL secret exists
kubectl get secret codemie-postgresql -n codemie

# Check all PVCs are bound
kubectl get pvc -n elastic

All checks should return successful results before proceeding.

Next Steps

Once the data layer is configured, proceed to Security and Identity installation to deploy Keycloak and OAuth2 Proxy components.