Prerequisites for Deployment
Before deploying Langfuse, complete the following steps:
Step 1: Clone the Repository
Clone the codemie-helm-charts repository.
Step 2: Update langfuse/values.yaml
Before deployment, you MUST update the langfuse/values.yaml file with your environment-specific values:
# Update the domain in these sections:
langfuse:
langfuse:
# Specify Langfuse version
image:
tag: "3.129.0" # Pin to specific version for production stability
nextauth:
url: "https://langfuse.%%DOMAIN%%" # Replace with your domain
ingress:
hosts:
- host: "langfuse.%%DOMAIN%%" # Replace with your domain
# Adjust storage class if needed:
global:
defaultStorageClass: "your-storage-class" # e.g., "gp3", "standard", etc.
# Adjust resource limits based on your requirements:
langfuse:
web:
resources:
limits:
cpu: "2" # Adjust as needed
memory: "4Gi" # Adjust as needed
worker:
resources:
limits:
cpu: "2" # Adjust as needed
memory: "4Gi" # Adjust as needed
postgresql:
deploy: false
host: some-postgresql.database.example.com # Replace with your database host
auth:
username: langfuse_admin
existingSecret: langfuse-postgresql
secretKeys:
userPasswordKey: password
# Adjust component resources:
clickhouse:
resources:
limits:
cpu: "2" # Adjust as needed
memory: "8Gi" # Adjust as needed
persistence:
size: "100Gi" # Adjust as needed
redis:
persistence:
size: "2Gi" # Adjust as needed
s3:
persistence:
size: "100Gi" # Adjust as needed
# Configure data retention policies for langfuse (optional):
retention:
langfuse:
enabled: false # Set to 'true' to automatically purge historical data
observationsDays: 90 # Retain observations for 90 days
tracesDays: 90 # Retain traces for 90 days
blobstoragefilelogDays: 90 # Retain blob storage logs for 90 days
The retention configuration automatically applies TTL (Time-To-Live) policies to Langfuse tables in ClickHouse. This helps manage storage costs by automatically removing old data. Adjust the retention periods based on your compliance and storage requirements.
The clickhouse.extraOverrides section in values.yaml contains optional configuration for ClickHouse system tables TTL. By default, it's commented out to avoid modifying ClickHouse's internal logging behavior.
When to enable:
- You need to manage storage used by ClickHouse system logs (query_log, trace_log, metric_log, etc.)
- You want to limit retention of internal ClickHouse operational logs
How to enable:
- Uncomment the
extraOverridessection inlangfuse/values.yaml - Adjust the TTL intervals as needed (default is 90 days)
- Redeploy Langfuse with
helm upgrade
This configuration is separate from Langfuse data retention and only affects ClickHouse's internal system tables.
Step 2.1: Managing ClickHouse Data Retention
When you enable retention.langfuse.enabled: true in the configuration above, a Kubernetes Job automatically applies TTL (Time-To-Live) policies to Langfuse tables in ClickHouse. The TTL policies automatically delete data older than the specified retention period.
Scenario A: New Langfuse Installation
If you're deploying Langfuse for the first time:
- Simply enable retention in
values.yaml(as shown above) - Deploy Langfuse following the deployment instructions
- TTL policies will be applied automatically during deployment
- No manual data cleanup is needed
Scenario B: Existing Installation or Changing TTL Settings
If you have an existing Langfuse installation with data and want to:
- Enable retention for the first time, OR
- Reduce the retention period (e.g., from 90 days to 30 days)
You must manually delete old data before enabling or updating retention settings.
Step 1: Connect to ClickHouse
Find the ClickHouse pod name:
kubectl get pods -n langfuse | grep clickhouse
Connect to the ClickHouse pod (replace X with your shard number):
kubectl exec -it langfuse-clickhouse-shard0-X -n langfuse -- /bin/bash
Step 2: Get ClickHouse Password
Retrieve the admin password from the Kubernetes secret:
kubectl get secret langfuse-clickhouse -n langfuse -o jsonpath='{.data.admin-password}' | base64 --decode; echo
Step 3: Connect to ClickHouse Client
Inside the pod, connect to ClickHouse using the password from Step 2:
clickhouse-client --password <password_from_step_2>
Step 4: Delete Old Data
Execute the following SQL commands to delete data older than your desired retention date.
Example: Delete data older than July 13, 2025:
-- Delete old observations
ALTER TABLE default.observations DELETE WHERE toDate(start_time) < toDate('2025-07-13');
-- Delete old traces
ALTER TABLE default.traces DELETE WHERE toDate(timestamp) < toDate('2025-07-13');
-- Delete old blob storage logs
ALTER TABLE default.blob_storage_file_log DELETE WHERE toDate(created_at) < toDate('2025-07-13');
These DELETE operations are irreversible. Make sure you have backups if needed and verify the date before executing.
After deleting old data manually, you can enable or update the retention configuration in values.yaml and redeploy Langfuse. The TTL policies will then automatically manage future data cleanup.
For useful SQL queries to monitor disk usage, verify retention policies, and analyze data patterns, see Operational Queries.
Step 3: Configure PostgreSQL
Configure PostgreSQL running in managed cloud.
3.1. Connect to PostgreSQL Database
Connect to PostgreSQL database codemie depending on your cloud provider. Choose one of the following options:
- Some cloud providers have built-in query tools
- Deploy pgadmin inside the cluster to access your private Postgres instance:
# Create namespace and secret
kubectl create ns pgadmin
kubectl create secret generic pgadmin4-credentials \
--namespace pgadmin \
--from-literal=password="$(openssl rand -hex 16)" \
--type=Opaque
helm upgrade --install pgadmin pgadmin/. -n pgadmin --values pgadmin/values.yaml --wait --timeout 900s --dependency-update
# port-forward to svc
kubectl -n pgadmin port-forward svc/pgadmin-pgadmin4 8080:80
# access via localhost:8080 with secret from pgadmin namespace
# Default user: "pgadmin4@example.com"
# Retrieve the pgAdmin password from the Kubernetes secret.
kubectl -n pgadmin get secret pgadmin4-credentials -o jsonpath='{.data.password}' | base64 -d; echo
3.2. Create Database and User
Execute the following SQL commands to create the database and user:
CREATE DATABASE postgres_langfuse;
CREATE USER langfuse_admin WITH PASSWORD 'your_strong_password_here';
GRANT ALL PRIVILEGES ON DATABASE postgres_langfuse TO langfuse_admin;
3.3. Grant Schema Privileges
Switch to the postgres_langfuse database and grant schema privileges:
GRANT ALL ON SCHEMA public TO langfuse_admin;