Secret Management Guide
Secret Management Guide
This guide provides instructions for creating and managing Kubernetes secrets for the Obliq SRE Agent deployment. The chart allows you to either let it manage secrets automatically or provide your own pre-existing Kubernetes secrets.
π§ Secret Types
| Secret Type | Purpose | Chart Creates | Pre-existing Support |
|---|---|---|---|
| Global Secret | All environment variables | β | β |
| Image Pull Secret | Container registry access | β | β |
| Kubeconfig Secret | Kubernetes cluster access | β | β |
π Required Environment Variables
Core Variables (Required)
OPENAI_API_KEY- OpenAI API key for AI services (required)KUBECONFIG- Kubernetes configuration for cluster access (required)
AWS Integration (Optional)
AWS_ACCESS_KEY_ID- AWS access key for cloud servicesAWS_SECRET_ACCESS_KEY- AWS secret access keyAWS_REGION- AWS region (default: us-east-1)AWS_ROLE_ARN_AWS_MCP- IAM role for AWS MCP serviceAWS_ROLE_ARN_EC2_CLOUDWATCH_ALARMS- IAM role for CloudWatch alarms
External Integrations (Optional)
SLACK_BOT_TOKEN- Slack bot token for notifications (xoxb-β¦)SLACK_WEBHOOK_URL- Slack webhook URL for alertsDD_API_KEY- DataDog API key for service graph engineDD_APP_KEY- DataDog application keyJIRA_EMAIL- Jira user email for incident managementJIRA_API_TOKEN- Jira API tokenJIRA_BASE_URL- Jira instance URL (default: https://avesha.atlassian.net)PROMETHEUS_URL- Prometheus server URL (default: http://prometheus:9090)PROMETHEUS_USER- Prometheus username (if auth enabled)PROMETHEUS_PASSWORD- Prometheus password (if auth enabled)LOKI_URL- Loki server URL (default: http://loki:3100)LOKI_USERNAME- Loki username (if auth enabled)LOKI_PASSWORD- Loki password (if auth enabled)LOKI_TOKEN- Loki authentication token
π Method 1: Using Pre-existing Global Secret (Recommended)
Create your own Kubernetes secret with all required environment variables:
Step 1: Create the Secret
Minimal Deployment (Core Services Only)
kubectl create secret generic obliq-secrets \
--namespace=avesha \
--from-literal=OPENAI_API_KEY="sk-your-openai-key"
AWS Integration
kubectl create secret generic obliq-secrets \
--namespace=avesha \
--from-literal=OPENAI_API_KEY="sk-your-openai-key" \
--from-literal=AWS_ACCESS_KEY_ID="your-aws-access-key" \
--from-literal=AWS_SECRET_ACCESS_KEY="your-aws-secret-key" \
--from-literal=AWS_REGION="us-east-1" \
--from-literal=AWS_ROLE_ARN_AWS_MCP="arn:aws:iam::account:role/aws-mcp-role"
Full Integration (All Services)
kubectl create secret generic obliq-secrets \
--namespace=avesha \
--from-literal=OPENAI_API_KEY="sk-your-openai-key" \
--from-literal=AWS_ACCESS_KEY_ID="your-aws-access-key" \
--from-literal=AWS_SECRET_ACCESS_KEY="your-aws-secret-key" \
--from-literal=AWS_REGION="us-east-1" \
--from-literal=SLACK_BOT_TOKEN="xoxb-your-slack-token" \
--from-literal=SLACK_WEBHOOK_URL="https://hooks.slack.com/services/your-webhook" \
--from-literal=DD_API_KEY="your-datadog-api-key" \
--from-literal=DD_APP_KEY="your-datadog-app-key" \
--from-literal=JIRA_EMAIL="user@company.com" \
--from-literal=JIRA_API_TOKEN="your-jira-api-token" \
--from-literal=JIRA_BASE_URL="https://company.atlassian.net" \
--from-literal=PROMETHEUS_URL="http://prometheus:9090" \
--from-literal=PROMETHEUS_USER="admin" \
--from-literal=PROMETHEUS_PASSWORD="your-prometheus-password" \
--from-literal=LOKI_URL="http://loki:3100" \
--from-literal=LOKI_USERNAME="admin" \
--from-literal=LOKI_PASSWORD="your-loki-password"
Step 2: Configure Chart to Use Existing Secret
Create a values file or use command line flags:
# custom-values.yaml
global:
globalSecret:
create:
enabled: false # Don't create a new secret
existing:
enabled: true # Use existing secret
name: "obliq-secrets" # Your secret name
Step 3: Install with Pre-existing Secret
First, add the Helm repository:
helm repo add obliq-charts https://repo.obliq.avesha.io/
helm repo update
Then install with the pre-existing secret:
helm install obliq-sre-agent obliq-charts/obliq-sre-agent \
--namespace avesha \
--create-namespace \
--set-file global.kubeconfig.content=./kubeconfig \
--set global.globalSecret.existing.enabled=true \
--set global.globalSecret.existing.name=obliq-secrets \
--set global.globalSecret.create.enabled=false
π Method 2: Using Pre-existing Image Pull Secret
If you already have a Docker registry secret:
Step 1: Create Registry Secret (if needed)
kubectl create secret docker-registry registry-secret \
--docker-server=avesha.azurecr.io \
--docker-username=your-username \
--docker-password=your-password \
--docker-email=your-email \
--namespace=avesha
Step 2: Configure Chart
# custom-values.yaml
global:
imagePullSecrets:
- name: registry-secret
imagePullSecretConfig:
create:
enabled: false # Don't create new registry secret
existing:
enabled: true # Use existing registry secret
name: "registry-secret"
π Service-Specific Environment Variables
Each service requires specific environment variables from the global secret:
Core Services (Always Enabled)
- backend:
OPENAI_API_KEY,PORT,INFRA_AGENT_HOST,INFRA_AGENT_PORT - orchestrator:
OPENAI_API_KEY,MCP_SERVERS,PORT - rca-agent:
OPENAI_API_KEY,MCP_SERVERS,PORT - anomaly-detection:
OPENAI_API_KEY,MCP_SERVERS,PORT - auto-remediation:
OPENAI_API_KEY,MCP_SERVERS,PORT - incident-manager:
OPENAI_API_KEY,MCP_SERVERS,PORT
Optional Services (Enable as needed)
- aws-mcp:
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_REGION,AWS_ROLE_ARN_AWS_MCP - cloudwatch-mcp:
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_REGION - prometheus-mcp:
PROMETHEUS_URL,PROMETHEUS_USER,PROMETHEUS_PASSWORD - loki-mcp:
LOKI_URL,LOKI_USERNAME,LOKI_PASSWORD,LOKI_TOKEN - slack-ingester:
SLACK_BOT_TOKEN - service-graph-engine:
DD_API_KEY,DD_APP_KEY,DD_SITE
π― Configuration Examples
Enable AWS Services
# Enable AWS-related services
aws-mcp:
enabled: true
cloudwatch-mcp:
enabled: true
aws-ec2-cloudwatch-alarms:
enabled: true
Enable Observability Services
# Enable observability integrations
prometheus-mcp:
enabled: true
loki-mcp:
enabled: true
Enable External Integrations
# Enable external service integrations
slack-ingester:
enabled: true
service-graph-engine:
enabled: true
π Verification and Troubleshooting
Check Secret Creation
# Verify secret exists
kubectl get secrets -n avesha
# Check secret content (keys only)
kubectl describe secret obliq-secrets -n avesha
# View secret data (base64 encoded)
kubectl get secret obliq-secrets -n avesha -o yaml
Verify Environment Variables in Pods
# Check backend pod environment
kubectl exec -n avesha deployment/backend -- env | grep -E "(OPENAI|AWS|SLACK)" | sort
# Check specific service
kubectl exec -n avesha deployment/aws-mcp -- env | grep AWS
# Check all pods
kubectl get pods -n avesha -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}'
Common Issues and Solutions
1. Missing Required Variables
Problem: Pod fails to start due to missing environment variables
# Check pod events
kubectl describe pod -n avesha -l app.kubernetes.io/name=backend
# Check logs
kubectl logs -n avesha -l app.kubernetes.io/name=backend
Solution: Add missing variables to your secret
kubectl patch secret obliq-secrets -n avesha \
--type='json' \
-p='[{"op": "add", "path": "/data/MISSING_VAR", "value": "'$(echo -n "your-value" | base64)'"}]'
2. Wrong Secret Name/Namespace
Problem: Chart canβt find the specified secret
# Check if secret exists in correct namespace
kubectl get secret obliq-secrets -n avesha
Solution: Ensure secret name matches configuration and is in correct namespace
3. Invalid Secret Values
Problem: Services fail to authenticate with external APIs
# Test OpenAI API
kubectl exec -n avesha deployment/backend -- curl -H "Authorization: Bearer $OPENAI_API_KEY" https://api.openai.com/v1/models
# Test AWS credentials
kubectl exec -n avesha deployment/aws-mcp -- aws sts get-caller-identity
Solution: Verify credentials are correct and have proper permissions
4. Pods Not Picking Up Secret Changes
Problem: Updated secret values not reflected in running pods
# Restart deployments to pick up new values
kubectl rollout restart deployment -n avesha
π Secret Management Best Practices
- Use Descriptive Names: Use clear, descriptive names for your secrets
- Organize by Environment: Create separate secrets for dev/staging/prod
- Regular Rotation: Implement a process for regular secret rotation
- Least Privilege: Only include the secrets each service actually needs
- Backup Secrets: Ensure you have secure backups of critical secrets
- Monitor Access: Use RBAC to control who can access secrets
π Related Documentation
- Main README - Quick start and overview
- Parameters Guide - Complete deployment examples
- Prerequisites - System requirements and setup
- Parameters Reference - All configuration options