Skip to main content

Teaching vacancies - Alert Runbook

These alerts are related to the service and raised in the #twd_tv_dev Slack channel.

Action

Metrics and Logging

Azure Kubernetes Service (AKS)

Further information on setting up and logging in to AKS are in the hosting document.

  • Request editor role access to s189-teacher-services-cloud-production subscription throguh the Azure Portal
  • Login with az login --tenant tenantid
  • List apps with kubectl get deployments -n tv-production (or other desired namespace)
  • List apps pods statuses for the namespace with kubectl get deployments -n tv-production
  • Get the logs for the app
make production logs CONFIRM_PRODUCTION=YES
  • Restart apps with
    kubectl rollout restart deployment teaching-vacancies-production -n tv-production
    kubectl rollout restart deployment teaching-vacancies-production-worker -n tv-production
    

Terraform

Most AKS settings are in production.tfvars.json

postgres_flexible_server_sku      = "GP_Standard_D2ds_v4"
postgres_enable_high_availability = true
redis_queue_sku_name              = "Premium"
aks_web_app_instances             = 8
worker_app_instances              = 4
aks_worker_app_memory             = "1.5Gi"

Apps

Scale out the number of instances by increasing:

  • aks_web_app_instances
  • worker_app_instances

Scale up the worker app memory by increasing:

  • aks_worker_app_memory (the default for app memory is set to 1Gi in variables.tf, and then overridden for the worker app in production only.

Postgres

You can list the computing and storage options for the Postgres flexible server instances from this list

Change the value for postgres_flexible_server_sku.

Redis

From Azure Cache for Redis documentation

You can list the different Redis Cache tiers in this page

Change the value for redis_queue_sku_name, redis_queue_family and redis_queue_capacity. Same for redis_cache_ values.