Upgrade Postgresql-ha Helm Chart for GoodData 2.1.0
If you’re using external PostgreSQL database (when you deployed gooddata-cn helm
chart with option deployPostgresHA: false
), you can skip this step because it
is not relevant for your deployment.
If you’re using embedded PostgreSQL database (with deployPostgresHA: true
set
in values.yaml
), you will need to perform the following process to upgrade
postgresql-ha
helm chart from version 8.6.13 to 9.1.5. The upgrade includes migration
of PostgreSQL database version from 11 to 14. Due to the nature of the upgrade, this action will
cause a service disruption. Please schedule maintenance window for this operation.
Important
All commands must end successfully. Any error must be addressed properly to avoid data loss. Get familiar with the procedure on test environment before applying it to production.
Always start the upgrade in an empty folder. Procedure creates helper files for rollback purposes. If two upgrade procedures are executed from the same folder, helper files from two environments mix up and rollback will not be possible without proper manual intervention.
Note
Procedure expects kubectl in version 1.23 or above because of parameter --retries
of cp
subcommand. It is possible
to execute the procedure with older version. Just remove --retries
and double-check complete data were transferred.
Steps:
If you have not done so already, create a
/tmp/organization-layout.json
JSON dump from your GoodData.CN 2.0.x organization layout. See Back Up the Organization.Set up your shell environment, make sure you change the following values:
# namespace GoodData.CN is deployed to export NAMESPACE=gooddata-cn # name of helm release used to deploy GoodData.CN export HELM_RELEASE=release-name # path to helm values file used to deploy GoodData.CN export HELM_VALUES_FILE="values-gooddata-cn.yaml" # postgres name as specified in values.yaml postgresql-ha.nameOverride export PG_NAME=db # PG-HA admin user as defined in values.yaml postgresql-ha.postgresql.username export PGUSER=postgres # PG-HA admin user password as defined in values.yaml postgresql-ha.postgresql.password export PGPASSWORD=$(cat pg_password.txt) export PGHOST=${HELM_RELEASE}-${PG_NAME}-postgresql-0 # helm release name of temporary PG in destination version started to execute pg_dump export TMP_PGDUMP_RELEASE=tmp-pg-dump export TMP_PGDUMP_POD=${TMP_PGDUMP_RELEASE}-postgresql-0 # location of dumps in temporary container export DUMP_LOCATION=/bitnami/postgresql/dumps
Disable access to your GoodData.CN application:
helm upgrade --namespace $NAMESPACE --version 2.0.1 \ --wait --timeout 7m -f $HELM_VALUES_FILE \ --set metadataApi.replicaCount=0 \ --set sqlExecutor.replicaCount=0 \ --set dex.replicaCount=0 \ $HELM_RELEASE gooddata/gooddata-cn
Note that once the command finishes, users will see
Internal Server Error
message when trying to access any deployment organization.Deploy a temporary container to dump your data into:
cat << EOT > /tmp/values-${TMP_PGDUMP_RELEASE}.yaml auth: postgresPassword: dumpdata primary: persistence: enabled: false readReplicas: replicaCount: 0 EOT helm upgrade --install --namespace $NAMESPACE --version 11.6.6 \ --wait --timeout 2m --values /tmp/values-${TMP_PGDUMP_RELEASE}.yaml \ ${TMP_PGDUMP_RELEASE} bitnami/postgresql
Enable network access between
gooddata-cn-db-pgpool
andtmp-pg-dump-postgresql
:cat << EOT > /tmp/network-access-tmp-pg-2-prod-pg.yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: namespace: $NAMESPACE name: $TMP_PGDUMP_RELEASE-ingress spec: ingress: - from: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: $NAMESPACE podSelector: matchLabels: app.kubernetes.io/instance: $TMP_PGDUMP_RELEASE app.kubernetes.io/name: postgresql ports: - port: 5432 protocol: TCP podSelector: matchLabels: app.kubernetes.io/component: postgresql app.kubernetes.io/instance: $NAMESPACE app.kubernetes.io/name: $PG_NAME policyTypes: - Ingress EOT kubectl apply -f /tmp/network-access-tmp-pg-2-prod-pg.yaml
List available databases in the postgres-ha deployment:
kubectl -n $NAMESPACE exec $PGHOST -- env PGPASSWORD=$PGPASSWORD psql -U postgres -c "\l"
Pick databases to be preserved. Use their names as input for variable
USER_DBS_TO_TRANSFER
in the next step. Note that:- databases
md
,execution
anddex
are always preserved - databases
template0
andtemplate1
must be always skipped
- databases
Dump databases that you want preserved, edit the
USER_DBS_TO_TRANSFER
andUSER_ROLES_BY_PG_HA
values:# space separate list of user DBs to be dumped, system DBs md, execution and dex are included automatically export USER_DBS_TO_TRANSFER="tigerdb" # space separated list of user-defined PG roles delivered by postgresql-ha helm chart; # the roles will be excluded from roles dump as they are created automatically during PG-HA ecosystem provisioning; # roles repmgr, postgres and executor are excluded automatically export USER_ROLES_BY_PG_HA="" cat << "EOT" > ./dump-pg-dbs.sh #!/bin/bash set -x set -e PGHOST_IP=$(kubectl get pod -n $NAMESPACE $PGHOST --template '{{.status.podIP}}') kubectl -n $NAMESPACE exec $TMP_PGDUMP_POD -- mkdir -p $DUMP_LOCATION # exclude all the roles created automatically by: # - PG-HA chart # - postgres installation # - GD chart time kubectl exec -it -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "env PGPASSWORD=$PGPASSWORD \ pg_dumpall -h $PGHOST_IP -U $PGUSER --roles-only > /tmp/0_dump_pg_roles_all.sql" ROLES_TO_EXCLUDE="repmgr postgres executor" if [[ "$USER_ROLES_BY_PG_HA" != "" ]]; then ROLES_TO_EXCLUDE="${ROLES_TO_EXCLUDE} $USER_ROLES_BY_PG_HA" fi ROLES_TO_EXCLUDE=$(echo $ROLES_TO_EXCLUDE | tr ' ' '|') kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "grep -i -v \ -E \"^(CREATE|ALTER) ROLE (${ROLES_TO_EXCLUDE})\" \ /tmp/0_dump_pg_roles_all.sql | gzip > $DUMP_LOCATION/0_dump_pg_roles.sql.gz" # dump selected databases ALL_DBS_TO_TRANSFER="md execution dex ${USER_DBS_TO_TRANSFER}" ITER_ID=1 for db in $ALL_DBS_TO_TRANSFER; do DUMP_DEST_FILE="$DUMP_LOCATION/${ITER_ID}_dump_pg_db_${db}.sql.gz" echo "Creating dump of DB ${db} to file ${DUMP_DEST_FILE}" time kubectl exec -it -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "env PGPASSWORD=$PGPASSWORD \ pg_dump -h $PGHOST_IP -U $PGUSER --quote-all-identifiers --create ${db} | gzip > $DUMP_DEST_FILE" ITER_ID=$(($ITER_ID+1)) done EOT chmod 754 dump-pg-dbs.sh ./dump-pg-dbs.sh
The script first dumps only roles, those are PostgreSQL instance level definitions. After that, databases are dumped one by one.
Before moving forward, verify that temporary POD contains all the requested dumps. It means:
- 3 dumps (files) for
md
,dex
andexecution
databases - 1 dump for roles
- 1 dump for each database specified in
USER_DBS_TO_TRANSFER
variable
kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "ls -la $DUMP_LOCATION"
- 3 dumps (files) for
(Optional) Download dumps to local machine for backup:
Note
This step requires kubectl in version 1.23 or above. The version 1.23 adds parameter
--retries
kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "cd $DUMP_LOCATION/../; tar cf dumps.tar dumps/*" kubectl cp -n $NAMESPACE --retries -1 $TMP_PGDUMP_POD:/bitnami/postgresql/dumps.tar ./dumps.tar # verify, data were transferred completely - compare MD5 hash kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- md5sum $DUMP_LOCATION/../dumps.tar md5sum ./dumps.tar
Remove postgresql-ha:
helm upgrade --namespace $NAMESPACE --version 2.0.1 \ --wait --timeout 2m -f $HELM_VALUES_FILE \ --set metadataApi.replicaCount=0 \ --set sqlExecutor.replicaCount=0 \ --set dex.replicaCount=0 \ --set deployPostgresHA=false \ $HELM_RELEASE gooddata/gooddata-cn
Remove persistent volume claims and backup related persistent volumes:
cat << "EOT" > ./backup-pvs.sh #!/bin/bash set -x set -e PV_NAME_BACKUP_PREFIX=pv_name_backup PVC_BACKUP_PREFIX=pvc_bck PVC_PV_LINES=$(kubectl get pvc -n $NAMESPACE --sort-by=.metadata.name \ --selector="app.kubernetes.io/component=postgresql,app.kubernetes.io/instance=${HELM_RELEASE},app.kubernetes.io/name=${PG_NAME}" \ -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.volumeName}{"\n"}{end}') echo "$PVC_PV_LINES" | while read -r line; do PVC_INSTANCE=$(echo $line | cut -f 1 -d ' ') PV_INSTANCE=$(echo $line | cut -f 2 -d ' ') # make sure PV is not deleted by k8s after PVC is removed echo "Setting Retain policy for PV=${PV_INSTANCE}" kubectl patch pv "$PV_INSTANCE" -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' # remember PV name for rollback purposes PV_NAME_BACKUP_FILE="${PV_NAME_BACKUP_PREFIX}_${PV_INSTANCE}" echo "Creating PV name backup as $PV_NAME_BACKUP_FILE" touch $PV_NAME_BACKUP_FILE # Backup PVC definition for rollback purposes PVC_BCK_FILE="${PVC_BACKUP_PREFIX}_$PVC_INSTANCE.yaml" echo "Creating PVC backup file $PVC_BCK_FILE" kubectl get pvc -n $NAMESPACE $PVC_INSTANCE -o yaml > $PVC_BCK_FILE # delete PVC echo "Deleting PVC $PVC_INSTANCE" kubectl delete pvc -n $NAMESPACE $PVC_INSTANCE done EOT chmod 754 backup-pvs.sh ./backup-pvs.sh
Upgrade chart to 2.1.0, this includes upgrade of postgresql-ha chart to 9.x.x:
helm upgrade --namespace $NAMESPACE --version 2.1.0 \ --wait --timeout 7m -f $HELM_VALUES_FILE \ --set metadataApi.replicaCount=0 \ --set sqlExecutor.replicaCount=0 \ --set dex.replicaCount=0 \ --set deployPostgresHA=true \ $HELM_RELEASE gooddata/gooddata-cn
Verify postgres DB version:
kubectl exec -n $NAMESPACE $PGHOST -c postgresql -- psql --version
Restore DB data:
cat << "EOT" > ./restore-data.sh #!/bin/bash set -x set -e PGHOST_IP=$(kubectl get pod -n $NAMESPACE $PGHOST --template '{{.status.podIP}}') DUMP_FILES=$(kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- ls $DUMP_LOCATION) for dump_file in $DUMP_FILES; do echo "Restoring dump ${dump_file}" time kubectl exec -it -n $NAMESPACE $TMP_PGDUMP_POD -c postgresql -- bash -c "\ gzip -cd $DUMP_LOCATION/$dump_file | \ env PGPASSWORD=$PGPASSWORD psql -h $PGHOST_IP -U $PGUSER" done EOT chmod 754 restore-data.sh ./restore-data.sh
Verify that all databases exist in a PostgreSQL instance:
kubectl -n $NAMESPACE exec $PGHOST -- env PGPASSWORD=$PGPASSWORD psql -U postgres -c "\l"
Enable application:
helm upgrade --namespace $NAMESPACE --version 2.1.0 \ --wait --timeout 7m -f $HELM_VALUES_FILE \ $HELM_RELEASE gooddata/gooddata-cn
Test the GoodData.CN deployment:
- Is it possible to log in?
- Does dashboard reports compute?
Remove network policies:
kubectl delete -f /tmp/network-access-tmp-pg-2-prod-pg.yaml
Remove temporary PG deployed to create DB dumps:
helm uninstall --namespace $NAMESPACE $TMP_PGDUMP_RELEASE
Important
This operation drops all pg_dump backups. If step (8) was skipped backups will be lost.
Rollback
In case something goes wrong, you can use persistent volumes backed in step 7 to revert back to version 2.0.1.
Steps:
Disable access to the application and remove postgresql-ha:
helm upgrade --namespace $NAMESPACE --version 2.1.0 \ --wait --timeout 2m -f $HELM_VALUES_FILE \ --set metadataApi.replicaCount=0 \ --set sqlExecutor.replicaCount=0 \ --set dex.replicaCount=0 \ --set deployPostgresHA=false \ $HELM_RELEASE gooddata/gooddata-cn
Remove PVCs and backup new PVs:
cat << "EOT" > ./remove-new-pvc.sh #!/bin/bash set -x set -e PV_NAME_BACKUP_PREFIX=pv14_name_backup PVC_BACKUP_PREFIX=pvc14_bck PVC_PV_LINES=$(kubectl get pvc -n $NAMESPACE --sort-by=.metadata.name \ --selector="app.kubernetes.io/component=postgresql,app.kubernetes.io/instance=${HELM_RELEASE},app.kubernetes.io/name=${PG_NAME}" \ -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.volumeName}{"\n"}{end}') echo "$PVC_PV_LINES" | while read -r line; do PVC_INSTANCE=$(echo $line | cut -f 1 -d ' ') PV_INSTANCE=$(echo $line | cut -f 2 -d ' ') # make sure PV is not deleted by k8s after PVC is removed echo "Setting Retain policy for PV=${PV_INSTANCE}" kubectl patch pv "$PV_INSTANCE" -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' PV_NAME_BACKUP_FILE="${PV_NAME_BACKUP_PREFIX}_${PV_INSTANCE}" # remember PV name for possible investigation purposes echo "Creating PV name backup as $PV_NAME_BACKUP_FILE" touch $PV_NAME_BACKUP_FILE # Backup PVC definition for rollback purposes PVC_BCK_FILE="${PVC_BACKUP_PREFIX}_$PVC_INSTANCE.yaml" echo "Creating PVC backup file $PVC_BCK_FILE" kubectl get pvc -n $NAMESPACE $PVC_INSTANCE -o yaml > $PVC_BCK_FILE # delete PVC echo "Deleting PVC $PVC_INSTANCE" kubectl delete pvc -n $NAMESPACE $PVC_INSTANCE done EOT chmod 754 remove-new-pvc.sh ./remove-new-pvc.sh
Restore original PVCs and bind them to PV backups:
cat << "EOT" > ./restore-orig-pvc.sh #!/bin/bash set -x set -e PV_NAME_BACKUP_PREFIX=pv_name_backup PVC_BACKUP_PREFIX=pvc_bck # prepare original PVs to be joined by PVCs for bck_pv in $(ls ${PV_NAME_BACKUP_PREFIX}*); do PV_INSTANCE=${bck_pv#"${PV_NAME_BACKUP_PREFIX}_"} echo "Making PV ${PV_INSTANCE} available" kubectl patch pv "$PV_INSTANCE" --type json -p '[{"op": "remove", "path": "/spec/claimRef"}]' echo "Setting Retain policy for PV=${PV_INSTANCE} to delete" kubectl patch pv "$PV_INSTANCE" -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}' done # restore original PVCs for bck_pvc in $(ls ${PVC_BACKUP_PREFIX}*); do PVC_INSTANCE=${bck_pvc#"${PVC_BACKUP_PREFIX}_"} echo "Creating PVC ${PVC_INSTANCE}" kubectl create -f $bck_pvc done EOT chmod 754 restore-orig-pvc.sh ./restore-orig-pvc.sh
Install application in the original version:
helm upgrade --namespace $NAMESPACE --version 2.0.1 \ --wait --timeout 2m -f $HELM_VALUES_FILE \ --set metadataApi.replicaCount=0 \ --set sqlExecutor.replicaCount=0 \ --set dex.replicaCount=0 \ --set deployPostgresHA=true \ $HELM_RELEASE gooddata/gooddata-cn
Verify postgres DB version:
kubectl exec -n $NAMESPACE $PGHOST -c postgresql -- psql --version
Enable application:
helm upgrade --namespace $NAMESPACE --version 2.0.1 \ --wait --timeout 7m -f $HELM_VALUES_FILE \ $HELM_RELEASE gooddata/gooddata-cn
Test the GoodData.CN deployment after rollback:
- Is it possible to log in?
- Does dashboard reports compute?