Disaster Recovery from a failure on global synchronizer

This flow is adapted from the Canton official guide

Some additional instructions adapted to Catalyst validator instalation were added to provide additional aid.

More specific instructions were added and adapted to the Catalyst validator installation.

1 - Set the following variables for the rest of the commands

export NAME=<name of the validator>
export PASSWORD=<password for wallet>
export KEYCLOAK\_URL=<keycloak url>
export REALM=<keycloack realm for catalyst>

2 - Login in the keycloak with the wallet user of the validator and put the response in a json file

curl --location "${KEYCLOAK\_URL}/auth/realms/${REALM}/protocol/openid-connect/token" \
--header "Content-Type: application/x-www-form-urlencoded" \
--data-urlencode "grant\_type=password" \
--data-urlencode "client\_id=${NAME}-wallet-ui" \
--data-urlencode "username=${NAME}\_walletuser" \
--data-urlencode "password=${PASSWORD}" > token.json

3 - Extract token to an environment variable

export TOKEN=$(jq -r .access\_token token.json)

4 - Obtain the snapshot from the wallet

  • The wallet URL can be obtained from the UI page of the wallet application

  • The super validators on network recovery will provide the timestamp in the #validators-ops channel

  • The timestamp format is the same as the logs for the validator

export WALLET\_URL=<url of the validator wallet application>
export TIMESTAMP=<timestamp on the same format as the logs>
curl -sSLf "${WALLET\_URL}/api/validator/v0/admin/domain/data-snapshot?timestamp=${TIMESTAMP}&force=true" -X GET \
           -H "authorization: Bearer ${TOKEN}" \
           -H "Content-Type: application/json" > dump\_response.json

5 - Extract the data snapshot

jq '.data\_snapshot' dump\_response.json > data\_snapshot.json

6 - Copy the snapshot to the dump folder in the validator

  • the validator pod will have the name of the validator plus suffixes added by kubernetes

export NAMESPACE=<namespace of the validator>
export PODNAME=<pod of validator>
kubectl cp data\_snapshot.json ${NAMESPACE}/${PODNAME}:/domain-upgrade-dump/domain\_migration\_dump.json

7 - Compare the md5sum of your local file and file copied to the pod:

export LOCAL\_DUMP\_MD5SUM=$(md5sum data\_snapshot.json)
export PODS\_DUMP\_MD5SUM=$(kubectl -n "${NAMESPACE}" exec -it ${PODNAME} -- md5sum /domain-upgrade-dump/domain\_migration\_dump.json)
if [ "$(echo $LOCAL\_DUMP\_MD5SUM | awk '{print $1}')" = "$(echo $PODS\_DUMP\_MD5SUM | awk '{print $1}')" ]; then echo -e 'MD5SUM checksums are equal.'; else 'MD5SUM checksums are not equal in local and pod. Check the file copied is correct!'; fi

8 - On the UI validator page, edit the validator, set migrationID to a new migration ID, and toggle migrating to true.

Then confirm and wait for the update