Skip to main content

Scaling deployment

WARNING
  • Only SCALING UP is supported
  • The Thingpark Enterprise scaling procedure depends from Kubernetes cluster storage provider volume resizing feature. If feature is supported, deployment will be scaled without service disruption . In the other case, it will require to backup and restore the deployment.

Planning the Scale Up

Before starting procedure:

  • Ensure that your license matches capacities of targeted deployment.

  • Ensure that the the Kubernetes cluster can schedule additional compute resources requested by the targeted segment (expressed in the sizing matrix). The scaling up also impact storage cost by requiring more disk space.

  • Get back your reference files values-data-stack.yaml and values-thingpark-stack.yaml customization files owning your deployment configuration. Update these files with the new profile file:

    • Retrieve the new sizing configuration
    # Value in m,l,xl
    export SEGMENT=l
    export RELEASE=8.0.x
    export CONFIG_REPO_BASEURL=https://raw.githubusercontent.com/actility/thingpark-enterprise-kubernetes/v$RELEASE
    curl -O $CONFIG_REPO_BASEURL/values/sizing/values-$SEGMENT-segment.yaml

    Use yq to merge configuration files:

    yq eval-all '. as $item ireduce ({}; . * $item)' values-storage.yaml \
    values-default-priority-class.yaml values-$SEGMENT-segment.yaml values-data-stack.yaml \
    | tee values-data-stack-all.yaml
    yq eval-all '. as $item ireduce ({}; . * $item)' values-storage.yaml \
    values-lb.yaml values-default-priority-class.yaml values-$SEGMENT-segment.yaml \
    values-thingpark-stack.yaml | tee values-thingpark-stack-all.yaml

Option A: Scaling without interruption

Prerequisites

  • The Kubernetes cluster must use a storage provider supporting volume resizing
  • StorageClass must be set with allowVolumeExpansion: true
  • If strimzi operator have not been installed using thingpark-data-controller Helm Chart, it must be configured with createGlobalResources=true
  • Identify new disk sizing for following components in your customized configuration files
    • mariadb-galera: mariadb-galera.persistence.size key
    • lrc: lrc.persistence.size key

1. Data stack scale up

  1. Update mariadb-galera Persistent Volume Claims with the mariadb-galera.persistence.size value

    kubectl patch -n $NAMESPACE pvc data-mariadb-galera-0 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'
    kubectl patch -n $NAMESPACE pvc data-mariadb-galera-1 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'
    kubectl patch -n $NAMESPACE pvc data-mariadb-galera-2 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'
    Example to scale up to profile l
    kubectl patch -n $NAMESPACE pvc data-mariadb-galera-0 -p '{"spec":{"resources":{"requests":{"storage":"15Gi"}}}}'
    kubectl patch -n $NAMESPACE pvc data-mariadb-galera-1 -p '{"spec":{"resources":{"requests":{"storage":"15Gi"}}}}'
    kubectl patch -n $NAMESPACE pvc data-mariadb-galera-2 -p '{"spec":{"resources":{"requests":{"storage":"15Gi"}}}}'

    Monitor that storage controller progressively scale up each pvc

    kubectl -n $NAMESPACE describe pvc data-mariadb-galera-0
    kubectl -n $NAMESPACE describe pvc data-mariadb-galera-1
    kubectl -n $NAMESPACE describe pvc data-mariadb-galera-2
    Example with AWS EBS storage controller
    Name:          data-mariadb-galera-0
    Namespace: devops-test-resizing
    StorageClass: thingpark-csi-gp2-xfs
    Status: Bound
    Volume: pvc-9a33ed9f-5439-473c-8c5f-03c83ae58b2e
    Labels: app.kubernetes.io/instance=tpe-data
    app.kubernetes.io/managed-by=Helm
    app.kubernetes.io/name=mariadb-galera
    Annotations: pv.kubernetes.io/bind-completed: yes
    pv.kubernetes.io/bound-by-controller: yes
    volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com
    volume.kubernetes.io/selected-node: ip-10-252-3-105.eu-west-1.compute.internal
    volume.kubernetes.io/storage-provisioner: ebs.csi.aws.com
    Finalizers: [kubernetes.io/pvc-protection]
    Capacity: 15Gi
    Access Modes: RWO
    VolumeMode: Filesystem
    Used By: mariadb-galera-0
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal WaitForFirstConsumer 13m persistentvolume-controller waiting for first consumer to be created before binding
    Normal Provisioning 13m ebs.csi.aws.com_ebs-csi-controller-6f854796-5dx5n_888c33a4-9e9b-4fb7-82ce-006130b6851d External provisioner is provisioning volume for claim "devops-test-resizing/data-mariadb-galera-0"
    Normal ExternalProvisioning 13m persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'ebs.csi.aws.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
    Normal ProvisioningSucceeded 13m ebs.csi.aws.com_ebs-csi-controller-6f854796-5dx5n_888c33a4-9e9b-4fb7-82ce-006130b6851d Successfully provisioned volume pvc-9a33ed9f-5439-473c-8c5f-03c83ae58b2e
    Normal Resizing 2m56s external-resizer ebs.csi.aws.com External resizer is resizing volume pvc-9a33ed9f-5439-473c-8c5f-03c83ae58b2e
    Normal ExternalExpanding 2m56s volume_expand waiting for an external controller to expand this PVC
    Normal FileSystemResizeRequired 2m50s external-resizer ebs.csi.aws.com Require file system resize of volume on node
    Normal FileSystemResizeSuccessful 2m14s kubelet MountVolume.NodeExpandVolume succeeded for volume "pvc-9a33ed9f-5439-473c-8c5f-03c83ae58b2e" ip-10-252-3-105.eu-west-1.compute.internal
  2. Prepare psmdb resource for expansion

    kubectl patch -n $NAMESPACE psmdb mongo-replicaset --type='json' -p='[{"op": "add", "path": "/spec/enableVolumeExpansion", "value": true}]'
  3. Delete the mariadb-galera statefulset with cascad=orphan

    kubectl -n $NAMESPACE delete sts --cascade=orphan mariadb-galera
  4. Upgrade the tpe-data Helm release with customization file updated with new sizing profile

    helm upgrade  tpe-data -n $NAMESPACE  actility/thingpark-data -f values-data-stack-all.yaml
  5. Validate that all pvc are correctly updated to target sizing

    kubectl -n $NAMESPACE get pvc
    NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
    data-0-kafka-cluster-kafka-0 Bound pvc-742f63b3-514d-49b8-aaf8-9bf3b79f5d79 20Gi RWO thingpark-csi-gp2-xfs <unset> 18h
    data-0-kafka-cluster-kafka-1 Bound pvc-dcc644e3-1735-40de-8eb0-cf84797530b3 20Gi RWO thingpark-csi-gp2-xfs <unset> 18h
    data-kafka-cluster-zookeeper-0 Bound pvc-316e3a1a-aee0-4591-a59e-137a9330b40f 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
    data-kafka-cluster-zookeeper-1 Bound pvc-a062ee17-f8b9-4b48-9103-d0d964817b47 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
    data-kafka-cluster-zookeeper-2 Bound pvc-11b7cb24-3085-4ae7-85e7-6a845aacf626 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
    data-mariadb-galera-0 Bound pvc-f0921f31-827c-43f1-937c-2845da9995bd 15Gi RWO thingpark-csi-gp2-xfs <unset> 17h
    data-mariadb-galera-1 Bound pvc-0d3b5f4a-94df-4546-aacf-8aed80ee661b 15Gi RWO thingpark-csi-gp2-xfs <unset> 17h
    data-mariadb-galera-2 Bound pvc-279907b9-bdf4-4de9-b514-b60570aaf35d 15Gi RWO thingpark-csi-gp2-xfs <unset> 17h
    data-zookeeper-0 Bound pvc-b26a6f85-dc31-468d-9eb9-8edd7b03247e 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
    data-zookeeper-1 Bound pvc-eaa40aa9-368d-4919-8873-056a375dcec5 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
    data-zookeeper-2 Bound pvc-0023eed1-3af1-4a89-bc9b-67d16f479f28 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
    mongod-data-mongo-replicaset-rs0-0 Bound pvc-02cbe222-4ea2-41e2-bf95-40ec13f2cf0e 25Gi RWO thingpark-csi-gp2-xfs <unset> 16h
    mongod-data-mongo-replicaset-rs0-1 Bound pvc-c014cdbb-fbfc-4d41-9f26-1085a9bbd88e 25Gi RWO thingpark-csi-gp2-xfs <unset> 16h
info

If mongod-data-mongo-replicaset-rs0-* or data-0-kafka-cluster-kafka-* are not correctly updated, check respectivelly psmdb-operator and strimzi-cluster-operator deployment pods logs before contacting support.

2. Thingpark Enterprise stack scale up

This section is not required if lrc disk size is not changed(e.g. move from m to l profile) If lrc.persistence.size is changed by new profile, update appriate pvc

  1. Update lrc Persistent Volume Claims with the lrc.persistence.size value

    kubectl patch -n $NAMESPACE pvc data-lrc-0 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'
    kubectl patch -n $NAMESPACE pvc data-lrc-1 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'

    Monitor that storage controller progressively scale up each pvc

    kubectl -n $NAMESPACE describe pvc data-lrc-0
    kubectl -n $NAMESPACE describe pvc data-lrc-1
  2. When Persistent Volume resizing done, delete the lrc statefulset with cascad=orphan

    kubectl -n $NAMESPACE delete sts --cascade=orphan lrc
  3. Finally update the tpe Helm release with updated customization file:

helm upgrade -i tpe --debug --timeout 20m -n $NAMESPACE \
actility/thingpark-enterprise --version $THINGPARK_ENTERPRISE_VERSION \
-f values-thingpark-stack-all.yaml

Option B: Scaling with backuping and restoring

The Scale up procedure consists in:

  1. Backup your data from initial deployment,
  2. Uninstall both ThingPark Enterprise and ThingPark Data stacks,
  3. Install a new empty ThingPark Enterprise deployment,
  4. Restore data from backup done at Step 1
CAUTION
  • This is a major operation with following impacts:

    • Service interruption for both API/GUI and base station flows
    • Packets may be lost or queued during re-deployment
  • Same version of ThingPark Enterprise must be used to deploy the scaled up infrastructure

  • Backup must be done the closest of uninstall


1. Backup data

  1. Run backup script using kubernetes api exec endpoint

    export NAMESPACE=thingpark-enterprise
    kubectl exec -it -n $NAMESPACE deploy/tp-backup-controller -- tp-backup -o backup
  2. Take a note of backup name for restoration time

    kubectl exec -it -n $NAMESPACE deploy/tp-backup-controller -- tp-backup -o list

2. Uninstall

  1. Start by uninstall charts

    helm -n $NAMESPACE uninstall tpe tpe-controllers tpe-data tpe-data-controllers
  2. Remove the namespace (required to cleanup all persistent data)

    kubectl delete ns $NAMESPACE

3. New Helm Release deployment

Using the previously recovered values-data-stack-all.yaml and values-thingpark-stack-all.yaml customization files, follow the Deployment procedure to re-deploy ThingPark Enterprise on your cluster.

WARNING
  • Set RELEASE environment variable with same version of previous ThingPark Enterprise deployment
  • Set SEGMENT environment variable with targeted sizing

5. Data restoration

Trigger the data restoration (command will ask for confirmation) using the initial backup name:

kubectl exec -it -n $NAMESPACE deploy/tp-backup-controller -- tp-backup -o restore -b %backup name%
Ask AI