Scaling deployment
- Only SCALING UP is supported
- The Thingpark Enterprise scaling procedure depends from Kubernetes cluster storage provider volume resizing feature. If feature is supported, deployment will be scaled without service disruption . In the other case, it will require to backup and restore the deployment.
Planning the Scale Up
Before starting procedure:
-
Ensure that your license matches capacities of targeted deployment.
-
Ensure that the the Kubernetes cluster can schedule additional compute resources requested by the targeted segment (expressed in the sizing matrix). The scaling up also impact storage cost by requiring more disk space.
-
Get back your reference files
values-data-stack.yamlandvalues-thingpark-stack.yamlcustomization files owning your deployment configuration. Update these files with the new profile file:- Retrieve the new sizing configuration
# Value in m,l,xl
export SEGMENT=l
export RELEASE=8.0.x
export CONFIG_REPO_BASEURL=https://raw.githubusercontent.com/actility/thingpark-enterprise-kubernetes/v$RELEASE
curl -O $CONFIG_REPO_BASEURL/values/sizing/values-$SEGMENT-segment.yamlUse yq to merge configuration files:
yq eval-all '. as $item ireduce ({}; . * $item)' values-storage.yaml \
values-default-priority-class.yaml values-$SEGMENT-segment.yaml values-data-stack.yaml \
| tee values-data-stack-all.yaml
yq eval-all '. as $item ireduce ({}; . * $item)' values-storage.yaml \
values-lb.yaml values-default-priority-class.yaml values-$SEGMENT-segment.yaml \
values-thingpark-stack.yaml | tee values-thingpark-stack-all.yaml
Option A: Scaling without interruption
Prerequisites
- The Kubernetes cluster must use a storage provider supporting volume resizing
- StorageClass must be set with
allowVolumeExpansion: true - If strimzi operator have not been installed using thingpark-data-controller
Helm Chart, it must be configured with
createGlobalResources=true - Identify new disk sizing for following components in your customized configuration files
- mariadb-galera:
mariadb-galera.persistence.sizekey - lrc:
lrc.persistence.sizekey
- mariadb-galera:
1. Data stack scale up
-
Update mariadb-galera Persistent Volume Claims with the
mariadb-galera.persistence.sizevaluekubectl patch -n $NAMESPACE pvc data-mariadb-galera-0 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'
kubectl patch -n $NAMESPACE pvc data-mariadb-galera-1 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'
kubectl patch -n $NAMESPACE pvc data-mariadb-galera-2 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'Example to scale up to profile lkubectl patch -n $NAMESPACE pvc data-mariadb-galera-0 -p '{"spec":{"resources":{"requests":{"storage":"15Gi"}}}}'
kubectl patch -n $NAMESPACE pvc data-mariadb-galera-1 -p '{"spec":{"resources":{"requests":{"storage":"15Gi"}}}}'
kubectl patch -n $NAMESPACE pvc data-mariadb-galera-2 -p '{"spec":{"resources":{"requests":{"storage":"15Gi"}}}}'Monitor that storage controller progressively scale up each pvc
kubectl -n $NAMESPACE describe pvc data-mariadb-galera-0
kubectl -n $NAMESPACE describe pvc data-mariadb-galera-1
kubectl -n $NAMESPACE describe pvc data-mariadb-galera-2Example with AWS EBS storage controllerName: data-mariadb-galera-0
Namespace: devops-test-resizing
StorageClass: thingpark-csi-gp2-xfs
Status: Bound
Volume: pvc-9a33ed9f-5439-473c-8c5f-03c83ae58b2e
Labels: app.kubernetes.io/instance=tpe-data
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=mariadb-galera
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com
volume.kubernetes.io/selected-node: ip-10-252-3-105.eu-west-1.compute.internal
volume.kubernetes.io/storage-provisioner: ebs.csi.aws.com
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 15Gi
Access Modes: RWO
VolumeMode: Filesystem
Used By: mariadb-galera-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 13m persistentvolume-controller waiting for first consumer to be created before binding
Normal Provisioning 13m ebs.csi.aws.com_ebs-csi-controller-6f854796-5dx5n_888c33a4-9e9b-4fb7-82ce-006130b6851d External provisioner is provisioning volume for claim "devops-test-resizing/data-mariadb-galera-0"
Normal ExternalProvisioning 13m persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'ebs.csi.aws.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
Normal ProvisioningSucceeded 13m ebs.csi.aws.com_ebs-csi-controller-6f854796-5dx5n_888c33a4-9e9b-4fb7-82ce-006130b6851d Successfully provisioned volume pvc-9a33ed9f-5439-473c-8c5f-03c83ae58b2e
Normal Resizing 2m56s external-resizer ebs.csi.aws.com External resizer is resizing volume pvc-9a33ed9f-5439-473c-8c5f-03c83ae58b2e
Normal ExternalExpanding 2m56s volume_expand waiting for an external controller to expand this PVC
Normal FileSystemResizeRequired 2m50s external-resizer ebs.csi.aws.com Require file system resize of volume on node
Normal FileSystemResizeSuccessful 2m14s kubelet MountVolume.NodeExpandVolume succeeded for volume "pvc-9a33ed9f-5439-473c-8c5f-03c83ae58b2e" ip-10-252-3-105.eu-west-1.compute.internal -
Prepare psmdb resource for expansion
kubectl patch -n $NAMESPACE psmdb mongo-replicaset --type='json' -p='[{"op": "add", "path": "/spec/enableVolumeExpansion", "value": true}]' -
Delete the mariadb-galera statefulset with cascad=orphan
kubectl -n $NAMESPACE delete sts --cascade=orphan mariadb-galera -
Upgrade the tpe-data Helm release with customization file updated with new sizing profile
helm upgrade tpe-data -n $NAMESPACE actility/thingpark-data -f values-data-stack-all.yaml -
Validate that all pvc are correctly updated to target sizing
kubectl -n $NAMESPACE get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
data-0-kafka-cluster-kafka-0 Bound pvc-742f63b3-514d-49b8-aaf8-9bf3b79f5d79 20Gi RWO thingpark-csi-gp2-xfs <unset> 18h
data-0-kafka-cluster-kafka-1 Bound pvc-dcc644e3-1735-40de-8eb0-cf84797530b3 20Gi RWO thingpark-csi-gp2-xfs <unset> 18h
data-kafka-cluster-zookeeper-0 Bound pvc-316e3a1a-aee0-4591-a59e-137a9330b40f 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
data-kafka-cluster-zookeeper-1 Bound pvc-a062ee17-f8b9-4b48-9103-d0d964817b47 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
data-kafka-cluster-zookeeper-2 Bound pvc-11b7cb24-3085-4ae7-85e7-6a845aacf626 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
data-mariadb-galera-0 Bound pvc-f0921f31-827c-43f1-937c-2845da9995bd 15Gi RWO thingpark-csi-gp2-xfs <unset> 17h
data-mariadb-galera-1 Bound pvc-0d3b5f4a-94df-4546-aacf-8aed80ee661b 15Gi RWO thingpark-csi-gp2-xfs <unset> 17h
data-mariadb-galera-2 Bound pvc-279907b9-bdf4-4de9-b514-b60570aaf35d 15Gi RWO thingpark-csi-gp2-xfs <unset> 17h
data-zookeeper-0 Bound pvc-b26a6f85-dc31-468d-9eb9-8edd7b03247e 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
data-zookeeper-1 Bound pvc-eaa40aa9-368d-4919-8873-056a375dcec5 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
data-zookeeper-2 Bound pvc-0023eed1-3af1-4a89-bc9b-67d16f479f28 5Gi RWO thingpark-csi-gp2-xfs <unset> 18h
mongod-data-mongo-replicaset-rs0-0 Bound pvc-02cbe222-4ea2-41e2-bf95-40ec13f2cf0e 25Gi RWO thingpark-csi-gp2-xfs <unset> 16h
mongod-data-mongo-replicaset-rs0-1 Bound pvc-c014cdbb-fbfc-4d41-9f26-1085a9bbd88e 25Gi RWO thingpark-csi-gp2-xfs <unset> 16h
If mongod-data-mongo-replicaset-rs0-* or data-0-kafka-cluster-kafka-* are
not correctly updated, check respectivelly psmdb-operator and
strimzi-cluster-operator deployment pods logs before contacting support.
2. Thingpark Enterprise stack scale up
This section is not required if lrc disk size is not changed(e.g. move from m to l profile) If lrc.persistence.size is changed by new profile, update appriate pvc
-
Update lrc Persistent Volume Claims with the
lrc.persistence.sizevaluekubectl patch -n $NAMESPACE pvc data-lrc-0 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'
kubectl patch -n $NAMESPACE pvc data-lrc-1 -p '{"spec":{"resources":{"requests":{"storage":"<new_size>Gi"}}}}'Monitor that storage controller progressively scale up each pvc
kubectl -n $NAMESPACE describe pvc data-lrc-0
kubectl -n $NAMESPACE describe pvc data-lrc-1 -
When Persistent Volume resizing done, delete the lrc statefulset with cascad=orphan
kubectl -n $NAMESPACE delete sts --cascade=orphan lrc -
Finally update the
tpeHelm release with updated customization file:
helm upgrade -i tpe --debug --timeout 20m -n $NAMESPACE \
actility/thingpark-enterprise --version $THINGPARK_ENTERPRISE_VERSION \
-f values-thingpark-stack-all.yaml
Option B: Scaling with backuping and restoring
The Scale up procedure consists in:
- Backup your data from initial deployment,
- Uninstall both ThingPark Enterprise and ThingPark Data stacks,
- Install a new empty ThingPark Enterprise deployment,
- Restore data from backup done at Step 1
-
This is a major operation with following impacts:
- Service interruption for both API/GUI and base station flows
- Packets may be lost or queued during re-deployment
-
Same version of ThingPark Enterprise must be used to deploy the scaled up infrastructure
-
Backup must be done the closest of uninstall
1. Backup data
-
Run backup script using kubernetes api exec endpoint
export NAMESPACE=thingpark-enterprise
kubectl exec -it -n $NAMESPACE deploy/tp-backup-controller -- tp-backup -o backup -
Take a note of backup name for restoration time
kubectl exec -it -n $NAMESPACE deploy/tp-backup-controller -- tp-backup -o list
2. Uninstall
-
Start by uninstall charts
helm -n $NAMESPACE uninstall tpe tpe-controllers tpe-data tpe-data-controllers -
Remove the namespace (required to cleanup all persistent data)
kubectl delete ns $NAMESPACE
3. New Helm Release deployment
Using the previously recovered values-data-stack-all.yaml and
values-thingpark-stack-all.yaml customization files, follow the
Deployment procedure to re-deploy
ThingPark Enterprise on your cluster.
- Set
RELEASEenvironment variable with same version of previous ThingPark Enterprise deployment - Set
SEGMENTenvironment variable with targeted sizing
5. Data restoration
Trigger the data restoration (command will ask for confirmation) using the initial backup name:
kubectl exec -it -n $NAMESPACE deploy/tp-backup-controller -- tp-backup -o restore -b %backup name%