#4253 - Upgrade Redis Part 2 (#4309)

As part of the existing redis files removal, these files should be removed by creating the new Pull request once the new redis-cluster is deployed succesfully. <img width="320" alt="image" src="https://github.com/user-attachments/assets/db64333f-82c7-41f4-a189-f7a27809584b" /> <img width="312" alt="image" src="https://github.com/user-attachments/assets/c270cfc0-c1f5-4a45-b8a8-90489db9e689" /> - PVC is updated with 1GB size in sync with the existing PROD size - Service account is required while creating the cluster as it helps the Redis Pods the necessary RBAC (Role-Based Access Control) to interact with the other objects created during installation like the secrets, configmaps and PVCs. - Existing Makefile commands are removed for the old redis in devops folder as we usse helm installation for the new redis and is available in devops/helm/redis-cluster folder. - Redis Creds will have 32 alphanumeric generated password as previously generated ones. **AOF vs RDB** As part of the analysis, finding the right persistence mechanism for our project was crucial, so on checking the official documentations of the https://redis.io/docs/latest/operate/oss_and_stack/management/persistence/, here are some of the answer. **Common functionalities of AOF and RDB and how it is used during disaster recovery** - We have enabled PVC for our DB, so both AOF and RDB gets saved into it. - Even if we uninstall the helm chart, the PVCs stay and when tried to install again with a different version or after disaster recovery, the existing PVC is connected automatically by the helm current configurations and there is no loss of data **AOF** Is kind of a write operation to the disk in a file appending everytime, usually it will have serious of files which does base file, incremental update file and manifest file. This can be found by running the below command and answers as below in the redis-cli in any of the redis-cluster pods. ``` $ cat /opt/bitnami/redis/etc/redis.conf | grep appendonly appendonly yes # For example, if appendfilename is set to appendonly.aof, the following file # - appendonly.aof.1.base.rdb as a base file. # - appendonly.aof.1.incr.aof, appendonly.aof.2.incr.aof as incremental files. # - appendonly.aof.manifest as a manifest file. appendfilename "appendonly.aof" appenddirname "appendonlydir" ``` There files are present in /bitnami/redis/data folder and the file appendonly.aof.1.base.rdb is the base file and appendonly.aof.1.incr.aof, appendonly.aof.2.incr.aof are the incremental files and the appendonly.aof.manifest is the manifest file, where it has the metadata/configuration of the aof files. The reason we have 2 incremental files appendonly.aof.2.incr.aof, is when the base file corrupts and the new base file needs to be replaced, with the child creating a new base AOF file while the parent logs updates in an incremental AOF; once rewriting completes, Redis atomically updates the manifest and cleans up old files to ensure a consistent dataset. This is a feature we have in Redis 7+, as we are using 7.4.2-debian-12-r0, it available. PROS and CONS: The only downside of AOF is, as the filesize is very large due to the incremental updates, it will be take more time to recover but the loss of data in case of disaster is maximum one sec, this is done using the configuration below. <img width="248" alt="image" src="https://github.com/user-attachments/assets/b22b2ada-f175-46d9-a656-1b68f9619272" /> **RDB** Is a file which takes a SNAPSHOT of the current dataset more like a backup strategy that run in certain intervals as configured. It is a single file and can be found running the below command in the redis-cli of the redis-cluster pods. ``` $ cat /opt/bitnami/redis/etc/redis.conf | grep dbfilename # and 'dbfilename') and that aren't usually modified during runtime dbfilename dump.rdb # above using the 'dbfilename' configuration directive. ``` The file is present in /bitnami/redis/data and the file dump.rdb contains the snapshot of the dataset, The configuration for them is done in the save configuration as below. <img width="549" alt="image" src="https://github.com/user-attachments/assets/13aa1f8a-1e09-4964-8349-50d059b84b46" /> | **Time Interval (seconds)** | **Minimum Number of Changes** | |----------------------------|------------------------------| | 900 seconds (15 minutes) | 1 change | | 300 seconds (5 minutes) | 10 changes | | 60 seconds (1 minute) | 10,000 changes | PROS and CONS: RDB can recover the Data quickly as it does not have to run through multiple files or the filesize is relatively smaller than the AOF. But the only downside is the interval in which the changes are saved as per the current configuration for minimal changes as 10 is around 5 minutes and if there is only one change it is 15 min. So if there is any 9 data changes, as per the RDB configuration the change to save in the disk will take 15 min, and during this time if there is a disaster, it will lose those 9 data changes. **Conclusion** To have the best of both worlds of RDB and AOF, enabling both of them at the same time, solves the recovery strategy. Also after the implementation of the helm installation for Redis, the upgrade and full disaster recovery can be done via the github actions. **Installation and upgrade of redis** Installing/Upgrade of redis-cluster is handled by the GHA `Redis Cluster - Install/Upgrade` . ![image](https://github.com/user-attachments/assets/cc525402-70b2-437f-8531-2e3820415b30) **Issues in the Redis Cluster** Troubleshooting guides as per the BC GOV is given clearly in the given links **https://github.com/bcgov/common-service-showcase/wiki/Redis-Troubleshooting** Also if the cluster fails completely, we can uninstall the redis using the `helm delete redis-cluster . -n {NAMESPACE}` commands run from the `/devops/helm/redis-cluster` folder. This ensures the PVC's are not deleted and cluster is removed. So when installing the redis-cluster using the GHA in the previous steps, it can be recovered, without minimum or no data loss. **Migration from Old Redis** - Bring the old redis pods in the statefulset to 0 ![image](https://github.com/user-attachments/assets/636c9f28-76a3-4b62-a088-c37664d75357) - Install redis-cluster using the GHA `Redis Cluster - Install/Upgrade` . - Deploying the release tag - this ensure all the applications will have the updated redis host and password from the new redis and once the deployment is successful, the API, queue-consumers and workers connections should work seemlessly. - Currently backup and recovery of the redis keys from old to new redis steps are not requested, but can be done by port-forwarding locally the existing redis and backing up and restoring into the new redis-cluster. **Rollback Procedures** - During rollback the newly created redis-cluster statefulset pods should be bring down to 0 ![image](https://github.com/user-attachments/assets/0d68f9ec-fefb-446d-a03c-c9c2c632237a) - Bring the old redis from 0 to 6 ![image](https://github.com/user-attachments/assets/404ba468-0197-48a6-9433-a46e3b505b94) - Continue the rollback steps in the release notes. **Note:** Once the deployment is complete and the redis-cluster is in place, the wiki will be updated.
bcgov · Feb 5, 2025 · 129f923 · 129f923
1 parent c7a35a5
commit 129f923
Show file tree

Hide file tree

Showing 9 changed files with 150 additions and 132 deletions.
diff --git a/.github/workflows/env-setup-sysdig-teams.yml b/.github/workflows/env-setup-sysdig-teams.yml
@@ -22,7 +22,7 @@ jobs:
       - name: Log in to OpenShift
         run: |
           oc login --token=${{ secrets.SA_TOKEN }} --server=${{ vars.OPENSHIFT_CLUSTER_URL }}
-      - name: Delete Redis
+      - name: Updating Sysdig Team
         working-directory: "./devops/"
         run: |
           make update-sysdig-team
diff --git a/devops/helm/redis-cluster/templates/_helpers.tpl b/devops/helm/redis-cluster/templates/_helpers.tpl
@@ -167,7 +167,7 @@ Return Redis&reg; password
 {{- else if not (empty .Values.password) -}}
     {{- .Values.password -}}
 {{- else -}}
-    {{- randAlphaNum 10 -}}
+    {{- randAlphaNum 32 -}}
 {{- end -}}
 {{- end -}}
 

diff --git a/devops/helm/redis-cluster/templates/configmap.yaml b/devops/helm/redis-cluster/templates/configmap.yaml
@@ -1396,7 +1396,7 @@ data:
     #
     # Please check https://redis.io/topics/persistence for more information.
 
-    appendonly no
+    appendonly yes
 
     # The base name of the append only file.
     #

diff --git a/devops/helm/redis-cluster/values-0c27fb-dev.yaml b/devops/helm/redis-cluster/values-0c27fb-dev.yaml
@@ -1,40 +1,43 @@
+persistence:
+  size: 1Gi
+
 volumePermissions:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param volumePermissions.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 250m
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 100m
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
 redis:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param redis.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 1
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 2
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
 ## Cluster update job settings
 updateJob:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param updateJob.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 1
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 2
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
diff --git a/devops/helm/redis-cluster/values-0c27fb-prod.yaml b/devops/helm/redis-cluster/values-0c27fb-prod.yaml
@@ -1,40 +1,43 @@
+persistence:
+  size: 1Gi
+
 volumePermissions:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param volumePermissions.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 250m
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 100m
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
 redis:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param redis.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 1
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 2
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
 ## Cluster update job settings
 updateJob:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param updateJob.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 1
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 2
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
diff --git a/devops/helm/redis-cluster/values-0c27fb-test.yaml b/devops/helm/redis-cluster/values-0c27fb-test.yaml
@@ -1,40 +1,43 @@
+persistence:
+  size: 1Gi
+
 volumePermissions:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param volumePermissions.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 250m
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 100m
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
 redis:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param redis.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 1
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 2
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
 ## Cluster update job settings
 updateJob:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param updateJob.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 1
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 2
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
diff --git a/devops/helm/redis-cluster/values-a6ef19-dev.yaml b/devops/helm/redis-cluster/values-a6ef19-dev.yaml
@@ -1,13 +1,16 @@
+persistence:
+  size: 1Gi
+
 volumePermissions:
   # resourcesPreset: "nano"
   ## @param volumePermissions.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
   resources:
     requests:
-      cpu: 250m
-      memory: 512Mi
+      cpu: 500m
+      memory: 1024Mi
     limits:
-      cpu: 100m
+      cpu: 500m
       memory: 1024Mi
 
   # resources: {}
@@ -17,10 +20,10 @@ redis:
   ## Example:
   resources:
     requests:
-      cpu: 1
-      memory: 512Mi
+      cpu: 500m
+      memory: 1024Mi
     limits:
-      cpu: 2
+      cpu: 500m
       memory: 1024Mi
 
   # resources: {}
@@ -31,10 +34,10 @@ updateJob:
   ## Example:
   resources:
     requests:
-      cpu: 1
-      memory: 512Mi
+      cpu: 500m
+      memory: 1024Mi
     limits:
-      cpu: 2
+      cpu: 500m
       memory: 1024Mi
 
   # resources: {}
diff --git a/devops/helm/redis-cluster/values-a6ef19-prod.yaml b/devops/helm/redis-cluster/values-a6ef19-prod.yaml
@@ -1,40 +1,43 @@
+persistence:
+  size: 1Gi
+
 volumePermissions:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param volumePermissions.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 250m
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 100m
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
 redis:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param redis.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 1
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 2
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}
 ## Cluster update job settings
 updateJob:
-  resourcesPreset: "nano"
+  # resourcesPreset: "nano"
   ## @param updateJob.resources Set container requests and limits for different resources like CPU or memory (essential for production workloads)
   ## Example:
-  # resources:
-  #   requests:
-  #     cpu: 1
-  #     memory: 512Mi
-  #   limits:
-  #     cpu: 2
-  #     memory: 1024Mi
+  resources:
+    requests:
+      cpu: 500m
+      memory: 1024Mi
+    limits:
+      cpu: 500m
+      memory: 1024Mi
 
   # resources: {}