Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
As part of the existing redis files removal, these files should be removed by creating the new Pull request once the new redis-cluster is deployed succesfully. <img width="320" alt="image" src="https://github.com/user-attachments/assets/db64333f-82c7-41f4-a189-f7a27809584b" /> <img width="312" alt="image" src="https://github.com/user-attachments/assets/c270cfc0-c1f5-4a45-b8a8-90489db9e689" /> - PVC is updated with 1GB size in sync with the existing PROD size - Service account is required while creating the cluster as it helps the Redis Pods the necessary RBAC (Role-Based Access Control) to interact with the other objects created during installation like the secrets, configmaps and PVCs. - Existing Makefile commands are removed for the old redis in devops folder as we usse helm installation for the new redis and is available in devops/helm/redis-cluster folder. - Redis Creds will have 32 alphanumeric generated password as previously generated ones. **AOF vs RDB** As part of the analysis, finding the right persistence mechanism for our project was crucial, so on checking the official documentations of the https://redis.io/docs/latest/operate/oss_and_stack/management/persistence/, here are some of the answer. **Common functionalities of AOF and RDB and how it is used during disaster recovery** - We have enabled PVC for our DB, so both AOF and RDB gets saved into it. - Even if we uninstall the helm chart, the PVCs stay and when tried to install again with a different version or after disaster recovery, the existing PVC is connected automatically by the helm current configurations and there is no loss of data **AOF** Is kind of a write operation to the disk in a file appending everytime, usually it will have serious of files which does base file, incremental update file and manifest file. This can be found by running the below command and answers as below in the redis-cli in any of the redis-cluster pods. ``` $ cat /opt/bitnami/redis/etc/redis.conf | grep appendonly appendonly yes # For example, if appendfilename is set to appendonly.aof, the following file # - appendonly.aof.1.base.rdb as a base file. # - appendonly.aof.1.incr.aof, appendonly.aof.2.incr.aof as incremental files. # - appendonly.aof.manifest as a manifest file. appendfilename "appendonly.aof" appenddirname "appendonlydir" ``` There files are present in /bitnami/redis/data folder and the file appendonly.aof.1.base.rdb is the base file and appendonly.aof.1.incr.aof, appendonly.aof.2.incr.aof are the incremental files and the appendonly.aof.manifest is the manifest file, where it has the metadata/configuration of the aof files. The reason we have 2 incremental files appendonly.aof.2.incr.aof, is when the base file corrupts and the new base file needs to be replaced, with the child creating a new base AOF file while the parent logs updates in an incremental AOF; once rewriting completes, Redis atomically updates the manifest and cleans up old files to ensure a consistent dataset. This is a feature we have in Redis 7+, as we are using 7.4.2-debian-12-r0, it available. PROS and CONS: The only downside of AOF is, as the filesize is very large due to the incremental updates, it will be take more time to recover but the loss of data in case of disaster is maximum one sec, this is done using the configuration below. <img width="248" alt="image" src="https://github.com/user-attachments/assets/b22b2ada-f175-46d9-a656-1b68f9619272" /> **RDB** Is a file which takes a SNAPSHOT of the current dataset more like a backup strategy that run in certain intervals as configured. It is a single file and can be found running the below command in the redis-cli of the redis-cluster pods. ``` $ cat /opt/bitnami/redis/etc/redis.conf | grep dbfilename # and 'dbfilename') and that aren't usually modified during runtime dbfilename dump.rdb # above using the 'dbfilename' configuration directive. ``` The file is present in /bitnami/redis/data and the file dump.rdb contains the snapshot of the dataset, The configuration for them is done in the save configuration as below. <img width="549" alt="image" src="https://github.com/user-attachments/assets/13aa1f8a-1e09-4964-8349-50d059b84b46" /> | **Time Interval (seconds)** | **Minimum Number of Changes** | |----------------------------|------------------------------| | 900 seconds (15 minutes) | 1 change | | 300 seconds (5 minutes) | 10 changes | | 60 seconds (1 minute) | 10,000 changes | PROS and CONS: RDB can recover the Data quickly as it does not have to run through multiple files or the filesize is relatively smaller than the AOF. But the only downside is the interval in which the changes are saved as per the current configuration for minimal changes as 10 is around 5 minutes and if there is only one change it is 15 min. So if there is any 9 data changes, as per the RDB configuration the change to save in the disk will take 15 min, and during this time if there is a disaster, it will lose those 9 data changes. **Conclusion** To have the best of both worlds of RDB and AOF, enabling both of them at the same time, solves the recovery strategy. Also after the implementation of the helm installation for Redis, the upgrade and full disaster recovery can be done via the github actions. **Installation and upgrade of redis** Installing/Upgrade of redis-cluster is handled by the GHA `Redis Cluster - Install/Upgrade` . ![image](https://github.com/user-attachments/assets/cc525402-70b2-437f-8531-2e3820415b30) **Issues in the Redis Cluster** Troubleshooting guides as per the BC GOV is given clearly in the given links **https://github.com/bcgov/common-service-showcase/wiki/Redis-Troubleshooting** Also if the cluster fails completely, we can uninstall the redis using the `helm delete redis-cluster . -n {NAMESPACE}` commands run from the `/devops/helm/redis-cluster` folder. This ensures the PVC's are not deleted and cluster is removed. So when installing the redis-cluster using the GHA in the previous steps, it can be recovered, without minimum or no data loss. **Migration from Old Redis** - Bring the old redis pods in the statefulset to 0 ![image](https://github.com/user-attachments/assets/636c9f28-76a3-4b62-a088-c37664d75357) - Install redis-cluster using the GHA `Redis Cluster - Install/Upgrade` . - Deploying the release tag - this ensure all the applications will have the updated redis host and password from the new redis and once the deployment is successful, the API, queue-consumers and workers connections should work seemlessly. - Currently backup and recovery of the redis keys from old to new redis steps are not requested, but can be done by port-forwarding locally the existing redis and backing up and restoring into the new redis-cluster. **Rollback Procedures** - During rollback the newly created redis-cluster statefulset pods should be bring down to 0 ![image](https://github.com/user-attachments/assets/0d68f9ec-fefb-446d-a03c-c9c2c632237a) - Bring the old redis from 0 to 6 ![image](https://github.com/user-attachments/assets/404ba468-0197-48a6-9433-a46e3b505b94) - Continue the rollback steps in the release notes. **Note:** Once the deployment is complete and the redis-cluster is in place, the wiki will be updated.
- Loading branch information