[Suggestion] Kubernetes Security Roadmap #1054
Labels
inactive
No activity on issue/PR
suggestion
New suggestion for the CNCF sig-security group that don't fall into an existing category
triage-required
Requires triage
Description:
The CNCF Kubernetes community has been making substantial progress in providing guidance for hardening Kubernetes. These documents are unbelievably valuable, but can leave the reader with what feels like an insurmountable mountain to climb. I wanted to make a start to break this perception down and have written down my thoughts on how to approach hardening your Kubernetes clusters.
Impact:
To make it more actionable, provide a more distilled body of content (this is not an exhaustive hardening guide) and also divide it into phases: Quick Wins, Next Steps and Research.
Scope: "not yet determined"
Additional info:
Quick Wins
These are hardening steps one would ideally have in place from day one and by adding them you have made a good start to securing your Kubernetes cluster.
Enable API Server Auditing
The audit logger records actions taken by the API which can be used for post-incident investigations and compliance, so you can answer who, what and when. Also monitor the logs for anomalous or unwanted API calls, for example a status message “Forbidden” indicates Authorization failures and could mean that an attacker is trying to abuse stolen credentials.
IAM & RBAC
Integrate an IAM authentication service to your cluster. The pattern is likely to consist of your existing OIDC or LDAP server which then enables you to allow users access to the cluster based on group membership. Include all cluster clients in this pattern, you may implement service accounts for infrastructure users, such as nodes, proxies, etc. See authentication reference document for more information.
With authentication in place, we can also authorise every API call. Kubernetes has an integrated Role-Based Access Control (RBAC) component that should be used to match an incoming user or group to a set of permissions stored in roles. These built-in roles have a good balance between flexibility and common use cases, but more limited roles should be considered to prevent accidental escalation and maintain least privilege posture. See the authorization reference section for more information.
Container hygiene
Container image should only contain what is necessary to run the application they package. With the release of ephemeral debug container, troubleshooting utils can be removed from the application container. Distroless images have minimal packages and do not include shell. Statically compiled languages like Go can use Scratch images, an empty image with only the application code. Also build your images to start with an unprivileged user.
Scan container images to prevent critical vulnerabilities from being deployed to the cluster. This practice should be implemented in your CI/CD pipeline (Shift Left), so that vulnerabilities are patched before they are released to production.
Control the pod resources with resource limits. The Memory limits should be set for the application with a limit equal or less than to the request. CPU limit should also be set according to the applications requirements. If not implemented, you are exposed to DoS attacks from breached or malicious pods.
Protect Secrets
Secrets used by pods should be stored within Kubernetes Secrets API and not in ConfigMaps. The secret stored within etcd should be encrypted at rest. Secrets mounted through volumes should be stored in memory with the emptyDir.medium option.
Next Steps
These steps could be added to the Quick Wins, but they do need a bit more consideration and testing. They should be on your cluster hardening implementation roadmap.
Further Container Hygiene
Pull the correct image, using hashes, signatures and authorised registries. Use the complete sha256 digest which is unique to the image manifest. Enforce this practice with an ImagePolicyWebhook. Image signatures can be verified with an admission controller at deployment.
Implement Pod Security Standards policies. There are three options: privileged, baseline and restricted that limit how the security fields can be set in the PodSpec. Phase in the implementation by using the modes: enforce, audit, and warn.
Further Secrets Protection
Inject secrets from third-party storage as a volume, using Secrets Store CSI Driver. This is preferred to the approach where the pod’s service account has RBAC access to secrets. With the later approach secrets could be mounted as environment variables or files. Environment variable method is specifically prone to leakage due to crash dumps in logs and the non-confidential nature of environment variable, as opposed to control via permission for files.
Service account tokens should not be mounted into pods that do not require them. This can be configured by setting automountServiceAccountToken to false either at service account or at the pod level.
Kubernetes 1.26 added encryption-at-rest for extension APIs defined in CustomResourceDefinitions, so now you can encrypt custom resources.
Network security
Implement default network policies blocking all egress and ingress, for all pods in each namespace. An allow list approach can now be used to ensure all pod egress and ingress traffic is controlled. This also protects against pods accessing cloud metadata API 169.254.169.254, which may leak information.
The control plane components such as the API Server, etcd datastore, and Kubernetes Dashboard should not be publicly exposed on the Internet. Furthermore, TLS/mTLS should be used to communicate with them.
The kubelet API access should be restricted and not publicly exposed, the defaults are overly permissive.
Restrict the use of LoadBalancer and ExternalIPs, see CVE-2020–8554: Man in the middle using LoadBalancer or ExternalIPs and the DenyServiceExternalIPs admission controller for further information.
Research
These recommendations are usually implemented only for specific use cases. They didn’t make it to the previous two lists, because they may have dependencies on the underlying container host OS.
Container Hygiene (Again 😊)
Use Seccomp, AppArmor or SELinux where appropriate to reduce container access to Linux kernel syscall attack vectors. The Kubernetes project provides tutorials and guides to implement these, eg. Seccomp profiles, enable AppArmor in Kubernetes and assigning SELinux labels to pods or containers.
Use cases that involve multi-tenant clusters, with applications that require the highest levels of trust, will require container sandboxes. Container sandboxes ensure container breakout and kernel exploits are not possible. Examples of these are Kata, gVisor and Firecracker.
Security teams may need the ability to detect and respond to threats in real time at container runtime. Implementations in this space typically leverage BPF to filter low level system calls for events related to shell inside the container, container mounts to sensitive host paths, access to sensitive files and outbound network connections being established
Further Network Security
By design Kubernetes allows any pod access to any other pod in the cluster via unencrypted connections. Most CNIs can provide transport encryption, and network policies can control pod to pod connections, which have been discussed. Zero Trust and API Security requirements are however use cases that would typically require some form of Service Mesh implementation. One could separate Service Mesh and API Gateway (API Security), but some mature Service Meshes, such as Istio, have API Gateway capabilities out of the box. Service Mesh also provides additional important capabilities such as multi cluster traffic control and cluster observability.
Final Thoughts
Kubernetes managed service providers, whether via cloud service providers or other, do offer consumers the promise of having their Kubernetes risk mitigated, but we should still measure our Kubernetes Security postures and know where they are lacking. Many organisations need to build their own (BYO) Kubernetes and much of your cluster’s security is implemented at build stage and requires maintenance. There are numerous BYO guides and the Kubernetes CAPI project is a good enabler.
Look after your T-zone. As usual security requires defense in depth. The blog topics touches each of the layers in the T-zone, which is horizontal (4Cs of Cloud Native Security) and vertical (Platform Delivery).
Hopefully this blog allows you to start your Kubernetes hardening journey, wherever your Kubernetes clusters are running.
Further Reading
CNCF Kubernetes Security Whitepaper
CNCF Kubernetes Security Overview
CNCF Kubernetes Securing a Cluster
CNCF Kubernetes Security Checklist
OWASP Kubernetes Security Cheat Sheet
The text was updated successfully, but these errors were encountered: