From 35471b804af4c83c267a956004e8dd7d142074dd Mon Sep 17 00:00:00 2001 From: Steve Hipwell Date: Fri, 25 Oct 2024 15:58:34 +0100 Subject: [PATCH] feat(chart): Improved default security context Signed-off-by: Steve Hipwell --- charts/karpenter-crd/README.md | 8 ++++- charts/karpenter/README.md | 25 +++++++++------- charts/karpenter/templates/deployment.yaml | 23 +++++++++++---- charts/karpenter/values.yaml | 34 ++++++++++++++-------- 4 files changed, 61 insertions(+), 29 deletions(-) diff --git a/charts/karpenter-crd/README.md b/charts/karpenter-crd/README.md index 566d9a7efa12..212525cb3292 100644 --- a/charts/karpenter-crd/README.md +++ b/charts/karpenter-crd/README.md @@ -1,6 +1,6 @@ # karpenter-crd -![Version: 0.36.0](https://img.shields.io/badge/Version-0.36.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 0.36.0](https://img.shields.io/badge/AppVersion-0.36.0-informational?style=flat-square) +![Version: 1.1.1](https://img.shields.io/badge/Version-1.1.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.1.1](https://img.shields.io/badge/AppVersion-1.1.1-informational?style=flat-square) A Helm chart for Karpenter Custom Resource Definitions (CRDs). @@ -10,6 +10,12 @@ A Helm chart for Karpenter Custom Resource Definitions (CRDs). * +## Values + +| Key | Type | Default | Description | +|-----|------|---------|-------------| +| additionalAnnotations | object | `{}` | Additional annotations for the custom resource definitions. | + ---------------------------------------------- Autogenerated from chart metadata using [helm-docs](https://github.com/norwoodj/helm-docs/). diff --git a/charts/karpenter/README.md b/charts/karpenter/README.md index ac73e80eda54..8d0287d01d5d 100644 --- a/charts/karpenter/README.md +++ b/charts/karpenter/README.md @@ -45,15 +45,18 @@ cosign verify public.ecr.aws/karpenter/karpenter:1.1.1 \ | additionalLabels | object | `{}` | Additional labels to add into metadata. | | affinity | object | `{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"karpenter.sh/nodepool","operator":"DoesNotExist"}]}]}},"podAntiAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":[{"topologyKey":"kubernetes.io/hostname"}]}}` | Affinity rules for scheduling the pod. If an explicit label selector is not provided for pod affinity or pod anti-affinity one will be created from the pod selector labels. | | controller.containerName | string | `"controller"` | Distinguishing container name (containerName: karpenter-controller). | -| controller.env | list | `[]` | Additional environment variables for the controller pod. | +| controller.env | list | `[]` | Additional environment variables for the controller container. | | controller.envFrom | list | `[]` | | -| controller.extraVolumeMounts | list | `[]` | Additional volumeMounts for the controller pod. | +| controller.extraVolumeMounts | list | `[]` | Additional volumeMounts for the controller container. | | controller.healthProbe.port | int | `8081` | The container port to use for http health probe. | -| controller.image.digest | string | `"sha256:fe383abf1dbc79f164d1cbcfd8edaaf7ce97a43fbd6cb70176011ff99ce57523"` | SHA256 digest of the controller image. | +| controller.image.digest | string | `"sha256:51bca600197c7c6e6e0838549664b2c12c3f8dd4b23744ab28202ae97ca5aed1"` | SHA256 digest of the controller image. | | controller.image.repository | string | `"public.ecr.aws/karpenter/controller"` | Repository path to the controller image. | | controller.image.tag | string | `"1.1.1"` | Tag of the controller image. | | controller.metrics.port | int | `8080` | The container port to use for metrics. | -| controller.resources | object | `{}` | Resources for the controller pod. | +| controller.resources | object | `{}` | Resources for the controller container. | +| controller.securityContext.appArmorProfile | object | `nil` | The AppArmor options to use by the controller container. | +| controller.securityContext.seLinuxOptions | object | `nil` | The SELinux context to be applied to the controller container. | +| controller.securityContext.seccompProfile | object | `{"type":"RuntimeDefault"}` | The seccomp options to use by the controller container. | | controller.sidecarContainer | list | `[]` | Additional sidecarContainer config | | controller.sidecarVolumeMounts | list | `[]` | Additional volumeMounts for the sidecar - this will be added to the volume mounts on top of extraVolumeMounts | | dnsConfig | object | `{}` | Configure DNS Config for the pod | @@ -72,7 +75,7 @@ cosign verify public.ecr.aws/karpenter/karpenter:1.1.1 \ | podDisruptionBudget.maxUnavailable | int | `1` | | | podDisruptionBudget.name | string | `"karpenter"` | | | podLabels | object | `{}` | Additional labels for the pod. | -| podSecurityContext | object | `{"fsGroup":65532}` | SecurityContext for the pod. | +| podSecurityContext | object | `{"fsGroup":65532,"runAsNonRoot":true,"seccompProfile":{"type":"RuntimeDefault"}}` | SecurityContext for the pod. | | priorityClassName | string | `"system-cluster-critical"` | PriorityClass name for the pod. | | replicas | int | `2` | Number of replicas. | | revisionHistoryLimit | int | `10` | The number of old ReplicaSets to retain to allow rollback. | @@ -88,15 +91,15 @@ cosign verify public.ecr.aws/karpenter/karpenter:1.1.1 \ | settings.batchIdleDuration | string | `"1s"` | The maximum amount of time with no new ending pods that if exceeded ends the current batching window. If pods arrive faster than this time, the batching window will be extended up to the maxDuration. If they arrive slower, the pods will be batched separately. | | settings.batchMaxDuration | string | `"10s"` | The maximum length of a batch window. The longer this is, the more pods we can consider for provisioning at one time which usually results in fewer but larger nodes. | | settings.clusterCABundle | string | `""` | Cluster CA bundle for TLS configuration of provisioned nodes. If not set, this is taken from the controller's TLS configuration for the API server. | -| settings.clusterEndpoint | string | `""` | Cluster endpoint. If not set, will be discovered during startup (EKS only) | +| settings.clusterEndpoint | string | `""` | Cluster endpoint. If not set, will be discovered during startup (EKS only). | | settings.clusterName | string | `""` | Cluster name. | -| settings.eksControlPlane | bool | `false` | Marking this true means that your cluster is running with an EKS control plane and Karpenter should attempt to discover cluster details from the DescribeCluster API | -| settings.featureGates | object | `{"nodeRepair":false,"spotToSpotConsolidation":false}` | Feature Gate configuration values. Feature Gates will follow the same graduation process and requirements as feature gates in Kubernetes. More information here https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-gates-for-alpha-or-beta-features | +| settings.eksControlPlane | bool | `false` | Marking this true means that your cluster is running with an EKS control plane and Karpenter should attempt to discover cluster details from the DescribeCluster API. | +| settings.featureGates | object | `{"nodeRepair":false,"spotToSpotConsolidation":false}` | Feature Gate configuration values. Feature Gates will follow the same graduation process and requirements as feature gates in Kubernetes. More information here https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-gates-for-alpha-or-beta-features. | | settings.featureGates.nodeRepair | bool | `false` | nodeRepair is ALPHA and is disabled by default. Setting this to true will enable node repair. | | settings.featureGates.spotToSpotConsolidation | bool | `false` | spotToSpotConsolidation is ALPHA and is disabled by default. Setting this to true will enable spot replacement consolidation for both single and multi-node consolidation. | -| settings.interruptionQueue | string | `""` | Interruption queue is the name of the SQS queue used for processing interruption events from EC2 Interruption handling is disabled if not specified. Enabling interruption handling may require additional permissions on the controller service account. Additional permissions are outlined in the docs. | -| settings.isolatedVPC | bool | `false` | If true then assume we can't reach AWS services which don't have a VPC endpoint This also has the effect of disabling look-ups to the AWS pricing endpoint | -| settings.reservedENIs | string | `"0"` | Reserved ENIs are not included in the calculations for max-pods or kube-reserved This is most often used in the VPC CNI custom networking setup https://docs.aws.amazon.com/eks/latest/userguide/cni-custom-network.html | +| settings.interruptionQueue | string | `""` | Interruption queue is the name of the SQS queue used for processing interruption events from EC2. Interruption handling is disabled if not specified. Enabling interruption handling may require additional permissions on the controller service account. Additional permissions are outlined in the docs. | +| settings.isolatedVPC | bool | `false` | If true then assume we can't reach AWS services which don't have a VPC endpoint. This also has the effect of disabling look-ups to the AWS pricing endpoint. | +| settings.reservedENIs | string | `"0"` | Reserved ENIs are not included in the calculations for max-pods or kube-reserved. This is most often used in the VPC CNI custom networking setup https://docs.aws.amazon.com/eks/latest/userguide/cni-custom-network.html. | | settings.vmMemoryOverheadPercent | float | `0.075` | The VM memory overhead as a percent that will be subtracted from the total memory for all instance types. The value of `0.075` equals to 7.5%. | | strategy | object | `{"rollingUpdate":{"maxUnavailable":1}}` | Strategy for updating the pod. | | terminationGracePeriodSeconds | string | `nil` | Override the default termination grace period for the pod. | diff --git a/charts/karpenter/templates/deployment.yaml b/charts/karpenter/templates/deployment.yaml index 990ce486292e..53305c08e253 100644 --- a/charts/karpenter/templates/deployment.yaml +++ b/charts/karpenter/templates/deployment.yaml @@ -62,16 +62,29 @@ spec: containers: - name: {{ .Values.controller.containerName | default "controller" }} securityContext: + privileged: false + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + runAsNonRoot: true runAsUser: 65532 runAsGroup: 65532 - runAsNonRoot: true - seccompProfile: - type: RuntimeDefault - allowPrivilegeEscalation: false capabilities: drop: - ALL - readOnlyRootFilesystem: true + {{- with .Values.controller.securityContext }} + {{- with .appArmorProfile }} + appArmorProfile: + {{- toYaml . | nindent 14}} + {{- end }} + {{- with .seLinuxOptions }} + seLinuxOptions: + {{- toYaml . | nindent 14}} + {{- end }} + {{- with .seccompProfile }} + seccompProfile: + {{- toYaml . | nindent 14}} + {{- end }} + {{- end }} image: {{ include "karpenter.controller.image" . }} imagePullPolicy: {{ .Values.imagePullPolicy }} env: diff --git a/charts/karpenter/values.yaml b/charts/karpenter/values.yaml index f4434266b5c8..34096a352983 100644 --- a/charts/karpenter/values.yaml +++ b/charts/karpenter/values.yaml @@ -51,7 +51,10 @@ podDisruptionBudget: maxUnavailable: 1 # -- SecurityContext for the pod. podSecurityContext: + runAsNonRoot: true fsGroup: 65532 + seccompProfile: + type: RuntimeDefault # -- PriorityClass name for the pod. priorityClassName: system-cluster-critical # -- Override the default termination grace period for the pod. @@ -111,12 +114,20 @@ controller: tag: 1.1.1 # -- SHA256 digest of the controller image. digest: sha256:51bca600197c7c6e6e0838549664b2c12c3f8dd4b23744ab28202ae97ca5aed1 - # -- Additional environment variables for the controller pod. + securityContext: + # -- (object) The AppArmor options to use by the controller container. + appArmorProfile: + # -- (object) The SELinux context to be applied to the controller container. + seLinuxOptions: + # -- The seccomp options to use by the controller container. + seccompProfile: + type: RuntimeDefault + # -- Additional environment variables for the controller container. env: [] # - name: AWS_REGION # value: eu-west-1 envFrom: [] - # -- Resources for the controller pod. + # -- Resources for the controller container. resources: {} # We usually recommend not to specify default resources and to leave this as a conscious # choice for the user. This also increases chances charts run on environments with little @@ -128,8 +139,7 @@ controller: # limits: # cpu: 1 # memory: 1Gi - - # -- Additional volumeMounts for the controller pod. + # -- Additional volumeMounts for the controller container. extraVolumeMounts: [] # - name: aws-iam-token # mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount @@ -165,24 +175,24 @@ settings: clusterCABundle: "" # -- Cluster name. clusterName: "" - # -- Cluster endpoint. If not set, will be discovered during startup (EKS only) + # -- Cluster endpoint. If not set, will be discovered during startup (EKS only). clusterEndpoint: "" - # -- If true then assume we can't reach AWS services which don't have a VPC endpoint - # This also has the effect of disabling look-ups to the AWS pricing endpoint + # -- If true then assume we can't reach AWS services which don't have a VPC endpoint. + # This also has the effect of disabling look-ups to the AWS pricing endpoint. isolatedVPC: false - # Marking this true means that your cluster is running with an EKS control plane and Karpenter should attempt to discover cluster details from the DescribeCluster API + # -- Marking this true means that your cluster is running with an EKS control plane and Karpenter should attempt to discover cluster details from the DescribeCluster API. eksControlPlane: false # -- The VM memory overhead as a percent that will be subtracted from the total memory for all instance types. The value of `0.075` equals to 7.5%. vmMemoryOverheadPercent: 0.075 - # -- Interruption queue is the name of the SQS queue used for processing interruption events from EC2 + # -- Interruption queue is the name of the SQS queue used for processing interruption events from EC2. # Interruption handling is disabled if not specified. Enabling interruption handling may # require additional permissions on the controller service account. Additional permissions are outlined in the docs. interruptionQueue: "" - # -- Reserved ENIs are not included in the calculations for max-pods or kube-reserved - # This is most often used in the VPC CNI custom networking setup https://docs.aws.amazon.com/eks/latest/userguide/cni-custom-network.html + # -- Reserved ENIs are not included in the calculations for max-pods or kube-reserved. + # This is most often used in the VPC CNI custom networking setup https://docs.aws.amazon.com/eks/latest/userguide/cni-custom-network.html. reservedENIs: "0" # -- Feature Gate configuration values. Feature Gates will follow the same graduation process and requirements as feature gates - # in Kubernetes. More information here https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-gates-for-alpha-or-beta-features + # in Kubernetes. More information here https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-gates-for-alpha-or-beta-features. featureGates: # -- spotToSpotConsolidation is ALPHA and is disabled by default. # Setting this to true will enable spot replacement consolidation for both single and multi-node consolidation.