Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

constantly see "updating Ingress status" in controller logs without anything change in the cluster #10972

Closed
lefterisALEX opened this issue Feb 6, 2024 · 11 comments
Labels
needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@lefterisALEX
Copy link

lefterisALEX commented Feb 6, 2024

What happened:
After upgrading ingress-nginx from 1.4.0 to 1.8.5 we noticed a lot of those messages in the logs (no debug):

ingress-nginx-controller-84dd99f964-g9plj controller I0206 11:14:33.918571       7 status.go:303] "updating Ingress status" namespace="pctv-branch-tcloud" ingress="pctv-ingress" currentValue=[{"ip":"10.28.124.184"},{"ip":"10.28.85.49"},{"ip":"10.28.98.203"}] newValue=[{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]

We have dedicated ingress-node. The map newValue are the IPs of the ingress nodes, and the map currentValue the IPs of the worker nodes.

NodeName                                              ROLE      EC2-type      Instance-type   AZ           IP              CPU   Memory       PODS_number
fargate-ip-10-28-107-202.eu-west-1.compute.internal   <none>    <none>        <none>          eu-west-1a   <none>          2     7824320Ki    1
fargate-ip-10-28-114-72.eu-west-1.compute.internal    <none>    <none>        <none>          eu-west-1b   <none>          2     7824328Ki    1
ip-10-28-111-198.eu-west-1.compute.internal           ingress   c6i.large     spot            eu-west-1a   10.28.111.198   2     3922884Ki    110
ip-10-28-117-238.eu-west-1.compute.internal           system    c5.4xlarge    <none>          eu-west-1b   10.28.117.238   16    32040496Ki   110
ip-10-28-121-243.eu-west-1.compute.internal           ingress   c6i.large     spot            eu-west-1b   10.28.121.243   2     3922884Ki    110
ip-10-28-124-184.eu-west-1.compute.internal           <none>    c6a.4xlarge   spot            eu-west-1b   10.28.124.184   16    32153136Ki   110
ip-10-28-85-49.eu-west-1.compute.internal             <none>    c5.4xlarge    spot            eu-west-1c   10.28.85.49     16    31811124Ki   110
ip-10-28-86-64.eu-west-1.compute.internal             system    c5.4xlarge    <none>          eu-west-1c   10.28.86.64     16    32040496Ki   110
ip-10-28-88-240.eu-west-1.compute.internal            ingress   c6i.large     spot            eu-west-1c   10.28.88.240    2     3922884Ki    110
ip-10-28-98-203.eu-west-1.compute.internal            <none>    c5.4xlarge    spot            eu-west-1a   10.28.98.203    16    31811124Ki   110
ip-10-28-98-213.eu-west-1.compute.internal            system    c5.4xlarge    <none>          eu-west-1a   10.28.98.213    16    31811124Ki   110

What you expected to happen:

I do not expect to see this in the logs since nothing is changing. This very likely is triggering constantly a reload of the ingress-nginx since new configuration is detected.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

 kubectl exec -ti ingress-nginx-controller-84dd99f964-89tlr -- bash -c "/nginx-ingress-controller --version"
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.8.5
  Build:         b5595e12928ac4b5de5dfd2ba9e36b0486dbb4bc
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.21.6

Kubernetes version (use kubectl version):

WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.1", GitCommit:"e4d4e1ab7cf1bf15273ef97303551b279f0920a9", GitTreeState:"clean", BuildDate:"2022-09-14T19:49:27Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25+", GitVersion:"v1.25.16-eks-5e0fdde", GitCommit:"dbe0c94703b5c31afe4e7a4ad467fb3a044c532b", GitTreeState:"clean", BuildDate:"2024-01-02T20:35:57Z", GoVersion:"go1.20.10", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS/EKS
  • OS (e.g. from /etc/os-release):
sh-4.2$ cat /etc/os-release 
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"
  • Kernel (e.g. uname -a):
Linux ip-10-28-86-64.eu-west-1.compute.internal 5.10.198-187.748.amzn2.x86_64 #1 SMP Tue Oct 24 19:49:54 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
    We use terraform and this module to deploy the cluster
  • Basic cluster related info:
kubectl get nodes -owide
NAME                                                  STATUS   ROLES    AGE     VERSION                INTERNAL-IP     EXTERNAL-IP   OS-IMAGE         KERNEL-VERSION                  CONTAINER-RUNTIME
fargate-ip-10-28-107-202.eu-west-1.compute.internal   Ready    <none>   19d     v1.25.15-eks-4f4795d   10.28.107.202   <none>        Amazon Linux 2   5.10.205-195.804.amzn2.x86_64   containerd://1.6.6
fargate-ip-10-28-114-72.eu-west-1.compute.internal    Ready    <none>   19d     v1.25.15-eks-4f4795d   10.28.114.72    <none>        Amazon Linux 2   5.10.205-195.804.amzn2.x86_64   containerd://1.6.6
ip-10-28-111-198.eu-west-1.compute.internal           Ready    <none>   8h      v1.25.15-eks-e71965b   10.28.111.198   <none>        Amazon Linux 2   5.10.198-187.748.amzn2.x86_64   containerd://1.6.19
ip-10-28-117-238.eu-west-1.compute.internal           Ready    <none>   28h     v1.25.15-eks-e71965b   10.28.117.238   <none>        Amazon Linux 2   5.10.198-187.748.amzn2.x86_64   containerd://1.6.19
ip-10-28-121-243.eu-west-1.compute.internal           Ready    <none>   5h39m   v1.25.15-eks-e71965b   10.28.121.243   <none>        Amazon Linux 2   5.10.198-187.748.amzn2.x86_64   containerd://1.6.19
ip-10-28-124-184.eu-west-1.compute.internal           Ready    <none>   28h     v1.25.15-eks-e71965b   10.28.124.184   <none>        Amazon Linux 2   5.10.198-187.748.amzn2.x86_64   containerd://1.6.19
ip-10-28-85-49.eu-west-1.compute.internal             Ready    <none>   26h     v1.25.15-eks-e71965b   10.28.85.49     <none>        Amazon Linux 2   5.10.198-187.748.amzn2.x86_64   containerd://1.6.19
ip-10-28-86-64.eu-west-1.compute.internal             Ready    <none>   28h     v1.25.15-eks-e71965b   10.28.86.64     <none>        Amazon Linux 2   5.10.198-187.748.amzn2.x86_64   containerd://1.6.19
ip-10-28-88-240.eu-west-1.compute.internal            Ready    <none>   5h29m   v1.25.15-eks-e71965b   10.28.88.240    <none>        Amazon Linux 2   5.10.198-187.748.amzn2.x86_64   containerd://1.6.19
ip-10-28-98-203.eu-west-1.compute.internal            Ready    <none>   29h     v1.25.15-eks-e71965b   10.28.98.203    <none>        Amazon Linux 2   5.10.198-187.748.amzn2.x86_64   containerd://1.6.19
ip-10-28-98-213.eu-west-1.compute.internal            Ready    <none>   28h     v1.25.15-eks-e71965b   10.28.98.213    <none>        Amazon Linux 2   5.10.198-187.748.amzn2.x86_64   containerd://1.6.19
  • How was the ingress-nginx-controller installed:
helm ls -n ingress-nginx
NAME                            	NAMESPACE    	REVISION	UPDATED                                	STATUS  	CHART                              	APP VERSION
ingress-nginx                   	ingress-nginx	42      	2024-02-05 15:30:27.099723833 +0000 UTC	deployed	ingress-nginx-4.7.5                	1.8.5      
ingress-nginx-validating-webhook	ingress-nginx	36      	2024-02-05 15:29:10.800106447 +0000 UTC	deployed	ingress-nginx-4.7.5                	1.8.5      
prometheus-blackbox-exporter    	ingress-nginx	12      	2024-02-06 09:38:42.093206477 +0000 UTC	deployed	prometheus-blackbox-exporter-8.10.1	v0.24.0    
helm get values ingress-nginx  -n ingress-nginx 
USER-SUPPLIED VALUES:
controller:
  admissionWebhooks:
    enabled: false
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
            - ingress-nginx
          - key: app.kubernetes.io/instance
            operator: In
            values:
            - ingress-nginx
          - key: app.kubernetes.io/component
            operator: In
            values:
            - controller
        topologyKey: kubernetes.io/hostname
  chroot: false
  config:
    annotation-value-word-blocklist: load_module,lua,root,serviceaccount,alias
    enable-underscores-in-headers: "true"
    force-ssl-redirect: "true"
    hsts-max-age: "31536000"
    http-snippet: |
      # See documentation on confluence for more information:
      # https://confluence.kpn.org/x/roXMD

      # Fix status
      # If status is 000, set to 499

      map $status $status_real {
        000     499;
        default $status;
      }

      # Fix upstream_status
      # If there are multiple values, select the last one
      # If unset, set to null

      map $upstream_status $upstream_status_real {
        ""              null;
        "~,\s*([^,]+)$" "$1";
        "~:\s+(\S+)$"   "$1";
        default         $upstream_status;
      }

      # If status is unset or 499, set to null

      map $status_real $upstream_status_json {
        ""      null;
        499     null;
        default $upstream_status_real;
      }

      # Fix upstream_addr
      # If there are multiple values, select the last one

      map $upstream_addr $upstream_addr_json {
        "~,\s*([^,]+)$" "$1";
        "~:\s+(\S+)$"   "$1";
        default         $upstream_addr;
      }

      # Fix upstream_response_length
      # If there are multiple values, select the last one
      # If unset, set to 0

      map $upstream_response_length $upstream_response_length_json {
        ""              0;
        "~,\s*([^,]+)$" "$1";
        "~:\s+(\S+)$"   "$1";
        default         $upstream_response_length;
      }

      # Fix upstream_response_time
      # If there are multiple values, select the last one
      # If unset, set to null

      map $upstream_response_time $upstream_response_time_json {
        ""              null;
        "~,\s*([^,]+)$" "$1";
        "~:\s+(\S+)$"   "$1";
        default         $upstream_response_time;
      }

      # Fix upstream_connect_time
      # If there are multiple values, select the last one
      # If unset, set to null

      map $upstream_connect_time $upstream_connect_time_1 {
        ""              null;
        "~,\s*([^,]+)$" "$1";
        "~:\s+(\S+)$"   "$1";
        default         $upstream_connect_time;
      }

      # If upstream_response_length is (0, unset), set to null

      map $upstream_response_length_json $upstream_connect_time_2 {
        0       null;
        ""      null;
        default $upstream_connect_time_1;
      }

      # If status if 499, set to null

      map $status_real $upstream_connect_time_json {
        499     null;
        default $upstream_connect_time_2;
      }

      # Fix upstream_header_time
      # If there are multiple values, select the last one
      # If unset, set to null

      map $upstream_header_time $upstream_header_time_1 {
        ""              null;
        "~,\s*([^,]+)$" "$1";
        "~:\s+(\S+)$"   "$1";
        default         $upstream_header_time;
      }

      # If upstream_response_length is (0, unset), set to null

      map $upstream_response_length_json $upstream_header_time_2 {
        0       null;
        ""      null;
        default $upstream_header_time_1;
      }

      # If status if 499, set to null

      map $status_real $upstream_header_time_json {
        499     null;
        default $upstream_header_time_2;
      }
    keep-alive-requests: "10000"
    log-format-escape-json: "true"
    log-format-upstream: '{ "time": "$time_iso8601", "remote_addr": "$remote_addr",
      "x-forward-for": "$proxy_add_x_forwarded_for", "remote_user": "$remote_user",
      "bytes_sent": $bytes_sent, "request_time": $request_time, "status": $status_real,
      "vhost": "$host", "request_proto": "$server_protocol", "path": "$uri", "request_query":
      "$args", "request_length": $request_length, "duration": $request_time, "method":
      "$request_method", "http_referrer": "$http_referer", "http_user_agent": "$http_user_agent",
      "proxy_upstream_name": "$proxy_upstream_name", "upstream_addr": "$upstream_addr_json",
      "upstream_response_length": $upstream_response_length_json, "upstream_response_time":
      $upstream_response_time_json, "upstream_connect_time": $upstream_connect_time_json,
      "upstream_header_time": $upstream_header_time_json, "upstream_status": $upstream_status_json,
      "namespace": "$namespace", "ingress_name": "$ingress_name", "service_name":
      "$service_name", "service_port": "$service_port" }'
    max-worker-connections: "65536"
    proxy-real-ip-cidr: 10.94.47.97/32,10.94.211.13/32,10.94.47.99/32,10.23.95.8/32,10.23.95.10/32,10.23.95.144/32,10.23.95.145/32,10.28.64.0/18,2a05:d018:41b:4000::/56,100.82.0.0/16,100.80.0.0/16,100.81.0.0/16,13.124.199.0/24,130.176.0.0/18,130.176.128.0/21,130.176.136.0/23,130.176.140.0/22,130.176.144.0/20,130.176.160.0/19,130.176.192.0/19,130.176.64.0/21,130.176.72.0/22,130.176.76.0/24,130.176.78.0/23,130.176.80.0/22,130.176.86.0/23,130.176.88.0/21,130.176.96.0/19,15.158.0.0/16,18.68.0.0/16,204.246.166.0/24,205.251.218.0/24,3.172.0.0/18,3.29.57.0/26,52.46.0.0/22,52.46.16.0/20,52.46.32.0/19,52.46.4.0/23,52.82.128.0/23,52.82.134.0/23,54.182.128.0/20,54.182.144.0/21,54.182.154.0/23,54.182.156.0/22,54.182.160.0/21,54.182.172.0/22,54.182.176.0/21,54.182.184.0/22,54.182.188.0/23,54.182.224.0/21,54.182.240.0/21,54.182.248.0/22,54.239.134.0/23,54.239.170.0/23,54.239.204.0/22,54.239.208.0/21,64.252.128.0/18,64.252.64.0/18,70.132.0.0/18
    server-tokens: "false"
    ssl-protocols: TLSv1.2 TLSv1.3
    upstream-keepalive-connections: "1000"
    use-forwarded-headers: "true"
  ingressClass: nginx
  ingressClassResource:
    enabled: false
  keda:
    behavior:
      scaleDown:
        policies:
        - periodSeconds: 600
          type: Pods
          value: 1
        stabilizationWindowSeconds: 1800
    enabled: true
    maxReplicas: 9
    minReplicas: 3
    triggers:
    - metadata:
        query: |
          sum(
            label_replace(
              max(kube_pod_info{
                  job="kube-state-metrics",
                  namespace="ingress-nginx"
              }) by (pod, node)
              * on(pod) group_left() max(kube_pod_labels{
                  job="kube-state-metrics",
                  namespace="ingress-nginx",
                  label_app_kubernetes_io_name="ingress-nginx",
                  label_app_kubernetes_io_instance="ingress-nginx",
                  label_app_kubernetes_io_component="controller",
              }) by (pod),
              "instance",
              "$1.$2.$3.$4:9100",
              "node",
              "ip-([0-9]+)-([0-9]+)-([0-9]+)-([0-9]+).*"
            )
            * on(instance) group_left() max(instance:node_cpu:ratio_1m) without (pod)
          ) * 100
        serverAddress: http://kube-prometheus-stack-thanos-query-frontend.kube-system:9090/
        threshold: "60"
      type: prometheus
  kind: Deployment
  livenessProbe: null
  metrics:
    enabled: true
    prometheusRule:
      additionalLabels:
        tcloud-system: "true"
      enabled: true
      rules:
      - alert: NGINXConfigFailed
        annotations:
          description: bad ingress config - nginx config test failed
        expr: count(nginx_ingress_controller_config_last_reload_successful == 0) >
          0
        for: 1s
        labels:
          severity: critical
      - expr: sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait",mode!="steal"}[1m]))
          WITHOUT (cpu, mode) / ON(instance) GROUP_LEFT() count(sum(node_cpu_seconds_total)
          BY (instance, cpu)) BY (instance)
        record: instance:node_cpu:ratio_1m
    serviceMonitor:
      additionalLabels:
        tcloud-system: "true"
      enabled: true
  minAvailable: 2
  nodeSelector:
    kpn.org/role: ingress
  podAnnotations:
    co.elastic.logs/enabled: "true"
  publishService:
    enabled: false
  replicaCount: 3
  service:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-access-log-enabled: "true"
      service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-name: tcloud-itv-dev1-nlb-access-logs
      service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-prefix: ingress-nginx-controller
      service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: map-migrated=d-server-0193ks42x91oyd
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: ssl
      service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "3600"
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: "2"
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "10"
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: "2"
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:eu-west-1:040414987200:certificate/cecc9e8d-ece0-44b2-aa3a-645290340bdc
      service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy: ELBSecurityPolicy-FS-1-2-Res-2020-10
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
      service.beta.kubernetes.io/aws-load-balancer-target-node-labels: kpn.org/role=ingress
      service.beta.kubernetes.io/aws-load-balancer-type: nlb
    enableHttp: false
    enabled: true
    externalTrafficPolicy: Local
    loadBalancerSourceRanges:
    - 127.0.0.1/32
  sysctls:
    net.core.somaxconn: "65535"
    net.ipv4.ip_local_port_range: 1024 65535
    net.ipv4.tcp_fin_timeout: "15"
    net.ipv4.tcp_max_syn_backlog: "3240000"
    net.ipv4.tcp_max_tw_buckets: "5880000"
    net.ipv4.tcp_no_metrics_save: "1"
    net.ipv4.tcp_syn_retries: "2"
    net.ipv4.tcp_synack_retries: "2"
    net.ipv4.tcp_tw_reuse: "1"
    net.netfilter.nf_conntrack_tcp_timeout_established: "120"
  terminationGracePeriodSeconds: 120
  tolerations:
  - key: node.kubernetes.io/ingress
    operator: Exists
  updateStrategy:
    rollingUpdate:
      maxSurge: 100%
      maxUnavailable: 1
    type: RollingUpdate
  watchIngressWithoutClass: true
defaultBackend:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
              - ingress-nginx
            - key: app.kubernetes.io/instance
              operator: In
              values:
              - ingress-nginx
            - key: app.kubernetes.io/component
              operator: In
              values:
              - default-backend
          topologyKey: topology.kubernetes.io/zone
        weight: 100
  enabled: true
  image:
    allowPrivilegeEscalation: false
    pullPolicy: IfNotPresent
    readOnlyRootFilesystem: false
    repository: artifacts.kpn.org/cloudinfra/docker-images/default-backend
    runAsNonRoot: true
    runAsUser: 1001
    tag: 1.0.7
  nodeSelector:
    kpn.org/role: ingress
  replicaCount: 3
  serviceAccount:
    automountServiceAccountToken: false
  tolerations:
  - key: node.kubernetes.io/ingress
    operator: Exists
  • Current State of the controller:
kubectl describe ingressclasses
Name:         nginx
Labels:       <none>
Annotations:  ingressclass.kubernetes.io/is-default-class: true
Controller:   k8s.io/ingress-nginx
Events:       <none>
k describe pods ingress-nginx-controller-8db75dfcd-vpd2f
Name:             ingress-nginx-controller-8db75dfcd-vpd2f
Namespace:        ingress-nginx
Priority:         0
Service Account:  ingress-nginx
Node:             ip-10-28-111-198.eu-west-1.compute.internal/10.28.111.198
Start Time:       Tue, 06 Feb 2024 12:56:28 +0100
Labels:           app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=ingress-nginx
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=ingress-nginx
                  app.kubernetes.io/part-of=ingress-nginx
                  app.kubernetes.io/version=1.8.5
                  helm.sh/chart=ingress-nginx-4.7.5
                  pod-template-hash=8db75dfcd
Annotations:      co.elastic.logs/enabled: true
                  kubectl.kubernetes.io/restartedAt: 2023-11-29T12:23:43+01:00
Status:           Running
IP:               100.81.254.198
IPs:
  IP:           100.81.254.198
Controlled By:  ReplicaSet/ingress-nginx-controller-8db75dfcd
Containers:
  controller:
    Container ID:  containerd://1fef5f8bd075273deba38d42aec61e59af68a685c07eea33210e9ed835469ce4
    Image:         registry.k8s.io/ingress-nginx/controller:v1.8.5@sha256:5831fa630e691c0c8c93ead1b57b37a6a8e5416d3d2364afeb8fe36fe0fef680
    Image ID:      registry.k8s.io/ingress-nginx/controller@sha256:5831fa630e691c0c8c93ead1b57b37a6a8e5416d3d2364afeb8fe36fe0fef680
    Ports:         80/TCP, 443/TCP, 10254/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Args:
      /nginx-ingress-controller
      --default-backend-service=$(POD_NAMESPACE)/ingress-nginx-defaultbackend
      --election-id=ingress-nginx-leader
      --controller-class=k8s.io/ingress-nginx
      --ingress-class=nginx
      --v=5
      --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
      --watch-ingress-without-class=true
    State:          Running
      Started:      Tue, 06 Feb 2024 12:56:29 +0100
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      100m
      memory:   90Mi
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       ingress-nginx-controller-8db75dfcd-vpd2f (v1:metadata.name)
      POD_NAMESPACE:  ingress-nginx (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-p8snn (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  kube-api-access-p8snn:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kpn.org/role=ingress
                             kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/ingress op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>
k describe svc ingress-nginx-controller
Name:                        ingress-nginx-controller
Namespace:                   ingress-nginx
Labels:                      app.kubernetes.io/component=controller
                             app.kubernetes.io/instance=ingress-nginx
                             app.kubernetes.io/managed-by=Helm
                             app.kubernetes.io/name=ingress-nginx
                             app.kubernetes.io/part-of=ingress-nginx
                             app.kubernetes.io/version=1.8.5
                             helm.sh/chart=ingress-nginx-4.7.5
Annotations:                 field.cattle.io/publicEndpoints:
                               [{"addresses":["a66c715ed346b4f269a74d256113065c-3d88629be1b7aab6.elb.eu-west-1.amazonaws.com"],"port":443,"protocol":"TCP","serviceName":...
                             meta.helm.sh/release-name: ingress-nginx
                             meta.helm.sh/release-namespace: ingress-nginx
                             service.beta.kubernetes.io/aws-load-balancer-access-log-enabled: true
                             service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-name: tcloud-itv-dev1-nlb-access-logs
                             service.beta.kubernetes.io/aws-load-balancer-access-log-s3-bucket-prefix: ingress-nginx-controller
                             service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: map-migrated=d-server-0193ks42x91oyd
                             service.beta.kubernetes.io/aws-load-balancer-backend-protocol: ssl
                             service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: 3600
                             service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: true
                             service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: 2
                             service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: 10
                             service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: 2
                             service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:eu-west-1:040414987200:certificate/cecc9e8d-ece0-44b2-aa3a-645290340bdc
                             service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy: ELBSecurityPolicy-FS-1-2-Res-2020-10
                             service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
                             service.beta.kubernetes.io/aws-load-balancer-target-node-labels: kpn.org/role=ingress
                             service.beta.kubernetes.io/aws-load-balancer-type: nlb
Selector:                    app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type:                        LoadBalancer
IP Family Policy:            SingleStack
IP Families:                 IPv4
IP:                          172.20.201.116
IPs:                         172.20.201.116
LoadBalancer Ingress:        a66c715ed346b4f269a74d256113065c-3d88629be1b7aab6.elb.eu-west-1.amazonaws.com
Port:                        https  443/TCP
TargetPort:                  https/TCP
NodePort:                    https  31286/TCP
Endpoints:                   100.80.240.25:443,100.81.254.198:443,100.82.131.30:443
Session Affinity:            None
External Traffic Policy:     Local
HealthCheck NodePort:        30750
LoadBalancer Source Ranges:  127.0.0.1/32
Events:                      <none>

How to reproduce this issue:

  1. create a new EKS clusters and add taint to ingress nodes.
  2. Install helm-chart 4.7.5 using the provided values. Ingress pods can tolerate the taint and will run in ingress-nodes since they have also node-selector.
  3. Create some ingresses
  4. Check the logs for lines containing the string updating Ingress status
@lefterisALEX lefterisALEX added the kind/bug Categorizes issue or PR as related to a bug. label Feb 6, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Feb 6, 2024
@longwuyuan
Copy link
Contributor

/remove-kind bug

  • The information in the issue description is not based on the questions asked in a new issue template
  • The issue description is not formatted properly so hard to read
  • No controller pod logs are provided

@k8s-ci-robot k8s-ci-robot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. and removed kind/bug Categorizes issue or PR as related to a bug. labels Feb 6, 2024
@longwuyuan
Copy link
Contributor

/triage needs-information

@k8s-ci-robot k8s-ci-robot added the triage/needs-information Indicates an issue needs more information in order to work on it. label Feb 6, 2024
@lefterisALEX
Copy link
Author

lefterisALEX commented Feb 6, 2024

thanks for the reply @longwuyuan ,

The information in the issue description is not based on the questions asked in a new issue template
The issue description is not formatted properly so hard to read

Can you please a bit more specific what is missing ? Also which part is hard to read? I would like to update the description but not sure if i understand what is missing

No controller pod logs are provided

I have already attached controller logs in first section. Please let me know if those are not sufficient or different debug level is needed

@lefterisALEX
Copy link
Author

I added also the describe of the following two:

k describe pods <controller-pod>
k describe svc <controller-service>

only think is missing now from the template is a list of all ingresses, let me know if needed so i can share in private. We have around 1000 ingresses

@lefterisALEX lefterisALEX changed the title "updating Ingress status" logs with currentValue map the IPs of worker nodes and newValue map the IPs of the ingress nodes constantly see "updating Ingress status" in controller logs without anything change in the cluster Feb 7, 2024
@lefterisALEX
Copy link
Author

if i "watch" the output of .status.loadbalancer.ingress, i see is changing every couple of seconds.

while true; set timestamp (date "+%Y-%m-%d %H:%M:%S") ;set ingress_info (kubectl get ingress kube-prometheus-stack-prometheus -n kube-system -o jsonpath='{.status.loadBalancer.ingress}')   ; echo "$timestamp - $ingress_info" ; sleep 1 ;end

2024-02-07 14:42:58 - [{"ip":"10.28.110.155"},{"ip":"10.28.124.184"}]
2024-02-07 14:43:00 - [{"ip":"10.28.110.155"},{"ip":"10.28.124.184"}]
2024-02-07 14:43:03 - [{"ip":"10.28.110.155"},{"ip":"10.28.124.184"}]
2024-02-07 14:43:06 - [{"ip":"10.28.110.155"},{"ip":"10.28.124.184"}]
2024-02-07 14:43:08 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:11 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:14 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:17 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:20 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:22 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:25 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:28 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:30 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:33 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:36 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:38 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:41 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:44 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:46 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:49 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:52 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:55 - [{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]
2024-02-07 14:43:57 - [{"ip":"10.28.110.155"},{"ip":"10.28.124.184"}]
2024-02-07 14:44:00 - [{"ip":"10.28.110.155"},{"ip":"10.28.124.184"}]
2024-02-07 14:44:03 - [{"ip":"10.28.110.155"},{"ip":"10.28.124.184"}]
NodeName                                              ROLE      EC2-type      Instance-type   AZ           IP              CPU   Memory       PODS_number
fargate-ip-10-28-107-202.eu-west-1.compute.internal   <none>    <none>        <none>          eu-west-1a   <none>          2     7824320Ki    1
fargate-ip-10-28-114-72.eu-west-1.compute.internal    <none>    <none>        <none>          eu-west-1b   <none>          2     7824328Ki    1
ip-10-28-105-227.eu-west-1.compute.internal           system    c6i.4xlarge   <none>          eu-west-1a   10.28.105.227   16    32327212Ki   110
ip-10-28-110-155.eu-west-1.compute.internal           <none>    c6a.4xlarge   spot            eu-west-1a   10.28.110.155   16    32153136Ki   110
ip-10-28-111-198.eu-west-1.compute.internal           ingress   c6i.large     spot            eu-west-1a   10.28.111.198   2     3922884Ki    110
ip-10-28-119-94.eu-west-1.compute.internal            system    c6a.4xlarge   <none>          eu-west-1b   10.28.119.94    16    32153136Ki   110
ip-10-28-121-243.eu-west-1.compute.internal           ingress   c6i.large     spot            eu-west-1b   10.28.121.243   2     3922884Ki    110
ip-10-28-124-184.eu-west-1.compute.internal           <none>    c6a.4xlarge   spot            eu-west-1b   10.28.124.184   16    32153136Ki   110
ip-10-28-87-98.eu-west-1.compute.internal             <none>    c6a.4xlarge   spot            eu-west-1c   10.28.87.98     16    32153136Ki   110
ip-10-28-88-240.eu-west-1.compute.internal            ingress   c6i.large     spot            eu-west-1c   10.28.88.240    2     3922884Ki    110
ip-10-28-95-12.eu-west-1.compute.internal             system    c6a.4xlarge   <none>          eu-west-1c   10.28.95.12     16    32153136Ki   110

any idea what might trigger the .status.loadbalancer.ingress to constantly change?

@lefterisALEX
Copy link
Author

lefterisALEX commented Feb 7, 2024

one more observation, from the logs i see that currentValue is the IP of validating-webhook-controller pods and newValue is the IP of the controller pods.

ingress-nginx-controller-8db75dfcd-vpd2f controller I0207 14:08:09.852483       6 status.go:303] "updating Ingress status" namespace="p-pl8zv-observability" ingress="tcloud-prometheus-alertmanager" currentValue=[{"ip":"10.28.110.155"},{"ip":"10.28.124.184"}] newValue=[{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]

currentValue=[{"ip":"10.28.110.155"},{"ip":"10.28.124.184"}]
newValue=[{"ip":"10.28.111.198"},{"ip":"10.28.121.243"},{"ip":"10.28.88.240"}]

╰─ k get pods -owide -n ingress-nginx
NAME                                                          READY   STATUS    RESTARTS   AGE   IP               NODE                                          NOMINATED NODE   READINESS GATES
ingress-nginx-controller-8db75dfcd-27zhl                      1/1     Running   0          26h   100.82.131.30    ip-10-28-121-243.eu-west-1.compute.internal   <none>           <none>
ingress-nginx-controller-8db75dfcd-rbgdd                      1/1     Running   0          26h   100.80.240.25    ip-10-28-88-240.eu-west-1.compute.internal    <none>           <none>
ingress-nginx-controller-8db75dfcd-vpd2f                      1/1     Running   0          26h   100.81.254.198   ip-10-28-111-198.eu-west-1.compute.internal   <none>           <none>
ingress-nginx-defaultbackend-75b87ff9c5-94zgz                 1/1     Running   0          32h   100.81.254.197   ip-10-28-111-198.eu-west-1.compute.internal   <none>           <none>
ingress-nginx-defaultbackend-75b87ff9c5-q7k6k                 1/1     Running   0          31h   100.82.242.154   ip-10-28-121-243.eu-west-1.compute.internal   <none>           <none>
ingress-nginx-defaultbackend-75b87ff9c5-wp6nd                 1/1     Running   0          31h   100.82.242.147   ip-10-28-121-243.eu-west-1.compute.internal   <none>           <none>
ingress-nginx-validating-webhook-controller-c4566c4d8-cwq2h   1/1     Running   0          46h   100.82.171.235   ip-10-28-124-184.eu-west-1.compute.internal   <none>           <none>
ingress-nginx-validating-webhook-controller-c4566c4d8-wdr8t   1/1     Running   0          16h   100.81.236.132   ip-10-28-110-155.eu-west-1.compute.internal   <none>           <none>
ingress-nginx-validating-webhook-controller-c4566c4d8-wnh6g   1/1     Running   0          16h   100.81.208.244   ip-10-28-110-155.eu-west-1.compute.internal   <none>           <none>
prometheus-blackbox-exporter-66c8cbc9f7-nb28b                 1/1     Running   0          16h   100.82.144.173   ip-10-28-124-184.eu-west-1.compute.internal   <none>           <none>

@longwuyuan
Copy link
Contributor

i am reading and will comment shortly but you can see that the output of helm get values has lines like kind: Deployment and ingressclass etc etc. So at least that section is one long set of lines tfrom different pieces of seggregated information appearing concatenated for a reader

@longwuyuan
Copy link
Contributor

-ping me on K8S slack and send me a zoom session id .. there is much to comment on

  • the logs contain info you may not want to be public so delete that zip file if you want

@longwuyuan
Copy link
Contributor

ok, as per zoom session, you have 2 controller instances installed in the same namespace (same cluster) but you have only one ingress class. That is not going to work because both controllers will process all the ingresses . You got the docs link for how to install multiple instances of controller in same cluster so please follow that and also please update the issue description, as per our discussion regarding the questions in the new issue template. thnx

@lefterisALEX
Copy link
Author

Thanks for the help @longwuyuan , the issue indeed was not there anymore if were using a single controller.
Sharing also the link for document you mentioned how to setup multiple controllers in case someone else bump in the same issue.,

https://kubernetes.github.io/ingress-nginx/user-guide/multiple-ingress/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
Development

No branches or pull requests

3 participants