Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingress controller processing ingresses with different ingressClassName when multiple ingress controllers are deployed in the same namespace - AWS EKS 1.27 #10907

Open
mdellavedova opened this issue Jan 23, 2024 · 12 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@mdellavedova
Copy link

mdellavedova commented Jan 23, 2024

What happened:

I have 2 ingress controllers deployed in the same namespace, set up following the instructions in these documents:
https://kubernetes.github.io/ingress-nginx/user-guide/k8s-122-migration/#i-cant-use-multiple-namespaces-what-should-i-do and https://kubernetes.github.io/ingress-nginx/user-guide/multiple-ingress/#multiple-ingress-controllers
The ingresses work as expected but when I look at the logs for one ingress controller I can see multiple errors:

I0123 10:02:35.684672       7 store.go:436] "Ignoring ingress because of error while validating ingress class" ingress="omega/cs-05c36933-076c-490f-a23b-d6d5019d1cb2-api-gw" error="no object matching key \"ingress-controller-internal-nginx\" in local store"

suggesting that the ingress controller is considering ingresses that belong to the other ingress controller and vice-versa. This creates a high load on (one of) the ingress controller's pods causing it to restart.
What you expected to happen:
I would expect both ingress controllers to ignore ingresses which don't have their associated ingressClassName

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):


NGINX Ingress controller
Release: v1.8.1
Build: dc88dce
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.21.6


I have also tried the latest available helm chart, which didn't help

NGINX Ingress controller
Release: v1.9.5
Build: f503c4b
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.21.6


Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4", GitCommit:"fa3d7990104d7c1f16943a67f11b154b71f6a132", GitTreeState:"clean", BuildDate:"2023-07-19T12:20:54Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27+", GitVersion:"v1.27.8-eks-8cb36c9", GitCommit:"fca3a8722c88c4dba573a903712a6feaf3c40a51", GitTreeState:"clean", BuildDate:"2023-11-22T21:52:13Z", GoVersion:"go1.20.11", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release):
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"
  • Kernel (e.g. uname -a): Linux ip-10-229-145-39.eu-west-1.compute.internal 5.10.198-187.748.amzn2.x86_64 #1 SMP Tue Oct 24 19:49:54 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
    • Please mention how/where was the cluster created like kubeadm/kops/minikube/kind etc.
  • Basic cluster related info:
    • kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4", GitCommit:"fa3d7990104d7c1f16943a67f11b154b71f6a132", GitTreeState:"clean", BuildDate:"2023-07-19T12:20:54Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27+", GitVersion:"v1.27.8-eks-8cb36c9", GitCommit:"fca3a8722c88c4dba573a903712a6feaf3c40a51", GitTreeState:"clean", BuildDate:"2023-11-22T21:52:13Z", GoVersion:"go1.20.11", Compiler:"gc", Platform:"linux/amd64"}
  • kubectl get nodes -o wide

  • How was the ingress-nginx-controller installed:

    • If helm was used then please show output of helm ls -A | grep -i ingress
    • If helm was used then please show output of helm -n <ingresscontrollernamespace> get values <helmreleasename>
      the ingress controller(s) were installed using ArgoCD (which in terms uses helm). Helm values below:

    nginx-public-nlb-tls

       controller:
         resources:
           requests:
             cpu: 100m
             memory: 500Mi
           limits:
             cpu: 2
             memory: 2000Mi
         hostNetwork: true
         ingressClassResource:
           name: nginx-public-nlb-tls
           enabled: true
           default: false
           controllerValue: k8s.io/ingress-nginx-public-nlb-tls
         ingressClass: nginx-public-nlb-tls
         ingressClassByName: true
         electionID: ingress-nginx-public-nlb-tls-leader
         {{- if .Values.ingressControllerPublicNlbTls.metrics.enabled }}
         metrics:
           enabled: true
           service:
             annotations:
               prometheus.io/port: "10254"
               prometheus.io/scrape: "true"
         {{- end }}
         service:
         {{- if .Values.ingressControllerPublicNlbTls.tlsIngress.tlsSupport }} 
           targetPorts:
             http: http
             https: http
           annotations:
             nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
             service.beta.kubernetes.io/aws-load-balancer-type: nlb
             service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
             service.beta.kubernetes.io/aws-load-balancer-backend-protocol: TLS
             service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
             service.beta.kubernetes.io/aws-load-balancer-ssl-cert: {{ .Values.ingressControllerPublicNlbTls.acmArn }}
         {{ else }}
           targetPorts:
             http: http
             https: https
           annotations:
             nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
             service.beta.kubernetes.io/aws-load-balancer-type: nlb
             service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
         {{ end }}
             service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: "Monitoring=enabled"
         podAnnotations:
           co.elastic.logs/processors.0.decode_json_fields.fields: message
           co.elastic.logs/processors.0.decode_json_fields.target: lb
         config:
           log-format-escape-json: true
           log-format-upstream: '{"@timestamp":"$msec", "date":"$time_iso8601", "upstreamIp":"$realip_remote_addr", "traceId": "$http_x_nexmo_trace_id",
             "clientIpAddress":"$remote_addr", "xForwardedFor":"$http_x_forwarded_for", "hdrContentType":"$http_content_type",
             "hdrSentContentType": "$sent_http_content_type", "remoteUser": "$remote_user", "uri": "$request_uri", 
             "method":"$request_method","serverProto":"$server_protocol", "httpStatus":"$status",
             "reqTime":"$request_time", "reqLength":"$request_length", "size":"$body_bytes_sent",
             "referer":"$http_referer", "userAgent":"$http_user_agent", "upsAddr":"$upstream_addr",
             "upsStatus":"$upstream_status",  "upsConnectTime":"$upstream_connect_time", "upsHeaderTime":"$upstream_header_time",  "upsResponseTime":"$upstream_response_time",
             "upsStatus_all":"$upstream_status",  "upsConnectTime_all":"$upstream_connect_time",
             "upsHeaderTime_all":"$upstream_header_time",  "upsResponseTime_all":"$upstream_response_time",
             "hostname":"$host",  "serverPort":"$server_port",  "scheme":"$scheme", "sslCipher":"$ssl_cipher",
             "sslProtocol":"$ssl_protocol"}'
           http-snippet: >-
             log_format bodyinfo escape=json '{"@timestamp":"$msec", "date":"$time_iso8601", "upstreamIp":"$realip_remote_addr", "traceId": "$http_x_nexmo_trace_id",
             "clientIpAddress":"$remote_addr", "xForwardedFor":"$http_x_forwarded_for", "hdrContentType":"$http_content_type",
             "hdrSentContentType": "$sent_http_content_type", "remoteUser": "$remote_user", "uri": "$request_uri", 
             "method":"$request_method","serverProto":"$server_protocol", "httpStatus":"$status",
             "reqTime":"$request_time", "reqLength":"$request_length", "size":"$body_bytes_sent",
             "referer":"$http_referer", "userAgent":"$http_user_agent", "upsAddr":"$upstream_addr",
             "upsStatus":"$upstream_status",  "upsConnectTime":"$upstream_connect_time", "upsHeaderTime":"$upstream_header_time",  "upsResponseTime":"$upstream_response_time",
             "upsStatus_all":"$upstream_status",  "upsConnectTime_all":"$upstream_connect_time",
             "upsHeaderTime_all":"$upstream_header_time",  "upsResponseTime_all":"$upstream_response_time",
             "hostname":"$host",  "serverPort":"$server_port",  "scheme":"$scheme", "sslCipher":"$ssl_cipher",
             "sslProtocol":"$ssl_protocol","requestBody":"[$request_body]"}';
         admissionWebhooks:
           timeoutSeconds: 30
         replicaCount: {{ .Values.ingressControllerPublicNlbTls.replicaCount }}
         minAvailable: {{ max 1 ( sub .Values.ingressControllerPublicNlbTls.replicaCount 1 ) }}
         affinity:
           podAntiAffinity:
             requiredDuringSchedulingIgnoredDuringExecution:
             - labelSelector:
                 matchExpressions:
                 - key: app.kubernetes.io/name
                   operator: In
                   values:
                   - ingress-nginx
                 - key: app.kubernetes.io/instance
                   operator: In
                   values:
                   - ingress-nginx-public-nlb-tls
                 - key: app.kubernetes.io/component
                   operator: In
                   values:
                   - controller
               topologyKey: "kubernetes.io/hostname"
         topologySpreadConstraints:
           - maxSkew: 1
             topologyKey: topology.kubernetes.io/zone
             whenUnsatisfiable: ScheduleAnyway
             labelSelector:
               matchLabels:
                 app.kubernetes.io/instance: ingress-nginx-public-nlb-tls

ingress-controller-internal-nginx:

        controller:
          resources:
            requests:
              cpu: 100m
              memory: 500Mi
            limits:
              cpu: 2
              memory: 2000Mi
          hostNetwork: true
          ingressClass: ingress-controller-internal-nginx

          ingressClassResource:
            controllerValue: "k8s.io/ingress-nginx-internal"
            name: ingress-controller-internal-nginx
          electionID: "ingress-controller-internal-leader"
          {{- if .Values.ingressControllerInternal.metrics.enabled }}
          metrics:
            enabled: true
            service:
              annotations:
                prometheus.io/port: "10254"
                prometheus.io/scrape: "true"
          {{- end }}
          service:
            targetPorts:
              http: http
              https: http
            annotations:
              nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
              service.beta.kubernetes.io/aws-load-balancer-type: nlb
              service.beta.kubernetes.io/aws-load-balancer-internal: "true"
              service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
              service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
              service.beta.kubernetes.io/aws-load-balancer-backend-protocol: TLS
              service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
              service.beta.kubernetes.io/aws-load-balancer-ssl-cert: {{ .Values.ingressControllerInternal.acm_arn }}
              service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: "Monitoring=enabled"
              # it doesn't work, aws-load-balancer-type must be changed to "external"
              # service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: "preserve_client_ip.enabled=false"
          podAnnotations:
            co.elastic.logs/processors.0.decode_json_fields.fields: message
            co.elastic.logs/processors.0.decode_json_fields.target: lb
          config:
            log-format-escape-json: true
            log-format-upstream: '{"@timestamp":"$msec", "date":"$time_iso8601", "upstreamIp":"$realip_remote_addr", "traceId": "$http_x_nexmo_trace_id",
              "clientIpAddress":"$remote_addr", "xForwardedFor":"$http_x_forwarded_for", "hdrContentType":"$http_content_type",
              "hdrSentContentType": "$sent_http_content_type", "remoteUser": "$remote_user", "uri": "$request_uri", 
              "method":"$request_method","serverProto":"$server_protocol", "httpStatus":"$status",
              "reqTime":"$request_time", "reqLength":"$request_length", "size":"$body_bytes_sent",
              "referer":"$http_referer", "userAgent":"$http_user_agent", "upsAddr":"$upstream_addr",
              "upsStatus":"$upstream_status",  "upsConnectTime":"$upstream_connect_time", "upsHeaderTime":"$upstream_header_time",  "upsResponseTime":"$upstream_response_time",
              "upsStatus_all":"$upstream_status",  "upsConnectTime_all":"$upstream_connect_time",
              "upsHeaderTime_all":"$upstream_header_time",  "upsResponseTime_all":"$upstream_response_time",
              "hostname":"$host",  "serverPort":"$server_port",  "scheme":"$scheme", "sslCipher":"$ssl_cipher",
              "sslProtocol":"$ssl_protocol"}'
            http-snippet: >-
              log_format bodyinfo escape=json '{"@timestamp":"$msec", "date":"$time_iso8601", "upstreamIp":"$realip_remote_addr", "traceId": "$http_x_nexmo_trace_id",
              "clientIpAddress":"$remote_addr", "xForwardedFor":"$http_x_forwarded_for", "hdrContentType":"$http_content_type",
              "hdrSentContentType": "$sent_http_content_type", "remoteUser": "$remote_user", "uri": "$request_uri", 
              "method":"$request_method","serverProto":"$server_protocol", "httpStatus":"$status",
              "reqTime":"$request_time", "reqLength":"$request_length", "size":"$body_bytes_sent",
              "referer":"$http_referer", "userAgent":"$http_user_agent", "upsAddr":"$upstream_addr",
              "upsStatus":"$upstream_status",  "upsConnectTime":"$upstream_connect_time", "upsHeaderTime":"$upstream_header_time",  "upsResponseTime":"$upstream_response_time",
              "upsStatus_all":"$upstream_status",  "upsConnectTime_all":"$upstream_connect_time",
              "upsHeaderTime_all":"$upstream_header_time",  "upsResponseTime_all":"$upstream_response_time",
              "hostname":"$host",  "serverPort":"$server_port",  "scheme":"$scheme", "sslCipher":"$ssl_cipher",
              "sslProtocol":"$ssl_protocol","requestBody":"[$request_body]"}';
          admissionWebhooks:
            timeoutSeconds: 30
          replicaCount: {{ .Values.ingressControllerInternal.replicaCount }}
          minAvailable: {{ max 1 ( sub .Values.ingressControllerInternal.replicaCount 1 ) }}
          affinity:
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                  - key: app.kubernetes.io/name
                    operator: In
                    values:
                    - ingress-nginx
                  - key: app.kubernetes.io/instance
                    operator: In
                    values:
                    - ingress-nginx-internal
                  - key: app.kubernetes.io/component
                    operator: In
                    values:
                    - controller
                topologyKey: "kubernetes.io/hostname"
          topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: topology.kubernetes.io/zone
              whenUnsatisfiable: ScheduleAnyway
              labelSelector:
                matchLabels:
                  app.kubernetes.io/instance: ingress-nginx-internal
  • If helm was not used, then copy/paste the complete precise command used to install the controller, along with the flags and options used

  • if you have more than one instance of the ingress-nginx-controller installed in the same cluster, please provide details for all the instances

  • Current State of the controller:

    • kubectl describe ingressclasses
Name:         ingress-controller-internal-nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx-internal
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.8.1
              argocd.argoproj.io/instance=ingress-nginx-internal-euw1-1
              helm.sh/chart=ingress-nginx-4.7.1
Annotations:  <none>
Controller:   k8s.io/ingress-nginx-internal
Events:       <none>


Name:         nginx-public-nlb-tls
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx-public-nlb-tls
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.9.5
              argocd.argoproj.io/instance=ingress-nginx-public-nlb-tls-euw1-1
              helm.sh/chart=ingress-nginx-4.9.0
Annotations:  <none>
Controller:   k8s.io/ingress-nginx-public-nlb-tls
Events:       <none>
  • kubectl -n <ingresscontrollernamespace> get all -A -o wide
  • kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>
Name:             ingress-nginx-public-nlb-tls-controller-6fbb668d64-prgkp
Namespace:        cluster
Priority:         0
Service Account:  ingress-nginx-public-nlb-tls
Node:             ip-10-229-145-39.eu-west-1.compute.internal/10.229.145.39
Start Time:       Tue, 23 Jan 2024 10:02:33 +0000
Labels:           app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=ingress-nginx-public-nlb-tls
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=ingress-nginx
                  app.kubernetes.io/part-of=ingress-nginx
                  app.kubernetes.io/version=1.9.5
                  helm.sh/chart=ingress-nginx-4.9.0
                  pod-template-hash=6fbb668d64
Annotations:      co.elastic.logs/processors.0.decode_json_fields.fields: message
                  co.elastic.logs/processors.0.decode_json_fields.target: lb
                  kubectl.kubernetes.io/restartedAt: 2024-01-23T10:02:32Z
Status:           Running
IP:               10.229.145.39
IPs:
  IP:           10.229.145.39
Controlled By:  ReplicaSet/ingress-nginx-public-nlb-tls-controller-6fbb668d64
Containers:
  controller:
    Container ID:    containerd://3c0d0d081c8986c9bea84aa03e8c944848f35c415aca7d9d3e7dbc046eb3b346
    Image:           registry.k8s.io/ingress-nginx/controller:v1.9.5@sha256:b3aba22b1da80e7acfc52b115cae1d4c687172cbf2b742d5b502419c25ff340e
    Image ID:        registry.k8s.io/ingress-nginx/controller@sha256:b3aba22b1da80e7acfc52b115cae1d4c687172cbf2b742d5b502419c25ff340e
    Ports:           80/TCP, 443/TCP, 8443/TCP
    Host Ports:      80/TCP, 443/TCP, 8443/TCP
    SeccompProfile:  RuntimeDefault
    Args:
      /nginx-ingress-controller
      --publish-service=$(POD_NAMESPACE)/ingress-nginx-public-nlb-tls-controller
      --election-id=ingress-nginx-public-nlb-tls-leader
      --controller-class=k8s.io/ingress-nginx-public-nlb-tls
      --ingress-class=nginx-public-nlb-tls
      --configmap=$(POD_NAMESPACE)/ingress-nginx-public-nlb-tls-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
      --ingress-class-by-name=true
    State:          Running
      Started:      Tue, 23 Jan 2024 10:02:34 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  2000Mi
    Requests:
      cpu:      100m
      memory:   500Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       ingress-nginx-public-nlb-tls-controller-6fbb668d64-prgkp (v1:metadata.name)
      POD_NAMESPACE:  cluster (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-h9dd4 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ingress-nginx-public-nlb-tls-admission
    Optional:    false
  kube-api-access-h9dd4:
    Type:                     Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:   3607
    ConfigMapName:            kube-root-ca.crt
    ConfigMapOptional:        <nil>
    DownwardAPI:              true
QoS Class:                    Burstable
Node-Selectors:               kubernetes.io/os=linux
Tolerations:                  node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                              node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints:  topology.kubernetes.io/zone:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/instance=ingress-nginx-public-nlb-tls
Events:                       <none>
  • kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename>
  Name:                     ingress-nginx-public-nlb-tls-controller
Namespace:                cluster
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=ingress-nginx-public-nlb-tls
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/part-of=ingress-nginx
                          app.kubernetes.io/version=1.9.5
                          argocd.argoproj.io/instance=ingress-nginx-public-nlb-tls-euw1-1
                          helm.sh/chart=ingress-nginx-4.9.0
Annotations:              nginx.ingress.kubernetes.io/force-ssl-redirect: true
                          service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: Monitoring=enabled
                          service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: 60
                          service.beta.kubernetes.io/aws-load-balancer-type: nlb
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx-public-nlb-tls,app.kubernetes.io/name=ingress-nginx
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       172.20.60.210
IPs:                      172.20.60.210
LoadBalancer Ingress:     <redacted>
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  30881/TCP
Endpoints:                10.229.145.39:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  31447/TCP
Endpoints:                10.229.145.39:443
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>
  • Current state of ingress object, if applicable:
    • kubectl -n <appnamespace> get all,ing -o wide
    • kubectl -n <appnamespace> describe ing <ingressname>
  Name:             neru-59e69cd7-go-neru-queue-scheduler-dev-com
Labels:           <none>
Namespace:        omega
Address:          <redacted>.elb.eu-west-1.amazonaws.com
Ingress Class:    nginx-public-nlb-tls
Default backend:  <default>
TLS:
  default-ingress-ssl terminates <redacted>
Rules:
  Host                                                                       Path  Backends
  ----                                                                       ----  --------
  <redacted>
                                                                             /   envoy:80 (172.16.90.147:5000)
Annotations:                                                                 nginx.ingress.kubernetes.io/backend-protocol: HTTP
                                                                             nginx.ingress.kubernetes.io/upstream-vhost: <redacted>
Events:                                                                      <none>
  • If applicable, then, your complete and exact curl/grpcurl command (redacted if required) and the reponse to the curl/grpcurl command with the -v flag

  • Others:

    • Any other related information like ;
      • copy/paste of the snippet (if applicable)
      • kubectl describe ... of any custom configmap(s) created and in use
      • Any other related information that may help

How to reproduce this issue:

Anything else we need to know:

@mdellavedova mdellavedova added the kind/bug Categorizes issue or PR as related to a bug. label Jan 23, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 23, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@longwuyuan
Copy link
Contributor

/remove-kind bug

@k8s-ci-robot k8s-ci-robot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jan 23, 2024
@longwuyuan
Copy link
Contributor

/triage needs-information

@k8s-ci-robot k8s-ci-robot added the triage/needs-information Indicates an issue needs more information in order to work on it. label Jan 23, 2024
@mdellavedova
Copy link
Author

Sorry I posted by mistake before completing the form, please let me know if there's anything else I need to add

@mdellavedova
Copy link
Author

Hi, I can see the triage/needs-information label is still there after I updated the form last week, could you please let me know if there's anything missing?

@longwuyuan
Copy link
Contributor

  • "Ignoring ingress" does not indicate that the ingress rules were used for routing
  • The most important aspect here is to confirm that you installed as per the link i pasted here earlier
  • The proof needed is that appropriate controller instance processes appropriate ingress rule routing

@mdellavedova
Copy link
Author

mdellavedova commented Feb 5, 2024

Thanks for your reply

  • "Ignoring ingress" does not indicate that the ingress rules were used for routing

I'm sure the rules aren't used for routing although I have a large number of ingresses that get pointlessly evaluated causing an increase in load for 1 of the 3 pods in the deployment which lead to restarts. (every time there is a batch of "Ignoring ingress" errors in the logs one of the pod restarts)

  • The most important aspect here is to confirm that you installed as per the link i pasted here earlier

I have followed that guide and double checked the configuration multiple times

  • The proof needed is that appropriate controller instance processes appropriate ingress rule routing

that's confirmed, the 2 ingress controllers only process their own ingress rules, the issue is the "Ignoring ingress" errors
and the associated pod restarts

@longwuyuan
Copy link
Contributor

  • I just tested 2 controllers on minikube and I could not reproduce the restart of pods
    image

  • So it seems like that error and restart are coinciding for you but one is not related to the other.

  • Can you try to reproduce on minikube or kind cluster

  • I think you can look at

    • kubectl get events -A
    • dmesg on nodes
    • syslog on nodes
    • monitoring dashboards for cpu, memory, inodes, disk-usage, other resources

@mdellavedova
Copy link
Author

mdellavedova commented Feb 6, 2024

thanks for your effort, I believe the restarts are due to the number of ingress resources being evaluated, I have a similar setup in 3 separate regions:

region 1:
total number of ingresses managed by both controllers: 1962
restarts controller 1: 33 over 20 days (1 of 3 pods only)
restarts controller 2: 0 over 19 days

region 2:
total number of ingresses managed by both controllers: 426
restarts controller 1: 0 over 20 days
restarts controller 2: 123 over 19 days (1 of 3 pods only)

region 3:
total number of ingresses managed by both controllers: 192
restarts controller 1: 0 over 20 days
restarts controller 2: 0 over 19 days (1 of 3 pods only)

could you please re-run your test with a higher number of ingresses? I'm not sure why there is no correlation between the number of ingress resources and the number of restarts, I will try and look at the traffic in region 2 vs region 1

@longwuyuan
Copy link
Contributor

  • I don't have hardware or automation to do 192 ingresses test
  • I think some data is needed to point at a problem in the ingress-controller here because once the resources like cpu/memory/diskspace/inodes/tcp-buffers etc are in shortage then it does not mean that there is a problem with the ingress-nginx controller, even though the controller's pods may are restarting

Copy link

github-actions bot commented Mar 8, 2024

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

@github-actions github-actions bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Mar 8, 2024
@strongjz
Copy link
Member

@mdellavedova were you able to resolve this issue?

Can you provide logs from the controller pod that restarts? Is there resource issues or errors from the kubelet/api server that show why it would be restarting? Do you have limits on the pods that are being hit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
Development

No branches or pull requests

4 participants