-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
since controller 1.10.0 (chart 4.10.0): ingress rejects duplicate "Transfer-Encoding: chunked" and returns 502 #11162
Comments
/remove-kind bug
Its better if you write a step-by-step procedure that someone can copy/paste from. Please use the image httpbun.org as it will get people on the same page. Please use minikube to reproduce the problem. Ensure you provide all manifests that someone can copy/paste and reproduce. /kind support |
@longwuyuan |
@AlexPokatilov thanks for updating.
|
Do you happen to have a sample of a ticket where someone reproduced a bug that was based on the server's result? Part of the reproducing-setup would be to have a backend behind ingress to respond a certain way to showcase the setup in a minikube. I could create a spring-boot app that does that & create an image locally & deploy that, but that sounds like a lot of overhead for someone to reproduce that does not happen to have mvn&jdk locally |
Also you will see, there are no logs in the issue description. So essentially the info is already asked in issue template but I will list again (Not YAML but current state of resources shown as kubectl describe output and also the logs of the controller pod)
|
I will get back to you tomorrow and supply missing info. Sorry for that. I have used the template but omitted some of the parts that seemed less relevant but obviously were more relevant than I initially thought. |
@DRoppelt thanks for the idea. By discussing here or by copy/pasting data, if we can triage the issue description to an aspect that is related to reverseproxy or ingress-api, it would help make progress. If we make it a maven/jdk/jre/jsp specific discussion, then later in the discussion we will arrive at a step where we pinpoint at what reverseproxy concept or what nginx-as-a-reverseproxy concept we are looking at. In addition to what K8S-Ingress-API related problem are we looking at. In this case, it seems you are pointing at your java server setting I am wondering why should nginx reverseproxy bother what value you set to that header. We don't have any business messing with that. I don't know if you meant you were setting that header in your code and in the springboot framework. Either way, i know for sure that the controller does not get in the way normally. Hence we need to deep dive. BUT a clear issue description comes first. And the specific test where we alter headers comes next. |
Ok, so since the upgrade to controller 1.10.0 (chart 4.10.0), we observe 502s (not always, but at least one case where we can reproduce it deterministically) here is an attempt to visualize:
given that
then nginx appears to intercept the original response of We also observe following log with matching timestamps
We found that when altering backend to set the header once (as the backend probably should anyways), the response goes through as resproducingI tried to setup minikube and get a sample going with a spring boot app, but it appears that minikube comes with controller 1.9.4 and I found no way to pin it to 1.10.0, please advice. Do you have a sample ticket with httpbun.org I could copy from? I have a hard time understanding how to setup a sample based on that? |
Personally I am going to just use a nginx:alpine image to create a backend and see if I can just add the header to nginx.conf in that pod |
I see this ;
So it will be interesting to see your curl and same info. Trying to co-relate if nginx v1.25 does anything different . Also I read here that HTTP2 does not allow that header and I see HTTP/2 in play in my test above |
It is a bit too late (as in day of time in Europe) for me to get a feel for minikube, nevertheless thank you for the steps, will look at them myself tomorrow. Meanwhile, here is a backend that can be used for testing.
the api has 2 parameters, it will stream a response in 100ms intervals for I hope this helps!
|
@DRoppelt thank you. It really helped.
|
/kind bug @tao12345666333 @rikatz @cpanato @strongjz because of the bump to nginx v1.25.x, chunking related behaviour of the controller is broken out of the box. Details above. Basically internet search says a directive like Right now, the controller is rejecting response from a backend upstream app with the log message below ; 2024/03/26 03:39:34 [error] 267#267: *358731 upstream sent duplicate header line: "Transfer-Encoding: chunked", previous value: "Transfer-Encoding: chunked" while reading response header from upstream, client: 192.168.122.1, server: chunky-demo.dev.enjoydevops.com, request: "POST /chunky/10/1 HTTP/1.1", upstream: "http://10.244.0.68:8080/chunky/10/1", host: "chunky-demo.dev.enjoydevops.com"
|
Seems related #4838 |
@DRoppelt I think I have understood the problem and the solution
|
I think we need a docs PR to be clear about this to users of controller v1.10.x+ |
COMMENT@rikatz yes, this approach seems enough for @DRoppelt as that is what he asked here #11162 (comment) . CAVEAT
UNCLEAR INFO
upstream sent duplicate header line: "Transfer-Encoding: chunked", previous value: "Transfer-Encoding: chunked" while reading response header from upstream
|
Test-Case 1
I am going to test setting chunking off in the controller, using server-snippets and update in next post |
Test-Case 2
|
Unclear-info is NOW cleared, after the tests shown in previous 2 posts
upstream sent duplicate header line: "Transfer-Encoding: chunked", previous value: "Transfer-Encoding: chunked" while reading response header from upstream
|
Older Contoller Works
|
Hi all, to me it seems the issue is caused by this change introduced in NGINX v1.23. (Src https://forum.nginx.org/read.php?2,297086,297090#msg-297090) This seems to be the expected behavior when the |
@mcharriere thank you very much for that info. Solves the mystery. |
Cool that is some helpful background! I was not able to come around sooner for the points you brought up @longwuyuan , but here we go now
Sorry about my tone that day. I felt frustrated at the time as I am currently dealing with an k8s setup with about 100 services where about 3 services are knowingly affected and an unknown amount may be affected but not yet identified. On the other hand, I am a paid person that can find a way to deal with it in my paid time and should not roll off all onto an open-source project. I will submit another testcase to show a behaviour that is rather hidden and unluckily has spread as a pattern in our code-base at across various services. It is hard to identify them and the "all or nothing" approach that nginx has (no chunking enabled, OK, chunking enabled hard reject with 502)
I was thinking about it a bit more and in essence the problem lays in this "all or nothing" approach from nginx without a way inbeween
It would be rather helpful to have a warn-mode that allows a soft mirgation for all identified services that do not stick to the expected behaviour nginx expects. Especially the part where I am not knowedgeable about the process of submitting a process or a bug at the nginx project. But this stance here last year seems very clear and without any room for wiggle https://forum.nginx.org/read.php?2,297086,297090#msg-297090
as for the test-cases: you can take the code unmodified and test a) b) c) service has at least the following annotation added and passes due to overriding
d) given that the chart has some config to set
at this point I am a bit undecided if this is just enough for this non-standard behavior. A reason against this is that the default of One can add a global snippet-annotation which allows one to opt-out of chunking. So only a combination of allowing a service to opt-in/-out and controller in combination would add value. Otherwise just disabling it on the controller globally via a snippet is already sufficient. Currently looking into the docs for the exact syntax I will post another update on the demo-service that show-cases another source of duplicate headers that is not as straight-foward as |
So here is a sample where the code is proxying another rest-endpoint and therefore gets a 2. header through the way it proxies. I added a feign-based client that proxies a call to another service For demo-purposes, the service hosts that endpoint too, but does not alter the demo
the backend has an endpoint where it does a DB lookup and then does a downstream call. But as an attempt to proxy the call back to upstream, does not re-wrap the response but just returns the downstream call 1:1
-> nginx will 502 this request. but when it actually re-wraps the downstream's response, the duplicate header is prevented
This demo also still has the
-> header once
-> |
the endpoint I think on our end, we will try to upgrade from nginx 1.9.6 to 1.10.0 with an added server-snippet on the controller that disable chunking transfer. Then we will, once fully rolled out, re-enabled chunking on testing environments while leaving it inactive on productive environments (and with that have a drift between prod vs test for some time) and slowly fix affected services. From my end, I think we could live without further adjustments except for a) a note/hightlight in the changelog for this breaking behaviour b) a sample of an added server-snipped in the controller's ConfigMap |
@DRoppelt I think its not been clear enough communication
|
Ah now I see. Ok, I tried to reproduce this on my end deployed in an AKS
Seems like the duplicate check happens either way. :( |
This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach |
me too |
update from our org: we have identified all our affected services in testing-stages (by running the updated controller, observing test issues and looking into ingress logs), fixed them and then rolled out ingress update afterwards. So we fixed the downstream systems with odd header-proxying. The stance here was clear enough to acknowledge that we just need to fix the services. We were lucky enough to not have vendor-provided software misbehaving but deployables that we can patch ourselves https://forum.nginx.org/read.php?2,297086,297090#msg-297090
|
@DRoppelt thanks for the update. Seems in sync with the tests. Glad its not an issue anymore. I request that you close this issue, if there are no more questions. |
Hello Team, is there any updates on this issue ? |
The related info is ALREADY present above so there is no update pending from the project |
TLDR: controller 1.10.0 updated to a more recent nginx that started to reject certain behaviour from downstream systems, e.g. transfering 2x as per this: https://forum.nginx.org/read.php?2,297086,297090#msg-297090
There is no setting in ingress-nginx that suppresses this behaviour possible next steps:
The issue has been closed as I, the submitter, went for option 1), as the stance from the nginx project has been rather clear. |
I am putting annotation in ingress to disable transfer_encoding_chunked but it is not working. Is there any alternative to disable it for 1.10 version |
Hello, We had some issue, and we requested from Backend developers to find where they place double headers. |
TLDR: controller 1.10.0 updated to a more recent nginx that started to reject certain behaviour from downstream systems, e.g. transfering 2x
Transfere-Encoding: Chunked
headers.as per this:
https://forum.nginx.org/read.php?2,297086,297090#msg-297090
There is no setting in ingress-nginx that suppresses this behaviour
possible next steps:
fix the offending downstream systems so that they do not lead to nginx rejecting thje response
raise an issue with the nginx project
possibly in conjunction with 2), add an option that may be able to suppress the behaviour when there is feedback from the nginx project
The issue has been closed as I, the submitter, went for option 1), as the stance from the nginx project has been rather clear.
What happened: when controller from 1.9.6 (chart 4.9.1) to 1.10.0 (chart 4.10.0), we observe ingress answering 502 on behalve of the service. When rolling back to 1.9.6, restored to healthty behaviour
What you expected to happen: ingress behaving similar to 1.9.6 and not answering 502 where it did not before
there appears to be a regression of some sort. We have updated from 1.9.6 to 1.10.0 and observed that some requests return 502 right when we upgraded. We then downgraded and saw the 502s drop to previous numbers.
One instance where we observe it is when setting
Transfer-Encoding: chunked
twice, once in code and once via Spring Boot.We also observe following error message in the logs
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): 1.25.3
Kubernetes version (use
kubectl version
): v1.28.5Environment:
uname -a
):Please mention how/where was the cluster created like kubeadm/kops/minikube/kind etc.
kubectl version
kubectl get nodes -o wide
helm ls -A | grep -i ingress
helm -n <ingresscontrollernamespace> get values <helmreleasename>
If helm was not used, then copy/paste the complete precise command used to install the controller, along with the flags and options used
if you have more than one instance of the ingress-nginx-controller installed in the same cluster, please provide details for all the instances
Current State of the controller:
kubectl describe ingressclasses
kubectl -n <ingresscontrollernamespace> get all -A -o wide
kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>
kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename>
Current state of ingress object, if applicable:
kubectl -n <appnamespace> get all,ing -o wide
kubectl -n <appnamespace> describe ing <ingressname>
Others:
kubectl describe ...
of any custom configmap(s) created and in useHow to reproduce this issue:
we have a service that somestimes anwers with 2x
Transfer-Encoding: chunked
, sometimes with 1xTransfer-Encoding: chunked
, when it answers with 2x the header, the request does not reach the client and ingress answers with 502we hook into the ingress via port-forward to reproduce it, but have no better setup for local reproducing
<send requests via postman>
observe 502 & log when pod answers with duplicate header
Transfer-Encoding: chunked
, regular 200 when answering with one headerWe have also observed this for other clients with other backends, but so far have only particular endpoint reproduced with "always errors when X and always ok when Y"
Anything else we need to know:
we now have deployed a seperate ingress @ 1.10.0 (with seperate ingress class) and can observe this behaviour while our "hot" ingress that gets traffik is back to 1.9.6, the 1.10.0 still breaks while we have an operational ingress with 1.9.6. It does sound somewhat similar to spring-projects/spring-boot#37646
The text was updated successfully, but these errors were encountered: