Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timing in 23connect.t #7

Open
e477 opened this issue May 1, 2015 · 10 comments
Open

timing in 23connect.t #7

e477 opened this issue May 1, 2015 · 10 comments

Comments

@e477
Copy link

e477 commented May 1, 2015

23connect.t exposes an interesting timing issue. The proxy accepts the CONNECT and the 'is' in line 69 passes, but then reading from the socket in line 76 returns 'undef' and the test fails. However, the test passes when I try to monitor its behaviour with tcpdump.

With logging enabled:

t/23connect.t ............. 
[Fri May  1 03:11:08 2015] (6858) PROCESS: Forked child process 6859
[Fri May  1 03:11:08 2015] (6859) SOCKET: New connection from 127.0.0.1:47405
[Fri May  1 03:11:08 2015] (6859) REQUEST: CONNECT localhost:59128
[Fri May  1 03:11:08 2015] (6859) REQUEST: Via: 1.1 earl:0 (HTTP::Proxy/0.303)
[Fri May  1 03:11:08 2015] (6859) REQUEST: Host: localhost:59128
[Fri May  1 03:11:08 2015] (6859) REQUEST: User-Agent: libwww-perl/6.13
[Fri May  1 03:11:08 2015] (6859) REQUEST: X-Forwarded-For: 127.0.0.1
t/23connect.t ............. 1/4 
#   Failed test 'Read some data from the socket'
#   at t/23connect.t line 80. 
#   Failed test 'CONNECTed to the TCP server and got the banner'
#   at t/23connect.t line 81. 
#          got: undef
#     expected: 'President_of_Earth Barbarella Professor_Ping Stomoxys Dildano
# ' 
[Fri May  1 03:11:08 2015] (6859) RESPONSE: 200 OK
[Fri May  1 03:11:08 2015] (6859) RESPONSE: Date: Fri, 01 May 2015 01:11:08 GMT 
[Fri May  1 03:11:08 2015] (6859) RESPONSE: Transfer-Encoding: chunked
[Fri May  1 03:11:08 2015] (6859) RESPONSE: Server: HTTP::Proxy/0.303
[Fri May  1 03:11:08 2015] (6859) CONNECT: Connection closed by the client
[Fri May  1 03:11:08 2015] (6859) PROCESS: Served 1 requests
[Fri May  1 03:11:08 2015] (6859) CONNECT: Read undef from server (Bad file descriptor)
[Fri May  1 03:11:08 2015] (6859) CONNECT: End of CONNECT proxyfication
[Fri May  1 03:11:09 2015] (6858) PROCESS: Reaped child process 6859
[Fri May  1 03:11:09 2015] (6858) PROCESS: 0 remaining kids: 
[Fri May  1 03:11:09 2015] (6858) STATUS: Processed 1 connection(s)
t/23connect.t ............. Failed 2/4 subtests 

With logging enabled and tcpdump running in the background:

t/23connect.t ............. 
[Fri May  1 03:13:55 2015] (6943) PROCESS: Forked child process 6944
[Fri May  1 03:13:55 2015] (6944) SOCKET: New connection from 127.0.0.1:49532
[Fri May  1 03:13:55 2015] (6944) REQUEST: CONNECT localhost:55409
[Fri May  1 03:13:55 2015] (6944) REQUEST: Via: 1.1 earl:0 (HTTP::Proxy/0.303)
[Fri May  1 03:13:55 2015] (6944) REQUEST: Host: localhost:55409
[Fri May  1 03:13:55 2015] (6944) REQUEST: User-Agent: libwww-perl/6.13
[Fri May  1 03:13:55 2015] (6944) REQUEST: X-Forwarded-For: 127.0.0.1
[Fri May  1 03:13:55 2015] (6944) RESPONSE: 200 OK
[Fri May  1 03:13:55 2015] (6944) RESPONSE: Date: Fri, 01 May 2015 01:13:55 GMT
[Fri May  1 03:13:55 2015] (6944) RESPONSE: Transfer-Encoding: chunked
[Fri May  1 03:13:55 2015] (6944) RESPONSE: Server: HTTP::Proxy/0.303
[Fri May  1 03:13:55 2015] (6944) CONNECT: 62 bytes received from server
t/23connect.t ............. 1/4 
[Fri May  1 03:13:55 2015] (6944) CONNECT: Connection closed by the client
[Fri May  1 03:13:55 2015] (6944) PROCESS: Served 1 requests
[Fri May  1 03:13:55 2015] (6944) CONNECT: End of CONNECT proxyfication
t/23connect.t ............. 4/4 
[Fri May  1 03:13:56 2015] (6943) PROCESS: Reaped child process 6944
[Fri May  1 03:13:56 2015] (6943) PROCESS: 0 remaining kids:
[Fri May  1 03:13:56 2015] (6943) STATUS: Processed 1 connection(s)
t/23connect.t ............. ok

Could this be the result of the client trying to read from the proxy before the proxy has connected to or read from the server?

@book
Copy link
Owner

book commented May 1, 2015

This is the most likely issue indeed. Looking at the history of this test script, I've been fiddling with sleep a lot.

Among the things I'd try:

  • sleep some more in the client
  • use a lock to block the client until the server is ready
  • do a half-close (instead of a close) in the server

Since you have a setup where this test seems to reliably fail, could you try some of those?

@e477
Copy link
Author

e477 commented May 1, 2015

It's easy getting the test to pass by appending a "sleep 1;" after getting the client_socket (line 66). I'm not sure, though, if this is the right approach and whether the bug is in the test (e.g. client has to wait) or in HTTP-Proxy itself (e.g. connect to server first, then send "200 OK" to client).

I'll have a further look at it, but reading the RFCs and getting myself familiar with the HTTP-Proxy code will take a while.

@e477
Copy link
Author

e477 commented May 17, 2015

It's an error in the test.

The LWP-UserAgent works in non-blocking mode, so when we take the 'client_socket' and read via 'getline' while the server isn't ready yet, the read doesn't block and we get an EOF immediately. You can trigger the problem on a system where it normally doesn't occur by inserting a 'sleep 2;' before sending the server banner. Adding something like

unless ($sock->blocking) {
    diag("socket is not blocking, switching to blocking mode");
    $sock->blocking(1);
}   

after line 65 should fix it.

book added a commit that referenced this issue May 26, 2015
Quoting the report from e477 in #7:

    The LWP-UserAgent works in non-blocking mode, so when we take the
    'client_socket' and read via 'getline' while the server isn't ready
    yet, the read doesn't block and we get an EOF immediately. You can
    trigger the problem on a system where it normally doesn't occur by
    inserting a 'sleep 2;' before sending the server banner.
@book
Copy link
Owner

book commented May 26, 2015

Thanks, I applied the fix (and removed all sleep calls) in e6e9f75.

@book book closed this as completed May 26, 2015
@ppisar
Copy link

ppisar commented Jun 9, 2015

This does fix the issue for me.

@book
Copy link
Owner

book commented Jun 9, 2015

Thanks for the feedback. I'll make a release soon.

@ppisar
Copy link

ppisar commented Jun 10, 2015

Sorry. I wanted to write that it does NOT fix the issue for me. I still get random failures.

@book
Copy link
Owner

book commented Jun 10, 2015

Oh. Then I have some questions for you:

  • Do you see the socket is not blocking, switching to blocking mode diagnostic line when running the test?
  • Which OS are you using?
  • Which version of Perl? Do the random failures appear with different Perl versions ? (If you have access to more than one).

@book book reopened this Jun 10, 2015
@spotrh
Copy link

spotrh commented Jun 15, 2015

t/22transparent.t ......... ok
# socket is non blocking, switching to blocking mode
#   Failed test 'Read some data from the socket'
#   at t/23connect.t line 79.
#   Failed test 'CONNECTed to the TCP server and got the banner'
#   at t/23connect.t line 80.
#          got: undef
#     expected: 'President_of_Earth Barbarella Professor_Ping Stomoxys Dildano
# '
t/23connect.t ............. 
Failed 2/4 subtests 

This is Fedora Rawhide (Fedora 23) with perl 5.22.0. I cannot reproduce the failure on Fedora 22 (perl 5.20.2).

Here's what it looks like when it passes on Fedora 22:

[Mon Jun 15 11:58:27 2015] (16546) PROCESS: Forked child process 16547
[Mon Jun 15 11:58:27 2015] (16547) SOCKET: New connection from 127.0.0.1:37580
[Mon Jun 15 11:58:27 2015] (16547) REQUEST: CONNECT localhost:34152
[Mon Jun 15 11:58:27 2015] (16547) REQUEST: Via: 1.1 localhost.localdomain:0 (HTTP::Proxy/0.303)
[Mon Jun 15 11:58:27 2015] (16547) REQUEST: Host: localhost:34152
[Mon Jun 15 11:58:27 2015] (16547) REQUEST: User-Agent: libwww-perl/6.13
[Mon Jun 15 11:58:27 2015] (16547) REQUEST: X-Forwarded-For: 127.0.0.1
[Mon Jun 15 11:58:27 2015] (16547) RESPONSE: 200 OK
[Mon Jun 15 11:58:27 2015] (16547) RESPONSE: Date: Mon, 15 Jun 2015 15:58:27 GMT
[Mon Jun 15 11:58:27 2015] (16547) RESPONSE: Transfer-Encoding: chunked
[Mon Jun 15 11:58:27 2015] (16547) RESPONSE: Server: HTTP::Proxy/0.303
[Mon Jun 15 11:58:27 2015] (16547) CONNECT: 62 bytes received from server
[Mon Jun 15 11:58:27 2015] (16547) CONNECT: Connection closed by the server
[Mon Jun 15 11:58:27 2015] (16547) PROCESS: Served 1 requests
[Mon Jun 15 11:58:27 2015] (16547) CONNECT: End of CONNECT proxyfication
# socket is non blocking, switching to blocking mode
[Mon Jun 15 11:58:28 2015] (16546) PROCESS: Reaped child process 16547
[Mon Jun 15 11:58:28 2015] (16546) PROCESS: 0 remaining kids: 
[Mon Jun 15 11:58:28 2015] (16546) STATUS: Processed 1 connection(s)
t/23connect.t ............. ok

I tried to get it to fail again with logging enabled in Rawhide... and it succeeded. :/ I'm guessing this is a timing issue.

@ntyni
Copy link

ntyni commented Oct 13, 2015

I'm able to reproduce this issue on Debian, and it looks like the problem here is Net::HTTP (as used by LWP::Protocol::http) buffering. I've got "strace -e read,write" output from a failing case which shows the LWP::UserAgent client getting the content in the same read call with the header fields. Later explicit read() calls on the socket then get EOF, resulting in empty data.

With a diagnostic like
is( $res->code, 200, "The proxy accepts CONNECT requests" );
diag "Net::HTTP buffer: " . $sock->_rbuf if $sock->_rbuf_length;
I indeed see on failure:

ok - The proxy accepts CONNECT requests
# Net::HTTP buffer: President_of_Earth Barbarella Professor_Ping Stomoxys Dildano
not ok - Read some data from the socket
#   Failed test 'Read some data from the socket'
#   at t/23connect.t line 77.
not ok - CONNECTed to the TCP server and got the banner
#   Failed test 'CONNECTed to the TCP server and got the banner'
#   at t/23connect.t line 78.
#          got: undef
#     expected: 'President_of_Earth Barbarella Professor_Ping Stomoxys Dildano
# '

I can't see an interface to get the buffered data from LWP::UserAgent, so this looks to me like a bug in LWP::UserAgent HTTP CONNECT support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants