Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix buffer size issue for AF_NETLINK #8

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Leseratte10
Copy link
Contributor

This should hopefully fix #5 in all cases, by almost completely rewriting getsockname and getpeername It does fix the issue with AF_NETLINK I had with my application in particular.

It also fixes another issue with the buffer size. Previously, if the __addr buffer in getsockname or getpeername was too small for the struct, your code just aborted and returned -1.

According to the documentation though, the returned data is supposed to be truncated to fit into the buffer, and the real needed buffer size is supposed to be returned through __len so the calling application can either decide to retry with a larger buffer, or be satisfied with the partial response.

Also, getsockname is the IP address the local side of the socket is working on/with (your IP). So if you open a socket and connect to a remote server, getsockname is supposed to return your own IP. What happens currently is that in this case the library just "throws" the IPv6 address back to the application, which is probably not going to end well when an IPv4-only application actually tries to use this. I've modded it to return the IPv4 address 0.0.0.0 instead for the local socket.

@Leseratte10
Copy link
Contributor Author

Leseratte10 commented Jan 7, 2023

I've updated the code a bit with the following things:

  • When execution starts, I'm allocating an array named socket_fd_flags. That array has four bits of space for each file descriptor. Then, when the application creates an IPv4 socket and tnat64 "upgrades" that to an IPv6 socket, the flag TNAT_FLAG_SOCK_UPGRADED_TO_IPV6 is set. If the application already requested an IPv6 socket, that flag is not set. That way, all the code will know if the socket was intended to be a native IPv6 socket or if the application thinks it's an IPv4 socket. The functions get_custom_fd_flags and set_custom_fd_flags can be used to access that array.
    (The three other bits are empty and could be used in the future when you need to store more information for each socket).
  • Currently, the code assumes that it can always send IPv4 packets through an IPv6 socket - but some systems may not support dual-stack sockets, or an admin deliberately disabled them. With this change, the code calling "is_local" handling local IPv4 connections will only do that if the IPv6 socket supports sending packets to IPv4 destinations; same for the code for the NAT64-to-IPv4-fallback (at the end of connect).
  • If a connection to 127.0.0.1 fails because the socket only accepts connections to IPv6 destinations, it now tries to connect to ::1 instead.
  • getpeername and getsockname check the socket flags mentioned before, and if the flag TNAT_FLAG_SOCK_UPGRADED_TO_IPV6 is NOT set, it immediately passes execution through to the original implementation of getpeername or getsockname so the rest of the code, including the IP address shenanigans, only occur for IPv4 sockets that the library upgraded to IPv6. That also makes it less likely that bugs in the library break rarely-tested types like AF_NETLINK or AF_UNIX again.

@Leseratte10 Leseratte10 force-pushed the fix-netlink branch 3 times, most recently from 43dd434 to 23cad4a Compare January 7, 2023 18:30
@andrewshadura
Copy link
Owner

Wow, this is huge 🙂

I’m thinking, if we’re going this big, wouldn’t it better to rewrite it in Rust? Doing what you’re trying to do in this PR would be much easier. I tried prototyping something last weekend, and I’ve got some things working, but I’m super early in it, I only wrap the four functions and add debug prints.

How’s your Rust? Would you be able to contribute to a Rust codebase?

@Leseratte10
Copy link
Contributor Author

How's my Rust? Well, I know how to spell the word "Rust" and I know that it's a programming language, that's about it ^^ Not sure how easy it would be to get into Rust to a level to be able to make helpful PRs. Although, if you have something working in Rust, couldn't hurt to have both a C and a Rust variant of the library. I do know that any other programming language other than C is probably way easier to write safe code that handles all corner cases.

I was planning, once this PR is merged, to continue with #6 (different prefix sizes), and once that's done, try adding support to autodiscover the NAT64 prefix (RFC7050). And then, if possible, UDP support, and then the library should be fairly feature-complete.

@Leseratte10
Copy link
Contributor Author

So, any status update regarding this? From my side, this PR should be done unless you have further comments regarding something I should change. Would you be willing to merge this so I can continue working on the other stuff I've planned to implement (prefix autodiscover, UDP, ...); or do you plan to no longer maintain this version in favour of a Rust rewrite? If so, is that already available somewhere for me to take a look at?

@andrewshadura
Copy link
Owner

Hi,
Sorry, I haven’t had time for this yet. The Rust rewrite was just a prototype and I don’t expect it to be ready soon yet.
I’ll try to have a look over the weekend.

@Leseratte10
Copy link
Contributor Author

Thanks for the response. No problem, I just wasn't sure if you were waiting for another response from me, that's why I asked.

@Leseratte10
Copy link
Contributor Author

I've added two more changes:

  • The socket flag array is now allocated upon first usage, rather than at library start. This means that if the library is loaded into a program that doesn't do networking, or a program that's fully IPv6-aware, it won't waste RAM space for that never-used table.
  • Applications running as root are allowed to increase / decrease the file descriptor limit, so just allocating once at program start isn't enough. Now the array is re-allocated if it's too small because the limit has increased.

I hope all that new malloc/realloc code is correct ...

@Leseratte10 Leseratte10 force-pushed the fix-netlink branch 8 times, most recently from 669dad5 to 6aa3140 Compare January 26, 2023 18:11
@Leseratte10
Copy link
Contributor Author

@andrewshadura Any update yet?

@Leseratte10
Copy link
Contributor Author

Is there any update yet @andrewshadura ? Do you plan to review and merge this eventually, or would I be better off maintaining my own fork of tnat64?

@andrewshadura
Copy link
Owner

I’m really sorry for delaying it for this long. Adding this to my calendar to look at tomorrow.

Co-authored-by: Andrej Shadura <[email protected]>
@andrewshadura
Copy link
Owner

So, there’s a lot of code that seemingly looks okay, but I didn’t review in depth yet, but I’m thinking — we need to have a way to test it properly. So I’ve spent some hours yesterday adding a simple test based on #1 and #3, and also tried out Meson as a build system 🙂

@Leseratte10 Leseratte10 force-pushed the fix-netlink branch 4 times, most recently from 7a8ae29 to f4f9e06 Compare May 4, 2023 13:38
@Leseratte10
Copy link
Contributor Author

Looks like one of your tests is already failing in this branch. Question is, is my code broken or is your test broken? :)

@Leseratte10
Copy link
Contributor Author

Looks like the test is broken. With my changes it just no longer hits that code path. What was that test (01-correct-ipv6-address) supposed to test? A test for the invalid IPv6s I saw in #1 ?

@andrewshadura
Copy link
Owner

andrewshadura commented May 4, 2023 via email

@Leseratte10
Copy link
Contributor Author

Ah, so basically just revert the existing test? Fail if it finds that string, success otherwise?

@andrewshadura
Copy link
Owner

andrewshadura commented May 4, 2023 via email

@Leseratte10
Copy link
Contributor Author

Leseratte10 commented May 4, 2023

When executing this test, the current main version prints this:

16:48:39 libtnat64(1610858): Opening configuration file (/etc/tnat64.conf)
16:48:39 libtnat64(1610858): New network entry for 10.0.0.0
16:48:39 libtnat64(1610858): New network entry for 127.0.0.0
16:48:39 libtnat64(1610858): Got getsockname call for socket 3
16:48:39 libtnat64(1610858): Address family is AF_INET6
16:48:39 libtnat64(1610858): Checking if IPv6 address :: is behind the NAT64...
16:48:39 libtnat64(1610858): Checking the default NAT64 prefix 64:ff9b::
16:48:39 libtnat64(1610858): Got getsockname call for socket 6
16:48:39 libtnat64(1610858): Address family is AF_INET
16:48:39 libtnat64(1610858): Got getsockname call for socket 6
16:48:39 libtnat64(1610858): Address family is AF_INET6
16:48:39 libtnat64(1610858): Checking if IPv6 address ::1 is behind the NAT64...
16:48:39 libtnat64(1610858): Checking the default NAT64 prefix 64:ff9b::

(and the original test on main passes).

But the version inside this PR (#8) prints this:

16:49:49 libtnat64(1611368): Checking file descriptor limits - current limit: 1024, max limit: 1048576
16:49:49 libtnat64(1611368): Perform initial flag array allocation (size 524289 bytes, 1048576 entries)
16:49:49 libtnat64(1611368): Allocated the flag buffer to size of 524289 (1048576 entries)!
16:49:49 libtnat64(1611368): Opening configuration file (/etc/tnat64.conf)
16:49:49 libtnat64(1611368): New network entry for 10.0.0.0
16:49:49 libtnat64(1611368): New network entry for 127.0.0.0
16:49:49 libtnat64(1611368): Got getsockname call for socket 3
16:49:49 libtnat64(1611368): None of our modded sockets, call real getsockname
16:49:49 libtnat64(1611368): Got getsockname call for socket 6
16:49:49 libtnat64(1611368): None of our modded sockets, call real getsockname
16:49:49 libtnat64(1611368): Got getsockname call for socket 6
16:49:49 libtnat64(1611368): None of our modded sockets, call real getsockname

(so the test fails)

Because "dig" is opening a UDP socket for DNS queries, and tnat64 is (currently) incapable of nat64'ing UDP connections, the calls skip most of the code and don't even run into the code path that had the issue with the corrupted logging which is tested by that test. So the test with dig is essentially useless. For a real test you'd need to use TCP, but I don't think an IP of 0.0.0.0 / :: will work there either.

EDIT: I do know that I should still add a whole new test for the NETLINK issue in particular, but that's a different topic. This change / question was about the existing test (for bug #1) failing on this branch because the code that's being tested here is no longer even being executed.

@andrewshadura
Copy link
Owner

Ah, okay, I see.

@Leseratte10
Copy link
Contributor Author

Leseratte10 commented May 4, 2023

I just added a third test that verifies that netlink sockets can call getsockname without errors. This test fails on main but succeeds with this PR.

I hope a dependency on python for that test isn't an issue; I made sure to test it with python2 and python3; but that was the easiest way I could figure out to test a netlink socket.

IPV6_V6ONLY is only supposed to have the values 0 or 1.
However, the man page says to assume 0=false and everything else = true.
@Leseratte10
Copy link
Contributor Author

Any update yet? Is there something I still need to do, or do you just need to review this? Or do you want to switch to the new meson build system?

I'm hoping to be able to add a couple more features as well (UDP, auto-discovering the NAT64 prefix, ...) but I'd rather not have too many open PRs with different features because I don't want to run into more merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

getsockname fails for AF_NETLINK connections
2 participants