-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patch RELOC_REL32 cases where offset is > 2GiB #52
base: master
Are you sure you want to change the base?
Conversation
5a98545
to
218ec6f
Compare
Thanks for working on this! (FTR, #2 has a similar proposal, but the discussion there died quickly. Contrary to the current PR, #2 generates the trampoline at link time (independently of the target distance) rather than at load time (the flexdll runtime does not need to be changed).) Some comments/questions:
=>
producing:
(not sure if the prefix is required), which is what #2 was proposing to use.
|
Hah - I hadn't realised that there was another PR which had worked on this! I'm still working through understanding this all properly - in particular, there's something weird happening with on mingw64 when the DLL is located far away where I'm certainly only seeing the trampolines being generated for C primitives in the runtime, yes. AFAICT so far, the default FWIW, I think patching the runtime is better - definitely on mingw64 and I think now on msvc64, the default load addresses for executables should be < 0xffffffff (I think that the issue seen with this in 2008 no longer exists). It seems unnecessary to be generating these jump tables when most of the time they're not needed. That said, on Cygwin64 they will always be needed! This present implementation has a bug at the moment which becomes revealed if you load two DLLs (like ocamldoc, loading str and unix) - the cached trampolines for dllunix can be made to be too far away for dllstr, so I need to use a per-dll symbol cache which should actually make that code a bit neater anyway. It's also unnecessarily doing all this for 32-bit. I won't have much time on this today, but I'll be back on it with a vengeance tomorrow! |
Note: the indirect jump instruction is also what is used for __declspec(dllimport) void f(void);
void foo() { f(); } ==> rex.W jmp *__imp_f(%rip) Btw, adding back |
218ec6f
to
a524a67
Compare
That change still doesn't deal with the |
f47e73a
to
5d79979
Compare
OK, I think this is now functionally complete. d0e07c2 includes building ocamldoc for the two OCaml tests which is a good idea in general, and should go in a separate PR. ocamldoc is a useful bytecode test for flexdll, because it uses both unix and str. a84f9fb artificially places dllunix and dllcamlstr at base addresses which are both more than 2GiB from ocamlrun and also more than 2GiB from each other. I don't think we should commit with this: there are two better tests:
|
03fe127
to
d1f2385
Compare
Until ocaml/flexdll#52 is resolved, Cygwin64 is essentially a broken platform for OCaml.
Until ocaml/flexdll#52 is resolved, Cygwin64 is essentially a broken platform for OCaml.
Until ocaml/flexdll#52 is resolved, Cygwin64 is essentially a broken platform for OCaml.
Until ocaml/flexdll#52 is resolved, Cygwin64 is essentially a broken platform for OCaml.
Until ocaml/flexdll#52 is resolved, Cygwin64 is essentially a broken platform for OCaml.
I'm going to close, on the basis that I no longer think we should fix it. There are three reasons for this:
|
Previously: - -base applied to MSVC64 - Cygwin64 had -base 0x10000 hard-coded until ocaml#89 - mingw64 didn't specify -base at all Now: - The default is not to change the base address at all - All 3 64-bit ports support -base - For now, still ignored for 32-bit ports
An additional executable section is added to each executable containing 16 bytes for each external symbol. This can then be used by flexdll_relocate_v2 if a RELOC_REL32 refers to a function more than 2GiB away.
fc5a68f
to
21a2211
Compare
21a2211
to
f1fd723
Compare
relocate function now takes a pointer to the jmptbl page allocated in the image being loaded. If a RELOC_REL32 relocation is encountered where the offset is greater than 2GiB then 16 bytes of the page is used to generate a trampoline to the function. The new functionality is exposed as flexdll_relocate_v2 and passed in the FLEXDLL_RELOCATE_V2 environment variable. The old flexdll_relocate pointer is still passed in FLEXDLL_RELOCATE by flexdll_dlopen meaning that an executable compiled with this new version can still load DLLs compiled with older versions (these will simply fail if the relocations are too far away, as before). Similarly, flexdll_init falls back to FLEXDLL_RELOCATE if FLEXDLL_RELOCATE_V2 is not set meaning that a DLL compiled with this new version can also be loaded by an executable compiled with older versions. The symbol table now has to be copied to be augmented with an extra pointer for the trampoline, as the symtbl itself cannot be changed without breaking this backwards compatibility.
Verify that the page containing the target symbol is marked executable before creating a trampoline.
Each unit has enough jmptbl space to be able to have trampolines for all of the symbols it imports. However, if two units load within 2GiB of each, the previous trampolines can be reused. Turns out that this is simpler than either resetting or storing multiple trampolines.
The wrong DLLs can end up being loaded while building OCaml :-/
Need some kind of check on the length of ptr->name (cf. cannot_resolve_msg)
f1fd723
to
3dc05ee
Compare
Previously, if a DLL was loaded more than 2GiB away from the executable image in the address space, flexdll_relocate would fail if there were any RELOC_REL32 relocations as the offset couldn't be represented in a 32-bit number.
The solution at the time was to specify a base address of 0x10000 for the MSVC64 and Cygwin64 ports (for some reason, mingw64 got forgotten about). For MSVC64, this generally seemed to work. For Cygwin64, this broke Unix.fork. Recent changes to Cygwin64 also mean that Cygwin DLLs and executables are not supposed to occupy memory below
0x2:00000000
and0x1:00000000
respectively.This patch creates a spare executable section in the DLL which is used to create trampolines for relocations where the offset is over 2GiB. As a result, it's no longer necessary to specify the image base address, so this default has been removed and both the Cygwin64 fork issue and RELOC_REL32 are simultaneously solved.
This needs quite a lot of testing which I'm doing, but it would also be handy for the code to start being reviewed. The easiest option would have been to break backwards compatibility (format of
symtbl
would have changed and the number of parameters forflexdll_relocate
). The version I give here allows mixing of DLLs and executables compiled with both this new version and older versions.