Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make static linking for nanobind_extension optional #7

Merged
merged 5 commits into from
Mar 10, 2024

Conversation

nicholasjng
Copy link
Owner

Results in a @nanobind//:libnanobind target producing a libnanobind.a file.

Results in a `@nanobind//:libnanobind` target producing a `libnanobind.a`
file.
@nicholasjng
Copy link
Owner Author

Notes:

I added the -ffunction-sections and -fno-data-sections explicitly here for Linux as per the comments in #1. However, based on the compile commands extracted with --compilation_mode=opt, it seems that Bazel will actually insert those flags also when building in "opt" (release) mode.

This means that we could in theory omit these flags here and bank on the assumption that the user compiles in "opt" mode. With the linker flags I'm not sure they can be omitted as easily.

What's missing is the definition of these flags in userland (i.e. in build_defs.bzl) so that unused library parts are actually stripped out when linking in libnanobind.a.

cc @wjakob, maybe you could take a look as well if you're interested and have time.

@wjakob
Copy link

wjakob commented Mar 7, 2024

This is purely an optimization thing. If Bazel adds these flags in optimization mode, then you don't need to do so on top.

@nicholasjng
Copy link
Owner Author

I'll leave them out of the compiler options, then. I still think the linker options are necessary, so I'll keep them for now and share how they influence the outputs.

By the way, for my understanding: The linker response file approach on Mac should not be necessary for libnanobind.a, since static linking means there will be no undefined symbols because the linker includes all Python APIs into the lib?

@wjakob
Copy link

wjakob commented Mar 7, 2024

You need the linker response file when libnanobind.a is linked into the final shared library constituting the compiled extension. Basically every time a shared library is compiled that uses Python C API symbols. If you compile libnanobind.so, then it's needed for that one too.

@nicholasjng
Copy link
Owner Author

Oh, thanks for clarifying. That would mean that it's currently missing in the actual bindings extension (notice there are no NANOBIND_LINKOPTS):

native.cc_binary(
name = name + ".so",
srcs = srcs,
copts = copts + NANOBIND_COPTS,
features = features + NANOBIND_FEATURES,
deps = deps + NANOBIND_DEPS,
linkshared = True,
linkstatic = True,
**kwargs
)

I only see green CI though. Why does this still work? I don't think it will just fall back to -Wl,-undefined dynamic_lookup, because that default was removed from Bazel before 7.0: bazelbuild/bazel#16414

(FWIW, Google Benchmark MacOS extensions are also green, so this seems robust for some reason.)

@wjakob
Copy link

wjakob commented Mar 7, 2024

Can you capture the actual linker command used and post them here?

@wjakob
Copy link

wjakob commented Mar 7, 2024

It's possible that everything might be fine if you only use nanobind code in the example binding without accessing the C API explicitly. Try adding something like PyUnicode_FromString("foo"); somewhere in the NB_MODULE.

@wjakob
Copy link

wjakob commented Mar 7, 2024

uh, something bad may be happening as well: what if Bazel links against the libpython of a specific Python version? Then it obviously won't complain about missing symbols, but you will end up with a built extension that is extremely difficult to redistribute to others.

@nicholasjng
Copy link
Owner Author

nicholasjng commented Mar 7, 2024

Trying on the google/benchmark repo, invocation:

bazel build -s //bindings/python/google_benchmark:_benchmark --compilation_mode=opt --cxxopt=-std=c++17 --@rules_python//python/config_settings:python_version=3.12 --@nanobind_bazel//:py-limited-api=cp312

Linker command looks inconspicuous (also not very helpful, as I can't see the linked libs for some reason):

SUBCOMMAND: # //bindings/python/google_benchmark:_benchmark.so [action 'Linking bindings/python/google_benchmark/_benchmark.so', configuration: 08c51e151f071c009f895c318cdee16e42af97eb2e93f8087e785edeffbcdfce, execution platform: @@local_config_platform//:host, mnemonic: CppLink]
(cd /private/var/tmp/_bazel_nicholasjunge/7931a2ee992544fe3d1024dd6c7623f6/execroot/_main && \
  exec env - \
    PATH=/Users/nicholasjunge/Workspaces/c++/benchmark/venv/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/Users/nicholasjunge/.pyenv/shims:/Users/nicholasjunge/.pyenv/shims:/opt/homebrew/share/google-cloud-sdk/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Users/nicholasjunge/.local/bin:/opt/homebrew/opt/fzf/bin \
    PWD=/proc/self/cwd \
    ZERO_AR_DATE=1 \
  external/bazel_tools~cc_configure_extension~local_config_cc/cc_wrapper.sh @bazel-out/darwin_arm64-opt/bin/bindings/python/google_benchmark/_benchmark.so-2.params)
# Configuration: 08c51e151f071c009f895c318cdee16e42af97eb2e93f8087e785edeffbcdfce
# Execution platform: @@local_config_platform//:host

I think I'm doing something wrong here, because no libs show up at all.

Also: In the compile commands of the nanobind parts, I see -isystem bazel-out/darwin_arm64-opt/bin/external/rules_python~0.31.0~python~python_3_12_aarch64-apple-darwin/include/python3.12 -isystem external/rules_python~0.31.0~python~python_3_12_aarch64-apple-darwin/include/python3.12m include paths, but those should be fine I think.

BTW: Do you have a one-liner for a module-wide binding using e.g. PyUnicode_FromString for me to try and invoke? I attempted using m.def("foo", []() {return PyUnicode_FromString("foo")};, but that gives me errors like TypeError: Unable to convert function return value to a Python type! The signature was foo() -> _object.

@wjakob
Copy link

wjakob commented Mar 7, 2024

You don't need to bind the function at all, you can just add it to the code that runs when the module is loaded. E.g.

NB_MODULE(my_ext, m) {
(void) PyUnicode_FromString("foo");
}

@nicholasjng
Copy link
Owner Author

Addendum: Output of nm -gC bindings/python/google_benchmark/_benchmark.abi3.so | grep Py shows all Python symbols as undefined except the init hook:

➜ nm -gC bindings/python/google_benchmark/_benchmark.abi3.so | grep Py
                 U _PyBytes_Type
                 U _PyCMethod_New
                 U _PyCallable_Check
                 U _PyCapsule_GetPointer
                 [...]
00000000000021e8 T _PyInit__benchmark
                 [...]

@nicholasjng
Copy link
Owner Author

Addendum 2: The module imports just fine with (void) PyUnicode_FromString("foo"); added as the first line. (That symbol is also part of the stable ABI though, so I don't really know which libpython it comes from.)

@nicholasjng
Copy link
Owner Author

nicholasjng commented Mar 7, 2024

Sorry, I didn't grasp the output. The linker command above is correct, I just failed to understand that the single argument there is in fact a file containing the options.

This is from a stable ABI invocation on the nanobind_example:

➜ bazel build -s //src:nanobind_example_ext --cxxopt=-std=c++17 --macos_minimum_os=11.0 --compilation_mode=opt --@rules_python//python/config_settings:python_version=3.12 --@nanobind_bazel//:py-limited-api=cp312

➜ cat bazel-out/darwin_arm64-opt/bin/src/nanobind_example_ext.so-2.params
-shared
-o
bazel-out/darwin_arm64-opt/bin/src/nanobind_example_ext.so
-Xlinker
-rpath
-Xlinker
@loader_path/../_solib_darwin_arm64/_U@@rules_Upython~0.31.0~python~python_U3_U12_Uaarch64-apple-darwin_S_S_Clibpython___Ulib
-Xlinker
-rpath
-Xlinker
@loader_path/nanobind_example_ext.so.runfiles/_main/_solib_darwin_arm64/_U@@rules_Upython~0.31.0~python~python_U3_U12_Uaarch64-apple-darwin_S_S_Clibpython___Ulib
-Lbazel-out/darwin_arm64-opt/bin/_solib_darwin_arm64/_U@@rules_Upython~0.31.0~python~python_U3_U12_Uaarch64-apple-darwin_S_S_Clibpython___Ulib
bazel-out/darwin_arm64-opt/bin/src/_objs/nanobind_example_ext.so/nanobind_example_ext.o
bazel-out/darwin_arm64-opt/bin/external/nanobind_bazel~override~internal_configure_extension~nanobind/libnanobind.a
-lpython3.12
-mmacos-version-min=11.0
-no-canonical-prefixes
-fobjc-link-runtime
-headerpad_max_install_names
-Wl,-dead_strip
-Wl,@external/nanobind_bazel~override~internal_configure_extension~nanobind/cmake/darwin-ld-cpython.sym
-lc++
-lm

Seems like it does in fact link in -lpython3.12, but also libpython from the hermetic Python 3.12 before that (-Lbazel-out/darwin_arm64-opt/bin/_solib_darwin_arm64/_U@@rules_Upython~0.31.0~python~python_U3_U12_Uaarch64-apple-darwin_S_S_Clibpython___Ulib).

@wjakob
Copy link

wjakob commented Mar 7, 2024

Great, looks like it's all there 👍

@nicholasjng
Copy link
Owner Author

Sweet! I think I got it now. One last question: Do I need to link libpython already into the @nanobind//:[lib]nanobind targets, or is it fine if I pass it only when linking the resulting .so extension containing the bindings?

@wjakob
Copy link

wjakob commented Mar 7, 2024

The question really only arises when building libnanobind as a shared library.

And regarding that, a higher level comment: you really don't ever want to link libpython into any target -- at least on macOS or Linux, because this would then tie the extension to a particular Python binary, which would make it tricky to redistribute on PyPI (On Windows, there is an exception -- you do need to link against a .lib file to explain to the linker that some symbols are resolved dynamically). But I think you are referring not to libpython the .so file, but libpython the bazel target. And in that case, I don't understand it well enough to comment.

What I would do is to capture the compiler and linker command line and compare to see if it produces the same set of flags as nanobind's CMake interface. You could systematically capture this (e.g. via CI) on the three platforms, for both static and dynamic libnanobind builds.

@nicholasjng
Copy link
Owner Author

Sure, I can set up a CI job for that. Your answer makes perfect sense, but doesn't that mean that the Python libs in the linker param file I shared above shouldn't be there?

@wjakob
Copy link

wjakob commented Mar 8, 2024

Oh right: it says -lpython3.12. That's not good.

@wjakob
Copy link

wjakob commented Mar 8, 2024

Imagine for example, that you tried to import such an extension library into a process that has Python statically linked in. The Blender 3d modeler with its Python plugin system would be a good example. The dynamic linker will then try to find another Python shared library and link it into the process, which will lead to a mixture of one initialized and one uninitialized interpreter in the same process. It will segfault immediately.

On both macOS and Linux, you need to leave CPython symbols "dangling" because they are provided by the process which dynamically imports the extension.

@nicholasjng
Copy link
Owner Author

Thank you, the dynamic linker mention finally made it click.

Luckily, that linker param file came out of a dirty build where I added the current Python libs as a direct dependency - on current master, they are left out and no Python libraries appear at all in the linker command, so right now it's correct.

Based on the CMake compile commands analysis of `nanobind_example`, the
`-fno-strict-aliasing` and `-fvisibility=hidden` flags are always added
to the builds in static mode.
…by default

This is practically equivalent to the previous configuration, since the
flags were largely the same.

What's still left to be figured out is whether the Python libs need to
go into the `nanobind_extension` target or into the nanobind deps.
@nicholasjng
Copy link
Owner Author

nicholasjng commented Mar 10, 2024

I believe that the latest change solves the outstanding issues: It

  1. removes the linkstatic = True default from the nanobind_extension, giving the option of linking everything shared (i.e. building @nanobind//:nanobind as a shared library), and building a static libnanobind.a if linkstatic = True.
  2. adds -Wl,--gc-sections and -fno-strict-aliasing to the build, removing the dead code from the static libnanobind.a.
  3. links the @rules_python//python/cc:current_py_cc_libs if and only if MSVC is used.

I'd consider the separate libnanobind-static target scrapped for now, until someone complains.

@nicholasjng nicholasjng changed the title Add static libnanobind target Make static linking for nanobind_extension optional Mar 10, 2024
@nicholasjng nicholasjng merged commit 5b82019 into master Mar 10, 2024
16 checks passed
@wjakob
Copy link

wjakob commented Mar 11, 2024

Hi @nicholasjng,

I don't know enough about bazel to understand all of the nuances. I am concerned about the following point though:

I'd consider the separate libnanobind-static target scrapped for now, until someone complains.

in the CMake version of the build system, the static libnanobind build is the actually the default. There are a several points that make this preferable to a shared build:

A PyPI or Conda-distributed extension will need to ship nanobind along in practice. You would not be able to rely on a precompiled libnanobind, say, from a central package repository. Besides a tricky inter-package shared library dependency, there is the obvious problem that you could not install two libnanobind versions a the same time to support several different packages. In the most common situation that the PyPI package contains a single shared library containing the extension, there is really no benefit to keeping the libnanobind parts separate (there is no sharing, which is the main purpose of a shared library).

This first part is mainly cosmetic in nature. But there are also two practical implications:

  1. Performance: a call into a shared library is always done indirectly through the Global Offset Table (GOT). It's a bit like peppering virtual function calls everywhere, which is an unwelcome overhead.

  2. Binary size: libnanobind contains lots of extras, for example the code to deal with nd-arrays or enumerations. If a project doesn't use those features and uses a static library build, then those will be stripped away and you get a smaller extension.

So this is all to say that there is some value in preserving a static libnanobind as the default in your bazel-based build system. Really the only reason where shared makes sense is if you have a really complex extension that contains multiple extension libraries that are compiled and linked separately. They can then share the libnanobind parts instead of having them replicated multiple times.

@nicholasjng
Copy link
Owner Author

Thank you for the comment. The following summarizes my thoughts and findings:

  1. The removal of linkstatic = True on the nanobind_extension is only a useability change to allow a False value, linkstatic = True is still the default, see https://bazel.build/reference/be/c-cpp#cc_binary.linkstatic.
  2. From the doc I just linked, the default linkstatic = True will actually make the @nanobind//:nanobind target produce a libnanobind.a file by default (the second bullet of the three there). I have confirmed this locally on my Apple M1 machine by extracting the compile commands, and I just added a CI job in Add an actions job for capturing compile commands on all platforms #11 to check how the other platforms behave.
  3. I understand the shipping concerns, and support stripping as much dead code as possible (I added the -Wl,--gc-sections / -Wl,-dead_strip parameters for it to @nanobind//:nanobind). But in that case, what is the point of a shared libnanobind? As I understand it, it would only do me good as long as I have lots of extensions, I keep everything local on my machine, and if I have control of the rest of my packaging story.

@wjakob
Copy link

wjakob commented Mar 11, 2024

But in that case, what is the point of a shared libnanobind? As I understand it, it would only do me good as long as I have lots of extensions, I keep everything local on my machine, and if I have control of the rest of my packaging story.

Let's say that you building a huge package (e.g. Tensorflow, for lack of imagination) using nanobind. Quite likely, that package doesn't just have a single Python extension library but a while bunch of them that are conditionally loaded. (E.g. one for CUDA, one for TPUs, etc.). In this case, your package could have the following contents

tensorflow/__init__.py
tensorflow/libnanobind-tensorflow.so
tensorflow/_tf_cuda.cpython-312-x86_64-linux-gnu.so
tensorflow/_tf_tpu.cpython-312-x86_64-linux-gnu.so
.. (potentially many more)

These binary extension files (_tf_*.so) share the libnanobind-tensorflow.so component instead of linking a static library into each one. But note that this libnanobind.so variant is local to the tensorflow package. In other words, it isn't copied into a central lib directory and shared with other Python packages since this would be really fragile.

Note also that this library was compiled with the NB_DOMAIN set to "tensorflow". This isolates the package from random nanobind settings/bindings that other packages may install, and it also gives it a unique name which is important for some platforms. (Windows, e.g., will not load two .dll files with the same name)

@nicholasjng
Copy link
Owner Author

Alright. Sounds like this is either a nanobind_extension(..., linkstatic = False), or a (currently unimplemented) nanobind_shared_library wrapping a cc_shared_library containing @nanobind. I think at this stage of the setuptools build, any shared nanobind lib would need to also be copied over to the wheel dist. Maybe this is a good idea for a more complex nanobind_example?

The NB_DOMAIN argument sounds interesting, maybe I can support that as a keyword argument.

@wjakob
Copy link

wjakob commented Mar 11, 2024

The NB_DOMAIN argument sounds interesting, maybe I can support that as a keyword argument.

That would be great! You are right that the nanobind_example is a bit on the simple side.

@nicholasjng nicholasjng deleted the static-libnanobind branch June 7, 2024 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants