JNI, Python and _Py_Zero that was not there

Recently, I had this issue with JNI code calling Python script using shared libraries provided by Python itself.

The schema is quite easy to follow, and you can easily apply it inside C based code.

// Initialize Python (can be done once - for a whole life of JVM)
Py_Initialize();

// Remember to reserve PyGILState
// It is not thread safe	
gstate = PyGILState_Ensure();

// c_str will contain content of java.lang.String passed
// from Java
const char *c_str;
PyRun_SimpleString( c_str );

// once finished remember to release GIL
PyGILState_Release(gstate);

Py_Finalize();

You can find fully working sample (Linux/macOS) here – python via JNA. In one of our projects we had to go via JNI for some post processing computations done by Python, otherwise, we had huge performance drawbacks. In fact, we have noticed ~100 times faster executions of the code while using Python with JNI together with some additional caching – reference. Note that we were running code using 256 CPUs, where both Java and Python were running on all CPUs. As for passing java.lang.String from Java to C, in case you are not familiar with it, take a look here – passing java.lang.String value from Java to C.

Usually, this approach works perfectly fine. Unless something is not quite OK. Last week I got really strange error coming from very simple Python script with numpy installed. And the error was:


File “../python2.7/site-packages/numpy/core/__init__.py”, line 24, in
raise ImportError(msg)
ImportError:
Importing the multiarray numpy extension module failed. Most
likely you are trying to import a failed build of numpy.
If you’re working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control). Otherwise reinstall numpy.

which is not quite promising. It looks like something is really broken. And it looks really serious. Now, let’s check what exactly is happening there. Let’s take a look at __init__.pyt inside core.

try:
    from . import multiarray
except ImportError:
    msg = """
Importing the multiarray numpy extension module failed.  Most
likely you are trying to import a failed build of numpy.
If you're working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control).  Otherwise reinstall numpy.
"""
    raise ImportError(msg)

What we can see here is that multiarray is one to blame. We need to find out what multiarray is.

> find . -name "multiarray*"
./numpy/core/multiarray_tests.so
./numpy/core/multiarray.so

In fact, it’s a shared library that is loaded from Python. We need to know why does it fail inside JVM. Do you remember the execution schema?

JVM -> SomeJavaClass -> JNI -> Python library (libpython2.7) -> numpy

Let’s see what happens if we try to load this library inside JVM. We can use supper simple code for that

public class TestMulti {
 
  static {
    // System.loadLibrary takes name of library as argument
    // System.load takes full path of file
    System.load("../numpy/core/multiarray.so");
  }
 
  public static void main(String[] args) {}
}

And now you are talking! It looks like multiarray.so is not able to find some symbols.

> java -Djava.library.path=... -cp . TestMulti
Exception in thread "main" java.lang.UnsatisfiedLinkError: .../multiarray.so: undefined symbol: _Py_ZeroStruct
  at java.lang.ClassLoader$NativeLibrary.load(Native Method)
  at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
  at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
  at java.lang.Runtime.load0(Runtime.java:809)
  at java.lang.System.load(System.java:1086)
  at python.PythonJustLoad.(PythonJustLoad.java:14)

Symbol _Py_ZeroStruct comes from libpython2.7. You can check it quite easily

> nm .../libpython2.7.so | grep _Py_Zero
00000000003c6cc0 D _Py_ZeroStruct

It looks like location of libpython2.7 is specified on LD_LIBRARY_PATH.

> env | grep LD_LIBRARY_PATH | tr ':' '\n' | grep python
.../python/2.7

Digging through StackOverflow lead me to people who had similar issues: Java: load shared librariees with dependencies. And I know that JNA solves the issue. However, I prefered to stick to JNI. So, we need to make sure that symbols from libpython2.7 are loaded properly while still using JNI. We have two options here

> export LD_PRELOAD=.../libpython2.7.so

This approach will enforce loading library priori to any other library loaded by process

> man ld.so
...
...
LD_PRELOAD
              A list of additional, user-specified, ELF shared libraries to be loaded before all others.
              The items of the list can be separated by  spaces  or  colons. This  can  be  used  to 
              selectively override functions in other shared libraries.  The libraries are searched for 
              using the rules given under DESCRIPTION. For set-user-ID/set-group-ID ELF binaries, preload 
              pathnames containing slashes are ignored, and libraries in the standard search directories  
              are  loaded only if the set-user-ID permission bit is enabled on the library file.
...
...

However, this is not quite what we want. This way, we will load Python library all over the place. Each and every process will load it. Not exactly what we are looking for. What we can do, instead, is to enforce loading of Python library directly within JNI code

void loadLibrary() {
  if(handle == NULL) {
    handle = dlopen("libpython2.7.so", RTLD_LAZY | RTLD_GLOBAL);
    if (!handle) {
      fprintf(stderr, "%s\n", dlerror());
      exit(EXIT_FAILURE);
    }
  }
}

...
...

loadLibrary();

// Initialize Python (can be done once for whole live of JVM)
Py_Initialize();

// Remember to reserve PyGILState
// It is not thread safe	
gstate = PyGILState_Ensure();

// c_str will contain content of java.lang.String passed
// from Java
const char *c_str;
PyRun_SimpleString( c_str );

// once finished remember to release GIL
PyGILState_Release(gstate);

Py_Finalize();

Two things here. First, we enforce RTLD_GLOBAL and make symbols from libpython2.7 visible for multiarray.so (note that this is not default behavior of JNIJVM loads libs using dlopen(filename, RTLD_LAZY). Second, we are inside JVM itself and we enforce loading of library only in case the code is actually executed. And that’s what solves the issue.

Have fun with JNI!

References:

1. OpenJDK – loading libraries
2. Java: load shared librariees with dependencies
3. man ld.so
4. Embedding Python code

If you are looking for more JNI samples, take a look here: http://jnicookbook.owsiak.org