Comments on Marginally Interesting: Matrices, JNI, DirectBuffers, and Number Crunching in Java

But I think that the basic problem remains that th...

2009-05-12T01:39:00.000+02:00

But I think that the basic problem remains that the JVM is oblivious to the memory allocated "off the JVM heap" and it won't be as aggressive reclaiming those buffers if they aren't accessible anymore.I suppose this reply is a bit after the fact now, but what I meant was to allocate your own block of memory using, say, the C library's malloc(), wrapping that pointer using NewDirectByteBuffer(), and using this block as an NIO buffer for however long you need it. Then, when you're done, use the C library's free() to release the block of native memory and set your Java reference to null. Since you manually reclaim, the timing of Java's GC isn't important anymore.

Hi Steve,thanks for your comments. Regarding your ...

2008-11-10T16:10:00.000+01:00

Hi Steve,

thanks for your comments.

Regarding your question, I'm still calling JNI, I'm only storing the matrices as double arrays now. Works very nicely, although one now has to do some things in Java now (especially those which take linear time like copying or even vector addition). Luckily, Java appears to be as quick as the native code there.

Thanks also for the link to the JVM languages group. I think it would already help if the JVM would take the memory allocated by the direct buffers into account and trigger garbage collections accordingly.

Concerning your plans with Scala, I'm setting up some m@tlab lookalike up in JRuby myself, and it works pretty well. In particular because you can add all kinds of syntactic sugar on the ruby side without subclassing. No idea how that would work in Scala... .

We're just now brushing up the documentation a bit, so look out for our first release of jblas.

Hi Mikio,Thanks for the writeup ... I'm always cur...

2008-11-10T15:32:00.000+01:00

Hi Mikio,

Thanks for the writeup ... I'm always curious about numerical stuff in Java, so I found this very interesting.

Two comments:

(1) I'm not quite sure what you mean by this from your post on Nov 6:

"""... have recently converted all my matrix code to using normals double arrays."""

Are you not calling out to JNI anymore? Do you just mean that your n-dim matrix is now stored as a 1d double array? (I think this is actually how COT stores its n-dim matrices, also).

(2) I think you might get some good feedback if you pop into the JVM Languages group[1] and ask them if they have any input into your situation.

While your problem isn't really related to new languages hosted on the JVM, the folks there are quite friendly and really know the JVM internals inside and out.

Anyway, I'd be interested in any follow up posts you have about related work (subscribing to your RSS feed now) as I'd like to actually use Scala for similar work down the road.

-steve

[1] JVM Languages Group: http://groups.google.com/group/jvm-languages

Hi Kiran,thanks for pointing me to MTJ. I think I ...

2008-11-07T10:54:00.000+01:00

Hi Kiran,

thanks for pointing me to MTJ. I think I have already taken a look at it, but forgot about it in the mean time. I couldn't really see all their source code but it also seems that they have taken the route of primitive arrays as well.

Their performance numbers are in fact quite interesting and also underline the need for highly optimized native matrix routines. There are just so many things you have to get right, starting with memory locality down to how to order the floating point operations to fill your pipelines optimally. Something which is maybe hard to control in Java.

It seems that MTJ takes a similar approach to rely on BLAS and LAPACK, and it also looks pretty complete featurewise. In our project JBLAS we have taken a slightly different approach with the interface. For example, there are only 2d matrices. It turns out the separation between vectors and matrices which only have one row or one column is a bit artificial and often requires you to recast vectors as matrices and vice versa. We have also used a lot of overloading to make using the classes as comfortable as possible.

Looks like it's time to release JBLAS soon!

I am not sure if you have seen this, MTJ: http://r...

2008-11-06T19:09:00.000+01:00

I am not sure if you have seen this, MTJ: http://ressim.berlios.de/. They have native and ATLAS based implementations of linear algebra stuff. Their benchmarks are interesting.

Hi Jesse,thanks for pointing me to the NewDirectBy...

2008-11-06T11:58:00.000+01:00

Hi Jesse,

thanks for pointing me to the NewDirectByteBuffer() function. But I think that the basic problem remains that the JVM is oblivious to the memory allocated "off the JVM heap" and it won't be as aggressive reclaiming those buffers if they aren't accessible anymore.

By the way, I just couldn't stand the instability anymore (having your machine freeze a few times a day on larger computations due to memory leaks is not nice) and have recently converted all my matrix code to using normals double arrays. It actually only took a few hours and now even huge computations run very smoothly.

I had to make sure that I do some stuff in Java now, in particular when the complexity is linear in the data. For example, copying data with dcopy just doesn't make any sense. On the other hand, for the really computationally expensive operations like matrix-matrix computations, the copying costs are negligible. And then there is the stuff like computing eigenvalues or solving linear equations where I don't even want to know too much about.

So, stability wise, it looks very nice now. I'll tweak the performance a bit more till I'm happy with that and then I think I'll release my matrix library.

Nothing should prevent you from doing your own mem...

2008-11-05T19:29:00.000+01:00

Nothing should prevent you from doing your own memory management on buffers. Instead of having the JVM allocate the direct buffer, with all the uncertainty of its lifespan, allocate it yourself from native code, and then use the JNI function NewDirectByteBuffer() to wrap the memory in a ByteBuffer for use on the Java side. When you want to free the backing buffer, use JNI and the native free (or whatever memory management you're using).