User login

Navigation

You are here

Java for High-Performance Computational Engg. / Sciences?

In the past few days at iMechanica, there have been quite a few messages dealing with different aspects of programming, libraries and so on...

It would perhaps be timely, therefore, to ask:

Do you have any opinion about using Java in numerical analysis (NA) / FEM / CFD etc.---i.e., in computational engineering and sciences (CES)?

Do you have any experience or hard data concerning performance of Java vis-a-vis C++ or FORTRAN, esp. for large systems, or for high-performance applications? Any pointers?


Please leave a note... Thanks in advance...

-----

A few clarifications:

-- This is not about starting platform wars/flames, just an honest query. (However improbable it may sound, it's true).

-- I personally have no opinion about Java in CES. My first choice is C++ and that's what I use all the time anyway. I sometimes run into Web pages that claim that C++ templates are now achieving as much performance as FORTRAN (or better). I, therefore, indirectly conclude that FORTRAN must be in general somewhat faster to C++. Yet, I like C++ (esp. the VC++ 6 IDE and all), and since I am so comfortable with it (15 consecutive years by now), I use it---neither FORTRAN nor Java nor C# (and certainly not VB!!).

-- I don't use Java, but suspect that it (or C#) would be slow for high-performance applications. This could be either because they are interpreted, or, even if someone produces native code out of them, I think, if C++ itself tends to be slower than FORTRAN, then Java (and C#) couldn't fare much better

-- Yet, I also run into things like: JAMA, Java Numerics, MTJ, Colt, etc. These seem serious toolkits... I mean, they also have MPI on Java and so on...

Therefore, the above question. 

-- Also feel free to add any other toolkit that is relevant to CES. (Commercial ones are OK.)

Comments

phunguyen's picture

Hi Ajit,

I do not know about Java in CES. But for C++ and Fortran performance comparision, please take a look at 

http://ubiety.uwaterloo.ca/~tveldhui/papers/DrDobbs2/drdobbs2.html

From my working experiences, I have used one very good C++ FEM code (written by a Dutch company) and the famous FEAP. To be honest, the C++ code is very much faster than FEAP. But this still can not say much since most commerical FEM codes are written in Fortran and who dares to say it's slow :)

I am interested in following this discussion.

From my experience, Java is slower than Fortran and C/C++, but Java is getting better with time. I think the issue is not so much of speed, but the fact that Java numerical libraries are not yet fully matured when compared with its Fortran and C/C++ counterparts.

I can only comment on the sole Java library I have used - the Matrix Toolkits for Java (MTJ). The efficiency of solving a linear system Ax=b also depends very much on how you store the matrix A in the computer memory. The first problem I see with MTJ is this: To use the direct solver in MTJ, one has to store the A matrix in a dense or structured sparse form. Memory problem arises when one deals with large and dense A matrices. The second problem is that if you have a large unstructured sparse matrix, you can store it in one of the standard sparse matrix forms such as Compressed row or column storage, etc., but you will only be able to use iterative solvers. Iterative solvers do not work well for certain matrices. The latter problem highlights the infancy of sparse solver technology within Java. As far as I know, this problem applies also to many other Java numerical libraries.

I get around this problem by calling Fortran and/or C/C++ libraries from Java using the Java Native Interface (JNI). This, however, makes portability a problem because the Fortran and/or C/C++ libraries need to be recompiled on the native machine. It also is a little tedious.

Java has to be interpreted by the virtual machine and that slows it down.  There are Java native compilers that can be used to bypass the virtual machine.  I wouldn't recommend Java for anything more than prototype testing or GUI generation.  And even for those tasks scripting tools like Python are significantly more convenient. 

-- Biswajit 

I think that the performance of the Java virtual machine relative to Fortran or C++ is a secondary issue these days. The "just-in-time" (JIT) compilers that are used in state-of-the-art Java implementations are producing quite efficient code.  JIT has the potential to generate code that is even faster than traditional compiled languages like Fortran and C++.

There was quite a lot of enthusiasm for Java in the numerical methods community about 7-8 years ago. (http://math.nist.gov/javanumerics/). If you look at that site you will see the last post is around 2002. My impression is that there is very little activity now in the Java numerics area. As you can see from this site, Sun was opposed to several changes to Java advocated by the numerical methods community. One of these was adding an efficient multi-dimensional array type.

I've been using C++ for numerics for around 15 years and Fortran before that. Seems like for the past ten years or so, C++ has always been a year or two away from becoming the "standard" language for numerical programming. It certainly isn't there yet. I would really like to see Java evolve to be a contender in this area.

 

Bill

Hi all,

Thanks for your comments.

About C++ and FORTRAN: Actually, yes, secretly I did believe that C++ performance would be at par with FORTRAN's, but just wanted to fail on the safer side, that's all. ... (Ditto, when it comes to assembly programmers---I not only never argue with them, I don't even begin by claiming that C++ can do as well as, if not better than, assembly. Again, one likes to err on the safer side...)

About Java, C#, etc.: These days, optimizing compilers really do an outstanding job... For example, consider HPJava, Carta Blanca, etc.

It's really time that Java became a serious contender. Or, C#. (I don't care... But the point is, if there are more languages in the frey, the matter will get better attention.)

BTW, Biswajit, as far as Python and Java goes, the relationship can be more funny than what appears at first sight---have a look at the Jython project (no misspelling here.)

About C++: IMHO the reason why C++ always remains 1 or 2 years away from being the standard language for numerical programming is: Absence of a standard library.

Here, I can't help but recall that famous Bell Labs Proverb: Language design is library design. (Words correct?)...  C++ really got a distinctly different level of acceptance once STL got standardized. Something similar needs to happen for numerical programming. But, there is an amazing lack of coordination or sharing of interests when it comes to CSE or NA community. ... It is amazing how everyone keeps on repeatedly writing the same routines... (Consider, just as an example, how standard the Numerical Recipies text is, and yet, how everyone implements exactly the same algorithms again, even if only to escape copyrights...)

Here, if Java and other OO languages come into picture and propose some new libraries, I believe, then this more general issue of standardization in C++ also will get the push it badly needs.

To begin with, however, I would like it if the existing "PACKS" and all come with more developer- and student-friendly distributions (e.g. together with VC++ project files---not just GCC on top of Linux) and, also carry more user-friendly documentation (e.g. more detailed and easily accessible documentation on concepts, richer or more detailed sample programs, etc.)

The benefits of standardization are obvious---not just FEM and CFD, but all computational sciences in general stand to benefit.... And, also, the teaching of engineering courses...

Here are a few comments on the previous post.

 >I did believe that C++ performance would be at par with FORTRAN's,

Yes, it has been shown that its possible to write C++ code that is as fast or faster than the equivalent fortran code but it is much more difficult. This particularly applies to code involving matrix operations. Getting fast C++ code for, say, a matrix multiply requires relatively sophisticated programming and a good compiler; the most basic implementation of a C++ matrix class with a matrix multiply method will be significantly slower than the simple fortran matrix multiply subroutine any fortran programmer can write.

 >But the point is, if there are more languages in the frey, the matter will get better attention.

I think that having more languages in the fray actually slows down progress toward a replacement for fortran for numerical programming. The diversion of the numerical methods community into Java several years ago, I think, hurt progress in C++. But it sure does encourage diverse thinking in numerical programming.

>numerical programming is: Absence of a standard library.

 I completely agree with this. But as much as I admire the STL, design and implementation of a C++ matrix class library is much harder. The first question is, "What should a C++ matrix class library contain?" Operator overloading for arithmetic operations? This makes implementation of a high performance library much, much harder. What about matrix types-- dense, symmetric, banded, general sparse? Presumably if you have these different types you want to allow operations among them. There have been many C++ matrix class libraries implemented (Blitz++, uBlas, tvmet, MTL2, MTL4, Newmat) that meet some of these requirements, but not all. It's just very, very hard.

 >example, how standard the Numerical Recipies text is, and yet, how everyone implements exactly the same algorithms again

 This is not completely true. If you are happy with a basic fortran or C interface to numerical methods, Lapack is freely available and provides a wide array of linear algebra operations. The netlib repository provides many more freely available libraries for other numerical tasks. The numerical methods community has done a wonderful job of making numerical software freely available. The problem, as mentioned above, is that replacing fortran or c implementations with object-oriented numerical methods is a tough problem.

The increasing importance of multi-cpu computers adds yet another dimension of complexity to the problem.

--> Any idea about the reason why a basic implementation of matrix multiplication ought to run significantly slower in C++ as compared to one in FORTRAN? How much is the difference? Even a simple, non-rigorous comparison would be good to have...

The background I come from is the following. (1) I had heard that STL was very neatly optimized. So, I wrote a difficult class all by myself. I implemented the map<> template (using the same Red & Black stuff that STL map uses...). Working all alone, without months of working on it, and using a "poor" code-generator (of VC++), my map still did well enough to be within 3% in actual execution time as compared to the highly well-tuned and professionally written STL. (My code can be made availale on request.) (ii) From whatever compiler and OS background I have, including possible differences between loading and linking, supplying memory management and all, I still believe, C++ and FORTRAN, using today's compilers (VC++ 6 on) should not---I repeat *should* not---differ by more than 5% to 10% or so. The reason is C and C++ are so close to the real machine. Indeed, I would expect even a straightforward implementation to be within 5%, but go up to 10% only to be on the safer side....

If you (or anyone else) has hard data to show otherwise, I would be very surprised---and also very interested in finding out why.

---> I beg to differ about whether having more OO languages would hurt C++ or not...But I won't argue the case--it might turn too lengthy, I am afraid.

---> About standard library. Nope, operator overloading would be relatively a very small part of it... We could even have ordinary member functions without any operator overloading, and then, operators could be added without so much as a one microsecond run-time cost per * overloaded between two matrices. (What's the time taken to set up a stack-frame and undo it? Negligible. Practically, zero.)

So, the basic issue is not syntactical sugar, but having a good, well-tested implementation that one knows is standard across all users, platforms, compilers. Something that one could prescribe to students without a moment's hesitation---as a standard part of the language itself.

As to the various matrix types you mention (dense, symmetric, banded, general sparse, etc.) Why, we could have different classes for them... Don't we have a queue and a dequeue; a list and a vector; a set and a map; a queue and a stack; etc. also in STL?

Plus, standardization doesn't mean special purpose vendors cease to exist... Look at the business class library suppliers are doing...

I think even if getting people to agree on certain common things is hard, it is possible. That is the lesson of STL. After all, general purpose (GP) libs are put to even more diverse use, and so, involve even more unruly people... But Stroustrup and co. did it! And did it so admirably well!! XML standards is yet another success story. If GP software succeeds. why not numerical ones? 

Ok, let me stick my neck out and say that if Jack Dongarra meets up Stroustrup a couple of times, and they both decide to have something like a standard library for NA (or, at least, matrix ops), the rest of the people, I bet, will simply fall in line... It indeed is useful to get "celebrity" people to endorse causes :)

So, not just freely available library (like netlib) but a standard one---that's the need of the hour.

--> And, concerning Java, did you lookup those Carta Blanca simulations?

Subscribe to Comments for "Java for High-Performance Computational Engg. / Sciences?"

More comments

Syndicate

Subscribe to Syndicate