The Source for Java Technology Collaboration
User: Password:



Mark Lam's Blog

Performance Archives


Java and More Embedded Considerations

Posted by mlam on April 22, 2007 at 03:46 AM | Permalink | Comments (0)

Previously, I talked about why an embedded systems developer would choose to develop on the Java platform. If you have read that article and are intrigued by the benefits that the Java platform offers, then the next step is probably to ask some more deep probing questions like ...

Do I really need the Java platform?
Sure, the Java platform offers many benefits. But is it needed for my specific device?

Well, if you want the benefits of a runtime interpreted scripting language (i.e. isolation, upgradeability, etc.), then, as I have explained previously, your best bet is with the Java platform.

You may not need the Java platform if your device has the following characteristics:

  1. static functionality: the software functionality never needs to be upgraded in the field, not even for bug fixes. Or, it is cheaper to replace the device than to replace the software (although this isn't very eco-responsible). Or you can live with the cost of providing the service and infrastructure to completely re-flash the software in deployed devices. Under these conditions, you do not need the Java platform's dynamic class loading/unloading feature.

  2. simple software: the software application is extremely simple. When the number of lines of code is low enough, the complexity of the software may be manageable, and not be overwhelming. Hence, the likelihood of your programmers being able to understand the entire system is higher, and the number of details for them to remember is lower. Under this condition, you can live without the Java platform's isolation property and language features (e.g. protection from stray pointers, automatic garbage collection, structured locking, etc.), and still be able to get a reasonable amount of developer productivity.

    I will also talk more about "simple software" from the perspective of performance and footprint below.

  3. small and restricted developer group: if the group of developers is small, then they are easier to manage. The likelihood that their code will accidentally step on each other's code is less. If there is only one developer group for the software and you will never need other groups or third parties to develop software for your device, then there is no risk of their code trampling on your code. Under these circumstances, you may not need the Java platform's isolation property and security features.

  4. software can always be trusted: If there will never be any software deployed on your device that is from an untrusted source (i.e. can perform attacks on or crash your device), then you may not need the Java platform's isolation property and security features. Or alternatively, if you don't care if they attack and crash your device, then you may not need the Java platform.

Generally, if your situation doesn't fit into one of the above profiles, then it is likely that you will benefit from developing on the Java platform.

Continue Reading...



JIT Performance: Defying Physics?

Posted by mlam on February 21, 2007 at 05:56 PM | Permalink | Comments (6)

A few days ago, I came across a few blog entries that referenced my previous article. They are: When is software faster than hardware? by Matthew Schmidt, and Can JIT'ed Code be Faster than Hardware Accelleration by Kirk Pepperdine. These blog entries had received some comments that I thought deserved a response. So below, I will try to address issues raised in some of those comments, as well as provide an intuitive understanding of why you would expect a JIT to outperform a JPU.

Resources: When is Software faster than Hardware?, Software Territory: Where Hardware can't go!

Let's start with ...

Physics Shmeesics
How can software possibly run faster than hardware? I've made this statement myself numerous times in the past. What was I thinking when I made that statement? Well, basically, the thought goes: if a piece of software is running on some given hardware, the performance of that software is ultimately gated by the hardware it runs on. This can be illustrated with an analogy as follows ...

Continue Reading...



Software Territory: Where Hardware can't go!

Posted by mlam on February 16, 2007 at 02:27 AM | Permalink | Comments (15)

In response to my previous article, some folks have been asking about the JIT optimizations I listed, as well as a lot of other interesting questions. I'm not sure I can address all of the questions here. But on the topic of JIT optimizations, I can provide more insight on what they are as well as why hardware cannot implement them.

Before I get started, just to be clear, I'm not personally against hardware Java processors. I certainly think that they fit nicely in some domains. I am also not against any vendors who make Java processors out there. I applaud them for serving the needs of a market that a JIT may not fit. Also, just because a JIT fits doesn't mean that it is always the best solution to deploy. In a previous article, I've made the case that engineering decisions should always be made on a case by case basis. A "one size fits all" mentality can work, but may not always yield the best solution.

However, I do want to debunk the myth that a hardware processor can be faster than an optimizing JIT. But, of course, the JIT isn't free. There is some cost to it in terms of CPU cycles and memory, though it is often a lot less than most people believe. I will address the JIT cost issue in a future article. For today, let's look at JIT optimizations. Since I work on the phoneME Advanced VM for CDC (aka CVM), along the way, I'll point out if these optimizations are available in CVM as it exists today (for those who are interested in CVM details).

Resources: When is Software faster than Hardware?

JIT Optimizations
In my last entry, I rambled off a random list of JIT compiler optimizations. The list is by no means comprehensive nor necessarily indicative of the most desirable optimizations to have in a JIT. Previously, I have explained how more performance isn't always a good thing. Each optimization comes with a cost of some sort. The VM/JIT engineer must weigh the cost against the benefits in choosing to include or leave out an optimization. That said, let's go over the optimizations I've already mentioned as examples to illustrate why a JIT has the advantage over Java processors when performance is the criteria of comparison.

The list again is:

  1. inlining
  2. constant folding
  3. loop unrolling
  4. loop invariant hoisting
  5. common subexpression elimination
  6. use of intrinsic methods

Continue Reading...



When is Software faster than Hardware?

Posted by mlam on February 13, 2007 at 02:44 AM | Permalink | Comments (6)

I decided that I'll take a break from the bug fix track that I've been on, and have a little diversion to spice things up. I'll resume the bug fix (and JIT internals) discussion soon. For today, I would like to clarify a common misconception that hardware Java processors are faster than dynamic adaptive compilers / just-in-time compilers (i.e. JITs). I'll take you through some analysis to prove my point. The analysis will be based on examples from the phoneME Advanced VM for CDC (aka CVM), but this reasoning should apply to other VMs as well. Let's dive in ...

Hardware Acceleration
Hardware acceleration is a technique that is commonly employed to get better software performance in terms of speed. This approach has been successful with graphics, sound, and DSP processing. In those cases, the hardware acceleration offloads the graphics, sound, and DSP work onto co-processors and frees up the main CPU to do other stuff. This parallelism is one reason we get improved performance out of hardware accelerators.

Another reason is that the hardware accelerators can provided special instructions that can do work that is traditionally done by software routines. Of course, these special instructions are specific to the types of algorithm (i.e. graphics, sound, DSP) that uses them. Hence, if your application doesn't do much graphics, sound, and/or DSP, then such hardware accelerators won't be able to make your application run any faster.

Due to the known success of these hardware accelerators in their respective applications, we have come to generalize this success to think that all hardware acceleration will beat software solutions. In the case of Java processors in comparison to JITs, this generalization turns out to be untrue.

Java Processors
The Java VM specification comes with its own instruction set. Some of these instructions look a lot like those one would find in a typical CPU's instruction set. Hence, the idea is that by adding the Java VM's instruction set to a CPU's instruction decoder, one can improve the performance of Java code execution. This observation is valid. However, the misconception is that this hardware acceleration will also out-perform or even match VM JITs. In the case of modern JIT compilers, a hardware Java processor will be hard-pressed to beat the performance of a JIT. Note: In the following discussion, I will abbreviate the hardware Java processor simply as JPU for brevity.

Disclaimer: I am not commenting on the quality of any specific hardware Java processor implementations in the market, but merely looking at this issue from a purely theoretical viewpoint.

OK, now let's look at a specific example ...

Continue Reading...



CVM's JIT: Another BIG Picture

Posted by mlam on January 11, 2007 at 04:33 AM | Permalink | Comments (0)

In my last few entries, I've been talking about a bug I'm currently fixing. One of the reason I haven't been updating daily is because said bug is taking a lot more of my time than expected. There is always more to the picture than meets the eye. Anyway, in my last entry, I briefly discussed the internals of the CVM (phoneME Advanced VM)'s JIT (officially, the dynamic adaptive compiler). Since the bug that I need to fix involved adding functionality to the JIT, we need to know in greater detail how the JIT works (or at least be able to know our way around the code). So today, we'll leave the bug fix alone for a while, and talk about the JIT's BIG Picture ...

Map of the CVM JIT Architecture

Click on the map to get a popup window with a 1024 x 768 res bitmap of the map (if you want to view it in a separate window). Or click here to view the map in a PDF file. I highly recommend using the PDF if you plan to do a printout of the map.

And here's how to read the map ...

Continue Reading...



Beware of the Natives

Posted by mlam on December 07, 2006 at 02:11 PM | Permalink | Comments (16)

There are a lot of not so nice things about using native methods. Here are some:

  • less safe - think "stray pointers".
  • less portable - you'll have to recompile them for every target device architecture you deploy on, present and future.
  • less cost effective - need extra work to build and test all the architecture variations, extra disk storage for deploying all the different binary versions, etc.
  • less manageable - can be a "binary version" tracking, management, and device provisioning nightmare.

But if these reasons aren't enough to deter you from using native methods, try this on for size:

Native code can hurt performance

This seems to go against most people's expectations, but it is the truth. First of all, there is the reason due to what goes on in the runtime stacks when you invoke native methods. I've talked about that in my previous articles (here, here, and here). There, I showed that using native methods incurs bootstrapping and extra frame pushing/popping overhead which results in degraded performance. But there are also many other reasons besides this.

To be fair, native code can be used to help improve performance when used in the right places. I will explain those cases as well. The key is to use native code "carefully".

Ok, let's go bust the "native" myth ...

Continue Reading...



C further with CVM

Posted by mlam on November 25, 2006 at 01:19 PM | Permalink | Comments (3)

I've been talking a lot about esoteric knowledge about the phoneME Advanced VM (CVM), and thought that it is about time to feed you some really technical data. So, I spent most of yesterday rendering a Map of CVM to show you the lay of the land, but it is taking a lot longer than I thought. As a result, no blog entry yesterday. :-( Hopefully, I will get it done today, and be able to do a write up for monday. Look for it. It'll be like CVM in a nutshell.

By the way, I'm using InkScape to do my rendering of the CVM map (a colleague pointed me to it). I don't know if it's the best, but it certainly does the job. So, I thought I'd give it a mention here in case others are looking for a tool like this too. I'm using InkScape because I wanted to render the CVM map in SVG, so that you'll get to scale it to match whatever resolution you need without sacrificing detail. But alas, I'm finding that my browsers aren't quite able to display the SVG format yet (or maybe I'm not exporting to the right format). If anyone has hints on what SVG format is supported by popular web browsers, please let me know. Otherwise, I will go with a bitmap for ease of viewing and a PDF for finer inspection.

Incidentally, I also want to thank the 2 people who have left comments for me so far. It's nice to know that I'm not just talking to a wall.

So, on to today's topic(s) ...

Why is CVM written in C?
The choice was either C or C++? As I've pointed out in a previous article, CVM's architecture matches closely to an object class hierarchy. Using C++ could have been an option. The reason we chose C is because the availability of good C compilers for the embedded space far exceeds the availability of good C++ compilers. I remind you that portability is one of the prime objectives of CVM, and is one reason why it is viable in the embedded space. Typically when hardware is introduced, it will at least come with a C compiler. A C++ compiler may or may not come later. We wanted the Java platform to be available on every device. Hence, it was an obvious choice to go with C.

On a second note, we've found that some C++ compilers also generate very inefficient code in terms of footprint (2 to 3 times more footprint). This certainly is not good for any embedded software. Now, before you jump to conclusions, I don't think that this inefficiency necessarily had to do with the C++ language itself. Personally, I'm a fan of C++ as well, and I know how it can let you write really elegant and efficient code (assuming the compiler cooperates), as well as really bad bloated code. My guess at the time was that people in general didn't care enough about C++ to invest in its toolchain (in comparison with C) ... not to say that there aren't very good C++ tool chains out there. As a result, C++ is given a bad name ... which I think is unfortunate.

Mind you, the CVM decision was made some 7 years ago. The inefficient C++ code generation was observed about 3 to 4 years ago. Perhaps, these issues of availability and efficiency have been fixed since.

Some more of my thoughts on portability and performance below ...

Continue Reading...



Performance: Too much of a good thing?

Posted by mlam on November 22, 2006 at 05:52 PM | Permalink | Comments (7)

This article continues with esoteric knowledge about the phoneME Advanced VM and the JavaME space that developers will need.

If you've looked at the phoneME Advanced VM source code, you'll see that a lot of the names of functions and data structures are prefixed with CVM. CVM is the informal name of Sun's CDC VM, and prefixing labels (especially for global functions and data structures) with CVM is a standard coding convention in this VM code base. This is probably common knowledge to most people who already work with Sun's CDC technology, but I thought I'd mention it anyway in case. Plus, now I can simply refer to CVM directly instead of having to say phoneME Advanced VM.

So, on to this entry's topic ...

Performance
Usually, no user will ever complain if you offer them more performance in their software. However, performance comes with a price. Usually, it means more complex code that makes better use of the hardware. That can mean a higher memory footprint may be needed to run the software. For a JavaME VM which is targetted for resource constrained embedded devices, this is definitely a great concern. Hence, any performance work needs to be justified against its cost. What this means is that platform developers can't just go wild with every optimization trick in the book that they know.

Having said that, I want you to know that I am not saying this because CVM's performance is anything to be embarrassed about. As far as we know, CVM is one of the fastest VM in this space, if not the fastest. To give you an idea of CVM's performance, a few years back, we benchmarked it against JavaSE 1.3 client VM on a subset of SPEC JVM98. We had to use a subset because SPEC JVM98 uses deprecated APIs which have been removed from CDC. Hence, we had to do an internal "port" of the benchmark for this comparison. The comparison was done on a PowerPC PowerMac and a Solaris SPARC machine. CVM came out to be around 80-90% of the performance with only 10% of the static footprint in comparison with JavaSE. You should know that this is old data. JavaSE has improved significantly since, and so has CVM. Note: I'm only sharing about this comparison to give you an idea of the level of performance that can be achieved in JavaME. I'm not saying anything about which VM is better. That would be like comparing apples and oranges. More on that later.

So, when we talk about performance, one of the VM's component that people think of first is the dynamic adaptive compiler, also commonly know as the JIT. Below, I will talk about some performance issues around compilation. I will also touch on other areas / topics that are not JIT related but are important as well.

Continue Reading...





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds