The Source for Java Technology Collaboration
User: Password:



Mark Lam's Blog

November 2006 Archives


CVM Stacks and Code Execution

Posted by mlam on November 30, 2006 at 05:13 PM | Permalink | Comments (0)

Map of CVM Data Structures

Welcome to a continuation of the discussion on the internals of the phoneME Advanced VM (CVM). If you missed the beginning of this discussion, look here where I did a high level introduction of some of the major VM data structures using the CVM map. Today, I'll get into the execution of Java methods and how this appears in the runtime stacks. By stacks, I mean stacks as in the thread stacks that hold activation records for methods ... not stacks as in container APIs, or stacks as in API layers. This discussion will give you insight into the control flow of Java code execution in CVM (i.e. who has the CPU at any time). If you want to bring up a copy of the map for reference while you read on, click here (or here for a PDF to print).

All the source files that will be referenced below can be found in the src/share/javavm/include (see here) or src/share/javavm/runtime (see here) folders of the phoneME Advanced project. You will find the .h files in the include folder, and .c files in the runtime folder.

The Execution Engines
In CVM, there's the interpreter and then, there's the dynamic adaptive compiler (commonly known as the JIT). Conceptually, the interpreter is just a big switch statement, and each case is for a bytecode that is to be executed (see CVMgcUnsafeExecuteJavaMethod() in executejava_standard.c). The interpreter loops around this switch statement until there are no more bytecodes to execute. For methods that are executed frequently (commonly referred to as being hot), the JIT will compile these methods into native machine code. The compiled methods will then be executed in place of doing the bytecode interpretation.

There are many ways to measure the hotness of a method. The CLDC VM (phoneME Feature) uses a timer based sampling mechanism. As of this writing, CVM uses invocation counts that are sampled during interpretation. Upon reaching some threshold of hotness, the method gets compiled. The issue now is how to go from interpreting the bytecodes to executing the compiled method. To understand this (and all the other nuances of Java code execution), we need to take a look at what happens in the runtime stacks when Java code is executed ...

Continue Reading...



Complexity equals Entropy?

Posted by mlam on November 29, 2006 at 01:10 PM | Permalink | Comments (0)

Gotta get some work done. So, short entry for today. CVM Internals will have to be postponed again ... sorry. On to today's topic ...

2nd Law
The second law of thermodynamics states that [at least how I would like to remember it] ... universal entropy always increases. An interesting corollary of that is ... local entropy can be reduced by infusing energy, but this also comes at the cost of increasing universal entropy. I probably butchered that in so many ways since I'm not a physicist. But let's get to the point ...

Software Complexity
Software complexity is a bit like entropy. Over the lifetime of a software product, there will be enhancements, customizations, ports, etc. All these add to the complexity of the code. We all know that complexity makes it harder for us to maintain the code. After all, the human mind can only remember so many things at a time. Too much complexity, and we'll have trouble maintaining the code. Bugs start to creep in. It takes longer and longer to add new enhancements without breaking the system. When your buddy who wrote half the system decided to leave for another job at a startup, you are left in charge of the code. Congratulations, you are now the new governor of a territory (the code) infested with wild beasts (the perpetual bug parade) without a map, and top it off, if you don't get the territory under control soon, the king (your boss) will have your head.

What to do? What to do? ...

Continue Reading...



Multi-tasking the Java platform: What's the Big Deal?

Posted by mlam on November 28, 2006 at 11:35 PM | Permalink | Comments (7)

Today, I started reading this thread on java.net forums. It made me wonder if people all mean the same thing when they talk about a multi-tasking Java platform. So, I decided to postpone my discussion of CVM internals for a day, and go over the topic of multi-tasking (which is also relevant to phoneME and CVM).

Disclaimer: Before getting into it, I should clarify that my opinions are my own and not necessarily that of Sun, my employer, nor my colleagues at Sun.

So, here goes ...

What is Multi-Tasking anyway?
Strictly speaking, multi-tasking means to be able to do multiple things at the same time. For the Java platform, this would mean to be able to have concurrent execution of code. The Java platform already supports multi-threading. So, what's the issue? Well, people want to be able to run multiple applications concurrently as well. How is that different than just calling the main() method of the apps from different threads? The difference is that apps like to think that they own the world and all its resources. Simply running them in different threads may have side-effects where one application can interfere with the operation of another. So, a multi-application Java platform needs to be able to isolate these applications from one another. Now, where have I heard of this kind of feature / behavior before? Why, it is commonly implemented today as processes in operating systems.

Therefore, when people want multi-tasking, I would think that what they are actually asking for is a Java process, and the Java platform takes on the role of an OS relative to the Java applications. Let's take a look at multi-tasking features in OSes and see how those should be manifested in the Java platform. We should also take a look at why people would want these features so that we don't end up over-engineering a solution. So first, OSes ...

Continue Reading...



The BIG Picture: a Map of CVM

Posted by mlam on November 27, 2006 at 01:17 PM | Permalink | Comments (8)

Personally, when I dive into a new system, one of the first thing that I try to figure out is how everything fits together. If you are a visual thinker like me, one of the best ways to do that is to draw a diagram of all the things that you think are important and see how they relate to one another. In the case of embedded systems, in my experience, it is also important to know what goes where in memory, and to get a feel of how system resources are being used. Hence, I prefer to map out the data structures.

Here is my map of CVM ...

the WORLD according to CVM

Map of CVM Data Structures
Click on the map to get a popup window with a 1024 x 768 res bitmap of the map (if you want to view it in a separate window). Or click here to view the map in a PDF file. I highly recommend using the PDF if you plan to do a printout of the map.

And here's how to read the map ...

Continue Reading...



C further with CVM

Posted by mlam on November 25, 2006 at 01:19 PM | Permalink | Comments (3)

I've been talking a lot about esoteric knowledge about the phoneME Advanced VM (CVM), and thought that it is about time to feed you some really technical data. So, I spent most of yesterday rendering a Map of CVM to show you the lay of the land, but it is taking a lot longer than I thought. As a result, no blog entry yesterday. :-( Hopefully, I will get it done today, and be able to do a write up for monday. Look for it. It'll be like CVM in a nutshell.

By the way, I'm using InkScape to do my rendering of the CVM map (a colleague pointed me to it). I don't know if it's the best, but it certainly does the job. So, I thought I'd give it a mention here in case others are looking for a tool like this too. I'm using InkScape because I wanted to render the CVM map in SVG, so that you'll get to scale it to match whatever resolution you need without sacrificing detail. But alas, I'm finding that my browsers aren't quite able to display the SVG format yet (or maybe I'm not exporting to the right format). If anyone has hints on what SVG format is supported by popular web browsers, please let me know. Otherwise, I will go with a bitmap for ease of viewing and a PDF for finer inspection.

Incidentally, I also want to thank the 2 people who have left comments for me so far. It's nice to know that I'm not just talking to a wall.

So, on to today's topic(s) ...

Why is CVM written in C?
The choice was either C or C++? As I've pointed out in a previous article, CVM's architecture matches closely to an object class hierarchy. Using C++ could have been an option. The reason we chose C is because the availability of good C compilers for the embedded space far exceeds the availability of good C++ compilers. I remind you that portability is one of the prime objectives of CVM, and is one reason why it is viable in the embedded space. Typically when hardware is introduced, it will at least come with a C compiler. A C++ compiler may or may not come later. We wanted the Java platform to be available on every device. Hence, it was an obvious choice to go with C.

On a second note, we've found that some C++ compilers also generate very inefficient code in terms of footprint (2 to 3 times more footprint). This certainly is not good for any embedded software. Now, before you jump to conclusions, I don't think that this inefficiency necessarily had to do with the C++ language itself. Personally, I'm a fan of C++ as well, and I know how it can let you write really elegant and efficient code (assuming the compiler cooperates), as well as really bad bloated code. My guess at the time was that people in general didn't care enough about C++ to invest in its toolchain (in comparison with C) ... not to say that there aren't very good C++ tool chains out there. As a result, C++ is given a bad name ... which I think is unfortunate.

Mind you, the CVM decision was made some 7 years ago. The inefficient C++ code generation was observed about 3 to 4 years ago. Perhaps, these issues of availability and efficiency have been fixed since.

Some more of my thoughts on portability and performance below ...

Continue Reading...



When does JavaSE becomes a better choice than JavaME CDC?

Posted by mlam on November 23, 2006 at 02:17 PM | Permalink | Comments (5)

A comment from my last entry on performance, asked, "I was thinking about the fact that devices [increasingly] get more power and more RAM. I thought when will JavaSE be a better choice instead of JavaME/CDC1.1? How much CPU, RAM, cache....do you need?"

Before I answer this, I must first make the disclaimer that my opinions are my own as an engineer, and not necessarily that of my employer, Sun, or even other engineers at Sun. With that said, now let's get into the question ...

JavaSE or JavaME?
While the initial question was motivated by devices capability and performance, there are many more dimensions to this question than first meets the eye. The issues also include compliance with specifications, ubiquity, portability, etc. Let's start with the obvious ...

Device Capability
If your device is like a desktop / server machine (think Pentium class) that is built into an embedded box, then, in general, the JavaSE VM can probably make better use of its capabilities for performance. If it's capabilities are a little less, then your mileage may vary. I say Pentium class here just to give you an idea of the type of computing capability involved. I don't mean only for a x86 type processor. Obviously, there are JavaSE ports for SPARC and PowerPC as well, and the same rule of thumb applies there.

Device capability isn't only about the choice of processors. Take PowerPC for example. It has embedded variants as well as the more well known desktop and server versions. The processor core is mostly the same (i.e. will execute the same code), but other capabilities are different. The most obvious would be differences in clock speed, and cache. And then, there are other hardware differences (e.g. board level) in capability: cache architecture, L2 cache, main RAM capacity, RAM speed, bus speed, memory and I/O bus architecture, I/O processors, MMU, DMA, TLAB size, secondary storage (HardDisks, FlashDisks), etc. ... or, the lack thereof. The more of these features your device has, the more likely JavaSE is a better fit, and vice versa.

Just looking at memory capacity alone, I think JavaSE typically operates with a footprint in the order of 10s to 100s of MBs, or even GBs. CVM operates in the order of 1s to the low 10s of MBs. Of course, a lot of this depends on what your application is doing (for both JavaSE and CVM). But those numbers should give you an idea. So, if your device only has 16MB RAM, CVM will probably be your best bet. If you have 32MB of RAM, it gets a little gray, and depends on what you are trying to achieve. CVM is still usually your best bet for most embedded applications. Low 100MBs, it is still gray but tending more towards JavaSE now. If the device has 1 GB or more, I would be fairly confident that JavaSE is better suited here.

As for cache, it's a lot harder to tell. 0 to 10s KBs, go with CVM. 10s to 100s of KBs, it's a gray area. MBs of cache, you can definitely run JavaSE now, but this doesn't mean that CVM isn't still the better choice in some cases.

For clock speeds, 10s to low 100s of MHz, CVM is your better choice. Low 100 MHzs to 1GHz, it's gray. More than 1GHz, JavaSE is the likely choice. But as Sun has shown not too long ago, CPU performance isn't all about clock speeds (see CoolThreads). So, take the above numbers with a grain of salt. In fact, all the ranges I've given above are just educated guesses based on my experience in this field. They can be used as a hints, but a real world case can be different. That's why there's no hard fast rule as to which fits better in any given case.

Now that we're talked about the obvious stuff, let's get into all the "gotchas" that people may not think about ...

Continue Reading...



Performance: Too much of a good thing?

Posted by mlam on November 22, 2006 at 05:52 PM | Permalink | Comments (7)

This article continues with esoteric knowledge about the phoneME Advanced VM and the JavaME space that developers will need.

If you've looked at the phoneME Advanced VM source code, you'll see that a lot of the names of functions and data structures are prefixed with CVM. CVM is the informal name of Sun's CDC VM, and prefixing labels (especially for global functions and data structures) with CVM is a standard coding convention in this VM code base. This is probably common knowledge to most people who already work with Sun's CDC technology, but I thought I'd mention it anyway in case. Plus, now I can simply refer to CVM directly instead of having to say phoneME Advanced VM.

So, on to this entry's topic ...

Performance
Usually, no user will ever complain if you offer them more performance in their software. However, performance comes with a price. Usually, it means more complex code that makes better use of the hardware. That can mean a higher memory footprint may be needed to run the software. For a JavaME VM which is targetted for resource constrained embedded devices, this is definitely a great concern. Hence, any performance work needs to be justified against its cost. What this means is that platform developers can't just go wild with every optimization trick in the book that they know.

Having said that, I want you to know that I am not saying this because CVM's performance is anything to be embarrassed about. As far as we know, CVM is one of the fastest VM in this space, if not the fastest. To give you an idea of CVM's performance, a few years back, we benchmarked it against JavaSE 1.3 client VM on a subset of SPEC JVM98. We had to use a subset because SPEC JVM98 uses deprecated APIs which have been removed from CDC. Hence, we had to do an internal "port" of the benchmark for this comparison. The comparison was done on a PowerPC PowerMac and a Solaris SPARC machine. CVM came out to be around 80-90% of the performance with only 10% of the static footprint in comparison with JavaSE. You should know that this is old data. JavaSE has improved significantly since, and so has CVM. Note: I'm only sharing about this comparison to give you an idea of the level of performance that can be achieved in JavaME. I'm not saying anything about which VM is better. That would be like comparing apples and oranges. More on that later.

So, when we talk about performance, one of the VM's component that people think of first is the dynamic adaptive compiler, also commonly know as the JIT. Below, I will talk about some performance issues around compilation. I will also touch on other areas / topics that are not JIT related but are important as well.

Continue Reading...



Introduction to phoneME Advanced VM Internals

Posted by mlam on November 21, 2006 at 01:17 PM | Permalink | Comments (1)

If you are reading my blog, chances are that you already know about Sun open-sourcing its JavaME software stack in the phoneME project. If not, click here to read more about phoneME.

Some background info
The intent of open sourcing our code is basically to allow you to gain access to it, study it, and perhaps contribute changes of your own. While there is now a mechanism by which you can gain access to the code, gaining access is not the same as being able to understand and contribute effectively to it. The latter requires some special knowledge that until now, for the most part, is only known to some of Sun's employees with a few exceptions (e.g. some hardcore VM engineers at customer companies who use our technology). The knowledge that I refer to are things like coding conventions, terminology/jargon, design philosophies, code organization, and design-tradeoff decisions to name a few. While I trust that you are all intelligent folk who can figure all these things out in time, I'm guessing that such a task is not what attracted you to this project in the first place.

Hence, I intend to write a series of blog entries (starting with this one) on these topics to make our mutual lives easier in the long run, as well as allowing everyone to get to the fun stuff sooner instead of having to waste time figuring out mundane things. I will also be writing about technical topics like how certain sub-systems work as I feel inspired to. Feel free to leave comments to requests topics that you want me to talk about, or ask for clarifications on things I will/have talked about. I will take demand into consideration when I choose the order of topics to write about.

Before I start, you might ask how I came into this knowledge that I will share (and why you should trust that I actually know what I am talking about). So, a bit about me: I work on the VM team that created and maintains the VM at the crux of the phoneME Advanced project. I was with the team since before CDC 1.0 was released. Hence, I've been working with this code base for a long time. The way our team works, we don't have officially divided up parts of the VM that we work on. We basically go where we are needed and do the necessary work. Hence, each VM engineer's knowledge of the code is quite well rounded. However, each of us do have areas that we are more familiar with than others. I should also point out that I am a VM engineer (as opposed to a class library engineer), and therefore, my expertise lies mostly in the VM and some very core system classes. While I am generally knowledgeable about the other classes in the standard libraries, I am not an expert on them. We have other engineers who focus on the libraries. Also, I will only be writing about the phoneME Advanced VM (as opposed to the phoneME Feature VM) because this is my area of expertise.

OK, so let's get into our first topic.

The meat

Earlier, I said that the esoteric knowledge that I speak of is known to few, even amongst customers who use our technology. The reason for this is because those customers seldom have a need or incentive to modify the region of code we call shared code. But now with OSS, this will no longer be the case. The greatest degree of innovation and feature enhancements occur in shared code, which historically has mostly been the domain of Sun engineers only. Our customers, on the other hand, is usually more focused on the region of code we call the HPI or Host Porting Interface. Look here for the CDC Porting Guide which will tell you how to get to the details of the HPI. The actual HPI is documented in the source code if you know where to look. You will also find other interesting documents on that webpage.

a Design Philosophy
The VM is designed to be highly portable, and to maximize reuse of code between ports while maximizing performance as much as possible. This is a founding principle of the VM.

The code reuse is achieved by keeping as much common code as possible within the shared umbrella. Only hardware or OS dependent bindings is kept out of the shared code. These hardware / OS dependent bindings are referred to as target or platform specific code. Click here to see a listing of the src folder of the phoneME Advanced project. In this specific example, there are arm, linux, and linux-arm folders. The linux folder contains code that is common to all linux ports. These are usually implementations of the HPI which is called from the shared code in the shared directory.

The linux-arm folder contains additonal customizations that either complete or override implementations in the linux directory. These customizations are, of course, only relevent to linux ARM ports.

The arm folder contains code that is specific to ARM ports. Usually, they appear in the form of utility functions (which could be assembly code in some cases) that is called upon from various ARM ports.

You will find that the code density of the shared folder (and its children) will be the highest followed by the OS folders (e.g. linux), and followed lastly by a tie between the OS-CPU (e.g. linux-arm) and CPU (e.g. arm) folders. This fact also demonstrates how the VM is made more portable. The porting effort usually only requires implementation / modification of the target specific files (which is a significantly smaller portion of the total code).

The decision to do the majority of our work and innovation in the shared code as opposed to the target specific code also supports our decision to maximize performance for all ports. This way, every port can benefit from the bulk of the performance work that is done (in shared code). It is true that there are optimizations that are port specific that we may wish to apply. For these, we usually apply them in the OS-CPU or CPU folders as appropriate.

Another bi-product of this code organization is that the code tends to be more readable. You will find that shared code is not littered with #ifdefs for customizations for various OS and CPU architectures. The #ifdefs you will typically see there are for enabling/disabling VM features instead. You will also see that the OS, CPU, and OS-CPU files are also more readable because they will/should not have #ifdefs due to customizations for other architectures.

In the src folder, you will see a portlibs folder. portlibs is used to hold code which may be common to various ports but don't quite fit in the OS or CPU categories. Some examples of these are commonality due to toolchains (e.g. gcc) or libraries / standards (e.g. posix, ansi). Various ports (in the OS and OS-CPU files) may choose to make use of the code in portlibs, or not as appropriate.

One way to conceptually understand the organization of the code is as follows: in terms of object-oriented terminology, there is a parent class for the VM. The parent class is expressed in the shared code. Each port of the VM of a specific OS and CPU target is a subclass that may has made use of white-box reuse through inheritance. The OS code is the immediate subclass of the shared code. The OS-CPU code is the subclass of the OS code. The code in the CPU folders are utility libraries that the OS-CPU class may choose to make use of. The portlibs code is another library that the OS-CPU class may choose to make use of.

To summarize, the VM is incarnated as a singleton for a given platform (OS and CPU). However, it is instantiated from an OS-CPU VM class which extends the OS VM class which in turn extends the shared VM class. And this OS-CPU VM class may reuse code also by delegation to libraries in CPU and portlibs code.

Mind you, this is only a conceptual model of the code organization. You will find no reference to a parent and subclass VM in the code. And the conceptual model is also not perfect in all aspects. You may find some areas where the code relate in ways that does not fit this abstraction. Yes, there are exceptions. But this model is the general rule.

What does this mean to you?
When you plan to look for the code that achieves some functionality, think of where the functionality should belong (i.e. shared, OS, OS-CPU, or in the CPU or portlibs libraries). This will help you locate the code of interest faster.

You will have to think similarly for code that you want to contribute. This code organization is one of the key factors that this VM has achieved its great ease of portability (which is an important feature for VMs in the mobile and embedded space). The code review process for code contributions will certainly take this into consideration.

As I've mentioned earlier, you may find some exceptions that don't follow this convention. Please don't use that as an excuse to further deviate from the convention. Instead, either the existing exceptions should be fixed to conform (if possible), or there are good technical reasons why those cases will/should not fit the desired mold. Those exceptions may be allowed if the reasons are compelling enough.

Hey, wait a minute!!!
All this stuff about portability sounds nice, but wouldn't all this layering have an impact on performance? The answer is NO! Well, "NO" for cases which we care about in this space. Is the VM as highly performant as it could be if it inlined everything and got rid of the layering? Maybe not, but that's a tradeoff we make for portability and maintainability. Note also that in practice, any performance difference is negligible. That said, the code is not implemented in a naive way in terms of layering. The layering is implemented using various techniques (which I won't go into here) to prevent unnecessary performance loss where it matters. And, of course, the team has done measurements to ensure that this code is competitive ... very competitive ... in terms of performance.

And that leads me to another question: if we're not trying to squeeze out every bit of performance possible (because of the tradeoff we made in our design philosophy), how much performance is enough performance? That, I will answer in my next blog entry.

Have a nice day. :-)





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds