Skip to main content

My top wish for Java 8: EE applications as first class citizens

Posted by jjviana on August 23, 2010 at 5:38 AM PDT

I have been a Java developer since the 1.02 days. It has been a long and fun ride, and in spite of what people said over the years, the Java language and Java platform didn't die and are in fact stronger than ever. But sometimes I wish it would evolve faster.

Java EE has come a long way, to the point  Java EE 6 can be considered a lightweight development platform. When Glassfish v3 was released one of the selling points was fast application (re)deployment times. This is an essential feature for any team in the quest for greater developer productivity: if you have to wait 120 seconds for a redeploy and you do it 30 times a day, you spend an entire working hour just waiting for the  server to reload your application.

The problem is, even tough Glassfish v3 deploys applications very fast indeed (most of my deployments are under 12 seconds now, for relatively complex web applications) it also suffers from a problem that plagues every single Java EE implementation I have worked with: memory leaks. After very few redeployments the Glassfish JVM dies with java.lang.OutOfMemoryException and you need to forcefully kill it and start another instance. That ends up neutralizing most of the gains obtained with fast deployments.

To be fair, it is not entirely Glassfish fault. I have been spending more time than I would like to admit chasing memory leaks and most of the time the cause lies in some third-pary library used by the application. Other times it is related to the interaction between the application server implementation details and and application code.

Consider for instance this situation: a web application or library uses InheritableThreadLocal in order to pass contextual information between application layers. This is the case in the Wicket framework, where the Application object is set in a ThreadLocal variable by a servlet filter at the beginning of each request  (an is cleaned up at the end of each request).  In wicket 1.4.9 the developers decided to tweak the implementation by using InheritableThreadLocal instead of ThreadLoal. The reason behind that change was that sometimes web applications spawn threads to work in background and these threads could benefit from inheriting a reference to the Application object.  Altough ThreadLocals are dangerous, this seems like an innocent enough change: if the application code doesn't start any immortal threads, there is no way this change can trigger a memory leak, right?

Here is the problem: Glassfish v3 initializes almost everything lazily (that's why it has such a fast startup time). When some resources (like a connection pool) are accessed for the first time, Glassfish will initialize all the infrastructure needed to make the resource work properly. One of the things it will do is to start a timer used for resource administration. This is done in the following code block (ConnectorTimerProxy.java):

 private Timer getTimer() {
        synchronized (getTimerLock) {
            if (timer == null) {
                ClassLoader loader = null;
                try {
                    loader = Thread.currentThread().getContextClassLoader();
                    Thread.currentThread().setContextClassLoader(
                            ConnectorRuntime.getRuntime().getConnectorClassLoader());
                    timer = new Timer("connector-timer-proxy", true);
                } finally {
                    Thread.currentThread().setContextClassLoader(loader);
                }
            }
        }
        return timer;
    }

Notice that the code makes sure the timer doesn't inherit the context class loader of the web application. But it cannot do anything prevent it from inheriting the Application instance that Wicket set via InheritableThreadLocal. Thus, the coupling of these two innocent pieces of code creates a memory leak in the application: since the Timer created here is never stopped, there will be a reference to the Application instance even after the application is undeployed, which prevents its memory from being reclaimed by the GC. (Wicket developers have reverted the Application instance to ThreadLocal in version 1.4.10, therefore removing this problem).

The point I'm trying to make is this: even tough both Glassfish developers and Wicket developers are very smart people commited to make a platform that is fast, stable and highly productive, it is still possible to introduce memory leaks quite easily in a Java EE application by accident. After fixing this particular memory leak we now gladly reap the benefits of fast redeployment times. But as soon as we change a library version or introduce new elements on the application we can (and probably will) run into problems like this one again.

These kinds of memory leaks always bothered me. They make hot deployment very unreliable, to the point many sysadmin teams won't allow it in a production environment. I always thought that over time, with better implementations, these bugs would go away but I am now convinced that the Java  platform might need a deep change in order to make that happen.

Java EE applications aren't real applications from the point of view of the Java VM. The Java EE application server tries to act as a "kernel", much like an O/S kernel, providing basic infrastructure services for multiple applications. One of the main jobs of a kernel is to make sure applications are properly insulated from one another and that the kernel itself is insulated from the applications. Modern Java EE implementations try to achieve that by manipulating the system ClassLoader and SecurityManager but it seems like this is not enough: situations like the one I have described above show that it is too easy for application code to get entangled with "kernel" code.

I was hoping JSR 277 (Java Module System) was going to address some of these issues, but after reading the spec it seems like it doesn't. It is probably too late to wish for this kind of change to happen in Java 7 so I will start my wish list for Java 8 already. Here is the first item:

  • First class support for running multiple applications on the Java VM. That includes:
    • A well defined way for classes to be unloaded.
    • Support for killing threads spawned by an application when the application is unloaded
    • Support for memory isolation between applications and setting limits on how much memory an application (or module) can use
    • Support for cleaning up ThreadLocal variables before a thread is put back in a thread pool

These are hard problems to solve, but after 7 releases of the platform there will be only hard problems left. Unloading classes is probably going to be particularly hard: what to do about references to unloaded classes? I belive they probably would need to be replaced with some phantom references that would throw a specific kind of exception when accessed.

Ideally these problems should be solved while still maintaining compatibility with existing applications.

Is that too much to wish for?

 

 

Related Topics >>

Comments

Yet another excellent project, killed by Sun

What you are asking for can be found here: http://labs.oracle.com/projects/barcelona/

It's a lot like older non-protected memory OSes

Java EE containers are in some ways a lot like MS-DOS, Mac OS 6/7/8/9, Symbian, etc ... applications share the same memory space and have access to each other's innards. There's polite agreement not to poke where they don't belong, but no strong enforcement of that. Without pointers and the resulting fun with corrupted stacks and dangling pointers leading apps to trample all over each other's memory the JVM and Java EE don't have it quite so bad - but there are still real issues with access control, resource ownership, etc.

Currently a Java EE app can delve into the innards of the container if it wants. It's not *meant* to, but it's not easily prevented and it makes the container's integrity less trustworthy. An app can break the container or cause it to misbehave. Expecting apps to be well-written is, as demonstrated by the OSes listed above, a great way to get a reputation for being unstable and generally crap.

More importantly, as you note in your post, Java EE apps can cause the container to leak resources, or can disrupt resources the apps shares with other apps via the container, like pooled connections.

This (and garbage collection + resource use issues with big JVMs) makes admins reluctant to have many apps hosted in a single container, so you land up doing all sorts of icky inter-JVM work.

I certainly agree that much stronger protection is needed to isolate the container's private core from apps, and apps from each other. I'm not at all convinced that trying to build that isolation entirely in software in the JVM is best, though. Modern successful OSes isolate apps using *hardware* features to do most of the grunt-work and protect them from implementation bugs; apps and the kernel are isolated from each other by default and can only interact by configured exception. Protected mode, separate logical address spaces, and the security ring system in x86 provide rather strong OS/app isolation. Trying to implement similarly capable isolation purely in software in the JVM would be bug-prone and hard to get right, kind of like the half-baked and buggy app isolation in Windows 3.x or even 9x.

Perhaps a more viable future approach would be to let multiple JVMs running on a host integrate more tightly via shared memory, high performance IPC, or whatever, so the container could run in its own JVM and each app could have a private JVM, with only those resources *explicitly* shared between them accessible across JVMs. That way, apps cleanup would be as simple as killing the app's jvm and letting the OS+hardware clean up the mess. The exposed surface for problems would be restricted to that part of the container that's explicitly accessible via the shared resources.

Or ... maybe the JVM will need more OS co-operation than that, so the JVM its self can use hardware features to isolate apps running within the JVM. I can't imagine OS designers wanting to let the JVM get its hooks into the kernel's memory mapping and process management, but with the advent of VMs and VM-friendly CPU extensions like VT-X and VT-IO I wonder if the JVM could use those instructions, like the kernel of a guest OS does in a VM, to isolate apps running within the JVM.

Much as I'd love the JVM to be able to isolate apps properly, in my admittedly pretty ignorant opinion I don't see it happening without some kind of major re-design. I imagine a half-assed solution that works about as well as Windows 9x did is possible with the current JVM, but giving the impression apps can be isolated and cleaned up without being able to really do a comprehensive job of it is IMO in many ways worse than the current situation. Without some really powerful, low level isolation features I don't see that happening. Look at how well "shared" OSes have typically achieved app isolation for examples of just how hard the problem is.

I do find the idea of using hardware VT-X support to help the JVM isolate apps in an EE container quite intriguing, actually. I wonder if it's something anyone's done any investigation of.

icky inter-JVM work?

Java provides the resources to make this easy, clean, and flexible; with extremely high performance. If I might suggest, please take a short look here cajo.dev.java.net/sdk.html.

John

The problem is the JVM already goes half the way...

The problem is the JVM already goes half the way in implementing application isolation. ClassLoaders and SecurityManagers are features that allow the JVM to execute untrusted code together with trused code in the same VM without security breaches. What is missing is support for resource isolation.

It is indeed very difficult to "kill" a Java EE application as parts of it can easily remain in memory. This is something moderns OSes are very good at, and altough I don't know enough about OS design I agree that more cooperation between the VM and the host OS could lead to an implementation of application isolation at moderate costs.

But I wonder whether there would be any way to implement that without breaking somehow the SE or EE semantics. One thing I always wondered is why doesn't any application server spawn a JVM per hosted application. In this scenario the "kernel" JVM would load the application server basic services (connection pools, JMS, Java Mail etc) and the "Application" JVM would load the application code (servlets, EJB, MDB etc). The kernel and application java virtual machines would communicate through a proprietary protocol transparent to the running application (maybe even using shared memory or pipes). When the kernel needs to undeploy the application it would then just kill the application JVM and spawn a new one. No memory leaks,no thread leaks, no problems with old class versions still retained in memory.

I wonder why no app server architecture went that way. Probably the performance price is too high.

 

 

IPC performance

"I wonder why no app server architecture went that way. Probably the performance price is too high."

I tend to suspect so. Most IPC mechanisms are significantly more expensive than in-process thread-based operation. mmap()ed files are slow, shared memory is fast but inflexible and prone to deployment issues, socket ipc is slow, signals are just awful, etc. That's why I find the idea of using the virtualization extensions for this interesting - it permits *very* cheap isolation of memory spaces without needing much help from the core OS's kernel. The "main" container jvm could set up mappings so that it could access the public parts of the application JVMs, and the app JVMs could see the public parts of the container JVM, but apps couldn't see each other's memory or the private memory of the container JVM. Essentially, I suspect one could use the virtualization extensions to provide a much more flexible and controlled form of shared memory with built-in access controls.

Because of Java's pointerless reference semantics, it might not even be *that* hard to retrofit, as apps never hold pointers to container memory directly. There'd need to be a mechanism for throwing a suitable exception (basically a wrapped segmentation fault, since that's what the hardware would generate) when an application tried to follow a reference to an object whose reference was to memory outside the app's mappings. Proper API design would prevent that in most cases anyway, of course, but there's always leakage especially when reflection is taken into account.

I suspect it'd break a lot of current application code though, as lots of Java EE code seems to like delving into the guts of the container. It'd be painful, but well worth it in the long run, to stop that happening.

If you're interested, you may want to have a quick look at how VT-X works, and see if you agree that it might have potential for turning the java ee container kernel into a *real* (VM guest) kernel running *real* EE programs in private JVMs with shared-mapped areas. http://en.wikipedia.org/wiki/X86_virtualization#Hardware_assist http://www.intel.com/technology/itj/2006/v10i3/1-hardware/6-vt-x-vt-i-so...

To turn Java as first class 'cheap web host' language.

I dont accept de fact that is much more cheap to host one PHP/Ruby web application than a Java one. And I think that the main reasons for it are the facts pointed by you in this post. We need a real application / kernel isolation. We have it only partialy, and not enought to make Java be largely used in small web applications. Do a little search about web hosting plans, and you will see what I am talking about. We have tons of php/ruby/perl/cgi/python plans. But hard-to-find, not cheap, Java plans.

New web programmers, not enterprise ones, may be using other languages because they dont have choice. Its not about productivity or performance. Its about choice.

Nice point of view, I agree

Nice point of view, I agree with you, now it's about choice.

PHP, Ruby etc aren't isolated either

PHP, Ruby, etc don't provide inter-app isolation either. What they do instead is have such lightweight runtimes, containers and frameworks that you can afford to just spawn one per hosting customer. If it goes runaway, you kill it, and let the OS clean up the mess. Because it's just a process,or group of processes owned by a given user, you can use the usual OS facilities for resource limiting and accounting too.

By contrast, you can't afford to start a JBoss or Glassfish instance per customer. The memory cost is horrifying, and there's a significant background load on the machine as they chug away doing their inscrutable internal housekeeping as well. Because the containers are so immensely fat (relative to PHP/Ruby/whatever running in Apache) you have to either dedicate a whole lot of RAM per customer, or let customers share the same container. As this article notes, sharing the same container just isn't viable at the moment, so the plans require lots more hardware resources and therefore cost lots more.

That's why I think trying to build isolation into the JVM is a bad idea. The JVM would have to become its own little mini-OS, with process (or thread) accounting of memory use and CPU, resource use controls, I/O scheduling, and everything else OSes already do for us.

What's really needed is a way to either share the vast majority of the container's costs between multiple container instances so you can have one (cheap) container per user, or permit a single container instance to host many apps in many *different* JVMs running as different processes on the host OS. Both those approaches permit you to use the OS's existing tools and facilities to manage load and users without having to build it all again on top of the JVM. Making either work, however, is another matter.

Even a way to compile the core container and JVM classes down to OS shared libraries that can be loaded as shared read-only mappings would massively reduce the memory cost. Doubly so if they could make proper use of basic OS facilities like shared read-only mappings of constant strings (ELF .cdata sections) etc. The way each JVM instance currently loads *everything* into process-private read/write memory is massively, grossly wasteful when you have more than one JVM running, as it denies the OS the use of its abilities to share resources between processes.

I wonder if there's potential for (ab)using the virtualization extensions in modern CPUs to help JVMs share resources or a single JVM manage multiple isolated domains? Or even compiling the core class library and container classes to a DLL / shared library that can be read-only shared mapped into each JVM instance by the OS?

I don't know Ruby....

But PHP lacks the "Application"  concept that Java EE has, and also lacks the class loading scope and the other security-related concerns that I believe make the JVM heavier.

Not that I am a PHP fan - I have been burned a lot of times by its lack of security, weak semantic definitions and poor performance. I would still preffer Java EE 6 for any non-trivial application.

I agree 100%

I agree 100%.

Back in the days when I owned an ISP I had a hard time hosting customer Java applications simply because they would start interfering with each other. It was java 1.4 back then and I ended up having to give each customer her private Tomcat instance to make it work.

Today there are a few shared Java hosting plans out there, but I have never used them because every time I want to host a Java EE appliation I use a VPS (virtual private server) solution. It is more expensive but it gets you the control you need in order to make it work reliably.