Skip to main content

Java 7 Fork/Join Framework Initial Look, and Resources

Posted by editor on April 16, 2012 at 9:08 PM PDT

In some earlier posts, I've talked about Java threads, Java Thread Overhead, and Amdahl's Law and Parallel Processing Speed-Up. My next investigation in this series is Java 7's new Fork/Join Framework. I plan to spend quite a lot of time in this particular investigation. For one, Java Threads have been around for a long time, and undoubtedly there are many resources available that discuss them; but the Fork/Join Framework is new, and hence, by spending more time investigating the characteristics of the framework, I'll be helping to provide a roadmap or points for thought/discussion for other developers.

This first post is a basic introduction to the framework, pointing out some of the available resources, outlining the basic concept of the framework, etc.

So, to start with, what do the terms "fork" and "join" mean in this context? If you're a Unix/Linux programmer, you very likely know that the fork command creates a process that is duplicate of itself, known as a "child" process. From there, the two processes can work on different segments of a problem. The results of the processing of the separate segments of the problem can then be "joined" to produce the final product. This approach is advantageous if multiple processes can execute simultaneously, for example on a multiprocessor system like the 8-processor Sun servers I worked with starting in 1993, or on a modern dual- or quad-core desktop PC.

A kind of sad side note: the term "fork" is also used when a group of people takes an open source code base and begins separate development on that code. The same term is used, but unfortunately, in open source development there is no accompanying "join" that rebuilds the accumulated work from the original effort and the "forked" effort into a unified final product. In the open source world, a "fork" is the proverbial "fork in the road" wherein the path taken never reunites with the path foregone...

Anyway, Java 7's Fork/Join Framework is Unix/Linux-like in the manner in which it divies up work, then joins all the subcomponent results into a single unified result. The Fork/Join Tutorial states that:

The fork/join framework is distinct because it uses a work-stealing algorithm. Worker threads that run out of things to do can steal tasks from other threads that are still busy.

This reminds me of Intel's Threading Building Blocks (TBB), a C++ library for achieving parallel processing. I know a lot about TBB, since I was the original editor / community manager for open source TBB back in 2007-2008 when TBB was first open sourced. The "work-stealing" method has a long, and in my view successful, history as a method for making achieving efficient parallel processing that's not terribly taxing on the developer.

Julien Ponge agrees with me on the "not terribly taxing" statement in his OTN article Fork and Join: Java Can Excel at Painless Parallel Programming Too! Julien's article shows that:

rich primitives can be used and assembled to write high-performance programs that take advantage of multicore processors, all without having to deal with low-level manipulation of threads and shared state synchronization.

And, that last phrase is the key: dealing with low-level manipulation of threads and synchronization... If you've ever developed multithread applications in any of the older languages... Put it this way: coping with thread management is bad at the development level. Move to the test/debug level, and it becomes a nightmare. A real-life illustration from my own personal experience:

Big Boss: Your app is producing random results and crashing for our most important client!

Developer: I tested that thoroughly, completely, perfectly, all results were correct in the 18,476 tests I ran on our systems, and there were no crashes!

Big Boss: You're saying our most important customer is lying? Get yourself to their data center ASAP!

... At the customer's data center, the developer realizes the customer's computers are much more high-end than the ones he developed and tested the app on. There is a bug in the thread management, memory overwrites, something that happens on the customer's high-end computers that never happens on the computers on which the multithreaded software was developed and tested... whoa!

Developer (thinking): Now what do I do? Ask the customer if I can develop the fix using their operational system as my test bed? Won't that seem weird to them? But, do I dare do the obvious -- ask Big Boss to buy as high-end computers as our huge-business customer has? But, if miraculously he approved, then I'd have to spend weeks developing a test that simulates Big Customer's data analysis situation. Mightn't they decide to go with IBM's very expensive but proven solution before I can accomplish that???

Those with experience know: managing threads on your own isn't a pretty situation. I've done it, both in creating a custom data center (the safe way, since you can indeed test on the actual computers the software will run on), and in the blockquoted situation, where I was working for a software vendor, and what worked on our computers failed miserably on our big customer's much fancier more high-end systems...

Yes, I've been very deep in the thread trenches (since 1993!). Or, fallen very deep into the trenches? I consider myself a very good programmer, but dealing with native thread management? Please, at this point in my career, can't I just sit back and watch it all happen for me instead? Yeah, manage the threads for me, Fork/Join Framework. I'll take that!

Just to whet your appetite, here are a few more links to articles developers have written about the Fork/Join Framework:

I very much look forward to my upcoming experiments with Java 7's Fork/Join Framework. Stay tuned if this topic interests you!


Java.net Weblogs

Since my last blog post, several people have posted new java.net blogs:


Poll

Our current Java.net poll asks How often do you attend Java conferences?. Voting will be open until Friday, April 20.


Java News

Here are the stories we've recently featured in our Java news section:


Spotlights

Our latest Java.net Spotlight is Heather Van Cura's Recent JSR Updates-JSR 356, 357, 355, 349, 236 :

JSR 357, Social Media API, was not approved by the SE/EE EC to continue development in the JCP program. JSR 356, Java API for WebSocket, was approved by the SE/EE EC to continue development in the JCP program...

Previously, we featured Deepak Vohra's JSF 2.0 for the Cloud, Part One:

JavaServer Faces 2.0 provides features ideally suited for the virtualized computing resources of the cloud. Here, in Part One of a two-part article, we look at @ManagedBean annotation, implicit navigation, and resource handling...


Subscriptions and Archives: You can subscribe to this blog using the java.net Editor's Blog Feed. You can also subscribe to the Java Today RSS feed and the java.net blogs feed. You can find historical archives of what has appeared the front page of Java.net in the java.net home page archive.

-- Kevin Farnham

Twitter: @kevin_farnham

Related Topics >>