Skip to main content

Multicore Desktop Application Experimentation, Part 1: Java Threads

Posted by editor on February 24, 2012 at 8:25 PM PST

Multithreaded programming has been possible in Java for a very long time. That's a good thing, since modern computers and even many mobile phones are multicore today. If an application is primarily interactive, its processing speed is fairly irrelevant today -- the limiting time factor is how fast the user can type or move and click the mouse or touch the screen, etc. But, to increase the speed of applications that do a lot of work between user interactions, utilizing those multiple cores is necessary.

Java 7 introduced the Fork/Join Framework, and Project Lambda will bring Lambda Expressions (closures) into Java 8. So, a question developers will soon be facing is: what's the best way for me to take advantage of multiple cores in my specific application? Should I use traditional Java threads? The Fork/Join Framework? Lambda Expressions (once Java 8 is released)?

I've worked on multithreaded development since late 1993 -- though most of that work has been in C, running on SunOS, and later on Solaris, multiprocessor machines. An eight-processor SunOS machine in 1993 was fairly radically powered at the time. The code was mathematical models and analyses applied to various types of satellite data. The researchers developed their functions as single-thread code (often in Fortran); my job was to convert it such that it could efficiently utilize all eight Sun processors.

It was an adventure, to say the least!

Today I did some experimentation on my quad-core CentOS 6.2 system (and, just for fun, on my dual-core HP Mini, which runs some kind of minimal Ubuntu, as well) with a basic multithreaded Java application. Essentially, the code creates some threads, passes them a data range, has each thread do some work on all values that are in the passed-in range, then requests back a final result. Here's the actual thread class:

import static java.lang.Math.pow;

class NewThread implements Runnable {
  String name; // name of thread
  Thread t;
  int iVal0;
  int iVal1;
  double lastVal;

  void SetWorkRange(int i0, int i1) {
    iVal0 = i0;
    iVal1 = i1;
    //System.out.println(name + " work range: " + iVal0 + "-" + iVal1);
    return;
  }

  double GetLastValue() {
    return lastVal;
  }

  NewThread(String threadname) {
    name = threadname;
    t = new Thread(this, name);
    System.out.println("New thread: " + t);
    //t.start(); // Start the thread
  }

  // This is the entry point for thread.
  public void run() {
    System.out.println(name + " starting, working on " + iVal0 + "-" + iVal1);
    try {
      for(int i = 1; i <= 100000; i++) {
        for(int j = iVal0; j <= iVal1; j++) {
          double val0 = i;
          double val1 = j;
          double val2 = val0 * val1;
          double val3 = pow(val2, 0.5);
          lastVal = val3;
        }
      }
    } catch (Exception e) {
      System.out.println(name + "error" + e);
    }
    System.out.println(name + " exiting.");
  }
}

The starting point for this was an example in Herbert Schildt's excellent "Java The Complete Reference" which I reviewed a while back.

The class in my sample application is used as follows. For each thread:

    1. A new NewThread is created;
    2. SetWorkRange is called to specify a data range for the thread's execution;
    3. The thread is run;
    4. The main app waits for the thread's processing to complete;
    5. GetLastValue is called to return the last result for the thread's execution.

A pretty useless multithreaded application, no doubt! But, it did let me gather some numbers on the relative performance of running computation-centric code with different numbers of threads on my computers.

I analyzed the data range from 1 to 8000, running with 1, 2, 4, and 8 threads, dividing the work equally between each thread. So, in the single-thread case, I called SetWorkRange with 1 and 8000; in the two-thread case, I called SetWorkRange with 1 and 4000 for the first thread, and 4001 and 8000 for the second thread; etc.

Here are the timing results on my CentOS 6.2 quad-core system:

Threads Clock Time
(seconds)
CPU Time
(seconds)
1 53.2 53.2
2 26.7 53.2
4 13.7 53.2
8 13.5 53.4

These results imply basically perfect scaling out to the number of processor cores in my system; and they imply no real loss (even a tiny clock-time improvement) from running eight threads on my quad-core processor.

Here's what happened on my HP Mini (which wasn't exactly designed with high-volume computation-intensive processing in mind):

Threads Clock Time
(seconds)
CPU Time
(seconds)
1 1741.8 1682.0
2 1433.1 1802.4
4 1235.0 1860.8
8 1097.1 1868.2

The itty-bitty Mini found the task daunting, to say the least. The results don't make a lot of sense, if I'm right that my HP Mini is a dual-core machine. The fact that the application completed in less time as the number of threads were increased to 4 and 8 is interesting, but my guess is that it reflects upon the uniqueness of the HP Mini's hardware and OS more than anything related to Java or my test app.

What I like about today's experimentation is that it demonstrates some of the complexity that's involved in efficiently utilizing the available processing power in any device. Same app, two different devices, with widely differing processing capability. The result? A quite different performance per number of threads graph for each device... So, how's a Java developer to write an efficient multi-platform "write once, run anywhere" application in this modern multicore world???

I'll be investigating this via a new Java.net project (not yet public) and blogs starting with this one. It's a topic I know lots about from my early work with multithreaded development on C on Sun multiprocessor computers, and also from my work on Intel's open source ThreadingBuildingBlocks project (see my Intel Software Network blogs).

In the end, successful multithreaded programming is all about applying language capabilities to efficiently utilize the available multicore/multiprocessor hardware resources. Java currently offers multiple methodologies for addressing this problem, and new methods will arrive with Java 8. It's going to be exciting to investigate and test what works best in what situations!


Java.net Weblogs

Since my last blog post, several people have posted new java.net blogs:


Poll

Our current Java.net poll asks Will you use JavaFX for development once it's fully ported to Mac and Linux platforms?. Voting will be open until Friday, March 2.


Articles

Our latest Java.net article is Michael Bar-Sinai's PanelMatic 101.


Java News

Here are the stories we've recently featured in our Java news section:


Spotlights

Our latest Java.net href="http://www.java.net/archive/spotlight">Spotlight is James Sugrue's Which JVM Language Is On Top?:

It’s a well known fact that Java’s prevalence in the software development industry is encouraged by the innovation that surrounds the JVM, and the languages that are built on top of it. Today I’d like to start a poll on what alternate languages you use (or would like to use!) on the JVM...

Previously, we spotlighted the Akka Team Blog's Scalability of Fork Join Pool:

Akka 2.0 message passing throughput scales way better on multi-core hardware than in previous versions, thanks to the new fork join executor developed by Doug Lea. One micro benchmark illustrates a 1100% increase in throughput! The new 48 core server had arrived and we were excited to run the benchmarks on the new hardware, but it was sad to see the initial results. It didn’t scale...


Subscriptions and Archives: You can subscribe to this blog using the java.net Editor's Blog Feed. You can also subscribe to the Java Today RSS feed and the java.net blogs feed. You can find historical archives of what has appeared the front page of Java.net in the java.net home page archive.

-- Kevin Farnham

Twitter: @kevin_farnham