Skip to main content

Java 7 Unsafe at Any Speed?

Posted by cayhorstmann on July 29, 2011 at 4:32 PM PDT

Some people are nervous about everything—killer bees, poison oak, martian invaders, socialized medicine, you know the type. I try not to be like that. When JDK 7 went final yesterday, I boldly went into my .bashrc and changed JAVA_HOME to point to jdk1.7.0. Then I read this.

So, apparently, under some conditions, Hotspot messes up. It might crash, but that doesn't bother me so much—I'd notice that. But it might also silently produce the wrong result. I try not to be a scaredy squirrel about these things, but I must say that “rarely happening” bugs in a widely used platform bother me. When Toyota cars had random brake problems, the NHSTA ultimately concluded that floor mats, sticky pedals, and “pedal misapplication” were the culprits. But what if the electronics had a bug that only happens in a confluence of rare circumstances? They said no, but how can they really know?

It all reminds me of the Pentium bug from 1994. Intel had just released the first Pentium chip, and I immediately went and bought one at considerable expense to the management. Later I learned that a mathematics professor, Dr. Thomas Nicely of Lynchburg College, had run into a curious issue. On a small set of inputs, the multiplication was buggy. For example, 4195835- 4195835 / 3145727 × 3145727 yielded 256 instead of the expected 0. I tried it out on my new computer. Sure enough, I got 256. I tried it out on an older 486. I got 0.

It turned out that Intel had known about the bug but decided to ship the processor anyway. Intel claimed that under normal use, a typical consumer would only notice the problem once every 27,000 years. Unfortunately for Intel, Dr. Nicely had not been a normal user. Intel stonewalled for a while, but eventually they sent out replacement chips for everyone.

Was I a normal user?  In the several months that I ran the defective chip, I noticed no problems. I even computed my tax return on it. Fortunately, I did not make $4195835 that year. But I applied for a replacement—I remember that I had to sign a form claiming to make mathematical computations. (They later gave the replacement to everyone.)

Uwe states that the Hotspot bugs in question are here, here, and here. The first one is easy to check. I checked it and couldn't reproduce it. The second bug requires a file microbenchmarks.jar. I don't know where to get that. The third one is in the “Application client”. The Java EE application client? If so, that's not something you'd want to misbehave in a shiny new release, right? Only the second bug was reported to give incorrect results; the others cause a crash. That bug was reported on July 23, 2011 and is classified as “medium”.

That seems problematic to me. A rare crash is manageable, but silently getting the wrong result is not. I would have expected this to be a high priority bug. Don't they have criteria for this that say “Hotspot produces wrong result ⇒ take it seriously”? As it is, it just looks bad. Oracle had promised to ship JDK 7 by the end of July. Was that considered more important than to fix a showstopper bug? If so, that would seem a rather poor decision. Don't they realize that this leads to blog posts with headlines “Don’t Use Java 7, For Anything” or ”Java 7 Unsafe at Any Speed?”

In my introductory textbooks, I include a short article about the Pentium bug. Many times, colleagues who reviewed the book questioned whether this is a good use of space. Isn't that story outdated? And anyway, what does it have to do with computer science? I keep it in there. I imagine that maybe, one day, one of the students will become a manager, needs to make a decision about a bug that silently computes the wrong result, and remembers the story. 

Related Topics >>

Comments

<p>&nbsp;Meh. &nbsp;General Availability != General Use. ...

Meh. General Availability != General Use.

One assumes this bug will be fixed right directly. Mission-critical enterprise server products won't get upgraded rapidly or recklessly. (Believe me, I know. I know personally of one product that only recently got Java 6 and another product that will remain on .NET 3.5 for at least another six months before our customers even begin to pick up.) People will monitor the bugs (thanks for calling them out), and make a decision about upgrading based on the status of the bugs.

When it comes to shipping hardware (Pentium bug, leaky capacitors), yeah, that's a big deal. Today's rapid software update cycle reduces the impact of problems like this (while causing others). I understand the outrage at shipping such a bug, but, seriously, it's gonna get fixed. Nobody sensible will even jump on a .0 release, right?

(No, I don't work for Oracle. :) )

<p>I realize that this isn't a huge problem--that's why ...

I realize that this isn't a huge problem--that's why I called it a "tempest in a teapot" in the blurb. The good news is that Oracle has bumped up the priorities of these bugs, and that fixes are in the pipeline.

The bad news is that the priorities are wrong. The scary bug--the one that silently computes the wrong answer--currently has "medium" priority. The two bugs that cause visible crashes have "high" priority.

I wrote the blog as a reminder that silently computing the wrong answer is much worse than crashing. That message, I am afraid, is still not getting to where it needs to get.

<p>[quote=cayhorstmann]</p> <p>That seems problematic to ...

[quote=cayhorstmann]

That seems problematic to me. A rare crash is manageable, but silently getting the wrong result is not. I would have expected this to be a high priority bug. Don't they have criteria for this that say “Hotspot produces wrong result ⇒ take it seriously”? As it is, it just looks bad. Oracle had promised to ship JDK 7 by the end of July. Was that considered more important than to fix a showstopper bug? If so, that would seem a rather poor decision. Don't they realize that this leads to blog posts with headlines “Don’t Use Java 7, For Anything” or ”Java 7 Unsafe at Any Speed?”

[/quote]

My experience with some people at Oracle is that they take bugs very serious and are working hard to fix them, while some managers at Oracle give a fixed release date a higher priority. So it actually happened recently that GlassFish v3.1.1 was released and praised as the super stable bug fix collection, but in fact it still contains several bugs that prevent WORA since to make an EAR run one must apply several workarounds to deployed application's source code. That actually is inacceptable for ISVs, but Oracle just seem to have that policy that this plays no role in their decisions.

<p>&nbsp;Dr.Horstmann total respect to you! Way you express ...

Dr.Horstmann total respect to you! Way you express your thoughts are so clear and straight to the point. Your blogs, books are a great source for learning. Thank you.