The Source for Java Technology Collaboration
User: Password:



Graham Hamilton

Graham Hamilton's Blog

Multithreaded toolkits: A failed dream?

Posted by kgh on October 19, 2004 at 08:43 AM | Comments (16)

The question came up recently of "should we make Swing truly multithreaded?" My personal answer would be "no", and here's why...

The Failed Dream

There are certain ideas in Computer Science that I think of as the "Failed Dreams" (borrowing a term from Vernor Vinge). The Failed Dreams seem like obvious good ideas. So they get periodically reinvented, and people put a lot of time and thought into them. They typically work well on a research scale and they have the intriguing attribute of almost working on a production scale. Except you can never quite get all the kinks ironed out...

For me, multithreaded GUI toolkits seem to be one of the Failed Dreams. It seems like the obvious right thing to do in a multithreaded environment. Any random thread should be able to update the GUI state of buttons, text fields, etc, etc. Damned straight. It's just a matter of having a few locks, what can be so hard? OK, there are some bugs, but we can fix them, right? Unfortunately it turns out not to be so simple...

From observation, there seems to be an amazing tendency towards deadlocks and race conditions in multithreaded GUIs. I first heard about this issue anecdotally from people who had worked with the Cedar GUI libraries at Xerox PARC in the early 80's. That was a community of extremely smart people who really understood threading, so the assertion that they were having regular deadlock issues within GUI code was intriguing. But maybe that was flawed data or an exceptional situation.

Unfortunately that general pattern has been repeated regularly down the years. People often start off trying for multithreading and then slowly move to an event queue model. "It's best to let the event thread do the GUI work."

We went through this with AWT. AWT was initially exposed as a normal multi-threaded Java library. But as the Java team looked at the experience with AWT and with the deadlocks and races that people had encountered, we began to realize that we were making a promise we couldn't keep.

This analysis culminated in one of the design reviews for Swing in 1997, when we reviewed the state of play in AWT, and the overall industry experience, and we accepted the Swing team's recommendation that Swing should support only very limited multi-threading. With a few narrow exceptions all GUI toolkit work should occur on the event processing thread. Random threads should not try to directly manipulate the GUI state.

Why is this so hard?

John Ousterhout gave a great Usenix talk on Events versus Threads in 1995 that explores some of the tradeoffs between thread-driven and event-driven programming and he correctly points out many reasons why multi-threaded programming is hard and why event driven programming can be simpler. I don't necessarily agree with his analysis for all kinds of programs, but I do agree for GUI programs.

The particular threading problems of GUI toolkits seem to me to arise from the combination of input event processing and abstraction.

The problem of input event processing is that it tends to run in the opposite direction to most GUI activity. In general, GUI operations start at the top of a stack of library abstractions and go "down". I am operating on an abstract idea in my application that is expressed by some GUI objects, so I start off in my application and call into high-level GUI abstractions, that call into lower level GUI abstractions, that call into the ugly guts of the toolkit, and thence into the OS. In contrast, input events start of at the OS layer and are progressively dispatched "up" the abstraction layers, until they arrive in my application code.

Now, since we are using abstractions, we will naturally be doing locking separately within each abstraction. And unfortunately we have the classic lock ordering nightmare: we have two different kinds of activities going on that want to acquire locks in opposite orders. So deadlock is almost inevitable.

This problem will initially surface as a series of specific threading bugs. And people's first reaction is to try to adjust the locking behavior to resolve the specific bugs. Let's release that lock there and then lets use more clever locking over here. Well, that is kind of a fun activity, but it is trying to fight back an oceanic tidal force. The cleverer locking typically turns into a combination of subtle races (due to lack of locking) or clever and intricate deadlocks (due to the clever and intricate locking). We went through a bunch of that in 95-97.

Notice that the problems extends beyond the GUI toolkit layers and also appears between the toolkit layer and the application level. With great difficult one might try to adopt a single lock for all activity within the GUI layer, but the same problem then resurfaces a level up.

So what's the answer? Well, at some point you have to step back and observe that there is a fundamental conflict here between a thread wanting to go "up" and other threads wanting to go "down", and while you can fix individual point bugs, you can't fix the overall situation.

This lead to the solution that the Swing team adopted and which is used by most leading GUI toolkits: run all GUI activity on a single event thread. This means that in some sense all GUI activity becomes event driven, and the "down" threads become just a new kind of event.

This demonstrably works. It is possible to write complex GUI apps that work reliably. Hurrah! But it does make managing long running activities tougher. I wrote a smallish Swing program that I use periodically to selectively zap large boring attachments from my email archives. I don't want to hang the GUI while it reads tens of megabytes of emails, and I also want to display a progress monitor, so I ended up having to carefully balance handing off big activities to worker threads and handing GUI activities back to the event thread. It is probably more complicated than it would be if I had a magic multi-threaded library, but it has the significant saving grace that it actually seems to work reliably.

Subtleties

Are things really so black and white? Surely there have been people who have used multi-threaded toolkits successfully? Yes, but I think this demonstrates one of the characteristics of the Failed Dreams.

I believe you can program successfully with multi-threaded GUI toolkits if the toolkit is very carefully designed; if the toolkit exposes its locking methodology in gory detail; if you are very smart, very careful, and have a global understanding of the whole structure of the toolkit. If you get one of these things slightly wrong, things will mostly work, but you will get occasional hangs (due to deadlocks) or glitches (due to races). This multithreaded approach works best for people who have been intimately involved in the design of the toolkit.

Unfortunately I don't think this set of characteristics scale to widespread commercial use. What you tend to end up with is normal smart programmers building apps that don't quite work reliably for reasons that are not at all obvious. So the authors get very disgruntled and frustrated and use bad words on the poor innocent toolkit. (Like me when I first started using AWT. Sorry!)

Another wrinkle: it is possible to have multiple simultaneous GUI activities within a Java VM by using multiple event threads. That works provided the different activities are almost entirely isolated, have their own distinct GUIs (no shared components or mixed hierarchies) and provided the very lowest toolkit level can correctly dispatch events to the right event thread with minimal locking. This is useful in (for example) running multiple applets within one JVM. But it isn't a very general solution - most applications need to live within the constraint of only a single event thread.

In this note I've most been covering why Swing and other toolkits are essentially single-threaded. Chet recently blogged on some related topics around why multi-threading complicates user programs and normally won't help raw graphics performance.

Also, before I forget, some people are probably remembering that "processes and monitors are duals". Well, yes, it's true. In some sense we are using the event thread to implement a global lock. We could invert things, and create a global lock that is equivalent to the event queue. This would be fairly ugly and would require wide coordination and undermine a lot of abstractions. But the larger problem is that Java developers tend to use multiple locks and if they are to preserve the equivalence with an event queue model, they will need to follow various non-obvious rules about how they interact with these other locks. The event queue model makes the central single lock much more visible and explicit, and on the whole that seems to help people to more reliably follow the model and thus construct GUI programs that work reliably.

Conclusion

I guess the bottom line is that like many others I would really like to see a flexible, powerful, truly multi-threaded GUI toolkit. But I don't know how to get there - at this point there is fairly strong experience that the obvious approaches for multi-threading don't work. Maybe in future years people will come up with a radically new and better approach, but for now the answer seems to be that events are our friends.

                                                                                           - Graham


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Graham,


    There have been several articles written by Sun about this. They are persuasive enough that I believe multi-threaded GUI toolkits are a bad idea, despite their existence and the fact that I've never really studied the issue. You don't need to convince me.

    I also know that Sun is trying to greatly increase the number Java programmers. There are a number of ways that Swing could be much better for developing multi-threaded apps.

    First off, how does a developer know if they are following Swing's threading rules correctly? I use a custom RepaintManager to catch problems. I don't know if that technique is adequate, but no technique is going to be obvious to Java neophytes. The Swing documentation needs to be revised so that the threading rules are not only clear and easy to find, but easy to check. Perhaps the new Tiger annotations could be used to help here.

    EventQueue/SwingUtilities invokeAndWait() not only
    uses wait() outside of while loop, but will deadlock silently, without providing any diagnostic information. I always use my own version instead, so that I get messages telling me that there is a deadlock and who has the lock I'm waiting for.

    Finally, there is no standard framework for developing multi-threaded Swing applications. Some alternatives are slowly emerging, but they all seem to be targeted at relatively narrow kinds of applications. There are several multi-threaded applications (like a rich-client bug tracking tool) that would make Java.net more valuable to developers. If Sun could develop a common framework for them, Java.net and Swing would both be the better for it.

    - Curt

    Posted by: coxcu on October 19, 2004 at 01:56 PM

  • Graham,

    What prompted this blog? Are people actually asking for multithreaded Swing? I thought multi-threaded toolkits were deemed a performance nightmare. I haven't heard of this since AWT was put to bed. I hope this isn't in response to the request of adding something like Foxtrot to Swing. Foxtrot eases Swing's single threaded nature, but doesn't make Swing multi-threaded. Making Swing multi-threaded is a mistake I agree, but it's equally a mistake to not add something as brilliant as Foxtrot to Swing. I really hope Sun adds either the proper hooks for Foxtrot to integrate into Swing seamlessly, or they implement the very same thing into Swing, the later being preferred.

    -Charlie

    Posted by: charliehubbard on October 19, 2004 at 08:32 PM

  • Maybe the presentation went but it would seem as though the slides have threading and event handling confused. I was not under the impression that using one precluded or insisted on using the other.

    On the question of deadlock... the trick to avoiding deadlock is to ensure that you always aquire and release the locks in the same order. That would seem to be a difficult thing to do in a multi-threaded GUI interface where your driving force is an unpredicatable user 8^)

    Posted by: kcpeppe on October 20, 2004 at 05:25 AM

  • "What prompted the blog?": Well, people do keep asking for
    multi-threaded Swing. This has come up as a number of times
    as part of our planning for Mustang.
    Because it is so superficially attractive, but so subtly
    problematic, I wanted to try
    to coherently write up my own views on why it isn't viable.

    I'm glad to hear that many people understand the issues here!
    This one isn't an issue of resources or relative engineering
    priorities, it is a reflection of an underlying technology
    reality.

    I definitely agree with the comments that we should work to make
    it easier for people to work with multiple threads within
    the mostly single-threaded GUI model!

           
           
           
           
           
           
           
           
           
           
    thanks - Graham

    Posted by: kgh on October 20, 2004 at 09:21 AM

  • Another posted wrote:
    how does a developer know if they are following Swing's threading rules correctly?
    I recently found a race-condition (aka threading bug) in my layout manager and deduced that some code was calling my layout manager outside the EventThread causing this problem. The bug therefor lead to nullPointerExceptions in my LayoutManager, but the real problem was that I was unaware that I broke the 1-thread rule somewhere.
    My solution was to add a throw new IllegalStateException("Should be called on AWTEvent thread only!") to the layoutContainer of my layout manager.
    The not-so-funny effect was that I found that almost all my applications had one bug or another in threading, that I was unaware of. And my throwing this exception basically made each and every application un-runnable. Point is; following this threading problem is really really hard since you almost never get good feedback, and if stuff fails, it does not fail fast enough which does not lead me to believe the bug was a threading bug.
    I now have a System.err.println() in my layout manager and have slowly been fixing all my applications ever since. I really would like to recommend this approach to everyone with the source to his layout managers. Perhaps a tip is put an assertion in the production version allowing more people to fix their GUIs without affecting production code.

    Cheers!

    Posted by: zander on October 20, 2004 at 10:57 AM

  • +1 to all these ideas! Add asserts to Swing to show developers where their errors are. Add something like Foxtrot to make dealing with the Threading issues easier. Fix SwingUtilities invokeAndWait() to report or avoid deadlocks. All of these would have helped me as I designed my last Swing application!

    Scott

    Posted by: scottganyo on October 20, 2004 at 12:50 PM

  • I agree with Graham that multithreaded version of Swing isn't possible, having been one of the AWT engineers who failed with that toolkit. I am convinced that the current Swing API cannot be made truly threadsafe any more than the AWT could. There are too many unsafe methods and data, and no one will tolerate removing them from existing API.

    I still think it's possible to have a multi-threaded GUI if the only access between it and the app were asynchronous events, but doubt anyone would be happy learning how to use that programming model. That's the environment I served my apprenticeship in, but it's not very common elsewhere.

    Posted by: tball on October 20, 2004 at 07:14 PM

  • I admit that I don't really understand the issues involved.. but, to play devil's advocate, what if every call (maybe optionally) to the component APIs queued an event on the event queue?

    For example, if I call button.setText(....) from some thread, instead of modifying the internal state of the button this method queued a change event on the Swing thread which would be done when the UI thread got to it.

    Doesn't that relieve the programmer of having to deal with multithreading (ie. locks) and allowing the UI to remain single threaded??

    Posted by: dog on October 20, 2004 at 08:31 PM

  • Thanks for this article. I have read the same idea in a couple of different places and it makes sense to me that Swing follows a single-threading model.

    What does not make sense is that, out of the box, adding almost any significant non-GUI code (or even GUI code) to a button click or menu item click event causes the button to remain 'depressed' or the menu to remain visible. This is really non-intuitive. In other toolkits I have worked with, regardless of what code I was writing, buttons and menus always accepted the click, redrew themselves, and processing continued.

    I understand the arguments in favor of using SwingWorker or Foxtrot. However, I believe the Swing toolkit as a whole should default to using SwingWorker or Foxtrot. A JButton that is clicked has no business, ever, remaining depressed while waiting for a response elsewhere. A menu item that is clicked has no business remaining visible while the app does something else. Likewise, the default should be to turn on the hourglass when these events are kicked off.

    My suggestion (I will repost in the Mustang forum) is that some pattern (Adapter?) be applied to facilitate the multi-threading that is necessary for code using Swing (not MT within Swing itself). When a button is clicked, there should be a 'postClick' method that is fired in which non-GUI (data) processing takes place, and a separate method where GUI updates are written...something like that. For buttons and menu items, the redraw/repaint would be queued first, the processing method fired, and then the GUI update method fired. I am just riffing here, don't have the concrete details, but hopefully the point is clear.

    Basically, I think we are asking too much of entry-level Swing developers to have to learn about SwingWorker or other approaches when they want to get something simple done.

    Patrick

    Posted by: pdoubleya on October 22, 2004 at 12:46 AM

  • Swing team did a presentation on java one 2004 TS-2853 "Desktop
    Application Architecture II: Using Threads Correctly and Effectively"
    Among other things new SwingWorker IV was presented. It aims to
    provide framework for long running swing UI interacting task
    implementation.

    -Igor

    Posted by: idkush on October 25, 2004 at 12:25 PM

  • (Please don't flame me before reading the entire post).

    Microsoft toolkits (WinForms, Avalon) face similar problems and have similar solutions to Swing, but with some important extra flexbility.

    For example, a nice feature to have would be to allow more control of the message loop. Swing is a little weird in that the main processing loop is completely hidden-- an app just continues running after main exits. This business of main being a separate thread from the UI thread has caused developers lot of problems. Yeah it's noted in several places what's safe to do on main, but it really should be easier by default.

    It would be better if there were an explicit Swing.RunApplication() call to run the main message pump and a Swing.DoEvents() to process queued events. This would open up easier ways to do UI thread processing a la FoxTrot. Also, it could become the preferred way to run the message pump without breaking backwards compatibility-- just do the old style event thread no explicit call is made. This is pretty much what WinForms does. And heck, it's even what AWT does for modal dialogs (I know because I was on the AWT team when we switched it from a separate dialog thread to just pumping messages).

    In Longhorn/Avalon (please no flames) they've simplified things further. You have a UIContext, which is basically equivalent to Swing/AWT event thread, but it is possible to synchronously enter/exit the context (sort of like having a context wide lock). And more importantly, you can have multiple UI contexts. You are safe doing things in one context just like you are safe doing things on Swing's event thread. But there can be more than one context per process.

    Posted by: robikhan on November 08, 2004 at 02:56 PM

  • Response to robikhan post:
    Having a "context wide lock" is not a new idea. You have the same mechanism in GTK with gdk_threads_enter() / gdk_threads_leave() functions. This produces an ugly and complicated code. Furthermore, it doesn

    Posted by: nakhli on November 23, 2004 at 05:08 AM

  • Response to robikhan post:
    Having a "context wide lock" is not a new idea. You have the same mechanism in GTK with gdk_threads_enter() / gdk_threads_leave() functions. This produces an ugly and complicated code. Furthermore, it doesn't protect the programmer from datarace conditions(I spent a lot of time trying to debug the "unexpected async reply" Xlib complaints).

    Posted by: nakhli on November 23, 2004 at 05:10 AM

  • Actually, I want to speak up here.

    The real issue here is the lack of coroutines in most modern programming languages.

    Threads are all fine and good, but a thread is overkill for a spelling checker. People use threads for spelling checkers, etc. because Java doesn't provide coroutines.

    In order to emulate a coroutine, you must create some mechanism to store state between calls, or to "store" and "restore" state. I've dealt with volumes of VB 6 and prior programs that used this technique for multitasking, and the code was truely ugly.

    Then you have to indicate whether you're in process or not. Different programs have different ways of doing this. With a real event queue, you can push the state as an event, then pop it off later to restore the state.

    The issue is this doesn't look very clean, you've got to modify multilpe places in the code every time you add a variable, and you've got this extra class hanging around, being created and being destroyed for each coroutine that you create.

    So then you look at the alternative, which is a thread. A thread you write your procedure from top to bottom, it executes from top to bottom, and all saving/restoring of the state is done by the platform. Your code is extremely clean, but you have to lock and unlock things or you can break data.

    Now the spellchecker often wants to display UI -- such as a message box saying "fix this occurence?" This UI often involves UI from another thread, and so the people come asking you for multithreaded UI.

    If you use event passing, there's no standard publisher/subscribe that crosses threads in Java, and so you implement your own system. The spellchecker posts the event, and then something checks the event queue, and you susbscribe to the thing that checks the event queue. This could all be on the same class, could be separate classes, it all depends on the implementation.

    A coroutine fixes this. You run up to a point in the spelling checker where the spelling checker gives up a time slice. Because the spelling checker is on the event pump thread, and because the programmer is control over when time is given up, no synchronization objects are required. Because it's on the event pump thread, it can access the UI without any concurrency issues. There's no chance that another thread is using the UI.

    The legitimate use of multithreaded UI is in media file players, videogames, and emulators. And on these, all they need to do is update a surface that will eventually be blitted. Only one thread updates the surface, but another thread may actually be responsible for the blitting.

    This is necessary on operating systems such as BeOS and OS/2 with decent video subsystems where this can substantially improve the performance of the application. On OS/2, I've personally witnessed a 2x performance in some applications in this genre by using multithreaded UI.

    Posted by: davebac on December 02, 2004 at 11:06 AM

  • And sorry for running it altogether, and BeOS also applies to PalmOS 6+, which uses BeOS' codebase heavily.

    Posted by: davebac on December 02, 2004 at 11:09 AM

  • Graham, thanks for the nice entry.

    I enjoyed the observation:
    "The problem of input event processing is that it tends to run in the opposite direction to most GUI activity. In general, GUI operations start at the top of a stack of library abstractions and go "down". I am operating on an abstract idea in my application that is expressed by some GUI objects, so I start off in my application and call into high-level GUI abstractions, that call into lower level GUI abstractions, that call into the ugly guts of the toolkit, and thence into the OS. In contrast, input events start of at the OS layer and are progressively dispatched "up" the abstraction layers, until they arrive in my application code."

    I had a similar problem when trying to merge a yacc generated i/o driven parser, with an event driven gui for the visual environment. I ended up using a second level event loop for the gui, checked on passes through the input stream parsing. Probably others must have done similar things, but it raised the quandry of this need to merge two control loops each with their own state, and an interesting solution.

    Posted by: guthrie on April 29, 2005 at 08:31 AM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds