Skip to main content

JavaFX's Game of Life

Posted by opinali on May 21, 2010 at 10:17 AM PDT

There is an unwritten tradition that John Conway's Game of Life must be implemented in every programming language and every GUI toolkit. Well, OK I just invented this tradition, but it's a smart introduction and Life is one of the easiest games / cool animations you can program. But it's not too simple that we can't learn a few important things about JavaFX...

My goal: a good-looking and feature-complete version of the Game of Life (GOL), but keeping code simple, short, "canonical". I won't resort to low-level optimizations (e.g. reaching to JavaSE APIs), but I may use high-level ones (e.g. good algorithms, careful selection of JavaFX features). How well JavaFX handles the task the way it is indented to be used?

So, let's start. The complete app is short enough that it fits in this blog, in a few small pieces.

class World {
    var SIZE:Integer on replace { reset() }
    var cells:Boolean[];
    var gen:Integer;
    var pop:Integer;
    function reset () { gen = pop = 0; cells = for (i in [0 ..< SIZE * SIZE]) false }
    bound function isLive (x:Integer, y:Integer)   { cells[y * SIZE + x] }
    function flip (x:Integer, y:Integer):Void      { cells[y * SIZE + x] = not cells[y * SIZE + x] }

The World class implement the game's data model and the GOL algorithm. The cells sequence contains true=alive, false=dead; it would ideally be a matrix, but JavaFX Script doesn't support multidimensional sequences so I have to do some index arithmetic. I could have used primitive arrays (with JavaFX Script's nativearray) but that would be impure, as native arrays are only intended for Java integration and don't completely integrate with JavaFX Script.

    function life (x:Integer, y:Integer) {
        var count = if (cells[y * SIZE + x]) then -1 else 0;
        for (yy in [max(y - 1, 0) .. min(y + 1, SIZE - 1)])
            for (xx in [max(x - 1, 0) .. min(x + 1, SIZE - 1)])
                if (cells[yy * SIZE + xx]) ++count;
        count == 3 or count == 2 and isLive(x, y)
    }

Function life() is the finite state machine for an individual cell; nothing JavaFX-specific here. Except that I hate and instead of &&.

Oh, I I didn't make the obvious optimization of creating two extra rows and columns to avoid the min/max tests to prevent out-of-bounds errors at border cells without all neighbors, because this would reduce the general seamlessness of working with sequences. (It takes a lot of discipline to resist the urge of micro-optimization... ugh...)

    function run ():Void {
        ++gen;
        pop = 0;
        cells = for (y in [0 ..< SIZE]) for (x in [0 ..< SIZE]) {
            def cell = life(x, y);
            if (cell) ++pop;
            cell
        }
    }

Function run() recomputes the whole world (all cells). The inner for xx builds a sequence for each row, and the outer for yy concatenates all row sequences in a single big sequence ("auto-flattening"). I didn't worry, because the compiler may optimize this by adding the inner elements directly into a single sequence for the outer loop.

((( Begin Parentheses to investigate the compiler (((

Sequences are immutable; updates are performed by creating a full-new sequence, copying all non-updated elements. The compiler can optimize this too, with temporary mutable representations in methods that perform multiple updates; ideally you trust the compiler by default, and only if optimize if necessary (as indicated by profiling). Having said that, my run() function replaces the current sequence by a new one, requiring a single assignment - but I didn't do it to optimize code, I did it because it's more elegant: the code explicitly calculates the entire state N+1 as a function of state N. In fact, run() was a one-liner before I augmented it to update the generation and population counters.

Notice that the GOL algorithm cannot be implemented easily with in-place updates because the new state of each cell depends on the current state of all cells around it. I could have used an in-place algorithm, but that would be uglier and also require some mutable data type like a nativearray.

Another interesting aspect of JavaFX Script is that is sequences are optimized for all basic types. My cells:Boolean[] uses a primitive boolean[] as internal storage, consuming a single byte per element; I've certified this behavior in the profiler. Let's check all these optimizations in the generated bytecode (decompiled):

    @ScriptPrivate
    public void run() {
        World receiver$ = this;
        set$World$gen(get$World$gen() + 1);
        set$World$pop(0);
        BooleanArraySequence jfx$25sb = new BooleanArraySequence();
        int y$ind = 0;
        for (int y$upper = get$World$SIZE(); y$ind < y$upper; ++y$ind) {
            int y = y$ind;
            BooleanArraySequence jfx$26sb = new BooleanArraySequence();
            int x$ind = 0;
            for (int x$upper = get$World$SIZE(); x$ind < x$upper; ++x$ind) {
                int x = x$ind;
                boolean cell = life(x, y);
                if (cell) set$World$pop(get$World$pop() + 1);
                boolean jfx$27tmp = cell;
                jfx$26sb.add(jfx$27tmp);
            }
            Sequence jfx$28tmp = jfx$26sb;
            jfx$25sb.add(jfx$28tmp);
        }
        Sequences.set(this, 1, jfx$25sb);
    }

Oh, crap - the compiler didn't use a single BooleanArraySequence like I expected. Unless my memory fails, javafxc is capable of this optimization, but maybe just for simpler cases. It seems the compiler has ways go go. Another missing optimization is preallocation: the maximum number of elements that will be inserted can be statically determined (SIZE for the inner sequences, SIZE*SIZE for the outer), so the compiler should create the sequences with these initial sizes, avoiding growing costs. Finally, every iteration of the outer loop allocates, uses and then discards a temporary sequence (its elements are copied to the outer sequence); this inner sequence could be allocated only once and cleared/recycled in all outer loop iterations. The latter optimization is unnecessary if the compiler could just avoid the inner temporary sequence, but I can see other scenarios where this wouldn't be possible but the reuse of temporary sequences would.

There are also other gratuitous inefficiencies in the generated code, like several redundant temporary variables. (One of these, receiver$, is an artifact of traits, already planned to disappear from unnecessary places). Also I wonder if the order of the synthetic $ind and $upper variables in the bytecode may confuse loop optimizations (just like it confused my decompiler). Such small issues won't impact runtime performance as the JIT compiler will just optimize them out; but the redundancies affect startup/warmup performance and also code bloat.

Why I am complaining so much? JavaFX Script is a high-level programming language, in the sense that its mapping to the compiled form (Java bytecode) is not trivial (like it is for Java). And it actively promotes a high-level programming style, both by offering very convenient high-level features such as sequences and binding, and by not offering alternative low-level features (except for the recourse of "native interface" into Java classes). The net result of this design is that the compiler must assume responsibility for all the low-level optimizations that programmers can't do anymore (or, are convinced that it's not good style to do anymore - e.g. explicitly mutating sequences). In my Java code, I always do such things as preallocating collections, recycling expensive objects (remarkably big collections), or eliminate intermediary collections produced by inner loops.

The javafxc compiler already includes some impressive amount of such high-level optimizations; but we need more. Performance is already pretty good, but there is a lot of potential to be even better; I expect the code generation to keep improving for many updates to come.

))) End Parentheses to investigate the compiler )))

Anyway, let's continue the program...

    function scroll (dx:Integer, dy:Integer):Void {
        cells = for (y in [0 ..< SIZE]) for (x in [0 ..< SIZE]) {
            def yy = y + dy;
            def xx = x + dx;
            yy >= 0 and yy < SIZE and xx > 0 and xx < SIZE and isLive(xx, yy)
        }
    }
}

World's last function, scroll(), allows me to scroll all cells in any direction. Nothing remarkable here.

def world = World { SIZE: 64 }
def CELL_SZ = 8;

def animSlider = Slider {
    min: 0 max: 1000 value: 50 blockIncrement: 50 layoutInfo: LayoutInfo { width: 120 }
}
def anim = Timeline {
    repeatCount: Timeline.INDEFINITE
    keyFrames: KeyFrame { time: bind animSlider.value * 1ms canSkip: false action: function () { world.run() } }
}

Now we start the game UI. I declare the world object, and the animation timeline that triggers a new generation in fixed delays. A Slider allows to change this delay; I had to declare it here so I can use binding to automatically adjust the KeyFrame's delay from the slider value.

Notice the value * 1ms calculation, necessary to convert a Double to a Duration. The multiplication is a no-op, as 1ms is Duration's fundamental unit. You can't use a typecast (value as Duration), because the Duration type needs a unit (ms, s, m, or h) and there is no default unit, not even for 0. I like that, and I'd love to see JavaFX Script evolving to embrace user-defined units in its core typesystem; this would make a lot of sense for a high-level language serving business applications stuffed with manipulation of "real-world" data.

def toolbar =  HBox { spacing: 8
    content: [
        Button {
            text: bind if (anim.running) "Stop" else "Go"
            layoutInfo: LayoutInfo { width: 60 }
            action: function () { if (anim.running) anim.stop() else anim.play() }
        }
        Button {
            text: "Clear" layoutInfo: LayoutInfo { width: 60 }
            action: function () { world.reset() }
        }
        animSlider,
        Label { text: bind "({animSlider.value}) Generations: {world.gen} - Population: {world.pop}" }
    ]
    onKeyPressed: function (e:KeyEvent) {
        if (e.code == KeyCode.VK_DOWN)       world.scroll( 0, -1)
        else if (e.code == KeyCode.VK_UP)    world.scroll( 0,  1)
        else if (e.code == KeyCode.VK_LEFT)  world.scroll( 1,  0)
        else if (e.code == KeyCode.VK_RIGHT) world.scroll(-1,  0)
    }
}

I have a top row of controls that allow to stop/start the animation, reset it to the initial state, control its speed, scroll the cells in four directions, and show the generation and population stats. Only remarkable part is the if-else cascade in onKeyPressed(), because JavaFX Script lacks a switch/case statement. The language already has first-class functions and closures, so adding a map type would allow efficient (hashed) branching for larger numbers of keys, and reasonably compact code too.

def life = Group { content: for (yy in [0 ..< world.SIZE]) for (xx in [0 ..< world.SIZE])
    Rectangle { x: xx * CELL_SZ y: yy * CELL_SZ width: CELL_SZ height: CELL_SZ
        fill: bind if (world.isLive(xx, yy)) Color.BEIGE else Color.BLACK
        stroke: Color.BLUE
        onMouseClicked: function (e:MouseEvent) {
            world.flip(xx, yy);
            toolbar.requestFocus();
        }
    }
}

The main "game" region is a grid of Rectangles to show each cell. Once again I use nested Y/X loops, producing the sequence expected by Group.content. For each Rectangle, I've used binding to set its fill color according to the corresponding cell state.

Yup, that design (and even my choice of the Game of Life - mwahahaha!) was a purposeful stress-test of both binding and the scene graph; with the 64x64 world size, this means 4.096 nodes and 4.096 bound properties, so I'm relying a lot on compiler and runtime efficiency.

The handling of mouse clicks, used to toggle cells, is trivial because I can attach the event handler to each Rectangle, so I don't need any picking logic. Notice also that my even handler is a "full closure" that reaches to the xx, yy variables - the indices of the for loops that built the Rectangle sequence.

Finally, in that same mouse handler I force the keyboard focus to the toolbar because that's where I installed the KeyEvent handler for scrolling.

Stage {
    title: "Life" resizable: false
    scene: Scene { content: VBox { content: [ toolbar, life ]}}
}

The Stage and its Scene, with the toolbar on top of the game region. It's complete! Click the image below to launch:

Game of Life screenshot

The resulting functionality and even the look are, IMHO, surprisingly great for a program that's under 100 lines of code. Just google "Game of Life Patterns", the web is shock-full of GOL resources. Just click the cells; sorry no import/export of LIF or RLE files yet - may appear in the payware version ;-)  This validated my impression about JavaFX's productivity...

A B[l]inding Puzzler

...but I confess that my first code didn't work; the cells didn't change in the screen, as if world was not being recalculated at all. The bug was here:

    function isLive (x:Integer, y:Integer)   { cells[y * SIZE + x] }
...

    Rectangle { fill: bind if (world.isLive(yy, xx)) Color.BEIGE else Color.BLACK ... }

My fill: bind... was not firing when world.cells changed. The problem is, the only variables captured by the bind expression are world, yy and xx. These are the only data which updates would trigger reevaluation of my bind expression. The cells sequence is encapsulated by the World class, and it's not directly referenced from the bind expression. This may be a significant puzzler as someone could start with a more "scriptish" prototype code full of script-scope variables, and later refactor these into classes.

The fix was trivial once I found the problem; just declare bound function isLive..., and it works. Now the binding system knows that the subexpression world.isLive(xx, yy) is also invalidated when the cells sequence is changed, because that field is used inside isLive() and this dependency propagates to bound expressions that invoke isLive(). (Such propagation is not a completely obvious feature; it shows that the binding mechanism is pretty well rounded, with robust dependency tracking.)

Timeline issues

The KeyFrame.time property is read/write; this is very convenient because I can change the animation speed by just updating this property in my single KeyFrame. Unfortunately, this doesn't work very well. The program starts with a configuration of 50ms; if you click Go and then drag the slider (left = smaller delays / faster animation, right = larger delays / slower animation), the animation will adjust its speed but not smoothly. Sometimes I observe a pause of a few seconds, sometimes a "race" of very fast animation while dragging the slider. The animation engine must be doing some timing/scheduling that become temporarily confused when a KeyFrame's time property is changed.

I have tried some alternate implementations - using an intermediary variable with an on replace trigger that stops (or pauses) and timeline, changes the KeyFrame delay and resumes it; and even create a new Timeline. But the result was always similar.

Performance

This is a simple app, so it shouldn't put a big stress on JavaFX... except perhaps, for my sub-optimal state management, and large counts of nodes and bindings.

Idle test: Empty world, animation stopped. CPU usage is 0 as expected, and GC log shows zero activity. This test looks trivial, but it's good to assert that no part of the system (binding, scene graph) uses polling, busy-waiting or other brain-dead techniques. Some platforms are known for non-zero CPU usage in idle apps, so it's good to show that JavaFX won't do that. ;-)

Dead test: Empty world (no live cells), animation on at 50ms (20 gen/s == 20 fps). CPU usage was a lowly ~1,1% (on a quad-core Q6600; so that's ~4,4% of a single core). Most work is due to the recalculation of the world; we can inspect the GC log:

[GC 15.049: [DefNew: 4421K->5K(4928K), 0.0005543 secs] 13263K->8847K(15872K), 0.0005950 secs]
[GC 16.201: [DefNew: 4421K->5K(4928K), 0.0006040 secs] 13263K->8847K(15872K), 0.0006480 secs]

That's ~3,8Mb/s = ~190Kb per frame/generation = ~46 bytes per cell update. It's a bit higher than I originally expected; the missing sequence optimization is certainly the cause, as it produces a lot of extra allocation. But even that is not so bad, because Java's excellent GC produces near-zero pauses.

Dead & Headless test: Similar to the previous test, but I commented out the bind in Rectangle.fill, so the entire GUI layer is a no-op after initial startup. GC behavior was identical, but CPU usage down to 0,89% (of a single core). This means that the binding was costing 0,22% core (or 0,00005% per cell: how's that for precision?). Notice that my code is assigning new values to every cell; the fact that the new values are identical to the old values only saves the effort to repaint the rectangles, but the bind expressions must be reevaluated every time.

Life test: I changed the life() function to just flip all cells in the first few rows. [I only change the return statement, to not remove the effort of calculating all cells with the normal GOL algorithm.] This provides a stable animation test; a real Life run is difficult to benchmark because the number of cell changes in each generation varies chaotically.

The animation engine could not keep up with many rows - the updated cells are only refreshed in some frames. Testing with 2 rows (128 cells) is fine; at 4 rows (256 rects) I could already see skipped frames. Garbage collection was intense, let's see it for 2 rows:

[GC 9.439: [DefNew: 4422K->9K(4928K), 0.0014487 secs] 13267K->8855K(15872K), 0.0014851 secs] 
[GC 9.480: [DefNew: 4425K->6K(4928K), 0.0015159 secs] 13271K->8851K(15872K), 0.0015514 secs]

The animation engine does a lot of allocation when I simply change the Rectangle.fill property to a different Color. We're up to ~120Mb/s = ~6Mb / generation = ~46Kb per updated cell. it seems that the scene graph completely rebuilds its internal node objects ("SGNode's") when some property changes. These preprocessing techniques are essential to accelerate such things as transforms and effects, but in this case I'm just changing a simple Rectangle's internal painting from one solid color to another solid color.

The program is doing full vector rendering; if you check JavaFX Balls, drawing things from geometric elements may be much slower than just blitting a bitmap image. But JavaFX Balls used a complex drawing with curves and gradients; Life only draws pretty dull rectangles without rounded corners, transforms, effects or anything else. I've tested the Prism toolkit too but this time it didn't save JavaFX; basically the same behavior.

I experimented some optimizations that I didn't originally want to use:

  • The first obvious thing is using a single ImageView for the background of all-dead cells. Then I have one fixed Rectangle per live cell, just hiding it when the cell isn't live.
  • Using ImageView also for the live cells (so, full "bitmap rendering").
  • To show/hide the live cells, I've tried both flipping the visible property, and moving dead cells away from the view (change y to a big negative value - this requires some layout tweaks).
  • Finally, I changed the code so that the entire Group of rectangles is protected by a single bind expression, and only live cells will generate a Rectangle. This replaces all live-cell nodes, if any, at every frame. The advantage is that most cells are usually dead, so the scene graph has less nodes.

All these optimizations net me a maximum of ~2X speedup; I could animate 4 rows of pulsating cells with proper behavior (no visible frame skipping - still, high CPU and GC activity).

It seems that JavaFX's scene graph is already perfect for GUIs with controls, but must improve its support for general animation. Changing the state of a large number of nodes per frame shouldn't have such a high cost. Even adding/removing many nodes should be faster, although I realize this is harder and will accept tradeoffs in coding effort - e.g. carefully breaking the scene graph into many groups, then adding/removing these to the scene, maybe with a hint to let the engine do all preprocessing in parallel and only make the new nodes visible when they're fully realized but not hang the animation until that happens.

This would be perfect for problems like Joeri Skora's Isometric tile rendering in JavaFX; notice that JavaFX 1.3 has pretty good performance for a very big scene graph - even in Use brute force mode, his animation scrolls over a 65.536-node scene with surprisingly good performance. But that's just because the scene is completely static; and the approach of dynamically adding and removing nodes, even with optimizations like quadtrees, suffers from the overhead of changing the scene tree.

I don't expect that JavaFX's scene graph would be optimized for huge scenes (e.g., with a spatially indexed node tree for more efficient clipping). JavaFX is not meant to compete, out of the box, as a high-end game engine. But it should be sufficiently powerful and flexible to allow programmers add extra tricks and optimizations that become necessary in each application niche. Besides games (a huge business even in its "casual" category), there are other important cases for advanced animation, such as sophisticated data visualization.

JavaFX versus Swing

I've quickly googled "Java Swing Game of Life" and found one program that's very close to mine. (Even closer after I stole its idea of having a slider to change the animation speed.) I've made a few changes to the Swing app to make both comparable - same cell number and size, same optional hack for the stable Life tests.

Code size and clarity: JavaFX wins. The Swing code is ~230 lines, this after I've stripped many redundant comments and {}s. Even removing all remaining comments (not fair because the code is clearly not all obvious) and tightening the formatting/indentation even more, it's more than 2X the size of the JavaFX Script code. And that's with less features (no scrolling). There is no contest in code size - or much more important, code clarity, that is more subjective but if you check the code I don't think there's much space for argumentation. (This program may not be the best possible Swing code, but I don't think that would be much better.)

The Swing program does no custom painting, it creates a custom JLabel for each cell and changes its background color - this is nice because it's the closest thing to a "scene graph-based" Swing program: all rendering is performed by the toolkit. Also, the panel objects contain the cell state and the GOL algorithms, using two state variables and two complete passes over all cells to enable in-place updates. That should put the Swing program in performance advantage over JavaFX. (Once again I could optimize my GOL program, remarkably for in-place update - but I don't want to; I'm focusing on easy to write, easy to read code.)

Memory usage: Swing wins. Measuring the Life test, JavaFX uses 54Mb working set / 100Mb private bytes (8,852Kb heap); JavaFX+Prism is better at 51Mb / 82Mb (8,875Kb heap).  The Swing program uses 55Mb / 94Mb (3,170K heap, but without any significant allocation/GC activity even in the Life test). JavaFX uses more heap, remarkably for bindings and closures and the scene graph. JavaFX 1.3 has improved the efficiency of both binding and the scene graph, but if you use both in the range of thousands, there's still enough overhead to care. But JavaFX really loses in its excessive memory allocation when scene graph nodes are updated.

Performance: Swing wins. The Swing program doesn't suffer from the issues I discussed with JavaFX's scene graph; it happily runs the Life test with near-zero CPU and GC activity - like we should expect from a simple, 64x64 Game of Life running on a current computer, and executed by native code (in that case JITted) and a reasonably hardware-accelerated toolkit (which includes Java2D/Swing).

Last Conclusions (and RFE's...)

I am still quite happy with my Life program. It was a pleasure to write, and some of the performance issues (remarkably with sequences) are easy to fix if I care. Perhaps even the scene graph limitations have a smarter workaround that I didn't try - e.g., rendering all live cells with a dynamically-built Path (I was just too lazy to try that one...) or some other trick.

I'm not happy though with these scene graph limitations; I should be able to write an efficient version of something like the Game of Life without any optimization effort. Adding/removing many nodes from the scene graph is very expensive; I can accept and understand this, it's the core tradeoff of scene graphs (but still, in JavaFX the tradeoff seems unreasonably high). But I neither understand, nor accept a big overhead for trivial updates in existing nodes - like changing a solid color, flipping the visibility state, or even just translating. At least in this area, it seems that JavaFX must still improve significantly. Even if JavaFX is now very close to be an excellent and complete platform for some important use cases - control-centric (e.g. business front-ends) and media-centric - the platform is still in the beginning of a steep adoption curve and it can't afford to not serve other niches very well.

As a JavaFX enthusiast, I like to refer to all former Java GUI toolkits (AWT/Java2D/Swing, LCDUI, even SWT) as "legacy" and "obsolete"; but this is clearly not fair while there are programs that I can write in these old toolkits with excellent results, but not in JavaFX. I'm optimist because JavaFX has already improved a lot since its v1.0; the foundations are very solid and the JavaFX team is now very fast catching up in areas like high-quality controls, layout and styling. The compiler is also maturing fast; the optimization of binding in 1.3 was massive (even though not yet complete) and the sequence optimizations are ongoing.

Finally, we could argue that the scene graph paradigm is not ideal for all graphics applications, but I don't believe that. I see immediate mode rendering as the future Assembly coding of graphics. On the other hand, shader programming is an important piece of modern graphics stacks; JavaFX uses this intensely (remarkably in Prism), but unfortunately it's only internal. With that support, I could write all the cell rendering easily inside a single canvas node - the Life "world" can be rendered as a big functional texture, and its rendering is a ridiculously-parallelizable task that's perfect for the shader paradigm. Shaders are often great replacement for important use cases that don't favor scene graphs. So, in my (non-expert) opinion, the big missing piece in the JavaFX stack is not a traditional immediate-mode API, but opening Decora (the desktop runtime's portable shading engine) for applications, with a public shading API.

AttachmentSize
Life.jnlp969 bytes

Comments

Life or Death

Thanks for the interesting post. I always knew that the scene graph was slow at updating but I did not know that the complete graph rebuilds when a insert or a trivial change is made. Its plain as day now that the scene graph is not the right structure for some applications. In my own application I have moved on to using JLayer which gives me buffered layered views. I've then created a set of reusable and reconfigurable renderer objects that target specific layered views, these replace the need for having many nodes. Then I've extracted all the layout and styling information out to a configuration manager which is backed by Lucene. A query then activates a given configuration combination to cause the renderers to layout and paint using a given style. My renderers use Java2d as it is simply the best way to render and the graphic2d object can be reused as well. With the JDK 7 improvements in Java2D with XRender I'm sure my renderers are good to go. Scene graphs and Pixel Shapers may be suitable for some types of of use cases but Java2D and buffered layered views are still the best for others. If the JavaFX nodes would have had more smarts like the Piccolo nodes did by having the ability to use Java2D within a node, then we would have a different story. At this juncture JavaFX is like the new candy counter at the store, full of neat candy but no really solid food to give you the energy to go the distance.

Several points here

Some apps may indeed be always better to program in immediate-mode APIs; I just think that scenegraph(+shading) will eventually be efficient enough for almost everything. When a lower-level technique gives you only a single-digit performance advantage but has double-digit disadvantages in productivity etc., the trend is always moving to system-level code and very few application niches. This is a never-ending cyclical evolution; my first serious study of graphics programming was the excellent series of articles from Michael Abrash published almost 20 years ago in DDJ - and guess what, nobody uses many of those (then state-of-the-art) techniques directly anymore, except perhaps GPU chip designers or programmers who code graphics drivers or APIs/runtimes like Java2D or JavaFX... the total dominance of the scene-graph paradigm is not "if", it's just "when".

Second point: paradigm is one thing, implementation is another. Maybe the JavaFX scene graph is just not enough optimized and I'm running into some important bug or bottleneck. (BTW it doesn't rebuild the whole scenegraph, it only rebuilds stuff for t he changed or added/removed nodes.) Or maybe FX is just too general-purpose, and a domain-specific system (like a dedicated 2D game engine) may always beat it. But I hope they keep improving FX so most developers won't have to master many different UI/graphics APIs.

The JLayer from JDK7 seems to be indeed a very interesting development. We need enough flexibility in the core platform (in that case JavaSE) so programmers can always pick the best architecture and API choices for each job.

People have asked for the ability to use Java2D to manually render some JavaFX components; in my blog I provide the basic directions (I have used a very low-level approach of direct manipulation of the pixel array, but I guess I could have also used a Graphics2D to draw on the BufferedImage). The problem is that this only works for the JavaFX Swing-based toolkit, but not for Prism. Looking forward with Prism, we would need something like a public Shader API.

Profiling away

 Hey, great post, love the detail and effort :-). I've been profiling this on Mac OSX with Shark which will profile all the way down into the native libs (very very useful). Anyway, the main culprit here is actually that little Label with the bound text. Shark shows somewhere around 54% of CPU time spent doing the text work (mostly related to positioning the text within the Label, and for building up the String which, believe it or not, uses String.format. There is a bug filed for this). When I comment out the Label, I see a much different profile where 24.2% CPU time is spent doing binding invalidation, and the remaining time doing actual fills.

Which is all to say, the scenegraph is not at fault :-). At least, not for the CPU time. I have not yet dug into the garbage generation, though typically we've seen most of this happening in Java2D with the creation of Graphics2D objects and in other places.

 

I will check that too

I've managed to avoid profiling, to not break my plans for this "high-level" article - but yeah now it's time to see what is really going on. I'm happy if this demo/benchmark becomes useful; and I'm surprised with your initial findings. (Is this specific to OSX? I'm only testing on Win7.)

(I'm sending you the project if you want to reproduce all my tests, it's a big mess right now with several variants of the code commented out, but I'll post it later when it's more clean.)