Skip to main content

Dropping Boxes with JavaFX

Posted by opinali on September 3, 2010 at 12:05 PM PDT

If you want to work for DropBox, they have an interesting programming test which solution must be submitted together with the CV. I’m not considering a position at DropBox, but their test was too fun to ignore: an interesting challenge in algorithms, and another opportunity to exercise JavaFX as any geometric problem surely deserves some GUI.

(Don’t read this blog if you actually plan to apply for a job at DropBox.  I don’t think the company would use this problem as its single method of recruitment; this is more valuable for the candidate.)

The problem is to find the ideal packing of some number of arbitrary “boxes” (rectangular objects), so the total area is minimized. The boxes can be rotated if necessary, and the area of the smaller rectangular area that contains all packed boxes is the solution.

Like most geometric problems, this is easy to solve by sketching some empiric solution and pondering how to translate it to an algorithm. The first, somewhat obvious idea:

  • Sorting the boxes: First place those with bigger areas, to avoid inefficient placement that could happen when trying to place large boxes on a very fragmented free space caused by smallish boxes.

I don’t have a formal justification for this heuristic... it's just a reasonable analogy with similar fragmentation problems, from memory allocation to disk filesystems. Partitioning by size is usually a good idea.

Let's start the implementation with the sorting function and some random utilities:

function area (c: Point2D)     { c.x * c.y }
function area (r: Rectangle2D) { r.width * r.height }

class RectComparator extends java.util.Comparator {
    override public function compare (r1: Object, r2: Object) : Integer {
        (area(r2 as Rectangle2D) - area(r1 as Rectangle2D)) as Integer;
    }
}

function sort (rects: Rectangle2D[]) {
    Sequences.sort(rects, RectComparator{}) as Rectangle2D[];
}

I was surprised that I needed to write a Comparator class like that. I can't just pass an equivalent  JavaFX Script function to the sort() method, because the language doesn't support SAM conversion (like proposed for Java 7). I wonder if some kind of support for Java's generic types would also be needed in that case, so a function (r1: Rectangle2D, r2: Rectangle2D) {...} would be type-safely converted to a Comparator.

Also, I had to write helper functions for areas. I noticed that javafx.geometry is much simpler than the equivalent AWT package. JavaFX APIs generally have a minimalist design, due to footprint especially on its lower profiles; but I wonder if we could have some extra power here, at least for the Desktop profile. Sophisticated manipulation of geometric models is something I see a significant number of JavaFX applications doing. The javafx.scene.shape is much more complete, but its classes are UI components and not adequate for anything else.

The second idea came easily too:

  • Candidate positions: Have a list of possible positions for box placement. In the initial state, the only candidate position is the origin (0, 0): the top-left corner of a virtual bounding box that grows to contain all placed boxes. This list will be updated as each box is positioned.

The figure below shows the virtual bounding box (thick lines); the initial candidate position (red disk); the next box to place (dashed line); and the ideal position for that box (blue arrow). This ideal position is calculated by our brain, not by the algorithm: we're still working on the problem of finding some algorithm that will select this position!

box1

Placing the first box is trivial because there’s only one candidate position, but this trivial special case doesn't help much to further reveal the algorithm.

Let's write down the data structure that will represent each iteration, or step, of the solution:

class State {
    public-init var output: Rectangle2D[] = [];
    public-init var candidates: Point2D[] = Point2D { x: 0 y: 0 };
    public-init var limits: Point2D = Point2D { x: 0 y: 0 };
}

In this class State, I have a sequence of output rectangles (the already-placed boxes), a sequence of candidate positions, and the limits of the virtual bounding box of all placed rectangles. This class will be immutable; functions that operate on its data will be member functions (methods) of State.

box2

I consumed the original placing position, and this reveals two new interesting candidate positions: the northeast (top-right) and southwest (bottom-left) corners of the placed box. The southeast (bottom-right) corner, or other positions (e.g. at the middle of some edge), don't make sense. All boxes must be tightly packed, so each new box should go as much north and west as possible, always touching other boxes' edges (or the origin axes).

This looks like a promising solution, at least a good start. In fact, this solution is already good enough for the simple 3-box problem from DropBox’s challenge. Now I can envision the core algorithms for box placement:

  • Placing a box: For each new box, try placing it in all candidate positions. Try this twice per box - with its original shape, and rotated (inverting its height and width). Discard placements that create intersection with any existing box. Calculate the total area for each attempt, and choose the option that delivers the smaller bounding area.
  • New candidate positions: Remove the position consumed by the placed box. Add the NE and SW corners of the placed box.

Here is the code for the box-placing part:

function flip (r: Rectangle2D) {
    if (r.height == r.width)
        null
    else
        Rectangle2D { width: r.height height: r.width }
}

function intersects (placed: Rectangle2D) {
    for (o in output) if (o.intersects(placed)) return true;
    false
}

function pack (next: Rectangle2D) {
    var bestLimit = null;
    def currArea = area(limits);
    var bestArea = Float.MAX_VALUE;
    var bestRect: Rectangle2D = null;
    var bestCand: Point2D = null;

    for (n in [ next, flip(next) ], c in candidates) {
        def newLimit = Point2D {
            x: max(c.x + n.width, limits.x),
            y: max(c.y + n.height, limits.y)
        }
        def newArea = area(newLimit);

        if (newArea < bestArea) {
            var placed = Rectangle2D {
                minX: c.x minY: c.y width: n.width height: n.height
            };

            if (not intersects(placed)) {
                bestLimit = newLimit;
                bestArea = area(newLimit);
                bestRect = placed;
                bestCand = c;
                if (newArea == currArea) break;
            }
        }
    }

    State {
        output: [ output, bestRect ],
        candidates: nextCandidates(bestCand, bestRect),
        limits: bestLimit
    }
}

Function Scene.pack() is basically half the solution; it depends on a nextCandidates() function that we'll see later. It also uses two new helper functions, flip() and intersect().

Notice that flip() returns null if the rectangle is square - in that case we don't try the same placement twice! The use of this null value is a JavaFX Script trick. The loop for (n in [ next, flip(next) ]) iterates a sequence that contains the "next box" and also its flipped version. But if flip(next) returns null, that sequence will contain only next, because sequences cannot contain nulls; sequence construction silently ignores nulls.

One important decision is the handling of multiple candidate placements that don't increase the current bounding area - an extremely frequent event, for smaller boxes that fit in some existing "hole". My first instinct was trying a best-fit decision, scanning all candidates and having a bias for placements closer to the top-left corner or some other heuristics. But the performance cost of this decision was big, because it implies in a full Cartesian product of boxes X candidate positions; and the result in better overall placement was virtually zero. So in the end I opted for a mix of best/first-fit strategy: keep searching candidates, but abort that search when I find the first placement that is "good enough" - doesn't increase the bounding area - the critical variable that the algorithm must optimize.

box3

Back to our boxes... We placed the second box; so far so good, but look at the next box... this will cause us some trouble.

box4

Now the rule for adding new candidate position has failed; the new set of candidate positions is clearly not sufficient. The ideal position for the next box is just under the last placed box, touching its bottom edge; but with the X coordinate more to the left, touching the right edge of the bigger box.

Our original rule was too simple, it ignores the gaps caused by placements that don't preserve the neat, ladder-like arrangement of the first three figures.

At this point in the problem, there are some possible solutions:

  1. Brute-force. Add to the candidate list the full Cartesian product of the right and bottom coordinates of each placed box X all existing boxes.

This will produce a large number of candidate positions, most of them useless, like this:

box5

This solution works; the problem is scalability. The number of candidate positions will be roughly O(N2) on N = number of boxes, even with some rules to discard a few positions (repeated positions produced by different combinations of boxes; positions already used by some placed box; SE position of any placed box). Positions that land in the left or top edge of any box are especially useless, can't place boxes of any non-zero area. You could trim such useless positions, but that would be expensive, requiring intersection tests with all existing boxes.

  • 2. Rocket-science. We could design a smarter placement algorithm that uses the candidate positions, but is able to "shift" blocks to the left (or to the top) until they touch another box.

box6

The picture shows this idea: the green arrow is the shift-left that will adjust the placement from the initial candidate position. How to implement this? I considered mapping the free space, starting with the bounding rectangle for all previously-placed boxes, then subtracting all these boxes to produce a shape containing all the free space... then partition this shape into a list of rectangular free-slots... so I can finally make intersection tests between my next rectangle and any free slots immediately to its top or left... this would be a potentially incremental process, stopping when there are no more free neighbors or when I intersect some other box.

This solution will also work; the problem is, it's complex. I am basically implementing a small physics engine - just add gravity, momentum, mass centers, some Newton's formulas... and my head is starting to hurt!

Remember that I will have to do this smart-ass stuff for all candidate positions X all "next" boxes! The preprocessing of the initial state (like creating a map of free space) can be performed incrementally from each state to the next; but the remaining effort per iteration is still significant, and the code would be quite complex.  Overall, it doesn't look much better than the brute-force approach.

  • 3. Think, Iterate, Refine. I have actually considered both previous solutions before I found the ideal solution - and as you will see, it can be considered a refinement of the previous idea.

I have changed the algorithm to create the new candidate positions after placed boxes. Now it reads like this:

  • New candidate positions: Remove the position consumed by the placed box.  Add the NE and SW corners of the placed box. For the SW corner, find the intersection with the right-most edge of all existing boxes at its left. For the NE corner, find the intersection with the bottom-most edge of all existing boxes above.

You will notice that this simple scan of intersections is a watered-down version of the "rocket-science" solution: this effectively finds the positions that would be created by shifting boxes to the left or top. The next picture illustrates the result; the green arrow shows the SW-originated candidate position that was shifted left.

box7

At this point I considered the algorithm complete and proceeded to code the GUI, write this blog, analyze the results. But I eventually found a small problem. What happens if, in the state above, the next box to place has a different shape so its ideal position is that inner corner (red circle right above the one pointer by the arrow)? In this case, the position that we just shifted to the left will be unusable because it will be in the middle of the new box's left edge. That position is not even removed from the candidates list, because it was not used as a placement position.

box8

This last figure shows what should happen: while placing the central square box as indicated, we must find any existing candidate positions that are captured by its left edge, and move these positions to the right (green arrow), so they coincide with the right edge of that box, at the same vertical coordinate. This trick will keep the candidate position useful, for example for the smaller square box as indicated by the figure. Similar handling will be applied to candidate positions captured by the top edge of a placed box.

The final algorithm:

  • New candidate positions: Remove the position consumed by the placed box. Add the NE and SW corners of the placed box. For the SW corner, find the intersection with the right-most edge of all existing boxes at its left. For the NE corner, find the intersection with the bottom-most edge of all existing boxes above. Move any position captured by the left edge of the placed box to its right edge. Move any position captured by the top edge of the placed box to its bottom edge. Avoid any duplicates.

So now we can code it...

function findIntersections (ne: Point2D, sw: Point2D) {
    var bestX = 0.0;
    var bestY = 0.0;

    for (o in output) {
        if (o.maxX < sw.x and o.maxX > bestX and
            o.minY < sw.y and o.maxY > sw.y)
            bestX = o.maxX;
               
        if (o.maxY < ne.y and o.maxY > bestY and
                o.minX < ne.x and o.maxX > ne.x)
            bestY = o.maxY;
    }

    Point2D { x: bestX y: bestY }
}

function nextCandidates (usedCand: Integer, next: Rectangle2D) {
    def ne = Point2D { x: next.maxX y: next.minY };
    def sw = Point2D { x: next.minX y: next.maxY };
    def inters = findIntersections(ne, sw);

    [
        for (c in candidates) {
            if (indexof c == usedCand) null
            else if (c.y == next.minY and c.x >= next.minX and c.x < next.maxX)
                Point2D { x: c.x y: next.maxY }
            else if (c.x == next.minX and c.y >= next.minY and c.y < next.maxY)
                Point2D { x: next.maxX y: c.y }
            else c
        }
        ne, sw,
        if (inters.y == ne.y) null else Point2D { y: inters.y x: ne.x }
        if (inters.x == sw.x) null else Point2D { x: inters.x y: sw.y }
    ]
}

I ignore intersections that coincide with the original candidate positions - a common case, that happens when no shift is possible because that position is already touching a placed box or the axes.

I consider the code above "self-documenting" - it reads almost identical to the English statement of the algorithm. In a real project, I'd only add comments to explain the reasoning of the algorithm (why these intersections and moves must be performed, etc.). This is my litmus test for readable code: you don't need comments to explain what it is doing. Higher-level language syntax makes this obviously much easier to achieve.

That's it - our solution is almost complete. It's only missing the "driver" function that packs all boxes from an input sequence:

function pack (input: Rectangle2D[]) {
    var state = State {};
    for (next in sort(input)) state = state.pack(next);
    state
}

Now let's write some test code. I want to create a sequence of random boxes.

def rnd = new java.util.Random(77);

function randomInput (size: Integer) {
    def skew = if (size <= 10) 5 else 10;
    for (i in [1 .. size]) Rectangle2D {
        width:  round(pow(2, (rnd.nextFloat() * skew)))
        height: round(pow(2, (rnd.nextFloat() * skew)))
    }
}

An exponential random distribution produces more interesting input data. Also, I'm forcing the random seed to a fixed value so I can run reproducible benchmarks.

User Interface

Finally, let's make some GUI - a JavaFX program wouldn't be complete without that :-)

def SIZE = 512;
var solution: State = null;
var solutionDebug = false;

function solve (panel: Panel, count: Integer) {
    panel.parent.scene.stage.title = "Boxes: {count}...";
    def input = randomInput(count);
    def start = DateTime {};
    solution = pack(input);
    solutionDebug = false;
    def time = (DateTime {}).instant - start.instant;
    var totalArea = 0;
    for (r in solution.output) totalArea = totalArea + area(r) as Integer;
    var limitArea = area(solution.limits);
    def scale = SIZE / max(solution.limits.x, solution.limits.y);
    panel.parent.scene.stage.title =
        "Boxes: {count} Usage: {100.0 * totalArea / limitArea}% "
        "Time: {time}ms ({(time as Float) / count}ms/box)";
    panel.content = [
        Rectangle { fill: null stroke: Color.YELLOW
            width: solution.limits.x * scale height: solution.limits.y * scale
        }
        for (r in solution.output) Rectangle {
            x: r.minX * scale y: r.minY * scale
            width: r.width * scale height: r.height * scale
            strokeWidth: 0.5 stroke: Color.BLACK fill: Color.rgb(0,
                (r.minX * scale * indexof r) mod 256,
                (r.minY * scale * indexof r) mod 256)
        }
    ]
}

function showDebug (panel: Panel) {
    if (solutionDebug) return;
    solutionDebug = true;
    def scale = SIZE / max(solution.limits.x, solution.limits.y);
    def fill = Color.rgb(255, 0, 0, 0.25);
    insert for (c in solution.candidates) Circle {
        centerX: c.x * scale centerY: c.y * scale radius: 8 fill: fill
    } into panel.content;
}

def stage = Stage {
    width: SIZE height: SIZE title: "DropBox JavaFX - Press SPACE, 1..0, ENTER"
    scene: Scene { content: Panel {
        layoutInfo: LayoutInfo { width: SIZE height: SIZE }
        onKeyTyped: function (e) {
            def panel = e.node as Panel;
            if (e.char.compareTo('1') >= 0 and e.char.compareTo('9') <= 0)
                solve(panel, 100 * Integer.valueOf(e.char))
            else if (e.char.equals('0'))  solve(panel, 1000)
            else if (e.char.equals(' '))  solve(panel, 10)
            else if (e.char.equals('\n')) showDebug(panel);
    }}}
};

stage.scene.content[0].requestFocus();
stage

The GUI is quite simple. If you press any key, the program creates a new input dataset, solves the box placing problem, and creates Rectangle components that show the solution ("placed" boxes). Most of the code is concerned with secondary stuff, like measuring execution time and computing and formatting some interesting statistics. There's also some scaling logic to fit the result neatly in the scene, and some pseudo-random color computing for a clear and interesting display. The dataset will have 10 boxes if you press SPACE, or 100 ... 1.000 boxes for '1 ... '0'.

You may notice the awkward code that maps keys to box counts. I can't do arithmetic like e.char - '0' because e.char is a String; the language has no Character type. Writing e.char.charAt(0) is no help, that returns the same single-char string (this is bad enough to make this Java API useless, so that's an important interop bug). I can't typecast e.char to any integral type. In short, no easy way to manipulate a character as a numeric value. On top of that, e.code is useless for my purposes, because the KeyCode type doesn't expose an ordinal number, so no luck writing e.code - KeyCode.VK_0. Even a range check like e.code.compareTo(KeyCode.VK_0) doesn't work: this compiles, but returns bogus results!. (The language has no enum construct.) Finally, JavaFX Script has no switch/case, and it doesn't support a native map type construct that could allow me to build a KeyCode->Function mapping; programmers depend completely on the if-else-if... syntax for multiple-choice decisions. Now add this to the difficulty of making range checks on typed keys (or any enum-like API), and the result is awkward.

I love JavaFX Script, but it does show a few traits of an Ivory Tower design: a language that didn't yet mature in the trenches of real projects, as it fails in such a common "pragmatic" idiom like mapping a range of keys to numeric values through simple arithmetic. Yes, it's trivial to call a small Java class to compensate for these limitations; but this facility should not make the language designers complacent, sitting on top of basic deficiencies. Even the DSL status is no excuse - my code needs to handle some keyboard input and this certainly belongs to the scope of a GUI DSL; if the language can't do that in a simple and elegant way, it has to be fixed.

DropBox

The picture above - click it to launch a JNLP app - would make Piet Mondrian proud! Notice the semi-transparent red circles showing the full list of candidate positions at the final state: these circles don't appear by default, you must press ENTER after the solution.

The efficiency of the algorithm looks pretty good: packing densities are usually in the 96%-98% range for 100 boxes, and never below 99,4% for 1.000 boxes, for my input data. My naked-eye analysis of a couple dozen solutions didn't show any obvious failure of optimal placement (well, not after all algorithm iterations summarized above). But this is by no means a formal proof; I doubt the algorithm is optimal. I didn't make any research to find my solution, but I know this is an important and well-researched problem.

For one thing, my algorithm uses a mix of first-fit decisions (for performance) and best-fit heuristics (for good placement). Also, my code does local optimization (finding the best placements for each box independently), instead of global optimization (considering all boxes at once). An ideal solution should be both purely best-fit, and globally-optimizing (perhaps through backtracking, or maybe smarter preprocessing like pre-grouping boxes with edges of similar sizes). The ordering or bigger boxes first seems to be a very good heuristic, but it's not sufficient for global optimization.

Performance

I'm basically done with algorithm work, so now it's the right time to think about code efficiency. I fired up the NetBeans profiler and looked at one run for 1.000 boxes. On HotSpot Server:

BoxProfileS

The biggest offender is intersects(). I expected this function to be the most called, but I didn't expect it to use so much time! The problem of course, is that each of the N boxes have at least one candidate position, and each position must be tested for intersection against many previously-placed (up to N-1 at worst-case) boxes.

I added some code to display the final number of candidate positions, and total number of invocations to intersects() and intersection tests inside that method:

  • Iterate rotation, then candidates: 2.740 candidates, 438.019 calls, 54.142.832 tests
  • Iterate candidates, then rotation: 2.719 candidates, 652.823 calls, 84.139.905 tests

If you asked why pack() does have the iteration c in candidates in the innermost loop and n in [ next, flip(next) ] in the outer loop, that's why: it's more efficient to scan the candidates in the inner loop, which really makes sense. Anyway the number calls to intersects() is roughly O(N2), while the number of intersection tests is roughly O(N3). (The latter seems to be closer to O(N3 / 20), but in Big-O analysis this dividend is basically irrelevant.)

One obvious solution is sorting or indexing. If the candidate positions were sorted, I could narrow the search and only perform intersection tests against a relatively small number of boxes. But any single-dimensional ordering, e.g. by distance from origin, would at best allow me to cut intersection tests by a linear factor; moving from O(N3 / 20) to (say) O(N3 / 100) doesn't seem worth the extra code and effort. I clearly need something that cuts time quadratically, so the total intersection number becomes O(N2). Some ideas are Binary Space Partitioning, Quadtrees, or maybe just a simple grid (e.g., I split the space in regions of 10 x 10 units of length, and I keep track of which boxes and/or candidate positions belong to each region). But I have to finish this blog some day, so this improvement is left as an exercise for the reader. ;-)

How fast is the code in real-world units? I benchmarked with the following procedure: start the program; run a single untimed 1.000-box test for warm-up; then run timed runs for 100 ... 2.000 boxes.

Boxes HS Client HS Server Cli / Srv
100 0,10 ms 0,05 ms 2,00 X
200 0,18 ms 0,05 ms 3,60 X
300 0,28 ms 0,13 ms 2,15 X
400 0,36 ms 0,16 ms 2,25 X
500 0,52 ms 0,22 ms 2,36 X
600 0,66 ms 0,29 ms 2,27 X
700 0,86 ms 0,37 ms 2,32 X
800 1,04 ms 0,44 ms 2,36 X
900 1,27 ms 0,54 ms 2,35 X
1.000 1,56 ms 0,66 ms 2,36 X
2.000 6,25 ms 2,40 ms 2,60 X
DropBoxPerf

HotSpot Server is ~2,35X faster than Client, but both degrade performance in a curve of similar shape. A linear regression of the most regular section of the series (for Server) results in the formula 0,0002 * N1,5, better than expected, but our data series is limited and the cost of per-box intersection tests extremely small. At higher box counts (many thousands and up) it's likely that the curve would reveal a higher exponent.

My code is not tuned for performance; for one thing, I could easily get rid of some object allocations - including the full recreation of the output and candidates sequences at each step of the solution. JavaFX Script is not Lisp - its sequences are not linked lists. It's not Clojure either - its sequences are "persistent" but their structure is not optimized for frequent updates of a few items. Anyway, allocation and GC are not a problem here. A 1.000-box solution causes 73 Mb of data to be allocated and collected, in 17 GC events, total time 13ms (0,83% of the total execution time for HotSpot Client), average pause time 0,7ms, heap occupation min = 3,8 Mb / max = 8,4 Mb, and back to 2,2 Mb when the solution is complete (even with the JavaFX GUI up). These numbers look excellent.

But I am a stubborn, low-level, optimization-obsessed hacker, so I had to make the experience of changing some code to avoid the perceived language inefficiencies. I did these changes (MainJ.java in the project):

  • Using a java.util.LinkedList (instead of a sequence) for candidates and output;
  • Abandoning all functional purism: these sequences, and the State object, are updated in-place at each step;
  • Defer some object allocations (newLimit and placed in the pack() function), even at the inconvenience of writing code like intersects(c.x, c.y, n.width, n.height);

As I expected, these changes saved a lot of memory churn. Now the program burns only 13,3 Mb for the 1.000-box test; this is more than 5X better, a really amazing feat so I was all slapping on my own back... and performance (for a 2.000-box test) improved ~5% for HotSpot Client. But it degraded ~20% for Server, which was shocking. JavaFX Script's are really very efficient, but remarkably when they benefit from a superior JIT compiler. Even when you consider that the VM that's available for RIA clients is just HotSpot Client, the 5% speedup is probably not worthy of the much uglier, "optimized" code.

The exercise revealed a little undocumented secret of JavaFX Script: its for loop can iterate Java collections, so I could declare output: LinkedList, and the code for (o in output) would still compile and run perfectly! The major inconvenience was adding typecasts, because generic types don't carry over to the JavaFX side.

Inspection of javafxc-generated bytecode shows a lot of useless overhead (e.g. binding support, missing sequence optimizations that I've already blogged over). This may explain the very big advantage of HotSpot Server: it does a much better job of removing redundancy (this is what advanced code optimization is all about). I expect the Server x Client gap to decrease with time, as javafxc evolves to produce more efficient bytecode.

More Exercises for the Reader

In the final algorithm for new candidate positions, the positions created by intersection with left/top edges are added together with the base SW/NE positions. Why can't we drop these base positions and only add the ones created by the intersections? At least in this figure, the base SW position seems redundant.

For large box counts like 1.000, you'll notice that most runs produce a solution with very skewed shape. Either the total width is equal or little bigger than that of the widest box, or the total height is equal or a little higher than that of the tallest box. Can you tune the algorithm to have a bias for a more "square" result, or for a specific shape? (The easiest way of course is using a fixed-size bounding box, which is an important real-world scenario; but I don't want that).

Evolve the algorithm and the code (including the GUI) to a 3D version, e.g. for filling shipping containers. :-)

AttachmentSize
DropBox.zip15.48 KB

Comments

typo?

"but their test was very fun to ignore", or "but their test was too fun to ignore"?

oops!

That was a mistake; I fixed it... at least in the main article, because the blog platform doesn't let me change the blurb.