Skip to main content

Using the builder pattern with subclasses

Posted by emcmanus on October 25, 2010 at 3:31 AM PDT

 Josh Bloch's Effective Java popularized the Builder Pattern as a more palatable way of constructing objects than constructors or factory methods when there are potentially many constructor parameters. The formulation in Effective Java makes for a particularly readable construction, like this:

new Rectangle.Builder().height(250).width(300).color(PINK).build();

The advantage over a constructor invocation like new Rectangle(250, 300, PINK); here is that you don't have to guess whether 250 is the height or the width. More generally, if the constructor allowed you to specify many other parameters — such as position, opacity, transforms, effects, and so on — it would quickly get very messy. The builder pattern stays clean.

But one question that arises is: how does it work in the presence of inheritance? For example, suppose you have an abstract Shape class that represents an arbitrary graphical shape, with a set of properties that are common to all shapes, such as opacity and transforms. And suppose you have a number of concrete subclasses such as Rectangle, Circle, Path and so on, each with its own properties, like Rectangle's height and width.

As a reminder, here's what the builder pattern looks like in the absence of subclassing:

public class Rectangle {
    private final double opacity;
    private final double height;
    ...

    public static class Builder {
        private double opacity;
        private double height;
        ...

        public Builder opacity(double opacity) {
            this.opacity = opacity;
            return this;
        }

        public Builder height(double height) {
            this.height = height;
            return this;
        }
        ...

        public Rectangle build() {
            return new Rectangle(this);
        }
    }

    private Rectangle(Builder builder) {
        this.opacity = builder.opacity;
        this.height = builder.height;
        ...
    }
}

Subclassing

Now what happens if we want to introduce a Shape superclass, and move some of the properties of Rectangle into it? Let's just concentrate on opacity and height. We'll move opacity into Shape (all shapes have opacity) and leave height in Rectangle (a circle for example is defined by its radius rather than its height).

Here's a first attempt. Obviously we are no longer going to be able to keep our constructors private, at least not in Shape, since a subclass needs to be able to invoke its superclass's constructor. So we'll make it protected, and we get this:

// First attempt at Shape/Rectangle separation.  This does not work!
public class Shape {
    private final double opacity;

    public static class Builder {
        private double opacity;

        public Builder opacity(double opacity) {
            this.opacity = opacity;
            return this;
        }

        public Shape build() {
            return new Shape(this);
        }
    }

    protected Shape(Builder builder) {
        this.opacity = builder.opacity;
    }
}

public class Rectangle extends Shape {
    private final double height;

    public static class Builder extends Shape.Builder {
        private double height;

        public Builder height(double height) {
            this.height = height;
            return this;
        }

        @Override
        public Rectangle build() {
            return new Rectangle(this);
        }
    }

    protected Rectangle(Builder builder) {
        super(builder);
        this.height = height;
    }
}

That looks pretty simple. Rectangle.Builder extends Shape.Builder, so it inherits the opacity method, and adds its own height. It overrides the build method to return its own Rectangle(Builder) constructor. That constructor passes its Builder to the superclass, so Shape can set the opacity of the new object, and Rectangle sets the height. So what's the problem?

Say we write:

Rectangle r = new Rectangle.Builder().opacity(0.5).height(250).build();

It doesn't compile. The reason why is slightly hidden by the reuse of the name Builder in both Shape.Builder and Rectangle.Builder. Rectangle.Builder inherits its opacity method from Shape.Builder. That method is declared to return Shape.Builder. But Shape.Builder does not have a height method. Even though the actual Rectangle.Builder object that we are using does have a height method, the compiler is using the declared type of opacity(0.5), which as we just saw is Shape.Builder.

Suppose we required our callers to put the methods in order from subclass to superclass (which would be a very unpleasant requirement). So here the caller would have to write:

Rectangle r = new Rectangle.Builder().height(250).opacity(0.5).build();

But that still doesn't help. opacity(0.5) still returns Shape.Builder, so therefore the build() at the end is Shape.Builder.build(), which returns a Shape not a Rectangle.

A nasty solution

One way out is for Rectangle.Builder to redeclare opacity:

public class Rectangle extends Shape {
...
    public static class Builder extends Shape.Builder {
        ...
        @Override
        public Builder opacity(double opacity) {
            super.opacity(opacity);
            return this;
        }
        ...

That does fix our problem. But it's very nasty. It means that every time you add a property to Shape you have to visit all its subclasses, and all its subclasses' subclasses, and so on, to add a new color(c) or whatever method to all their Builder classes. It's hardly worth using inheritance at all if you end up doing things like that.

A better solution

There is a better solution, which allows us to do what we want without having to pollute subclasses with their superclass's properties. The main drawback is that it uses mindbending generics declarations similar to Enum<E             extends Enum<E>>. But it works. Here's the code:

public class Shape {
    private final double opacity;

    protected static abstract class Init<T extends Init<T>> {
        private double opacity;

        protected abstract T self();

        public T opacity(double opacity) {
            this.opacity = opacity;
            return self();
        }

        public Shape build() {
            return new Shape(this);
        }
    }

    public static class Builder extends Init<Builder> {
        @Override
        protected Builder self() {
            return this;
        }
    }

    protected Shape(Init<?> init) {
        this.opacity = init.opacity;
    }
}

public class Rectangle extends Shape {
    private final double height;

    protected static abstract class Init<T extends Init<T>> extends Shape.Init<T> {
        private double height;

        public T height(double height) {
            this.height = height;
            return self();
        }

        public Rectangle build() {
            return new Rectangle(this);
        }
    }

    public static class Builder extends Init<Builder> {
        @Override
        protected Builder self() {
            return this;
        }
    }

    protected Rectangle(Init<?> init) {
        super(init);
        this.height = init.height;
    }
}

The idea is that instead of hardwiring opacity() to return the type of the Builder that defines it, we introduce a type parameter T and we return T. The self-referential definition Init<T extends Init<T>> is what allows us to make the type of the inherited opacity() in Rectangle.Builder be Rectangle.Builder rather than Shape.Builder.

We can no longer simply return this from opacity(), since at the point where it is defined, this is an Init, not a T. So instead of this, we return self(), and we arrange for self() to be overridden so that it returns the appropriate this. This is pure ceremony to keep the compiler happy: all of the Builder classes have identical definitions of self(). (This is what Angelika Langer calls the getThis() trick, citing Maurice Naftalin and Philip Wadler for the name and Heinz Kabutz for the first publication.)

Why do we need to split our previous Builder class into separate Init and Builder classes? Because we still want the caller to be able to write:

Rectangle r = new Rectangle.Builder().opacity(0.5).height(250).build();

If Builder were the class with the self-referential type (Builder<T extends Builder<T>>) then new Rectangle.Builder() would be missing a type argument. And even if we were prepared to bother every caller with having to supply such an argument, what would it be? We cannot write new Builder<Builder>() because then the second Builder is missing a type argument! That's why we need one class that has the self-referential parameter (so opacity() can return T) and another one that doesn't (so callers can make instances of it).

Better still: static factory instead of constructor

It isn't very nice that users can see both the Builder and Init classes. Is there a way to hide one of them?

The answer is yes, if we say that the way to build a rectangle is this:

Rectangle r = Rectangle.builder().opacity(0.5).height(250).build();

That is, the caller gets a builder from the static method Rectangle.builder() rather than with new Rectangle.Builder(). Here's what the modified code looks like:

public class Shape {
    private final double opacity;

    public static abstract class Builder<T extends Builder<T>> {
        private double opacity;

        protected abstract T self();

        public T opacity(double opacity) {
            this.opacity = opacity;
            return self();
        }

        public Shape build() {
            return new Shape(this);
        }
    }

    private static class Builder2 extends Builder<Builder2> {
        @Override
        protected Builder2 self() {
            return this;
        }
    }

    public static Builder<?> builder() {
        return new Builder2();
    }

    protected Shape(Builder<?> builder) {
        this.opacity = builder.opacity;
    }
}

public class Rectangle extends Shape {
    private final double height;

    public static abstract class Builder<T extends Builder<T>> extends Shape.Builder<T> {
        private double height;

        public T height(double height) {
            this.height = height;
            return self();
        }

        public Rectangle build() {
            return new Rectangle(this);
        }
    }

    private static class Builder2 extends Builder<Builder2> {
        @Override
        protected Builder2 self() {
            return this;
        }
    }

    public static Builder<?> builder() {
        return new Builder2();
    }

    protected Rectangle(Builder<?> builder) {
        super(builder);
        this.height = builder.height;
    }
}

 

This is my preferred version.

A shorter, smellier variant

There is a way to avoid having to define a second builder class for every class you want to construct, but it is a bit nasty. I won't spell it out like the other cases, but here's the gist:

public class Shape {
    ...
    public static class Builder<T extends Builder<T>> {
        ...
        @SuppressWarnings("unchecked")  // Smell 1
        protected T self() {
            return (T) this;            // Unchecked cast!
        }
        ...
    }

    @SuppressWarnings("rawtypes")       // Smell 2
    public static Builder<?> builder() {
        return new Builder();           // Raw type - no type argument!
    }
    ...
}

public class Rectangle extends Shape {
    ...
    public static class Builder<T extends Builder<T>> extends Shape.Builder<T> {
        ... no need to define self() ...
    }

    @SuppressWarnings("rawtypes")
    public static Builder<?> builder() {
        return new Builder();
    }
    ...
}

Some notes

I showed the opacity and height fields as final because fields should be final whenever possible, and because it demonstrates that this pattern works correctly with final fields. But of course it works with non-final fields too.

If the Shape class were abstract, we would omit its builder's build() method, and the static builder() method in the variant that has one, but otherwise everything would be the same.

If you have several hierarchies of classes using this pattern, you might want to extract the self() method into a separate interface or abstract class, such as
public interface Self<T extends Self<T>>.

You can have required constructor parameters by putting those parameters in each Builder's constructor, and in the static factory method in that variant. Of course then you lose the ability to name those parameters.

You can have default values for parameters by providing initializers in the builder (not in the class itself!), for example:

public class Shape {
    ...
    public static abstract class Builder<T extends Builder<T>> {
        private double opacity = 1.0;
    ...

Can we do better?

In my preferred solution, every class in the hierarchy has to duplicate the code in blue. Is there a way to reduce this boilerplate code without resorting to smelly hacks? I have not found one, but I'm open to suggestions!

 

Comments

A builder wonderfully done. I can understand the way it ...

A builder wonderfully done. I can understand the way it worked but there was something I could not figure. It was the return type for the method

public static Builder<?> builder().

Why the entire code didn't compile if I drop the wildcard and used raw type instead i.e.

public static Builder builder()?

What is the significance of the additional wildcard?

Thanks

double post

double post

I think another useful way to use the builder is create ...

I think another useful way to use the builder is create inner interfaces as steps : http://rdafbn.blogspot.co.uk/2012/07/step-builder-pattern_28.html

If Shape is concrete class: ShapeBuilder shapeBuilder = ...

If Shape is concrete class:

ShapeBuilder shapeBuilder = new Shape().Builder().opacity(2);
...
Rectangle r = new Rectangle.Builder().height(250).build(shapeBuilder);
...
Rectangle(Builder builder, ShapeBuilder shapeBuilder) {
super(shapeBuilder);
this.height = builder.height;
}

Note: The code in your post is hard to read due to broken HTML.

Note: The code in your post is hard to read due to broken HTML.

I often do this: public class BuilderWithSubclasses { ...

I often do this:

public class BuilderWithSubclasses {
  public static void main( final String[] notUsed ) {
    final Square s = new Square.Builder() {{ color = Color.BLUE; size = 2; }}.build();
    System.out.println( s );
  }
  static class Shape {
    protected final Color color;
    protected Shape( final Builder values ) { color = values.color; } 
    public static class Builder {
      public Color color = Color.BLACK;
      public Shape build() { return new Shape( this ); }
    }<br />    public String toString() { return &quot;[&quot; + color + &quot;]&quot;; }<br />  }<br />  static class Square extends Shape {<br />    protected final int size;<br />    protected Square( final Builder values ) {<br />      super( values );<br />      size = values.size;<br />    }<br />    public static class Builder extends Shape.Builder {<br />      public int size = 1;<br />      public Square build() { return new Square( this ); }<br />    }<br />    public String toString() { return &quot;[&quot; + color + &quot;, &quot; + size + &quot;]&quot;; }<br />  }
}

Except for the double braces the syntax is quite nice and it is easy to follow. There is a small performance overhead in that 3 classes are loaded, instead of 2, when the first object is built and one class loaded, instead of 0, thereafter.

Using the builder pattern

hi emcmanus,
great post!
Some words to your preferred version:
If you change the visibility of Shape.Builder2 and Rectangle.Builder2 to public you would allow the programmer to use the builder to create several instances of e.g. Rectangles with same opacity but different heights:

Rectangle.Builder2 rectangleBuilderWithPredefinedOpacity = new Rectangle.Builder2().opacity(0.5);

Rectangle rectangle1 = rectangleBuilderWithPredefinedOpacity.height(250).build();
Rectangle rectangle2 = rectangleBuilderWithPredefinedOpacity.height(34).build();
Rectangle rectangle3 = rectangleBuilderWithPredefinedOpacity.height(837).build();
...

Without making these Builder2 classes public and provding only factory methods which return Builder<?> you are constrained to build your Rectangles or Shapes in one statement. You can´t do something like mentioned before. If this is not wanted (like your preferred version), then there´s no need to hold the Shape or Rectange attributes in the Builder class and to do something like:
	protected Rectangle(Builder&lt;?&gt; builder) {
super(builder);
this.height = builder.height;
}

you could hold a Shape instance in your builder and manipulate it directly ...
but like a said - great post!

Hi Florian, I am not sure what would stop you from reusing ...

Hi Florian,

I am not sure what would stop you from reusing builder instances with the code as it stands:

Rectangle.Builder<?> rectangleBuilderWithPredefinedOpacity = Rectangle.builder().opacity(0.5);

As for having a local instance of Rectangle, that would indeed shorten the code somewhat, but as you say it would prevent builder reuse, and it would also not work if some of the properties being set are final.

Éamonn

 

Using the builder pattern

In the Some notes section, you state:

If the Shape class were abstract, we would omit its builder's build() method, and the static builder() method in the variant that has one, but otherwise everything would be the same.

In fact, it is even better. You can avoid creating the Builder2 classes as well (not just the base build() method and the static builder() method):

public abstract class AbstractShape {
    private final double opacity;

    public static abstract class AbstractBuilder<t abstractbuilder="" extends=""><t>&gt; {<br />        private double opacity;<br /><br />        protected abstract T self();<br /><br />        public T opacity(double opacity) {<br />            this.opacity = opacity;<br />            return self();<br />        }<br />    }<br /><br />    protected Shape(Builder builder) {<br />        this.opacity = builder.opacity;<br />    }<br />}<br /><br />public class Rectangle extends Shape {<br />    private final double height;<br /><br />    public static class Builder extends Shape.AbstractBuilder<builder> {<br />        private double height;<br /><br />        protected Builder self() {<br />            return this;<br />        }<br /><br />        public <strong>Builder</strong> height(double height) {<br />            this.height = height;<br />            return self();<br />        }<br /><br />        public Rectangle build() {<br />            return new Rectangle(this);<br />        }<br />    }<br /><br />    private Rectangle(<strong>Builder</strong> builder) {<br />        super(builder);<br />        this.height = builder.height;<br />    }<br />}<br /></builder></t></t>

For clarity, I named the builder class found in the abstract Shape class AbstractBuilder, but it could also carry the name "Builder" (with the cost of a lot more confusion). Also note that Rectangle.Builder has no type parameters. Thus, the height method returns a Rectangle.Builder., and the private Rectangle constructor just takes a Builder.
Of course, if the class hierarchy has more than two levels (e.g., Square extends Rectangle), then you have no choice but to introduce the Builder2 classes and the static build() methods.
Edit: I apologize for the formatting of the comment. For some reason, I cannot get the code blocks to format correctly.

Hi Eamonn, Nice post =)

Hi Eamonn,
Nice post =) Congratulations!
Well, I got a problem when I tried it. On your preferred solution, the Rectangle constructor is:
  protected Rectangle(Builder<!--?--> builder) {
      super(builder);
      this.opacity = builder.opacity;
  }
But I think it should be:
  protected Rectangle(Builder<!--?--> builder) {
      super(builder);
      this.height = builder.height;
  }

Another thing, I guess I found the page o 'Effective java', it is this one ,isnt it? I put the link here to share =D
Have a nice weekend!
Bye
Luan

Hi Eamonn, Nice post =)

Good catch! Thanks for pointing it out - I've fixed it.

What about a small language

What about a small language enhancement (a la project coin)?

Imagine if the java language was changed so that the caller of a method was allowed to "pretend" that a void instance method returns (instead of nothing) the object the method was called on?

In other words this code:

Object o = new Builder().someVoidMethod().someOtherVoidMethod().aThirdVoidMethod().build();

would be silently translated to:

Builder tmp = new Builder();
tmp.someVoidMethod();
tmp.someOtherVoidMethod();
tmp.aThirdVoidMethod();
Object o = tmp.build();

If this kind of enhancement existed then builder inheritance wouldn't be a problem, because the caller will always know what the concrete type of the builder is. There are some other advantages as well:

  • The setter methods of the builder are void methods, which means that they are actual JavaBeans properties
  • The builder is less tedious to write, as you can omit the "return this" line from every method
  • The contract is subtly stronger, as the caller knows it is dealing with the exact same instance, as opposed to a different instance of the same class
  • The pattern is useful for tersely constructing mutable java beans as well

What about a small language

Indeed, that is one of the very many changes that were proposed but not accepted for Java 7. Although it has the nice properties you mention, it is really a hack, and in particular would mean that you could not break a chain of calls in the expected way. In other words, you could not break

Object o = new Builder().someVoidMethod().build();

into

Builder b = new Builder().someVoidMethod();
Object o = b.build();

(Unless you also want to stretch the change in meaning of void to cover this case too, which I think would be a bad idea.) What is really wanted, if this problem is important enough to justify a language change, is the ability to declare explicitly that a method returns "this", for example with a syntax like this:

public <strong>this </strong>opacity(float f) {...}

JavaBeans could be changed so that a setter would be recognized if it returned this as well as if it returned void. There would be no need for an explicit return statement, since the compiler knows what to return.

Or, a more general solution would be a way to write "the type of this", so you could use it in more places than just the return type, although in this case the meaning is subtly different, as you note, because the returned value could be any value that is compatible with the type of this, including null.

I seem to remember a proposal

I seem to remember a proposal for something like that for Java 7. The rule was something like 'any method that returns void returns itself instead'. It didn't gain any traction then, maybe it's good idea to take a second look.

Actually, instead of the

Actually, instead of the self() method, you just need to cast this to <T>. It will give you an unchecked cast, but it works.

 Well, yes, that's one of the

 Well, yes, that's one of the two smelly things in the smelly variant that I gave at the end. I'd really rather not have to add @SuppressWarnings("unchecked") on every one of the methods that return T, just to avoid defining self() once, though.