Skip to main content

Bean Curd 2: The SQL

Posted by evanx on June 6, 2006 at 6:50 AM PDT

Introduction

"No one appreciates the very special genius of your conversation as a dog does."

DogLawn300.jpg
In Bean Curd 1 I introduced an "explicit properties" approach, where bean properties are declared in an "explicit bean info" class. Actually the bean info class and property descriptors are absorbed into the components bound to the bean, for convenience.

In that example, the bean info was absorbed into a Swing table model, and the property descriptors were absorbed into its table columns.

The property descriptor objects are configured using their own Field. In particular, the field name is used to deduce the default property name, and also further settings are extracted from annotations.

This avoids using string literals to refer to properties. I argue against such references and in favour of "stringless applications" in Explicit Reflection, Refactoring Translations and of course Bean Curd 1.

In this article, we address the worst pile of string literal references in the world. The database query! It's the messiest, most important integration of another language into Java. (Apart from XML, which JAXB handles quite brilliantly.)

In the ORM example presented here, our bean is an "entity bean" e.g. representing a row from a database table. The our property descriptors are declared in "data access objects" which double up as "explicit bean info" classes.

We then use these bean info's and their explicit property descriptors to reference our database tables and columns in "native database queries." These are pure Java representations of SQL queries.

In this case, our queries are readily toolable (eg. using the IDE's prompting, auto-completion and error-detection capabilities), and our ORM is refactorable, e.g. we can freely and safely rename our column mappings without breaking our queries, using IDE refactorings.

Incidently, I used this approach on one of the biggest (and definitely the messiest) databases I have ever encounted. It had a few hundred tables, although I only needed to map around 100 of them for the queries I implemented. The biggest single report was collated from about a dozen distinct queries involving two dozen tables, running concurrently on up to 130 distributed databases across the UK. It was so much fun that it shouldn't have been legal! ;)

DogButch125.jpg

Entity beans

"No matter what you've done wrong, always try to make it look like the dog did it."

Butch, our CIO, recently decided that we gotta rewrite our Top Dog club membership system. Because he was running with a dog who upgraded their system from some legacy language to this is new Java thing, and now we gotta do the same. Because all the puppies are being trained in Java these days, so we gonna be in the dog pound, baby!

So check it out, dog. We implement an "entity bean" for loading data from a database table into our application. For illustration, we throw in some EJB3 annotations, even though this is not a POJO. I haven't made the inevitable move to Glassfish Java Persistence yet, so please excuse any errors, omissions and lies.

@Entity
@Table(name = "dog")
public class ZDog extends MEntityBean {

   protected String dogId;
   protected String dogName;
   protected String username;
   protected String password;   
   protected ZDogHouse dogHouse;
   protected String dogHouseId; // id of the above dogHouse object
   protected List dogRoleList = null;
   ... // other fields
   
   public ZDog() {
   }
  
   public Comparable[] getId() {
      return new Comparable[] {dogId};
   }

   public String getLabel() {
      return dogName;
   }
  
   @Id
   @Column(name = "dog_id")
   public String getDogId() {
      return dogId;
   }

   @ManyToOne(optional = false)
   @JoinColumn(name = "house_id", nullable = false, updatable = false)
   public ZDogHouse getDogHouse() {
      if (dogHouse == null && dogHouseId != null) {
         dogHouse = entityManager.dogHouse.getEntityBean(dogHouseId);
      }  
      return dogHouse;
   }

   public void setDogHouse(ZDogHouse dogHouse) {
      if (dogHouse != null) {
         dogHouseId = dogHouse.getDogHouseId();
      }  
      this.dogHouse = dogHouse;
   }  
  
   @OneToMany(targetEntity = ZDogRole.class, cascade = CascadeType.ALL)
   @ElementJoinColumn(name = "dog_id")
   public List dogRoleList getDogRoleList() {
      if (dogRoleList == null) {
         dogRoleList = entityManager.dogRole.getDogRoleListByDog(dogId);
      }  
      return dogRoleList;
   }
   
   ... // other getters and setters
}

Notice that we use a naming convention where "framework" classes start with the letter M, and application classes with the letter Z. Incidently, the M comes from meme.dev.java.net which is an implementation-in-progress of the approach presented here. And Z is just such a cool letter, so...

Our MEntityBean superclass might declare the getId() and getLabel() methods as abstract. The getId() implementation returns the primary keys of this entity bean. The superclass can then implement a compareTo() method for us, using getId() to get the comparative values.

The getLabel() method returns the string representation used to render the object, e.g. in a JLabel, combo box, or JTable cell renderer. So its toString() method is free to be used for another purpose, e.g. logging.

Notice that we illustrate code for lazily fetching referenced entities, using our "entity manager" which will be introduced later. When using JDBC rather than a sophisticated ORM engine that enhances our classes, this is handy for avoiding null pointer exceptions. Incidently, in this case we should implement caching in our entity manager, e.g. a HashMap using getId() for the keys.

Dog Access Object

"Some days you're the dog; some days you're the fire hydrant."

I like the DAO pattern. It abstracts our persistence mechanism, e.g. JDBC, JDO or EJB3. This is attractive to me because I have a habit of doing my own thing with JDBC, but also have a JDO comfort-zone from a previous project, and JPOX2 looks great. And of course I love the look of the new minty Java Persistence API and Glassfish, with its annotations. Because I like configuring stuff in my favourite language, that my IDE fully understands and helps me every step of the way. That is to say, in Java rather than in XML. If you ever catch me using XML when I could be using Java, please shoot me! ;)

The DAO pattern suggests an interface. Personally I find interfaces a problem for rapid prototyping, because I can't Alt-G into the implementation in Netbeans. You get taken to the interface declaration, which is a dead-end.

Considering that we never follow anything to the letter, let's introduce a concrete "entity info object" to start with, as below. This is our "data access object." Later we might split out an abstract superclass, to allow different implementations.

public class ZDogInfo extends MTable {
   
   @EntityPropertyAnnotation(label = "Social Security Collar Number", length = 14)
   MEntityProperty dogId = createPrimaryColumn("dog_id");
       
   @EntityPropertyAnnotation(label = "Real Name", length = 40, displayLength = 20)
   MEntityProperty dogName = createColumn("dog_name");

   @EntityPropertyAnnotation(label = "Puppyday", format = "yyyy-MM-dd")
   MEntityProperty birthDate = createColumn("birth_date");

   @EntityPropertyAnnotation(label = "Dog House Id", displayWidth = 100)
   MEntityProperty dogHouseId = createForeignId("house_id");

   @EntityPropertyAnnotation(label = "Dog House", displayWidth = 150)
   MEntityProperty dogHouse = createForeignReference(dogHouseId);
     
   @EntityPropertyAnnotation(label = "Roles", cascadeRead = true)
   MEntityProperty dogRoleList = createOneToManyCollection(dogId);

   ... // other explicit properties

   protected void configure() {
      dogHouseId.setReferentialConstraint(entityManager.dogHouse.dogHouseId);
      ... // other configuration settings
   }

   public List getOldDogs(int dogYears) {
      ... // select dogs where age > dogYears
   }
  
   ... // other query methods   
}

So in our entity info class, we explicitly declare our entity bean property descriptors, customised for ORM, as MEntityProperty objects.

Note that ORM frameworks create such objects, e.g. from XML configuration files, or annotations. However, we choose to expose them explicitly in our application, as above.

Dog22.jpgAs in Bean Curd 1, we deduce the property name from the field name, rather than using string literal references. That is, the dogName property descriptor field in ZDogInfo, is bound to the dogName property in the ZDog bean, as exposed by the getDogName() accessor.

In addition to annotations, we have other metadata configuration options, notably a variety of superclass methods with overloaded arguments e.g. createColumn() to which the crucial SQL column name is passed as an argument. Also the bean property descriptors might be further configured directly. For example, in the configure() method above, we invoke setReferentialConstraint() on the property descriptor.

Clearly we have more than enough rope here to configure ourselves. So it's probably best to use standard EJB3 annotations on the entity bean, and then complement those here in the entity info to suit ourselves.

Refactoring the ORM

"Did you ever walk into a room and forget why you walked in? That happens to dogs all the time."

The entity info objects reflect on their MEntityProperty fields. These objects resolve their java.beans.PropertyDescriptor using their own field name as the implicit property name, e.g. dogName.

So when renaming a property in the entity bean i.e. the accessor and mutator methods e.g. getDogName(), we have to take care to rename the explicit property descriptor field in the entity info as well e.g. dogName, and visa versa.

This does not detract significantly from refactorability, since it is a single trivial operation, e.g. using Netbeans' refactorings. And if we neglect to do this, we will be reminded immediately when the application is run, courtesy of an exception.

Entity manager

"The difference between cats and dogs is, dogs come when called, cats take a message and get back to you."

We expose our entity info objects in a central entity manager as below.

public class ZEntityManager extends MEntityManager {
   public final ZDogInfo dog = new ZDogInfo();
   public final ZDogRoleInfo dogRole = new ZDogRoleInfo();
   public final ZDogHouseInfo dogHouse = new ZDogHouseInfo();
   public final ZRegionInfo region = new ZRegionInfo();
   public final ZOrganisationInfo organisation = new ZOrganisationInfo();
   ... // other entity info's   

   public ZEntityManager() {
      configureReferentialConstraint(dog.dogHouseId, dogHouse.dogHouseId);
      configureReferentialConstraint(dogHouse.regionId, region.regionId);
      ...// more configuration
   }
  
   ... // maybe some query methods
}

All the entity info's are exposed here. So this entity manager represents our database schema.

We might perform some further configuration here, e.g. specifying foreign key constraints, as in the above constructor. I find it convenient to configure these constraints here, rather than in annotations in our entity beans, or entity info objects.

When this class is instantiated, all the entity info's are created. Therefore if there is any missing property or misnamed property descriptor, in any entity info, we'll know about it immediately, via an exception as soon as we run the application, if not via a unit test before that.

Native queries

"That fallacy about teaching old dogs new tricks... is so often true."

We use the entity info objects (exposed by our entity manager) and their explicit properties, to reference our database tables and columns in "native queries" as follows.

public class ZFidoQueryManager extends ZEntityManager {
  
   public List findFido(String regionId) {
      MSelectQuery query = createSelectQuery();
      query.setTitle("Fido's in region %s", regionId);
      query.selectAll(dog);
      query.selectExclude(dog.address);
      query.select(organisation.organisationId, organisation.organisationName);
      query.join(dog, organisation, region);
      query.whereEquals(region.regionId, regionId);
      query.whereStartsWithIgnoreCase(dog.dogName, "fido");
      query.whereIsNotNull(organisation.organisationName);
      query.whereIs(organisation.active, true);
      query.orderBy(organisation.organisationId, dog.dogName);
      query.limit(100);
      query.execute();
      query.sectionBy(region.regionId, region.regionLabel);
      query.calculateAverage(dog.age);
      query.calculateTotal(dog.dependents);
      query.writeResult(new MResultPdfWriter("/tmp/fido.pdf"), new ZDoggyStyle());
      List dogList = new ArrayList();
      while (query.next()) {
         ZResultRow resultRow = query.getResultRow(ZResultRow.class);
         result.dog.setOrganisation(resultRow.organisation);
         dogList.add(resultRow.dog);
         logger.finer(resultRow.dog.getDogName());
      }     
      return dogList;
   }
  
   ... // other queries       
}

Notice, no string literals! Those strings that are there, shouldn't be. In particular, the title should be in a resource bundle, and the file name (and "fido") should not be hard-coded like this.

So the really cool thing is that these native queries are readily toolable. For example, we can type "dog.dep" and press Control-Space, and our IDE will auto-complete "dog.dependents." Also we can type "dog." and press Control-Space,
and the IDE shows us all the column names, in neat Java notation. This is great for the induction of new developers. "Who needs ER diagrams? Not us! We got real tools, dog."

Dog54.jpg
Which brings us to the most important point. So we map our database to nice Java names (in our entity beans). The problem is that as soon as we use the mapped names in string queries (eg. OQL, EJBQL, HQL), then we immediately lose refactorability. It becomes impossible to fix up spelling errors and naming inconsistencies, without breaking our queries. "Thaaat's mentil!" And if you ignore a few broken windows, the next thing the whole building is run down, innit.

Our native queries are pretty much Java versions of the equivalent SQL code. Consequently, generating the SELECT SQL from the above select query object is relatively trivial, as we'll see later. This close correlation to SQL also makes our native queries immediately understandable to developers familiar with SQL.

As you can imagine, we use variable-length arguments extensively. Also, we use auto-boxing and method overloading, particularly in the where methods. For example, whereEquals(), whereLessThan() et al, should support comparing a column value to another column, or to a Java number, String, Date or boolean value, e.g. passed as an argument to our query method, e.g. findFido(regionId) in the above example.

We can add additional functionality to our native queries fairly easily. "It's our bone and we gonna chew the hell out of it." For example, we can specify a title, sectioning (eg. by region in the above example), columns to be totalled and averaged (eg. by section, and/or a grand total), a style template, and write the output of the query to HTML, PDF and/or Excel. All this with a few extra lines of code, as in the above example. This is handy for generating "run of the mill" reports in record time. In theory, that is. And "in theory there is no difference between theory and practice" so...

Traversing the result set

"To his dog, every man is Napoleon; hence the constant popularity of dogs."

The trick we use in the above native query is to populate a "result row" object via reflection, where this class might be implemented as follows.

public class ZResultRow implements MResultRow {
   public ZDog dog; 
   public ZDogHouse dogHouse; 
   public ZDogRole dogRole; 
   public ZOrganisation organisation; 
   public ZRegion region; 
   ... // other entity bean declarations
  
   public void resolveReferences(); 
      if (dog != null && dogHouse != null) dog.setDogHouse(dogHouse);
      ... // resolve other references
   }
}

Dog21.jpg
In our query.getResultRow() method, for each of the entity beans declared in ZResultRow above, that are included in the query, we create that entity bean using newInstance(), and set those values which are selected in the query, from the current row, using reflection.

As such, some the above references will be null, and those values which are not selected in the query, will remain unset. However, this is the nature of relational database queries for business intelligence reports, i.e. we are selecting specific limited data from a large dataset. We are not constructing a object graph per se, or navigating through persistent connected objects as one might do using an object database.

In the case of expressions, e.g. SUM(dog.dependents), we can extract these from the current row using a reference to the expression object. Or alternatively, we can use the first operand to reference the expression indirectly, e.g. getIntResultByOperand(dog.dependents). I found that this latter approach sufficed for my purposes.

As illustrated in the resolveReferences() method above, we might "resolve references" manually. Alternatively, we can resolve them when we iterate through the query result set, as shown further above, e.g. resultRow.dog.setDogHouse(resultRow.dogHouse). Or even easier, we can use metadata from our entity info's to resolve them automatically.

This obviates all those nasty JDBC getString("dog_id") methods. And I cannot tell you how happy that makes me. Because we don't like those, and you know we don't like string literal references. So we roll over backwards and jump through hoops, not to use them.

Implementation

"If dogs could talk, that would take a lot of the fun out of being a dog."

Let's chew on the implementation, and see how we might spit out the SQL.

In our entity manager, we might implement methods for the convenience of our queries, as follows.

public class MEntityManager {
   ...
  
   public MCondition or(MCondition ... conditions) {
      return new MOr(conditions);
   }

   public MCondition startsWith(MEntityProperty column, String value) {
      return new MEquals(column, new MStringLiteralValue(value + "%"));
   }

   public MCondition equalsNumeric(MExpression expression, Object value) {
      return new MEquals(expression, new MNumericLiteralValue(value));
   }

   public MCondition is(MExpression expression, Boolean value) {
      return new MIs(expression, value); // "IS TRUE" or "IS FALSE"
   }

   public MCondition in(MExpression expression, MSelectQuery subselect) {
      return new MInSelect(expression, subselect);
   }
  
   public MCondition equalsExpression(MExpression expression, MExpression rightExpression) {
      return new MEquals(expression, rightExpression);
   }
  
   public MExpression sum(MExpression ... expressions) {
      return new MSum(new MAdd(expressions));
   }

  public MExpression max(MExpression expression) {
      return new MMax(expression);
   }
   
   public MExpression monthPart(MExpression expression) {
      return new MMonthPart(expression);
   }
  
   ... // other convenience methods   
}

Let's start digging into the above types, notably MExpression and MCondition.

Expressions

"From the dog's point of view, his master is an elongated and abnormally cunning dog. "

Our MExpression interface is implemented by expressions that we might SELECT, or use in conditional WHERE clauses, or other clauses, e.g. ORDER BY.

public interface MExpression {
   public String emitExpression();
}

Certainly we will want to select database columns. These are declared in our entity info's as "entity properties," and so those implement MExpression, as shown below.

public class MEntityProperty implements MExpression {
   protected MPropertyDescriptor propertyDescriptor; // embedded property descriptor wrapper
   protected String columnName; // SQL name e.g. "dog_id"  
   ...
   public String emitExpression() {
      return columnName;
   }
}

So that was easy. We just emit the SQL column name of the property. "That's too simple, there must be a catch."

Oops, I forgot the table name. Let's correct that in T minus zero, i.e. now.

public class MEntityProperty implements MExpression {
   protected MPropertyDescriptor propertyDescriptor; // embedded property descriptor wrapper
   protected String columnName; // SQL name e.g. "dog_id"
   protected MEntityInfo entityInfo; // parent bean info, e.g. ZDogInfo
      ...
   public String emitExpression() {
      return entityInfo.getTableReferenceName() + "." + columnName;
   }  
}

For each entity that maps to a table, we assign an SQL "table reference name" which by default equals the table name. This enables us to select multiple instances from the same table, using different references, e.g. "FROM region region1, region region2". To achieve this we declare multiple instances of the entity info, with different table reference names, as follows.

public class ZEntityManager extends MEntityManager {
   public final ZDogInfo dog = new ZDogInfo();
   public final ZDogRoleInfo dogRole = new ZDogRoleInfo();
   public final ZDogHouseInfo dogHouse = new ZDogHouseInfo();
   public final ZRegionInfo region = new ZRegionInfo();
   public final ZOrganisationInfo organisation = new ZOrganisationInfo();
   ...
   public final ZRegionInfo regionOfDogHouse = new ZRegionInfo("region_of_dog_house");
   public final ZRegionInfo regionOfOrganisation = new ZRegionInfo("region_of_organisation");
   ...
   public List selectDogsLivingInRegionLabel(String regionLabel) {
      MSelectQuery query = createSelectQuery();
      query.selectAll(dog, dogHouse, organisation);
      query.selectAll(regionOfDogHouse, regionOfOrganisation);
      query.join(dog, organisation, regionOfOrganisation);
      query.join(dogHouse, regionOfDogHouse);
      query.whereEquals(regionOfDogHouse.regionLabel, regionLabel);
      query.orderBy(dog.dogName);
      ...
   }
   ...

In the above example, we introduce regionOfDogHouse and regionOfOrganisation, with distinguishing table reference names, in order to support selecting multiple ZRegion instances in a single query. We would also need to declare these in ZResultRow to get at them.

Composite Expressions

"You can say any foolish thing to a dog, and the dog will give you this look that says, 'My God, you're right! I never would've thought of that!'"

Consider the "sum of" expression implementation below.

public class MSum implements MExpression {
   protected MExpression expression;
  
   public MSum(MExpression expression) {
      this.expression = expression;
   }
  
   public String emitExpression() {
      return "SUM(" + expression.emitExpression() + ")";
   }
}

package_games.png
As you can see, it's not exactly rocket science. We can knock these things off for a dime a dozen, innit. For example, MMax, MMin and MCount. And, if it suits us, have convenience methods to create such expressions in our MEntityManager as shown above, so that our queries look more naturally like SQL.
Which is our goal... Sorry, I might have forgotten to mention that?

Literal values

"Dogs, the foremost snobs in creation, are quick to notice the difference between a well-clad and a disreputable stranger."

We must also support literal expressions. Let's introduce an empty interface to identify our literals.

public interface MLiteralValue extends MExpression {
}

Now let's crunch the numbers.

public class MNumericLiteralValue implements MLiteralValue {
   protected Object value;
  
   public MNumericLiteralValue(Object value) {
      this.value = value;
   }
  
   public String emitExpression() {
      if (value == null) {
        throw new MQueryRuntimeException("null number - mental!");
      }
      return value.toString();
   }
}

For text literals, the main difference is that we must use single quotes, as below.

public class MStringLiteralValue implements MLiteralValue {
   protected String value;
  
   public MStringLiteralValue(String value) {
      this.value = value;
   }
  
   public String emitExpression() {
      return "'" + value + "'";
   }
}

For date literals, we must also use single quotes. Oh, and format the date correctly.

public class MDateLiteralValue implements MLiteralValue {
   protected static MContext context = MContext.getInstance();
  
   protected Date value;

   public MDateLiteralValue(Date value) {
      this.value = value;
   }
  
   public String emitExpression() {
      return "'" + context.queryDateFormat.format(value) + "'";
   }
}

Dog37-125.jpg
We must also support timestamps, and be cognisant of the difference between dates and timestamps, in particular in comparisons. For example, "WHERE timestamp BETWEEN begin_date AND end_date" might be broken because the end date will be converted to a timestamp at the beginning of the day (ie. "00:00" in the hit series "24"). This makes the right comparison exclusive, whereas if the timestamp was a date type, it would inclusive. Yes, you guessed right, I got bitten on the bum by this.

In order to support a different SQL dialect, or even a different query language e.g. EJB QL, our native query objects might delegate to an interface e.g. MQueryLanguageFormatter (via MContext), which might be implemented differently by MPostgresFormatter, MFireflyFormatter, MJavaPersistenceFormatter etcetera.

Conditions

"A dog teaches one fidelity, perseverance, and to turn around three times before lying down."

Now let's consider conditions, which are used primarily in our WHERE clause, but also in the HAVING clause (which can be used together with GROUP BY).

Firstly, we introduce an interface for conditions.

public interface MCondition {
   public String emitCondition();
}

Consider the following implementation of a comparitive condition.

public class MGreaterThanInclusive implements MCondition {
   protected MExpression leftExpression;
   protected MExpression rightExpression;
  
   public MGreaterThanInclusive(MExpression leftExpression, MExpression rightExpression) {
      this.leftExpression = leftExpression;
      this.rightExpression = rightExpression;
   }
  
   public String emitCondition() {
      return leftExpression.emitExpression() + " >= " + rightExpression.emitExpression();
   }
}

The above is a typical condition, with two operands. The MBetween condition takes three expressions. MIsNull, MIsNotNull, MIsTrue, and MIsFalse take one. But their implementation is otherwise much like the above - in a word, trivial.

Let's consider one more comparitive example.

public class MIs implements MCondition {
   protected MExpression expression;
   protected Boolean value;
  
   public MIs(MExpression expression, Boolean value) {
      this.expression = expression;
      this.value = value;
   }
  
   public String emitCondition() {
      if (value == null) return expression.emitExpression() + " IS NULL";
      if (value) return expression.emitExpression() + " IS TRUE";
      return expression.emitExpression() + " IS FALSE";
   }
}

It's probably safer to throw an exception if the boolean value is null, and so we should rather use MIsNull to test for that.

Composite conditions

"Dogs listen to you while you talk about yourself, and keep up an appearance of being interested in the conversation."

In following example, we use varargs to specify multiple conditions to be OR'ed together.

public class MOr implements MCondition {
   protected MCondition[] conditions;
  
   public MOr(MCondition ... conditions) {
      this.conditions = conditions;
   }
  
   public String emitCondition() {
      StringBuffer buffer = new StringBuffer();     
      for (MCondition condition : conditions) {
          if (buffer.length() > 0) buffer.append(" OR ");
          buffer.append(condition.emitCondition());
      }
      return "(" + buffer.toString() + ")";
   }
}

Another composite condition is of course, MAnd. And we will see later that we AND together multiple conditions specified by where() method invocations on our select query.

Select query

"When in a car, dogs suddenly feel the need to bark violently at nothing, right in your ear."

Let's now consider the select query object, woohoo!

public class MSelectQuery extends MSqlStatement {
   protected List selectList = new ArrayList();
   protected List fromList = new ArrayList();
   protected List whereList = new ArrayList();
   protected List groupByList = new ArrayList();
   protected List havingList = new ArrayList();
   protected List orderByList = new ArrayList();
   protected List totalList = new ArrayList();
   protected List sectionByList = new ArrayList();
   ...
  
   public void select(MExpression ... expressions) {
      selectList.addAll(Arrays.asList(expressions));
   }
  
   public void selectExclude(MExpression ... expressions) {
      selectList.removeAll(Arrays.asList(expressions));
   }
  
   public void selectAll(MTable ... tables) {
      for (MTable table : tables) {     
         selectList.addAll(table.getColumnList());
      }
   }
  
   ... // very many more methods
}

There are of course no limits to the conveniences we can incorporate into the above select query object. So we can make SQL and other query languages look relatively pedestrian to boot, besides being untoolable in our IDEs.

For instance, we know that all the expressions in the GROUP BY clause must also be in the SELECT. So we introduce a method selectGroupBy() to catch two birds with one jump, as below.

   public void selectGroupBy(MExpression ... expressions) {
      selectList.addAll(Arrays.asList(expressions));
      groupByList.addAll(Arrays.asList(expressions));
   }

This is especially handy when our GROUP BY expressions are complicated.

print_printer.png
Also, we can introduce decorative functionality into our select query, which is related to printing the result. For instance, our selectTotal() method, indicates that we should total that column when we print the result set. Also, our selectSectionBy() method indicates that we should break the output into sections, and show those columns in the section headings, and not in the lines, e.g. selectSectionBy(regionId, regionLabel).

   public void selectTotal(MExpression ... expressions) {
      selectList.addAll(Arrays.asList(expressions));
      totalList.addAll(Arrays.asList(expressions));
   }
 
   public void selectSectionBy(MExpression ... expressions) {
      selectList.addAll(Arrays.asList(expressions));
      sectionByList.addAll(Arrays.asList(expressions));
   }  

Consequently, our select query object suffices for most reports that involve a single select query only. And without any compromise on the presentation quality, e.g. titles, sections, column headings, formatting and styling.

Throwing SQL out of the house

"A door is what a dog is perpetually on the wrong side of."

But how are we ever going to execute this query? Actually we find that is it quite easy to emit our SQL in our MSelectQuery class. Check it out, dog.

   public String emitQueryString() {
      validate();
      StringBuffer query = new StringBuffer();
      query.append("SELECT ");
      StringBuffer buffer = new StringBuffer();     
      for (MExpression expression : selectList) {
          if (buffer.length() > 0) buffer.append(", ");
          buffer.append(expression.emitExpression());        
      }
      query.append(buffer);
      query.append("\n FROM ");
      buffer = new StringBuffer();     
      for (MSource source : fromList) { // tables and joins
         source.emitFrom(buffer);
      }
      query.append(buffer);
      query.append("\n WHERE ");
      buffer = new StringBuffer();     
      for (MCondition clause : whereList) {
          if (buffer.length() > 0) buffer.append("\n AND ");
          buffer.append(clause.emitCondition());
      }
      query.append(buffer);
      ... // GROUP BY, HAVING, ORDER BY
      if (limit != 0) query.append("\n LIMIT " + limit);
      return query.toString();
   }

Dog31.jpg

Notice that might implement a method validate(), to detect obvious errors in the query before we even try to formulate and execute the SQL. For instance, we might have selected columns from a table that has not been specified in the FROM/JOIN clauses. That's like throwing the frisbee straight into the ground, which is no fun.

In order to support different SQL databases, we would implement the above method in a separate delegate class, e.g. MPostgresSqlWriter. Also, we might support other query languages, e.g. via MGlassfishJavaPersistenceQueryWriter.

Constructing the SQL in the above method is easy, hey! Handling the JOIN clauses turns out to be the hardest part, but is still easy. Let's check it out...

From tables to joins

"Labradors are lousy watchdogs. When there is a stranger about, they bark - with unmitigated joy at the chance to meet somebody new."

We use MSource objects for our FROM/JOIN clauses.

public interface MSource {
   public void emitFrom(StringBuffer buffer);
}

Since we use the entity info objects (exposed in our entity manager) as "table references," their MTable superclass implements MSource.

public class MTable implements MSource {
   protected String tableName; // SQL table name
   protected String tableReferenceName; // e.g. "region_of_dog_house"
   ...  
   public void emitFrom(StringBuffer buffer) {
      if (buffer.length() > 0) buffer.append(", ");
      buffer.append(tableName);
      if (!tableReferenceName.equals(tableName)) {
         buffer.append(" " + tableReferenceName);
      }
   }
}

Finally, join objects are also of the MSource type.

public abstract class MJoin implements MSource {
   protected MTable rightTable;
   protected MCondition joinCondition;
   protected List columnPairList = new ArrayList();
   ...  
   public void add(MEntityProperty leftColumn, MEntityProperty rightColumn) {
      columnPairList.add(new MEntityProperty[] {leftColumn, rightColumn});
   }
  
   public String emitJoinCondition() {
      StringBuffer buffer = new StringBuffer();     
      for (MEntityProperty[] columnPair : columnPairList) {
         if (buffer.length() > 0) buffer.append(" AND ");
         buffer.append(columnPair[0].emitExpression() + " = " + columnPair[1].emitExpression());
      }
      return buffer.toString();
   }
  
   public abstract void emitFrom(StringBuffer buffer);  
}

Remember that we have pre-configured our foreign key constraints e.g. in ZEntityManager further above. Typically, this enables our joins to be expressed very concisely in our select query, e.g. join(dog, dogHouse, region), i.e. without having to specify the join columns.

For more complicated cases of joining using columns from two or more other tables, we provide convenience methods in MSelectQuery for explicitly indicating which column pairs to use in the join, and/or a further join MCondition might be specified.

But typically we can deduce the columns from the foreign constraints. For example, we know to use dogHouseId to handle join(dog, dogHouse), i.e. "FROM dog JOIN dog_house ON (dog.house_id = dog_house.house_id)."

We might emit the JOIN SQL as follows.

public class MLeftJoin extends MJoin {
   ...
   public void emitFrom(StringBuffer buffer) {
      buffer.append("\n LEFT JOIN " + rightTable.getTableName());
      buffer.append(" ON (" + super.emitJoinCondition() + ")");
   }
}  

We take care to make our SQL output look neat, and to be as concise as possible. For example, in the above method, we should try to use a JOIN USING if possible, e.g. "JOIN dog_house USING (house_id)" when the column names are the same on both sides of the join.

So joins are a walk in the park. Which is what I wanna do sooner rather than later, so let's wrap this up sooner. So I'm not going to talk about sub-selects. Because I forget exactly how I handled them before, and don't want to remember because then I'll be tempted to include a discussion on them here and now. Darn! I just remembered. OK, lemme try...

Loose endings

"Things that upset a terrier may pass virtually unnoticed by a Great Dane."

Of course we must support sub-selects, e.g. "WHERE type IN (SELECT type FROM types WHERE ...)
AND id IN (SELECT MAX(id) FROM ...)".

public class MInSelect extends MCondition {
   protected MExpression expression;
   protected MSelectQuery subselect;
  
   public MInSelect(MExpression expression, MSelectQuery subselect) {
      this.expression = expression;
      this.subselect = subselect;
   }
  
   public String emitCondition() {
      return expression.emitExpression() + " IN (" + subselect.emitQueryString() + ")";
   }
}  

That should work! I knew it was easy. Just remembering it was the hard part.

Which reminds me, we should also implement INSERT, UPDATE and DELETE queries. Compared to the MSelectQuery, they are easy. And that's all I'm gonna remember about them right now!

Except to say, UPDATE and DELETE implement FROM, JOIN and WHERE exactly like SELECT of course. So MSelectQuery, MUpdateQuery and their other friend MDeleteQuery, might all extend MQuery, with common support put there for FROM, JOIN and WHERE.

Netbeans, the adhoc query tool

"Whoever said you can't buy happiness forgot little puppies."

Dog45s.jpg
I found that writing native queries e.g. using Netbeans, was very convenient. I could keep adding new methods, which was fun. And also the auto-completion of column names, and neat column names at that, was such a pleasure. Especially when the database is big and messy and unfamiliar, and column names are abbreviated to the point of confusion. "What is ln_d_per again? You right, it is the line discount percentage... I knew that."

In fact, I found it infinitely more convenient than using psql. Especially where there are many joins. Those are very tedious in SQL, compared to the above native queries e.g. join(dog, dogHouse, region). We "pre-configure" the foreign key constraints, so our native select query can iron out the details.

So I wrote and ran adhoc support queries using Netbeans rather than psql. "I love that F6 button, man!" I was formatting the output to print to the console, which worked fine, because the IDE provides scroll bars of course. However, one could get fancy and popup a JTable, and save the preferred column widths and maybe even column order, using the Preferences API.

The bonus was that later I could cut and paste those adhoc queries into servlets, e.g. to be reused by myself, other developers, and support technicians. And with very little extra work, I spun some of those servlets into actual production reports for the client. A future article in this series will delve into that.

I wrote some bash scripts that rsync'ed the updated classes onto central servers in multiple environments/networks in multiple countries (only two, but shhh), with multiple instances per server, i.e. newer "staging" code vs stable "production" versions running on different ports, and restarted all the updated servers, at the push of an Enter button. It was a beautiful thing to behold! :) Hey, I should have written Ant tasks to do this, then I would never have to leave Netbeans... except to read Thunderbird and surf Firefox, of course.

Conclusion

"I see a look in the dog's eyes, a quickly vanishing look of amazed
contempt, and I'm convinced that he thinks I'm nuts."

In Bean Curd 1 I introduced an "explicit bean property" approach with "no string references attached" where property descriptors are declared explicitly in a bean info class, and their field names are used as implicit property names, for beans binding.

These property descriptor objects are configured using their own Field. In particular, the field name is used to deduce the default property name, and also further settings are extracted from annotations.

Dog23s.jpg
This avoids using string literal references, which I also argue against in Explicit Reflection and Refactoring Translations.

In the ORM example presented here, our bean is an "entity bean" e.g. representing a row from a database table. Explicit bean properties are declared in "entity info" objects, which double up as "data access objects." This provides flexibility for mapping and configuration, via overloaded factory methods, in addition to annotations.

We then use these explicit properties for "native queries" i.e. stringless database queries. In this case, our queries are readily toolable (eg. using the IDE's prompting, auto-completion and error-detection capabilities), and our ORM is refactorable, e.g. we can safely rename our column mappings. "It's the dogs!"

Writing queries becomes a breeze (thanks to the toolability, particularly auto-completion) and it becomes easy to keep the mapping clean and consistent (using IDE refactorings freely).

"We in the dog pound now, baby!" :)

Future articles in this series might be "Bean Curd 4: On Form", looking at using our "annotated integrated explicit property" approach for HTML forms, for the purpose of capturing parameters for native query servlets, and "Bean Curd 5: PH&P - Pdf, Html and Poi" for producing reports in PDF, HTML and/or Excel, using annotated property descriptors, native queries and servlets. But first I might do, "Bean Curd 3: Swing Form Binder" just to mix it up.

Credits: Icons: Everaldo.com; Fotos: pixelperfectdigital.com: Kenn Kiser; stock.xchng: Sarah Williams, Anna B.; flickr.com; yotophoto.com.

Related Topics >>