Anemic vs. Obese Domain Objects
Child number two has kept me away from the blogosphere. As I write this, señor Nelson Jr. is busily learning how to access and destroy objects. Next up: garbage collection.
I found this article on JavaBlogs the other morning and while technically speaking it's an old topic, it seems one that none of us ever tire of talking about. If you tire of talking about it, then you'll probably want to move on to another blog right about now.
Briefly summarized, the article points out that many domain objects are beefed up incorrectly in order to ostensibly avoid certain pejorative labels coined by certain well-known prolific industry visionaries. Specifically, the beef often takes the form of data access methods which, as the poster wonders out loud, are usually supposed to be whisked off to some data access object, are they not? So if you move them, what are you left with? The article examines a few cases of what he considers business methods, explores some of their wrinkles and ends, rather appropriately, in my opinion, with a semi-frustrated open-ended question:
Aren't domain objects with persistence logic just really Enity [sic] Beans? Didn't we agree Entity Beans are bad? Let's keep the persistence out of the domain object and in the Dao where it belongs.
I think this article raises a point that we all think about when setting out to design a class library or model a domain: what is the behavior, anyway? Most modeling and design exercises I've been involved with typically start with the structure. Members of the design or development team sit around and name domain objects and their attributes following good object-oriented practice. Deep thought ensues. More attributes are added, some are taken away. Classes are refactored. At the end of the day you have a glorified entity-relationship diagram, redubbed a class diagram, or, to use Fowler's term, a pile of anemic domain objects. Then the team sprinkles behavior over the objects, and, just as CodeMonkey indicates, more often than not it is simply pick-it-up-put-it-down logic: Account.findByDate(), Order.update(), etc.
I used to think that getting the structure right was critically important. Think deeply about your objects, make sure they have all the attributes they're going to need, move things around in the class hierarchy appropriately to reduce duplication, and there, you have your structure onto which you can drape the domain's behavior logic. Now I think this is wrongheaded (to paint it in overly black-and-white terms), and I have found that examining the behavior is a much better way to arrive at the structure. If you do this, the business logic falls out naturally.
To start with, think about some obvious but often neglected questions: for any given object with a proposed attribute x, why must that object have attribute x? What little-"p" process needs that x? Why, really, do I have to know an order's creation date? Why do I need to know a student's age?
I've found that the answers take one of three rough forms:
- Because someone somewhere is going to want to build some sort of general-purpose query that will involve x. We can't even begin to predict what this query will be.
- Because we need to make this attribute available for reports that we send to An Important Organization, Standards Body or Governmental Agency. Who knows why; that's just what we do.
- Because there is a business rule that dictates that things with an x in some range or meeting some criterion need to get flagged in a particular way, or exported, or associated with some other kind of thing.
The first item is probably the only case where you really simply have to expose your object's attributes via public getters and setters Because They Simply Have To Be There. I've also found that it's pretty rare. And finally, it also follows from this that it doesn't really matter whether your domain object has an x, because it might as well have a y, or a z, or forty three other attributes. The point is if an attribute is exposed simply for dynamic query purposes, then it hardly matters what that attribute is for, since your program isn't going to do anything with it other than show it to the person who constructed the query in the first place. So if you're designing with interfaces, keep these types of getters and setters out of your interface and put them in the underlying implementation.
The second item is fairly common. In higher education, which is the domain in which I currently find myself, the government has an interest in various kinds of data and reports that, to my eyes, consist of various combinations of obscure, unpronounceable "codes". Your first inclination would be to stop there (well, mine is): OK, the government says we need to provide a CIP code, so we'd better have a getCIPCode() method on our Program object. But stop there, and ask again: why do we need this? Maybe the correct method is, instead, provideGovernmentMandatedData(), or maybe the solution is to move these kinds of obscure, for-reports-only attributes into an outboard descriptor object, e.g. CipCodes.getFor(program).
The remaining item above is the killer. Suppose you intuitively know your Student object has to have an age attribute. Why? What's it used for? Why do you need to know how old a student is? In a lot of scenarios, the checking of an attribute is really something else in disguise. The business rule, in other words, may be if the student does not meet state requirements for taking a class, then reject him. If that really is the business rule, then you don't really need an age accessor at all; just a meetsStateRequirementsForTakingAClass() method.
In practice, of course, returning to planet Earth, you end up having domain objects that have a mix of getters and setters and more object-oriented business methods. But I've found that arriving at these methods by way of thinking about the processes that are going to need them tends to result in cleaner, less-coupled objects than the alternative approaches that I've tried.
Thanks for reading.