The Source for Java Technology Collaboration
User: Password:



David Walend's Blog

September 2003 Archives


Design for Reuse: Source Directory Structures for Java Projects

Posted by dwalend on September 25, 2003 at 04:56 AM | Permalink | Comments (12)

I've found I want to reuse code from almost every project I've ever worked on. Plus other people treat my code as library code years longer than I thought possible. So when I create Java code, I produce reusable .jars of code. Structuring the project correctly at the beginning to help reuse seems to be important, but isn't without cost. I have settled on one way, but am not convinced it's the best. (The examples are from JDigraph, a library for representing and working with directed graphs.)

I pack a few complete packages into each .jar I produce. Although it's possible to split a package between multiple .jar files, .jar files work best when one .jar file holds all the .class files from specific packages. The manifest file contains version and security information based on specific packages. Signing a package in a .jar file prevents others from adding code to that package, so splitting a package between .jar files can result in security exceptions. More importantly, if someone asks me about a ClassNotFoundException, I can quickly figure out which .jar file is missing from their classpath.

I have a simple rule for figuring out which packages go in which .jar files: If two packages depend on each other -- java.lang and java.io, for example -- then they should go in the same .jar file. If not, they should be in different ones. Making this decision early seems to be important for making smaller pieces reusable.

I know two ways to structure source code in Java projects: All-In-One and Many-Subprojects. Both allow me to build the .jars I want to. Neither is perfect. I think Many-Subprojects is best in most cases, but if there's a third, I'd love to hear about it.

All-In-One: The most common way I've seen Java projects set up is with all the source code under a common source root:

The great feature of All-In-One is that everyone can find things in the source code. The approach grows organically as the project grows; there aren't any choices beyond deciding when and where to create a new package. People expect to find this sort of structure. It's a simple structure that works well for small projects with few people working on them.

The All-In-One approach makes it hard to safely break the kit into smaller reusable .jars. Because the project grows organically, nothing enforces dependency directions between packages. If a developer adds a dependency in the wrong direction, the packages have to be in the same .jar file to safely use either package. If someone adds a dependency, javac will ignore the rules I lay down in your build.xml and reach across the directory structure to compile source code that is supposed to be independent; the build.xml looses control of the source path.

The All-In-One approach doesn't scale well. It's fine for three good programmers for about six months, but for twenty developers or for three good developers for two years, it becomes brittle. The build file has to hold the complexity of creating the two separate .jar files. I use ant build.xmls for builds. Ant build.xml files are easy to hack under the pressure of the moment, which makes them brittle over time. Most projects that I've worked on have had increasingly complex build.xml files that get very difficult to maintain and follow. Further, as the project grows larger, it gets harder to use a test-first style and harder to refactor. If I want to try out a change in API, I have to wag all my code to match the new API before I can run my tests.

Many-Subprojects: An alternative is Many-Subprojects. A project is made of several subprojects, each with its own build file that builds a single .jar file of library code. Each subproject contains only the source code needed for its root. The uber subproject orchestrates the other build files and creates the final product. It looks like this:

The best thing about the Many-Subprojects approach is that it organizes things into smaller reusable pieces from the start. I can even snap off a subproject and convert it to a new project. There's no shared common source root, so javac can enforce the dependency rules I've laid down. And the build.xml keeps tyrannical control of the classpath.

This approach scales well; as I start new packages, I add new subprojects. The build.xml files for the subprojects are generally identical. Some subprojects will need an odd target or two, but the changes stay isolated. The approach encourages refactoring and a test-first approach; I can try out a change in one subproject's test code before propagating the change beyond that subproject.

The worst thing about Many-Subprojects is that the directory structure is less intuitive. Few developers have seen the style before. When I start a new package, I have to decide if the new package is independent enough to rate its own .jar file and subproject immediately. Some people find that task difficult. Setting up the structure initially takes more work.

All-In-One seems fine for projects with few people and a short life time. Although I usually only work with a few developers, my projects tend to last for years, grow, and create new projects. Converting an All-In-One project to Many-Subprojects is a thankless task that uncovers ugly cross-dependencies, but converting Many-Subprojects to All-In-One is trivial. I use the Many-Subprojects approach from the start.

Does anyone have a third structure that will give me the best of both worlds?

Defending Autoboxing (or Save Us From the Preprocessor)

Posted by dwalend on September 18, 2003 at 05:14 AM | Permalink | Comments (8)

I plan to use autoboxing in a project, so I'm responding to Erb Cooper's damning blog, "The Terror That Is Autoboxing." I haven't read the spec yet -- Only JCP members have had the chance. I think we should reserve judgment at least until we can see what the JSR expert group has come up with. Under the eye-catching headline, Erb's complaint is that autoboxing could create a lot of objects for no good reason, use a lot of memory, and create a lot of garbage. I hope autoboxing will be a bit more sophisticated, and that developers will use the same care they show with Strings.

I want to solve a design problem using autoboxing and generics for JDigraph that I'll otherwise have to solve with a preprocessor, or Jython, or a compiler and ClassLoader hack that will basically repeat what the generics and autoboxing should do for me. I think autoboxing and generics will do the job, but won't know until I get a chance to read what the expert group has decided. I might even have to wait until the JSDK 1.5 betas are out to try it.

JDigraph's measured package has a collection of path optimization algorithms. Path optimization algorithms like Johnson's, Dijkstra's and Floyd-Warshall use a relax() procedure at their core. relax() compares the current cost to traverse between a "from" node and a "to" node to the cost of using a "through" node. The three indexes (fromIndex, throughIndex and toIndex) correspond with the nodes. safeLength() returns the lowest known cost to cross an edge between two nodes. Here's the the relax() method from AbstractShortestCEDistances.java.

    private int[][] distances;

    protected void relax(int fromIndex,int throughIndex,int toIndex) 
    {
        int fromThrough = safeLength(fromIndex,throughIndex); 
        if(fromThrough==Integer.MAX_VALUE)
        {
            return;
        }
        
        int throughTo = safeLength(throughIndex,toIndex); 
        if(throughTo==Integer.MAX_VALUE)
        {
            return;
        }
        
        int startLength = Integer.MAX_VALUE;
        
        int fromTo = safeLength(fromIndex,toIndex); 
        if(fromTo fromThrough+throughTo)
        {
            distances[fromIndex][toIndex]=fromThrough+throughTo;
        }
    }

Right now this code uses an int[][] for distances. I would like to use autoboxing and generics to easily switch this class from ints to doubles or other Number subclasses. A generic should generate a subclass of the parent class with the correct types, which would be very efficient. I'm hoping that autoboxing will let me keep the efficiency of primitive types while letting me access that flexibility. A profiling investigation showed that the algorithm spent most of its time assigning new distances in that last if block, so this method should be a good test of efficiency. It will be easy to measure once the JSDK 1.5 betas start to come out; I've got the timing study ready.

One alternative approach is to use a preprocessor to change the primitives. (And to rename the class so that the same ClassLoader context can use versions for different primitives). But to do that I'd have to guess what all the possible child classes of Number should be at build time. (Plus I'd need to do the same thing to my Heap for Dijkstra's algorithm. Plus all the interfaces involving path length would have to run through the same process to have the right APIs.)

A second alternative is to use string substitution, code generation, and a compiler at runtime to build the .class files I need at runtime. Defining the interfaces for these would be difficult; I'd still have to do some tricks at compile time. That work would repeat what the generics feature does already.

A third alternative is to write the code in Jython. But I expect this numeric-intensive code would be much slower. A disappointing implementation of autoboxing would run at Jython speed. (Like Erb says, "Wouldn't I be just as well off using Python?") However, it would work and I could speed it up when it's important.

I hope and expect that autoboxing will be efficient enough to let this work well, instead of just work.

SomnifugiJMS for User Interfaces and Simple-Enough APIs

Posted by dwalend on September 11, 2003 at 04:56 AM | Permalink | Comments (2)

Somnifugi JMS is an implementation of the Java Messaging Service built on top of Doug Lea's Channels. This JMS implementation runs inside a single JVM, quickly delivering messages between java Threads. A few years back, I created Somnifugi JMS to speed up a project where the architects had gone overboard with messaging. I used Somnifugi JMS to prototype and test the next project, and left it in place to keep things fast. The project after that, I used a Somnifugi JMS Topic to communicate from the AWT Thread to other Threads where the controllers and models live. It worked great; I've used it in every Swing project since then, and a few others have started using it this way, too.

I have trouble tracking all the issues in building a Swing interface: Swing's API has about a hundred top-level classes, each with a few dozen methods, plus nine subpackages, AWT, and Java2D to get things just right. Getting a group of people to all use MVC, JavaBeans and Threads the same way is hard. The model always has its own dynamics, and usually changes a few times a day. Add users unfamiliar with the new UI to make the project even harder.

The "Use Topics to communicate from the view to the controllers" pattern is a nice complement to the "Use SwingUtils.invokeLater() to communicate from the controllers to the view" pattern. Extending the pattern to place model handling in separate Threads is just a matter of adding extra Topics or Queues for the controller to consume. Plus if I need to make the application fit a client-server model, all I have to do is swap the model's Somnifugi JMS Topics for distributed JMS Topics.

I think the power in this pattern comes from breaking up the task into small, easy-to-grasp pieces.

Using a Topic to communicate from the view to the controller simplifies both by decoupling them. The view generates messages whenever any user action happens. The controller digests those messages. The JMS API is small and easy to learn and use. Using Topics simplifies performance decisions about handling Threads and shared Objects. The projects become more predictable, easier to decouple and test, and more pleasant to work on.

I think a lot of this gain is because a JMS Topic is easier to use than Java's threading support. The JMS specification is one of the best written specifications to come out of the JCP. The underlying ideas work without bending the universe to match. Each interface fills an outlined role and I don't have to do anything weird to my own code to use them. There's no requiring me to inheriting from someone else's superclass for example. (See Allen Holub's article, "The fragile base-class problem".) I needed a single afternoon to read the spec, and I understood without strain.

I spoke briefly with Joseph Fialli (one of the authors of the JMS spec) after he gave an invited talk (on JaxB) at a local user group. I described how I was using JMS behind Swing interfaces. He thought the pattern was pretty slick. He was the only person ever to compare it to InfoBus.* After the talk, I found my notes from reading the InfoBus spec in 1999. "InfoBus API looks more complex than just using the Thread API. We'd need to fill in most of the JavaBeans event spec. Not a good fit for our problem." Over the years, I've never seen a project that uses InfoBus, despite the hype it received in the late 90's. A JSR to update the InfoBus specification was started in 1998 but withdrawn in 1999. I think the InfoBus spec has been abandoned, but some ideas live on in the JavaBeans spec.

I think there's a strong correlation between simple APIs and how likely developers are to use those APIs. JMS survives and thrives because it meets a need that developers recognize with relatively little overhead. JMS exemplifies a "Simple-Enough Principle" for API design. Developers never adopted InfoBus in large numbers because we didn't think InfoBus was any better than what we had before.

Because all the code I write becomes library code, I try to keep this principle in mind when I define APIs. I want to create an API rich enough to do the job, but simple enough for another developer to be able to use without expanding the problem he's working on. Perhaps the advances in Aspect-Oriented code will reduce the typing overhead for JavaBeans to the point where developers can meet the specification. But that's a different blog.

* As I added mark up to this article, I got email from Ted Shab asking about Somnifugi JMS, "We are looking for InfoBus-like functionality, as well as some other related concepts."



Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds