The Source for Java Technology Collaboration
User: Password:



Evan Summers

Evan Summers's Blog

Bin Bash Java (Chapter 1)

Posted by evanx on May 24, 2006 at 01:37 AM | Comments (4)

penknife3.jpg Introduction

In the preceding blogs "Java is all you'll ever need" and "A Fool's Errand" I alluded to using Java for "small tasks" eg. file/system tasks, rather than shell scripts. I promised to present some examples along these lines.

This is Chapter 1 of many, and presents a basic design. We'll thrash it out in subsequent chapters.


Motivation

My motivation for trying to move away from shell scripting (or using some other scripting language like python or groovy), in favour of writing "tasklets" in Java, are as follows.

  • My personal familiarity and productivity with my favourite Java and Netbeans environment.
  • The unlimited power of available Java libraries and tools. "Yeah baby, bring it on!"
  • Using Netbeans' convenient CVS/Subversion to manage and version tasklets.
  • Using Netbeans to "keep it clean" eg. revising, commenting (javadocs) and refactoring tasklets.
  • Moving towards a portable, cross-platform solution for file and system tasks (compared to shell scripts).


Problem

A typical task is backing up our computer, so let's consider this one of our first goals. OK, this could take a while. Hopefully we'll achieve this goal in subsequent chapters. Doing a good design here would be a great start.

linux-av.png We want our tasklets to be cross-platform. In particular, we want to use our "backup tasklet" to backup our Linux PC, Windows laptop, and MacOSX media center. So let's keep that in mind.

Of course our "framework" should be reusable for other tasks, besides the backup one. And very convenient to use. So that some day soon, we can write all our file and system tasks in Java, and never look back, woohoo!


Design Overview

Let's create some helper classes, eg. TFileHelper for handling files, and TProcessHelper for handling processes. And we will create a TTaskContext which exposes our "library" to our tasklets, and/or can be used as a superclass by our tasklet classes.

Note that I've choosen the letter T to prefix our "tasking" project classes. This avoids any potential namespace conflicts, and makes it clear which classes are our new ones.

To give us some flexibility, let's create our own classes to represent files and processes, namely TFile and TProcess. We will extend those eg. TDir and TZipFile from TFile.

folder1_man.png
Files

First let's design our file helper class. (Note that I'm using abstract declarations to highlight the interface at this stage, rather than the implementation.)

public class TFileHelper {
   public boolean exists(TFile file);
   public boolean isDirectory(TFile file);
   public void remove(TFile file);
   public void removeIfExists(TFile file);
   public void removeDirectory(TDir directory);
   public void moveToTrash(TFile file);
   public void moveToTrashDirectory(TDir directory);
   public TDir createDirectory(TDir directory);
   public void copy(TFile file, TFile destination);
   public void copyOverwrite(TFile file, TFile destination);
   public List<TFile> listDirectory(TDir directory);
   public List<TFile> listDirectoryRecursively(TDir directory);
   public List<TFile> findRecursively(TDir directory, String pattern);

   public TZipFile zipRecursively(TFile ... paths);
   public TZipFile openZipFile(TZipFile zipFile);

   public String getDigestString(TFile file);
   public boolean equalsDigest(TFile file, TFile otherFile);
}

That's just for starters. We can expect this class to grow a lot!

yast_zip.png The methods related to zip files might be refactored out later eg. into TArchiveFileHelper, and the digest methods to TDigestHelper, but let's not overdesign this thing quite yet. Later when we wanna support different archives, eg. tar et al, and different digests, eg. MD5 vs CRC32, then we might create generic interfaces for archives and digests, for different implementations. But not today. Rome wasn't built in a day, innit.

WindowsTasks.png We might have these methods throw wrapped runtime exceptions, eg. TIOException, which is a RuntimeException wrapping IOException. So that's why the methods above are not throwing checked exceptions, in case you were wondering.


Processes

Next let's design our process helper class.

public class TProcessHelper {
   public void kill(TProcess ... processes);
   public boolean killAll(TProcess process);
   public TFile pipe(TProcess ... processes);
   public TFile exec(TProcess process, Object ... args);
   public List<TProcess> getProcessList(TProcess ... processes);

   public TProcess fileWriter(TFile outputFile);
   public TProcess fileReader(TFile inputFile);
   public TProcess gzipProcess(TFile gzipFile);
   public TProcess tarProcess(TFile tarFile);
   public TProcess xargProcess(TProcess execProcess);
   public TProcess findProcess(TProcess execProcess);
}

The file reader and writer processes above would be used for redirection, eg. the file reader as the first argument in the pipe(), and/or the file writer as the last, as follows.

   processHelper.pipe(findProcess, grepProcess,
      xargProcess, gzipProcess, fileWriter);  

As you can see, we are loving Java5 varargs :)


Context

For convenience, let's introduce a superclass for our tasklets.

public class TTaskContext {
   public TFileHelper fileHelper = new TFileHelper(); 
   public TProcessHelper processHelper = new TProcessHelper(); 
   public TCalendarHelper calendarHelper = new TCalendarHelper(); 

   public TDir toDir(String dirName);
   public TFile toFile(String fileName);
   public TProcess toProcess(String processName);
}

Let's slide this baby into gear and start putting rubber to road.


Example Implementation

So let's imagine what our first backup tasklet implementation might look like, eg. for zipping up our home directory on Linux.

public class BackupTask extends TTaskContext {

   @TArgument TDir targetDir = toDir("/home/evan");
   @TArgument TDir archiveDir = toDir("/backups/evan"); 
   @TArgument String zipBaseFileName = "evan"; 

   public void run() {
      String timestamp = calendarHelper.getNumericTimestamp();
      TZipFile zipFile = fileHelper.zipRecursively(targetDir);
      zipFile.setOutputDirectory(archiveDir);
      zipFile.saveAs(zipBaseFileName + timestamp);
   }
}

Ok, so if we wanna write tasklets like the above, we still gotta bit of work to do. It's gonna rock tho, innit.

So you spotted those annotations. The idea is that our framework could prompt for those eg. if not specified on the command-line...


Future plans

So we gonna use annotations to tell the framework what it needs to know to provide a "tasklet container" with a management console in future maybe. Now we're talking!

I'm getting way ahead of myself here, but let's dream on...

This "management console" could have a Swing interface, and/or a web interface.

yast_remote.png A Swing interface could support a file chooser, which would be very handy, and would be quick to throw together using Netbeans/Mattise. Hey, it could even let us invoke remote tasklets, on lists of multiple machines... Hang on, this could grow up to be a network management system! "Heel, boy!"

On the other hand, a web interface would be quick and easy for administering remote boxen, eg. backing up our media center in the lounge using our laptop. Even on a local machine, having a web-based console for tasks would be handy.

You know what, I just decided we gonna embed a web server into our tasklet framework. So that our tasklets are always remoteable, and deployable via a single self-contained jar... OK, I better stop, I'm getting too excited here!


Conclusion

TProcessHelper2.png This concludes "Chapter 1." We made a start which is the most important thing. We came up with a design. Hopefully it'll hold up. We'll find out soon enough.

Like when you are reversing your car towards a wall, you find out very quickly when you've gone too far. Rather that than driving in the wrong direction for hours on end, and not knowing it.

Let's hope for accidents of the former "loud and clear" type, rather than misdirections of the latter, later type.

In the next chapter, we might refine this design before we kick off, so please post comments to that effect.

Then we can start rolling up our sleeves to implement some of the file IO functionality. Uh oh, sounds like work. So we'll google and cut and paste. We are developers after all, with a limited amount of time, so the less code we actually write the better, innit ;)



Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • A dream come true ;)

    Nice start, i really like the arguments structure allowing to use tasklet from other tasklet (and integrate POO concept). I really love too the Java 5 args coding style, it's enabling a more developer friendly interface.

    Just a remark, why putting the word "Directory" at the end of your directory specific method ? using prototyping you can use same method name for file or folder (just use "remove" instead of "removeFolder") ?

    Really interesting and waiting for next chapter, don't hesitat to advertise us if you want to start an open-source project.

    Alois

    Posted by: alois_cochard on May 24, 2006 at 02:31 AM

  • Thanks for your comments, Alois! To answer your question, I put "removeDirectory" (maybe "Folder" would be better) rather than differentiating via argument type (eg. TFile vs TDir) just to make it clearer and safer, eg. that it is recursive deletion.

    And I did register a project already, taskingtape.dev.java.net :) But I'm too lazy to split it's code out of aptframework.dev.java.net source tree at this early stage. So it's package name is aptframework.nano.taskingtape. (The "nano" projects in aptframework are "incubating" (sub)projects that I want to spin out of aptframework at some time in the near future. So they don't depend on packages outside of aptframework.nano, but might depend on aptframework.nano.common, and aptframework.nano.loggerhead, for example.)

    Posted by: evanx on May 24, 2006 at 03:52 AM

  • Dear Evan, Frankly i consider this rather as a joke than an serious idea. You should know famouos statement: "Those who do not understand Unix are condemned to reinvent it - poorly". Please don't reinvent the wheel, poorly. Poorly I mean using these "borlandish T-hings", please. Bash + standard unix tools are way more powerfool, even ANT is much more powerful - if You cannot live without java. While 'backup example' is interesting, Your solution has a lot holes, fixed decades ago by simple tar utility (what about file permisions, what about symlinks, huge files etc.). This is my opinion. Artur

    Posted by: karaznie on May 24, 2006 at 11:33 AM

  • Thanks for your opinion, Artur. You raise some good points.

    I'm not trying to re-invent Unix, just trying to apply Java to a class of problem, which is "small file and system tasks." This is class of problem to which Python and Perl have been applied.

    I try to make the point that rather than invent/implement a scripting language, another approach is to implement a support library that achieves similar convenience and functionality, with other advantages, eg. readily toolable using Netbeans and Eclipse, today.

    Bash and shell scripting is great if you know it, but not so great for a student who only knows Java. It's also not so great for CIO's when ex-employees leave scripts behind that have grown into unreadable, unmaintainable and unmanageable monsters, on which your organisation's infrastructure now depends, and which keep jumping out of wardrobes to give young programmers nightmares.

    On Windows, you can't assume that bash (or zsh) is available. Even though on Windows machines it's possible to install Cygwin, it's not necessarily an acceptable requirement that Cygwin be installed. (Actually C# shell will become preferred shell on Windows Vista i imagine.)

    Having said that, it is possible to bundle and use zsh.exe on windows and that is a great solution for scripting on Windows, together with rsync, zip and a few other utilities. Also, there is msys.

    I argue that these are legacy solutions, and there is space for a modern Java solution, for the convenience of Java programmers. At this stage it's an academic exercise. A fools errand, as i said ;)

    GNU/unixversal utilities like tar et al, are great, but you usually you gotta program them in an extremely limited and highly fragile scripting language (bash).

    In this blog article, i specifically exposed tarProcess, findProcess et al (see TProcessHelper), hinting that we intend to integrate these utilities, maybe by exec'ing them, and/or maybe via portable (Java) libraries where those are available (eg. for ZIP and GZIP).

    In general I think that Java/Netbeans/Mattise is a great tool for building front-ends to such utilities, and also wrapping these utilities into a library that is highly programmable, using Java.

    Finally on the issue of prefix'ing with a T... ;) Hey T is for Task, not TurboC! ;) Altho I was a big TurboC user in my youth, but not Windows/OWL stuff, just DOS stuff, which didn't have any libraries starting with T as I recall! ;)

    Actually I use a prefix letter for all my projects around aptframework.dev.java.net. Greenscreen classes start with G, Meme with M, Vellum with V, Taskingtape with T, other nano projects with N, Webservlet with B (i think, I'm gonna change that project name to Blackboard :) Oh and the aptcomponent classes with Z.

    Actually I like this approach because it's clear which layer and project the class belongs to, and avoids namespace clashes, eg. GContext vs MContext. Like in G classes, there shouldn't be any references to Z classes, because the Z application is built on the G framework.

    I'm a Swing developer, and Swing got me started on this with their J prefix :)

    In this example, what would we call our TFile and TProcess so as not to conflict with java.io.File and java.lang.Process?

    Posted by: evanx on May 25, 2006 at 06:47 AM



Only logged in users may post comments. Login Here.


Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds