The Source for Java Technology Collaboration
User: Password:
Register | Login help    

Search

Online Books:
java.net on MarkMail:


Interview with lambdaj creator - meet Mario Fusco at the Java.Net Community Corner

Posted by fabriziogiudici on June 1, 2009 at 11:08 AM PDT
Functional programming has been a hot topic in recent times, at the point that a number of “new” languages such as Scala, Haskell or Erlang have been brought to the attention of people (at least, those attending conferences or reading blogs). BTW, the topic is not new as one might expect: for instance, look at this DDJ article back from 2005 – but, as often happens, new topics require some time to gain the attention of masses.

Indeed, I think that functional programming has got some pretty neat stuff and, carefully used, could be a good way to improve the quality of your code. Functional programming is indeed the correct topic to think of when you hear about “closures in the Java language”. The basic difference is that “functional programming” is the concept, that you should learn and master, while “closures” are just a tool to implement it. One of the possible tools, and for this reason it isn't really important as the concept itself. In fact, a number of existing solutions for the old, good, plain Java, that of course don't use closures, have been developed (just a random reference to an existing product, Functional Java).

I've been interested in using a few of functional stuff in Java since some time, but honestly this topic has slipped below other higher priorities so far. One of the first places where to I'd use functional programming is the collections area (lists, maps, sets, whatever), as I find myself often re-writing similar code to extract stuff out of a list, or index a List into a Map, or similar stuff. One of the issues I didn't link with solutions such as Functional Java is that they are a replacement of the java.util.* classes such as List or Map. Indeed I've always gave a high value to the Java Collections library, since it is a shared implementation of common stuff: you use List, Map and whatever and there are pretty high chances that whatever third party's library you're going to integrate in your project also use them. Having to replace them, and introduce adapters, never sounded good to me.

MarioFusco.jpg But now, it looks as you have another choice. A new library, named lambdaJ, offers a neat solution to work in functional way with the existing java.util.* stuff. To learn more about LambdaJ I'm interviewing Mario Fusco, the author, from JUG Lugano (but, hey, Mario is italianissimo ;-). Before starting, I'd like to point people that are going to attend JavaOne 2009 that Mario will speak in person about LambdaJ at the Java.Net Community Corner on Tuesday at 11:30.


Q. Ciao Mario. First, please introduce yourself. Where you live, where you work and what are your primary interests?

A. I am a Java programmer and architect with a more than decennial experience. I worked in Italy, Germany and currently in Switzerland being involved in (and often leading) many projects in different fields like advertising, e-commerce and insurance.
My main interests about Java technologies are in multi-threaded programming and distributed computing, performance optimization, the most important enterprise level servers and frameworks, functional programming and the emerging Java compatible languages like Scala, Groovy or Jruby.
I am also one of the leader of the Java User Group in the town of Lugano where I live since 3 years.

Q. Please introduce lambdaJ. As you're a pretty practical person, I bet you wrote it as a solution to a problem you found in your job. Am I right?

A. Yes, you're. We were working on a project that had a very complex data model and we found our  software was full of tons of pieces of code that did almost the same: iterating over collections of our business objects in order to perform basically the same set of more or less complex tasks.

We also found some of those loops were particularly hard to read. Developers spent more time trying to figure out what a given loop did than to write the loop itself.

For this reason I started writing some utility methods and implementing a small DSL at the purpose to improve the code readability. The guys of my team liked this approach and felt comfortable with it almost immediately.

But when they start asking “How could I wrote this in your collections language?” or “Could you add that feature to your collections thing?” I realized I was writing something that could have a very large and general applicability field. So I decided to refactor it in an independent library outside the original project.

Of course the last step was to find a name for that library, since in my opinion “the Mario's collection thing” was not good enough. I choose lambdaj since to develop it I used some functional programming techniques that in turns derive from the lambda-calculus theory.

Q. Can you give us some examples of how lambdaJ will improve our code?

A. There are mainly 2 idea behind lambdaj: the first is to treat a collection as it was a single object by allowing to propagate a single method invocation to all the objects in the collection as in the following example:
forEach(personsInFamily).setLastName("Fusco");
The second is to allow to define a pointer to a java method in a statically typed way in order to call the API offered by lambdaj using that method as an argument as it follows:
List<Person> sortedPersons = sort(persons, on(Person.class).getAge());
In this last case if you compare this statement with the piece of code that you have to write to achieve the same result using the API of the Java Collection framework:
List<Person> sortedPersons = new ArrayList<Person>(person);
Collections.sort(sortedPersons, new Comparator<Person>() {
    public int compare(Person p1, Person p2) {
        return p1.getAge() - p2.getAge();
    }
});
I think the improvement is evident in both effectiveness while writing and clarity while reading it.

Q. It is definitely much better! Let me open a small, personal bitching parenthesis: IMHO this demonstrates that people can improve the way they code more looking at their skills than complaining about missing features in the language. Mario has obtained that “done-in-one-line-of-code” thing that seemed to unavoidably need closures. Closed the personal parenthesis. Can you give us some other example? For instance, a typical thing that I often do is retrieve a List of items (e.g. from a database query) and then copy them into a Map, indexed by a certain property, so I can speed up further operations (such as the lookup by a UI or such).

A. Also this task could be made very easy by using lambdaj as it follows:
Map<String, Person> personsByLastName = index(persons, on(Person.class).getLastName());
Actually lambdaj can do much more. In this example there's a big likelihood that in the list of person there could be 2 or more persons with the same last name. In this case in the indexed map you will lose all the persons having a duplicate last name. Probably what you need instead is to group the persons by their last name. Even in this case lambdaj can help you as it follows:
Group<Person> personsByLastName = group(persons, by(on(Person.class).getLastName()));
Then to find the persons in the Group with a given last name you can than write:

List<Person> personsInMyFamily =  personsByLastName.find(“Fusco”);

Note that to achieve the same result in plain java you should write something like that:

Map<String, List<Person> > personsByLastName = new HashMap<String, List<Person>>();
for (Person person : persons) {
    String lastName = person.getLastName();
    List<Person> personsWithGivenName =  personsByLastName.get(lastName);

    if (personsWithGivenName == null) {
        personsWithGivenName = new ArrayList<Person>();
        personsByLastName.put(lastName, personsWithGivenName);
    }
    personsWithGivenName.add(person);
}
List<Person> personsInMyFamily = personsByLastName.get(“Fusco”);
Once again the difference is evident. Don't you think so? In the end lamndaj groups are even more powerful since you can group your collection by more than one property. For example you could want to group you list of persons by their last name and then by their first name. In lambdaj you could do that as it follows:
Group<Person> personsByLastName = 
    group(persons, by(on(Person.class).getLastName()), by(on(Person.class).getFirstName()));
Q. Great! One more example, please. We've shown a couple of “stateless” manipulations. What about I have to compute the sum of a certain property in a list of items?

A. You could achieve this result in lambdaj in two different ways:
int totalAges = sum(persons, on(Person.class).getAge());
or
int totalAges = sumFrom(persons).getAge());
that, as you can see, reflect the two main ideas on which lambdaj is built. Actually I think that the sumFrom() method can be effectively used in another very common case. Suppose you have a list of beans with the properties a, b and c that need to be showed in a table. What you usually do is to bind each bean in the list to a given row in the table and popolate the cells corrisponding to the columns A, B and C by invoking respectively bean.getA(), bean.getB() and bean.getC(). Well, if you now need to display a row with the totals of that values you could do something like that:
Bean totalizerBean = sumFrom(beans);
and so you can populate the row of totals of your table just binding the totalizerBean as any each other bean by invoking totalizerBean.getA(), totalizerBean.getB() and totalizerBean.getC().

Q. What about performance? And easiness of debugging? You're clearly using dynamic bytecode manipulation and I know that some people, when they hear about CGLIB and other similar stuff, get worried.

A. Of course there is a trade-off in using lambdaj and in some cases performance could be an issue. I tried to analyze them by comparing the time needed by lambdaj to achieve a given task with the one spent to have the same result in an iterative mode. I empirically found that lambdaj is about 3 times slower that for each time it needs invoke a method on an object proxied by lambdaj. It means that in the former example you could expect that lambdaj is 6 times slower to sort the given collection of Person since it needs to use that mechanism twice for each Person comparison.
Actually this result has been measured in the case when Person is implemented as a Class and then by using CGLIB. If Person was an interface lambdaj automatically switches to use the native Java proxy mechanism and I expect that in this case performance could be a bit better even if I haven't already tested it. Anyway even if that couldn't sound so good, I guess that in the biggest part of the software you can write the performance penalty you have to pay by using lambdaj is really small. And I hope the improvements in your code readability and maintainability overcome it.

Q. Fair. For what I see, performance shouldn't be a big trouble here. From an architectural point of view, we're not saying we're replacing database queries with ten or hundred of thousands of records; SQL/JPQL/whateverQL is still the best option here. I'm longing to try lambdaj in blueMarine, and for what I can see in advance it won't hurt my performance in a sensible way. In any case, probably the closure guys could have still an argument if they absolutely need the maximum speed (but - hey - I'd like to know how faster is a closure-based solution). And don't forget that lambdaj is just a new thing, so Mario and the other authors could find some optimization in future. By the way: what are your future plans? Do you have a roadmap with new features for lambdaj?

A. Honestly there isn't any well defined roadmap. As I told before lambdaj is the side product of a project on which my team and I are working. So, at the moment, it contains more or less only the features that we found immediately useful for us during the development of that project. I'm sure that there a lot of outstanding features that could be implemented following the same philosophy. That's why I would like to have the help and suggestions of the java developers community in order to define which are the missing features in lambdaj.


Thank you Mario and enjoy JavaOne!
Comments
Comments are listed in date ascending order (oldest first)

LambdaJ looks definitely nice. But what I'm missing (also on its homepage) is some information about its limits. I guess the approach breaks as soon as you hit a final class, which means you couldn't do: ... on(Person.class).getLastName().toLowerCase() ..., right? Somehow I have the feeling it's a (really very cool, but limited) workaround for the real solution - closures.

You're right. Lambdaj doesn't work when you meet a final class as in your example. But of course you can find some workaround to "help" lambdaj to work as you expect. To fix the problem you're pointing out it's enough to add the method getLowerCaseLastName() to the Person class. In the end if your business logic needs to do something with the last name in lower case, maybe it is not a bad idea to add that method to your domain model in each case. As for the lack of closures in Java, I wouldn't like to start a flame here. I like closures and, as you can imagine, I developed some feature of my library with them in mind. I'm currently using some languages that support them like Scala and I feel very comfortable with it. At the same time I think that Java hasn't be thought and engineered to support closures and to add them in a second stage, as somebody is suggesting, could create more problems than it could resolve, as it has been already pointed out by Joshua Bloch in a presentation of more than one year ago.

First, I'd like to stress that the comment on closure was marked as a personal parenthesis, so it's mine and not from Mario. And in any case, Mario needs feedback about lambdaj, not yet another discussion about closures ;-) In any case, right or wrong, lambdaj is here as a viable option, closure in Java aren't and, as far as we know, won't be here for years. Second, right, CGLIB and similar stuff have got a problem with final. It's something that unfortunately occurs in many other standard frameworks based on the same technique, such as JPA. I wouldn't die for the fact that I have to remove final for this reason (even though I usually advocate about using final as much as possible, but you have to accept some trade off).