 |
DSLs--Standalone or Embedded?
Posted by cayhorstmann on December 19, 2007 at 09:37 PM | Comments (9)
I am a reviewer for Java One. I have about 350 project proposals to plow through and not enough time to give each of them justice.
One submitter proposed a talk on "Doing your own language". The outline talked about parsing, abstract syntax trees, and generating code. I flippantly commented "A DSL is not a DYOL. You use a DSL precisely because you DON'T want to write another parser."
Of course, what I had in mind was the recent trend of embedding DSLs in programming languages such as Groovy and Scala. (See this blog for Groovy.)
The submitter happened to be a reviewer in another Java One track, so he was able to see my comment, and he was not happy. He emailed me: "I feel strongly that this is mistaken. Most DSL's originate with domain experts, who become weary of using general-purpose languages to work within their domain concepts and patterns. Many of us are the beneficiaries of such efforts, but DSL's don't spring forth from nothingness."
Now here is my thinking. Technically speaking, the submitter is right. A DSL is simply a domain-specific language, and the term does not imply an implementation strategy. You can use a parser and code generator, as was customary in the past, or you can embed the language in a larger host language, by rigging the metaobject protocol, doing tricks with closures and operator overloading, or whatever.
I just think that the embedded approach should be the approach of choice when it is at all feasible. After all, the toolbuilders are busy giving us tools for JRuby, Groovy, and Scala. If your DSL is contained in one of these languages, you get to leverage those tools. If, on the other hand, you start a new language from scratch, you might get a marginally prettier syntax, but you now have to build up a whole tool chain.
Consider JavaFX Script. (Doesn't that just roll off your tongue?) It has a bunch of nice features for building GUIs, such as the bind and dur operators. But it is yet another programming language. People need to learn it, they need to build tools for it, they need to learn how to use those tools. Sadiya Hameed, one of my graduate students, is implementing the key features of JavaFX Script in Scala and Groovy. The syntax is a bit clunkier, but not by much. Either of them would have made a fine host for a FX DSL.
What do you think? Am I being naive? Is it just too awkward to shoehorn DSL syntax into the Procrustean bed of a host language? Is it crazy to have a program with ten mutually incomprehensible DSLs in Scala or Groovy? Or has the time come to say "no" to building more parsers?
Bookmark blog post: del.icio.us Digg DZone Furl Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment
-
If the talk is well-balanced, it would discuss the different strategies (from scratch and by extending languages such as Groovy or even building on platforms such as the JVM or other runtimes).
But your reference to JavaFX Script is interesting. Effectively, JavaFX Script's functionality could have been implemented in Groovy. However, you have to look at the purpose of the DSL: in JavaFX's case, it's to write a high-performance compiled language, and so some of the overhead in more general-purpose, DSL-capable languages would just slow it down. Such overhead is being optimised away as these technologies evolve (Groovy's getting faster and faster, especially in normal use-cases), but it's just not necessary for JavaFX.
I would therefore suggest that business-oriented DSLs are best written in languages such as Groovy, whereas in some performance-critical scenarii with a limited number of typical use-cases, the talk in question can still be of interest.
The talk could also be of general interest to those who want to improve their general knowledge. For example, understanding how processors work at a hardware and assembly level isn't necessary in modern languages, but such awareness can help write more correct and efficient code (I'm not referring to "optimisation", just avoiding bad design). In the same way, understanding how languages are structured, parsed, and ultimately transformed into executing code can be useful.
The middle ground is of course covered by Antlr, JavaCC, et al, which transform DSLs into an existing language (in most cases, unless you want to access the Abstract Syntax Tree). That's useful when you don't need to do it all. If you design a DSL in a very capable language, then you generally have access to all the power of that language, whereas in your own controlled DSL-based execution environment, you can lock things down to avoid the end-user trying to be too smart and screwing things up.
- Chris
Posted by: chris_e_brown on December 20, 2007 at 12:43 AM
-
I believe you are wrong -- both external and internal DSLs are valid options, and both have their strengths and weaknesses. I prefer internal DSLs, too; but sometimes the domain concepts are very simple and the benefits in readability of an external DSL (which is free from the constraints of the host language) may be worth the additional effort in terms of tooling.
Posted by: stefantilkov on December 20, 2007 at 01:37 AM
-
The decision to create a DSL shouldn't be taken lightly -- it is an investment that should be made only if a reasonable return can be realized (productivity, maintainability, quality, performance, etc.). If the result is "ten mutually incomprehensible DSLs" something has gone very wrong!
I've done a few DSLs over the years, both as standalone languages and as extensions of existing languages. The most successful, in terms of a reasonable return on the investment I mentioned above, have always been the standalone languages. Existing languages impose many syntactic and semantic constraints that make the resulting DSL more awkward than its standalone equivalent.
That being said, I did one project where we were in an extreme rush (large financial penalties if we failed to deliver on time), which meant we didn't really have the time to create a standalone DSL, even though that would have been ideal. Instead we went the language extension route, which got the job done in time. Such choices are always a set of trade-offs, so it is simplistic to state categorically "all DSLs should be {standalone | embedded}!"
Your grad student's experience implementing key JavaFX features in Scala and Groovy is interesting. Perhaps this suggests that FX isn't the wondrous breakthrough that it is billed (by some) as, but more of an incremental improvement in recent languages? Or perhaps it is those relatively minor improvements in syntax that are greatest value to developers? This question, of course, brings us around full circle. :-)
Posted by: rtenhove on December 20, 2007 at 05:48 AM
-
Cay, i agree with you on this one. It's hard to justify creating a new domain language these days. I used to do it a lot back in the day when object oriented languages weren't mainstream. Giving people something that made their jobs easy was fun (still is). Plus I was a punk kid at the time and it gave me an outlet.
For the last fifteen years I've looked at almost all domain language work and ask, "Why didn't you just give us a good object library for [Ada|Objective C|Python|Java|Groovy|Ruby|Scala]?" It's pretty rare for me to spot the need for a new language. Most domain-specific languages are very procedural in their bent. Something in the zoo of existing procedural languages should do the job.
My last run-in with a domain language was with Graphviz's dot. Someone SWIGged the GraphViz C code. Now I've got the ability to call Graphviz via a Java interface.
I use Java Server Pages as my example for a successful domain specific language. First, it has a wide audience. Second, that audience doesn't already know another procedural language. Third, that audience does know a similar presentation language already (HTML), and JSP takes advantage of that. Forth, it presents a solution to the problem in an abstraction much closer to the problem's domain; JSPs look a lot more like HTML than the equivalent Java code. It's a good fit, and has carried its proponents for like eight years.
I'm reserving judgement on JavaFX Script. As Chet Haas pointed out in his JavaOne animation talk last year, one line of Java can do what one line of JavaFX Script can do. It doesn't seem to buy much. (It has added pressure to make Swing a better library for Java coders.)
I would like to see more declarative-style rules-style domain specific languages. Last time I looked SQL was the most used of the lot, and nothing's caused the sort of revolution there that Smalltalk and Ada did for procedural programming. Otherwise, why build a parser?
Posted by: dwalend on December 20, 2007 at 07:08 AM
-
Also, it is way over the line for a JavaOne talk reviewer to lobby for his own talk from the inside. What's that about?
Posted by: dwalend on December 20, 2007 at 07:12 AM
-
"Or has the time come to say 'no' to building more parsers?"
"You use a DSL precisely because you DON'T want to write another parser"
I think both statements miss the boat. No sane person defines a DSL and then writes their own parser for it ... that's what lex/yacc/antlr/etc. are for. But you need to understand parsing and ASTs if you are going to do something useful with those tools. I agree with the presenter that the reason you gave for rejecting the presentation makes no sense.
"I just think that the embedded approach should be the approach of choice when it is at all feasible"
Right tool for the right job. Admittedly, I can't think of many jobs where cooking your own DSL makes sense.
Posted by: jadonohu on December 20, 2007 at 10:44 AM
-
I think this last part of this last comment hits the nail on the head: whatever the merits of how you choose to build a DSL (and IF you should build one, of course), it is a very specialized activity, and so not likely to be of interest to JavaOne's developer audience. That is certainly, IMHO, a fair assessment for purposes of winnowing all those JavaOne submissions!
Posted by: rtenhove on December 20, 2007 at 12:36 PM
-
Fowler defines internal, embedded and external DSL. For internal DSL's you need quite a flexible/powerful language (thus often is dynamic) where we have Lips and Ruby as perfect examples. Embedded languages represent an obvious mismatch and can thus only be represented within a string token of the host language, SQL, JPA Query, Expression Lanugage are examples of this. And finally external which typically include such utils as SED, AWK etc. which requires parser-generators to create.
I am not particulary fond of the latest trend to use embedded DSL's in Java, basically by misusing Java's String token to add stuff that can only be parsed and validated at runtime.
Sure, it has worked reasonable well for SQL but apart from the issue of SQL injection, it really doesn't scale well. It is easy in a corporate environment to end up with 20+ lines of SQL inside a String token, you've tried to organize and format so it's still readable. Maintaining these are a nightmare, for you constantly have to switch between a debugging SQL parser/executer (no double quotes) and your Java environment (double quotes).
Thus I think its a mistake to see JPA Query language and JSR-295 binding expressions go that way. I much prefer allowing internal DSL by having a flexible/powerful language. This is the approach taken by C# and allowing very nice type-safe querying in LINQ. I make myself no illusion of that though, for the Java community is far too conservative and divided to ever allow this.
Posted by: mrmorris on December 21, 2007 at 06:42 AM
-
_
Posted by: bowlingtips on June 12, 2008 at 12:32 PM
|