The Source for Java Technology Collaboration
User: Password:



Michael Champion's Blog

Community: Java Web Services and XML Archives


Xml 2003 Reflections - Adam Bosworth Keynote

Posted by mchampion on December 19, 2003 at 06:58 AM | Permalink | Comments (8)

[Another look back at the XML 2003 conference last week. I feel sortof blogspherically incorrect in waiting a week to write down these thoughts, but I wanted to let them bounce around a bit, and look at what others wrote.]

Adam Bosworth of BEA delivered the opening keynote address on Wednesday. He started by reminding us of the dream that XML geeks shared back in 1998: Information should not get lost in presentation. Actual XML practice has to some extent diverged into two separate streams -- documents on one hand, and application data on the other -- but together they have helped take back the world from the "hideous complexity and fragility" of information presented in .DOC, .EXE, etc. files.

I started to summarize the talk itself, then came upon Matt May's excellent near-transcript . I'll just elaborate on a few points that resonated with me.First, Bosworth notes that one of the negative aspects of the current XML world is that the KISS (Keep It Simple, Stupid) principle is being widely ignored, as XML and Web services technologies are becoming extremely complex. Bosworth has pointedly noted the complexity of W3C XML Schema and XQuery in the past; this time he said something like "there's only one guy in our company who *really* understands WSDL" ... and "we don't need all these layers of coordination/orchestration specs on top of SOAP, we need something like a 'SOAP cookie'." It's heartening to see more and more people pick up on the theme that the XML family of specifications is too complex, since I've been beating this drum for a long time now. It's definitely time for some serious refactoring of the the family of XML specs, and that won't happen until more people of Bosworth's stature start telling the unpleasant truths about what a few thousand person years of Design By Committee will do to a technology that used to be simple. The video clip from Bosworth's presentation in Jon Udell's weblog has my favorite quote from the talk: After admitting that 50 people would come up with 75 different ways of building distributed systems ."Some will win, some will lose. That's just fine. It's called evolution. It works really well. It's why we're all sitting here today." Maybe a bit of Darwinian Refactoring is what is called for on this stuff.

Another major theme in the keynote was that XML developers are asked to use "APIs from Hell." For example, a programmer working with a purchase order in XML format must deal with events, or child/sibling nodes in a tree, rather than application-level concepts such as products and quantities. Hmm, that's a posting in and of itself, because it ties in with a town hall meeting on storing/querying XML that turned into a discussion of XML APIs. More later.

Probably the most unconventional topic Bosworth spent time on was the importance of getting XML data models and APIs suitable for handling the synchronization of intermittently connected devices to Web-based master databases or applications. He noted that he spends much of his business week using only his Blackberry device. Effective use of such web-enabled, but slow and UI-challenged devices will require better synchronization tools: queries are difficult to generate with a handheld UI, and their limited bandwidth (if connected at all) means that it is important for queries to be very optimized to return back only the information the user really wants. Since this is so far beyond the state of the art, it may be easier for the device to anticipate the types of data the user will want, and trickle than information into the device in advance, as bandwidth is available, rather than on demand.

Bosworth presented a slide outlining a "Mobilized Data Model" that might help guide work on this. I suspect that many other listeners were also intrigued, but a bit mystified by this-- a planned demo wasn't ready in time for the conference. It has, however, generated some interesting discussion on the Web, not so much on the still-fuzzy details of the data model, but on the deeper issues. For example, Tim Bray notes that he is skeptical of the basic assumption behind the widespread need for synchronization, since "the trend is clear: anyone who wants to will be able to have a fast pipe that's always on." I'm not sure which side I come down on, but there is definitely two world views here: Those who assume that there will be a significant amount of data on the handheld device that must be synchronized with a central repository, and those who assume that in general the remote device will be able to query the central repository for the data it needs for a given task. Bosworth did offer one data point that might argue against Bray's position: At best, in the best-served places in the world a GPRS (General Packet Radio Service -- used for "always on" connection to the Internet by mobile devices) request takes 1 second to fulfill, and more if any significant amount of data is transmitted. Unless we get a WiFi network that is as extensive as the GPRS network today, it's not clear that the "fat pipe" assumption is realistic.

Another point that Bosworth has explored, not so much in his speech but in his weblog. How can one address the difficult challenge of synchronization without falling into the trap of complexity that will not scale to the Internet, or even work on a limited power and bandwidth device? He's asking a lot of questions about REST in the weblog, getting lots of answers, but one gets the impression that they are not satisfactory.

Inspired by a posting by Vanessa Williams, I'll put in a plug for the ideas behind JXTASpaces (sortof tuplespaces/Javaspaces using XML rather than objects) as a way of bridging the gap between the web services ideas that Bosworth talked about and the REST stuff that intrigues him. (Williams doesn't make the link to Bosworth in the posting, but has done so privately) . Kimbro Staken picks up the thread and mentions how an XML DBMS that supports XPath can make the template lookup feature of tuplespaces easy to implement if XML documents are the "tuples." I think they are definitely on to something here -- See Robin Cover's summary of some technologies and discussions, including some comments by me. The one point that's most relevant to synchronizing mobile devices is that coordination via "spaces" allows loose coupling in time as well as space -- not only can components in a distributed system employ different platforms, languages, and native data formats, they don't even have to be running at the same time. On the other hand, product ideas such as Ruple went nowhere and even the open source JXTASpaces project is a not exactly thriving. Is this just an idea whose day has not yet come, perhaps like hypertext before HTML? Or is there not really a "there" there? I'm pretty sure that if any application domain is tailor made for an XML Spaces approach, it's intermittently connected mobile devices!

Finally (just when I thought I was finally done blogging this speech!), Joe Chiusano asks about the apparent contradiction between Bosworth's focus on simplicity and BEA's co-authorship of a somewhat daunting list of Web services specifications. It will be interesting to see if Joe gets an authoritative answer from BEA; my guess is that they are telling it like it is in a recent press release that suggests [at least in my reading between the lines!] that customers are giving them the message that the current way of doing things is too complex.



XML can define agreements, but can also help deal with chaos

Posted by mchampion on July 07, 2003 at 07:05 AM | Permalink | Comments (3)

The contentious world of RSS and the "(not) Echo" project have been featured in a number of java.net weblogs recently by Simon Phipps and Daniel Steinberg. I've been intrigued by RSS for awhile because it illustrates both the challenges one faces in the real world in getting agreement on what seems like a simple problem, but also on the ability of XML to provide robust solutions even in the absence of agreement.

One of the biggest issues in the (N)Echo debate is whether a new, presumably improved format is worth the disruption it will cause to established users and software developers. Some are saying "If it ain't broke, don't fix it." -- Tinkering is more likely to break the existing applications such as RSS aggregators than to provide a solid foundation for further development. Stability in formats and protocols is, in this view, what the weblogging world needs to continue to expand and prosper.

This argument would be quite compelling in a world without XML, but is somewhat moot now that XML is pervasive: The whole point of XML's "self-describing" tags [1] is to allow loose coupling between producers and consumers of data. In one widely held view, this means "all that the producers and consumers of information have to agree on is the XML format, and software that supports it can evolve freely." I'd contend, however, that if a community could agree on the data format there might be little need for XML -- a CSV or ASN.1 or Java serialized object format without tags would work more efficiently, be easier to integrate with procedural code, require less network bandwidth, etc. The problem with agreed-upon formats -- besides the difficulty of achieving agreement, of course!-- is their fragility in the face of inevitable change.

The power of XML's tags (namespaced or otherwise) is to allow variation and evolution. The history of RSS bears witness to this very clearly. Even in a world of chronic disagreement, rapid innovation, and several contending "standards," the actual software that syndicates weblogs and aggregates diverse feeds has been remarkably robust. In fact, software to produce and consume (N)Echo appeared almost immediately after . This rapid response wasn't due, AFAIK, to late night hacking but to simple tweaks to the scripts, stylesheets, etc. that had evolved to support the diverse flavors of RSS previously seen.

I don't want to understate the overall business importance of having authoritative "standards" (de jure, de facto, ad hoc, or whatever) in this area. Tim Bray has made a compelling case for this and that seems to have been one of the galvanizing factors in the formation of the (N)Echo community. But whether or not Bray's prototypical Mr. Safe can cope with controversy and diversity, XML's basic technology is definitely up to the job. An eventual "standard" for an RSS-like format will help corporate developers using drag-n-drop IDEs to more easily develop software to produce and process syndication streams, but stasis is not necessary for progress in the XML world. In fact, the evolutionary potential of XML's ability to support loosely coupled applications is its greatest strength.

[1] Sigh, am not under the illusion that XML instances are "self-describing" in any philosophical sense. XML is "self-describing" only by comparison to alternative formats such as CSV or ASN.1 that require a more rigid data format and does not "tag" individual data items. The XML markup may refer to "namespaces" in which the semantics of specific element names are rigorously defined using an ontology language, or they may simply be hints exploited by a heuristic algorithm, but they supply additional information that a processor can exploit.



SOA: One acronym to bind them all?

Posted by mchampion on June 24, 2003 at 07:31 PM | Permalink | Comments (6)

I must confess that when I first started hearing about Service Oriented Architectures or SOAs, my reaction was "oh brother, here we go again ... more vague mumbling about 'paradigm shifts' by analysts who have predicted 10 of the last 2 revolutions." The definitions one finds didn't inspire confidence ... usually something along the lines of "a SOA is an architecture in which the components are services and the services interact by invoking other services." Sortof begs the question of what a "service" is.

Another thing that did not inspire confidence was that many people who profoundly disagree about lots of other things all seem to agree that Service Oriented Architectures are a Good Thing. If the term is broad enough to include "everything", is it meaningful enough to exclude anything?

I've overcome a good bit of this cynicism lately. Sure, SOA is a kindof fuzzy concept, but after spending much of the last year trying to rigorously define what a Web service really is and is not, I'm more sympathetic to the recursive use of "servce" in these definitions. Likewise, a lot of the discussion blurs the distinction between the definition of SOA and the characteristics of good architectures, service oriented or otherwise. For example, "loosely coupled" and "standards based" are considered two of the pillars of SOA". These seem more like principles of most successful architectures than principles specific to SOA. But there does seem to be a very powerful idea buried down in the current SOA hype: the idea that the central abstraction is this fuzzy thing called "service."

It's hard to define what a "service" is -- it could be some software component written in an OO language and deployed via an application server, or some "glue" exposing a legacy mass of COBOL , or a plain vanilla Web server providing the, uhh, service of serving up documents. But it's a lot less hard to talk about how one invokes a "service" -- you always (I think!) send it messages. So, to implement a "service," take some discrete chunk of functionality and provide a messaging interface to it. All that consumers of the service have to know to use it is the format of the messages they send it and get back from it, the protocol(s) for transmitting the messages, and the "message exchange pattern" of the interaction between the service provider and service consumer. In other words, the critical definition of a service is its interface to its consumers, not any particular characteristic of its design, implementation, deployment, etc. One can develop SOAs with COM, CORBA, BEEP, JMS, proprietary stuff, XML, HTTP, SOAP, and all sorts of combinations of these and many other technologies. One can describe the service interface using some combination of IDL, WSDL, an XML schema language (there are several!), DAML-S or some other ontology language, or if all else fails, a phone conversation.

This, in my mind anyway, explains why so many people who disagree on other things tend to think of their architectures as "service oriented." What they're disagreeing about are things like whether to encapsuate the service invocation messages behind an API or expose the messages directly to the service consumer, whether to handle reliabiility and transaction integrity down in the service infrastructure or up in the application layer, whether to have the service semantics described by human-readable documents or machine-processable ontologies, and so on. But if one looks at the messages, exchange patterns, and data exchanged, a lot of these differences tend to blur into the background.

I think this view of distributed systems architectures helps maintain focus on what is important for scalability, interoperability, evolveability, etc. and prevents us from becoming distracted by extraneous and politically loaded concerns. For example, there is no intrinsic conflict betweeen REST and SOA or SOAP. One can and should debate the specific conditions under which it makes more sense to directly manipulate "resources" by transferring representations of their state using the "services" supplied by HTTP, and when it makes more sense to hide the resources behind a more complex and customized service interface, but let's not call these alternative "paradigms". They're just different flavors of SOA, and developers are free to mix and match them to meet the needs of a distributed application.



When does SOAP add value over simple HTTP+XML?

Posted by mchampion on June 13, 2003 at 11:04 AM | Permalink | Comments (5)

Sean McGrath seems to be the first to link to this weblog, and I'll return the favor by publicly disagreeing with him <grin>. Actually, there's not a whole lot in the XML world that Sean an I disagree about, I strongly recommend his ITWorld columns . But apparently the benefits of SOAP are one of them, as he says: " I would argue that it is not the case that for more complex apps, SOAP is better. Sure, there comes a point where you cannot encode parameters into a URI but I don't see why it follows that these more complex web services need to throw out the huge advantages of having a GETable URI."

First, as of SOAP 1.2 (very soon to be a W3C Recommendation) there is no conflict between SOAP and "having a GETable URI." SOAP has a web method feature:

Bindings to HTTP or such other protocols SHOULD use the SOAP Web Method feature to give applications control over the Web methods to be used when sending a SOAP message.

In short, SOAP 1.2 allows and encourages the use of HTTP GET to invoke services that simply return data, thus allowing one to hyperlink to SOAP services, exploit HTTP caches of the results of frequently-accessed services, etc.

But, one might ask (and people very frequently do!), what benefit does one get from SOAP if just GETing or POSTing XML data works just fine in your application? One very reasonable answer is "not much." Sure, once WSDL 1.2 (which wll support the web method feature in SOAP) is out we can expect tools to make it very easy to generate programming language code to invoke "GETable URI" SOAP services, but that is in the future sometime. A better answer, I think is that SOAP provides an architectural foundation for extended services as requirements evolve.

For example, "raw XML over HTTP" works just fine for simple services over a secure intranet (or the public internet when SSL provides adequate security). But what happens as the requirements on the service creep upwards and one must: support:

  • Routiing and reliable messaging across multi-node networks, such as when one must perform content-based routing from an HTTP gateway to the appropriate back-end service (e.g., the one nearest to the consumer).
  • End to end encryption (from consuming application to service rather from consuming application to SSL gateway)
  • integration of legacy services that may not have an HTTP URI
  • Non-HTTP communications protocols and interfaces such as BEEP, MQ and JMS
  • Multipart service transactions that must be committed or rolled-back/compensated as an entity
  • More complex interactions between service suppliers and consumers that need to be described and choreographed.

So, if you are likely to have some of these challenges, building SOAP into the architecture lets you leverage emerging technologies such as WS-Security, WS-Routing, one of the several reliable messaging proposals, WS-Transaction, WSBPEL, etc. that build on the SOAP infrastructure.

If, on the other hand, you are saying to yourself "that's WAY more complicated than anything I want to do," then SOAP may not be for you. Likewise, if you are saying "I need to do that stuff MY way rather than the way some industry committee's way," then perhaps you are better off putting this stuff into your own application rather then buying an off-the-shelf solution, and SOAP doesn't offer anything you need. But this is something to be analyzed -- it's just as "wrong" to blindly reject SOAP as to blindly accept. it.

Still, at a more fundamental level I think Sean and I agree on the basic architectural principle here :"I have learned to really appreciate the power of the "distributed systems as stateless document exchange," which I believe lies at the heart of what makes HTTP so stunningly successful. " It often makes more sense to think of XML-powered applications as transformation pipelines (see Propylon's PropelX and Software AG's EntireX Mediator as pioneering tools that support this approach) rather than objects to be exposed with APIs. More on this in a future edition ...but in either scenario, SOAP may add enough value to justify its use.



Exploring Where XML and Java Meet

Posted by mchampion on June 10, 2003 at 04:07 PM | Permalink | Comments (1)

For some users, the Java and XML roads come together in a smooth interchange where objects can be serialized as XML and schemas cleanly bound into classes. But for others, they come together in a maze of alternative sidestreets, and there are a lot of interesting things happening there that one one misses by simply taking the "databinding interchange." The overall theme of this weblog is to explore some of the sidestreets in the "place" where XML, Java, and related technologies such as DBMS and Web services meet, and sometimes look into the dark alleys leading off them. I plan to use this weblog as a forum in which to point to interesting ideas and generate discussions about them, so please contribute thoughts, corrections, commentary, and suggestions.

To start things off, there has been a spate of articles lately related to something I've wondered about for a long time, namely the relationship of the world of objects to the world of XML. On one hand, the OO principle of "data hiding" suggests treating the details of data formats as implementation details to be hidden behind access methods -- relates to the world of XML. The whole point of XML is to expose the data rather than to hide it. Norm Walsh emphasizes the point that "XML is Not Object Oriented," pointing out that:

the constituent elements and attributes of an XML vocabulary are not generally related to each other by inheritance, nor do they naturally correspond to objects with any kind of precision.

On the other hand, XML makes an awfully handy serialization format for Java objects.Also, there is a generic XML object model that has much support in the Java world and provides a useful, if very low-level, set of methods for working with XML documents.

So, as the old joke goes, XML really is a bit of a "floor wax and a dessert topping." The trick is to know which approaches, tools, APIs, etc. are best for which applications. That is a subject we will explore in subsequent editions of this weblog.





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds