<?xml version="1.0" encoding="utf-8"?>
<feed version="0.3" xmlns="http://purl.org/atom/ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xml:lang="en">
<title>Mark Little&apos;s Blog</title>
<link rel="alternate" type="text/html" href="http://weblogs.java.net/blog/marklittle/" />
<modified>2006-06-09T00:45:50Z</modified>
<tagline></tagline>
<id>tag:weblogs.java.net,2008:/blog/marklittle/237</id>
<generator url="http://www.movabletype.org/" version="3.01D">Movable Type</generator>
<copyright>Copyright (c) 2006, marklittle</copyright>
<entry>
<title>Transactions are your friend</title>
<link rel="alternate" type="text/html" href="http://weblogs.java.net/blog/marklittle/archive/2006/03/transactions_ar_1.html" />
<modified>2006-06-09T00:45:50Z</modified>
<issued>2006-03-02T21:25:45Z</issued>
<id>tag:weblogs.java.net,2006:/blog/marklittle/237.4231</id>
<created>2006-03-02T21:25:45Z</created>
<summary type="text/plain">In this article I&apos;ll describe why you shouldn&apos;t overlook the potential usefulness of transactions within your application, even if you&apos;re not using distributed transactions.</summary>
<author>
<name>marklittle</name>

<email>mark.little@arjuna.com</email>
</author>
<dc:subject>J2EE</dc:subject>
<content type="text/html" mode="escaped" xml:lang="en" xml:base="http://weblogs.java.net/blog/marklittle/">
<![CDATA[<p><a href="http://www.cs.ncl.ac.uk/people/home.php?id=16">After having been doing this for 20 years</a>, I can say that <a href="http://java.sun.com/blueprints/guidelines/designing_enterprise_applications_2e/transactions/transactions4.html">transaction processing</a> has got to be one of the most difficult middleware components to persuade developers to use. There are several reasons for this, but probably the most important is that, unlike something like caching or security, you don't see the benefits transactions bring until there's a failure. Unfortunately (or fortunately, depending on your perspective), failures don't happen that often, so actually demonstrating the utility of using a transaction processing system is made even more difficult. Furthermore, unlike something like security, you're unlikely to be refused access to a resource because you're not using it within the scope of a transaction.</p>

<p>However, thanks to the inefficiencies of natural selection (humanity is not perfect yet) and the <a href="http://www.webservices.org/weblog/mark_little/web_services_transactions_and_heuristics">beauty of entropy</a> (all things decay), failures will always happen and so transactions will always be needed: all we can ever hope to do as technology advances, is reduce the probability of a failure occuring. Therefore, as a developer you've got to weigh up the likelihood of a failure (any failure) happening and corrupting your application versus the perceived cost (commercial and the overhead of restoring the system to good health) of using transactions. If you want to take the risk, then don't use transactions; but likewise, don't forget that they do exist to help you.</p>

<p>Now you may think that replication of resources/objects/servers could be used in place of transactions, but <a href="http://www.cs.ncl.ac.uk/research/pubs/inproceedings/papers/161.pdf">that isn't</a> <a href="http://www.cs.ncl.ac.uk/research/pubs/inproceedings/papers/621.pdf">the case</a>. Replication and transactions can be complimentary, but they're not a replacement for one another. Transactions guarantee consistency even in the presence of complete system failures, but you won't necessarily get forward progress. However, replication offers (though cannot guarantee) forward progress in the presence of a finite number of failures. So I would argue that if you are replicating updatable data, then you should definitely consider transactions as well.</p>]]>
<![CDATA[<p>Which leads us to another problem with selling the idea of transactions, that I've <a href="http://markclittle.blogspot.com/2005/05/why-transactions-are-often-under-used.html">blogged on before</a>: the notion that they slow your application down. Combined with the first problem I mentioned, I've often heard this referred to as "I get nothing for something" syndrome: you get the overhead of using transactions, but you just don't see the benefits they bring (which, looked at from some perspectives, is an entirely logical conclusion to make). Of course transactions slow down your application: I've <a href="http://weblogs.java.net/blog/marklittle/archive/2005/05/transactions_an.html"><br />
discussed this before</a>, but if you just think about what they have to do in order to guarantee consistency in the presence of failures, it makes sense: there really is no such thing as a free meal! </p>

<p><a href="http://www.amazon.com/gp/product/1558604154/ref=pd_cpt_gw_1/102-4644813-4248132?%5Fencoding=UTF8&v=glance&n=283155">Transaction processing systems</a> have been the <a href="http://www.amazon.com/gp/product/1558605088/qid=1141313331/sr=1-1/ref=sr_1_1/102-4644813-4248132?s=books&v=glance&n=283155">backbone of significant areas</a> of <a href="http://www.amazon.com/gp/product/1558601902/ref=pd_bxgy_img_b/102-4644813-4248132?%5Fencoding=UTF8">computing infrastructure for decades</a>. A lot of these places (finance, telecos etc.) made the trade-off between performance and reliability because there was never a trade-off to be made: if they corrupt data (e.g., lose updates to stock trades), then institutions lose business. Now obviously that's not the case everywhere and there are applications where failures really don't matter (e.g., stateless interactions). But in general, you need to think about the effect of failures on your applications and although transactions are just one of the techniques you could use to help tolerate them, with the JTA they are a core component of J2EE. So rather than come up with ad hoc solutions, it may be better to try to leverage tried-and-tested techniques and associated implementations.</p>

<p>Following on from this is the oft heard statement: "everything I do is within a single VM, so I don't need transactions". This is definitely an education issue, where <i>distributed transactions</i> have become synonymous in the minds of many people with <i>transactions</i>. Most people can see that if they're accessing resources/participants across physically distinct machines or processes, there's a need for transactions to coordinate simultaneous updates to state. In a local (single VM) environment, the need is often overlooked. But it is still there: in many cases, even within the same VM, applications use and modify data from multiple different sources, and in that case, you need the benefits that transaction processing provides. Distribution just makes it more obvious that independent failures can cause problems. But they're still there in the local case; you may just have to look a little harder.</p>

<p>Plus, transactions get a lot of bad press for overheads that really don't exist in all cases. All commercial grade implementations support a number of significant optimisations to improve performance in the 80-20 case. For example, if there's only a single participant in a transaction, then the notorious <a href="http://www.sei.cmu.edu/str/descriptions/dtpc_body.html">two-phase  commit</a> protocol goes away and we run with a <a href="http://www.sei.cmu.edu/str/descriptions/dtpc_body.html">single phase</a>. Then there's the <a href="http://www.microsoft.com/windows2000/en/advanced/help/default.asp?url=/windows2000/en/advanced/help/addtccpt_4je6.htm">read-only optimization</a>: if a participant didn't modify any data, then it can drop out of the transaction "early". Plus, there are <a href="http://labs.jboss.com/portal/jbosstm/documentation/papers/HPTS2005.pdf">some implementations</a> that have evolved over decades to offer many other performance features, such as lightweight coordinators, nested transactions and non-durable participants. The intention (mirrored by Microsoft's work with Indigo transactions) is to make transaction implementations so lightweight, with low overhead that they'll become a natural part of the infrastructure (and in the case of Microsoft, that'll mean in the operating system). We've already seen them moving into <a href="http://slie.ca/projects/6.895/csail_abstract-sean_lie1.pdf">hardware</a>, so this makes a lot of sense too.</p>

<p>As I've <a href="http://www.amazon.com/gp/product/013035290X/ref=sr_11_1/102-4644813-4248132?%5Fencoding=UTF8">said before</a>, think of transactions like an insurance policy: compared to how much time, money and effort you may lose by not using them, the cost of using them may be well worth it. Obviously there's a tiping point on any graph of cost of using transactions versus advantage they may bring, and that point is going to be very dependent on your application. But consider it nonetheless.</p>]]>
</content>
</entry>
<entry>
<title>When and why are interoperability fests useful?</title>
<link rel="alternate" type="text/html" href="http://weblogs.java.net/blog/marklittle/archive/2006/02/when_and_why_ar_1.html" />
<modified>2006-06-09T00:45:50Z</modified>
<issued>2006-02-01T11:09:42Z</issued>
<id>tag:weblogs.java.net,2006:/blog/marklittle/237.4041</id>
<created>2006-02-01T11:09:42Z</created>
<summary type="text/plain">Interoperability fests/workshops have become very popular recently, particularly in the area of Web Services. However, they are more widely useful and should be an active part of a developer&apos;s testing arsenal whilst building relevant systems, rather than an afterthought as is often the case.</summary>
<author>
<name>marklittle</name>

<email>mark.little@arjuna.com</email>
</author>
<dc:subject>Testing</dc:subject>
<content type="text/html" mode="escaped" xml:lang="en" xml:base="http://weblogs.java.net/blog/marklittle/">
<![CDATA[<p>The world of Web Services has thrown up a <a href="http://www.arjuna.com/news/press/2004-12-13-ContextInterop-PR.pdf">range</a>  <a href="http://www.ebizq.net/news/5586.html">of</a> <a href="http://msdn.microsoft.com/webservices/webservices/building/interop/default.aspx">various</a> <a href="http://www.oasis-open.org/news/oasis_news_11_19_04.php">interoperability</a> <a href="http://www.xmlconference.org/xmlusa/2003/interopdemos_oasis.asp">workshops</a> <a href="http://lists.w3.org/Archives/Public/public-ws-addressing/2005Dec/0034">aka plugfests</a>; not to mention a <a href="http://www.ws-i.org/">whole organisation</a> dedicated to interoperability. You might get the impression that because Web Services are about interoperability as much as internet-scale computing, such things have not been of interest in other distributed systems such as JEE or CORBA. But interoperability events do <a href="http://www.infostor.com/Articles/Article_Display.cfm?ARTICLE_ID=235991">occur</a>  <a href="http://whitepapers.zdnet.co.uk/0,39025945,60039338p-39000527q,00.htm">elsewhere</a>. However, it is true that the approach to interoperability we're seeing  now is markedly different from what we saw in the past: for most of the key players, interoperability is at the forefront of specification and implementation development. If you look at CORBA, it took <a href="http://www.omg.org/gettingstarted/corbafaq.htm">7 years or so</a> for the OMG to address the shortcoming and things are still not perfect; and true heterogeneous JEE-to-JEE interoperability is a thing of the future.
</p>]]>
<![CDATA[<p>Both CORBA, JEE, DCE and (implicity) COM/DCOM, were dominated by vendors keen to maintain <a href="http://en.wikipedia.org/wiki/Lock-in">vendor-lockin</a>. Fortunately (or unfortunately, depending on your perspective) this couldn't continue and even before the rise of Web Services we were beginning to see change: the "norm" of sites running homogeneous environments changed, with companies growing by acquisition or wanting to do real vendor-to-vendor (business-to-business) interactions. No longer was the argument "take our XYZ product now and we'll work with you to have eventual interoperability with ABC" sufficient. Many large deals have fallen through because of the lack of interoperability.</p>

<p>However, it was certainly the case that interoperability was still considered of secondary importance. To a degree, that is understandable: you can't worry about interoperability until/unless you have a product. However, I believe strongly that interoperability testing should be considered as important as standard unit testing and QA: it shouldn't be an afterthought.</p>

<p>To explain why, I'll use the Web Services interoperability fests as an example, but as I said before, this isn't (or shouldn't be) technology specific. If you've ever looked at various Web Services specifications, such as <a href="http://developers.sun.com/techtopics/webservices/wscaf/">WS-CAF</a>, <a href="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=ws-sx">WS-SX</a>, or <a href="http://www.w3.org/2002/ws/addr/">WS-Addressing</a>, they're not exactly easy reading material (JEE and CORBA specifications are similar). Understanding <i>why</i> something is intended to work the way it does is often as difficult as understanding <i>how</i> and is definitely as important. As well as product development, I've been working in standards and specifications for over 10 years and two people can read the same specification and come away from it with completely different perspectives. In most cases that's a problem (read: bug) with the specification that should be caught early on. And this is where previous standards efforts, such as JEE and the OMG, fell down: more often than not, specifications were developed months or years before implementations; for example, in order to ratify a specification, the OMG only requires companies to say they will eventually use it, not that they have implemented it.</p>

<p>Now although the same is true for Web Services (e.g., OASIS requires at least 3 committee member organisations to say they are using a specification, though it doesn't have to be in product), the whole "Web Services are for interoperability" mantra has really taken hold. Before any Web Services specification is standardised, the various committees have at least one workshop where they work through interoperability between heterogeneous implementations and feed those results back into the specification. Obviously depending on the results, this can be an iterative process, but the end result is usually something that offers better out-of-the-box interoperability. Therefore, from a specification development perspective, these workshops are incredibly important and would be beneficial in other arenas.</p>

<p>Obviously not everyone can be present at these official interoperability fests and anyway many implementations arise after standardisation has occurred. Plus, standards are still not perfect and can often be deliberately ambiguous, leading to the possibility of non-interoperable implementations. That's why it is so important to do interoperability testing during development: iron out bugs in the specification or in your understanding of what was meant by the original authors. If it is left until after the product is shipped, then it may be difficult or impossible to make changes without causing problems for end-users. That's another route back to vendor-lockin and bridging protocols. Furthermore, feeding bugs back to the standards bodies will benefit the next version and other users: try arguing with one vendor that your view of an ambiguous specification is correct if they've come to a completely different (and incompatible) conclusion!</p>]]>
</content>
</entry>
<entry>
<title>Transactions and recoverability: what they mean to your applications</title>
<link rel="alternate" type="text/html" href="http://weblogs.java.net/blog/marklittle/archive/2005/05/transactions_an.html" />
<modified>2008-01-02T17:42:16Z</modified>
<issued>2005-05-11T20:53:05Z</issued>
<id>tag:weblogs.java.net,2005:/blog/marklittle/237.2423</id>
<created>2005-05-11T20:53:05Z</created>
<summary type="text/plain">Over the years I&apos;ve seen many complaints about using transactions (e.g., via the JTA) for a number of reasons, including performance degredation, assumptions are impact on application development etc. You don&apos;t get something for nothing (there really is no such thing as a free lunch), so there&apos;s always a trade-off to be made with transactions: guaranteed completion even in the presence of failures. In this entry I&apos;ll look at why you shouldn&apos;t look to trade off some transaction properties; either use them all or don&apos;t use transactions.</summary>
<author>
<name>marklittle</name>

<email>mark.little@arjuna.com</email>
</author>
<dc:subject>J2EE</dc:subject>
<content type="text/html" mode="escaped" xml:lang="en" xml:base="http://weblogs.java.net/blog/marklittle/">
<![CDATA[<p>The main problem that I hear about using transactions is performance. In order to guarantee atomicity in the presence of failures, the coordinator must first execute the two-phase commit protocol across all participants to achieve consensus. Assuming the all say "yes" during the first (preparation) phase, the coordinator <b>must</b> then make its decision to commit durable (persistent), so that if there's a failure, it can pick up from where it left off.</p>

<p>When each participant receives the first phase message (essentially asking "can you commit the work you've done?"), it must make any changes it is responsible for (e.g., table updates) durable, but in a provisional manner - the participant doesn't know the transaction outcome yet, so it had better not second guess the coordinator (this can lead to what are known as <i>heuristic</i> outcomes.) When the coordinator (eventually) sends the second phase message (which is either "commit" or "rollback"), the participant can clear its durable log by making provisional updates permanent (in the case of commit), or deleting them (in the case of rollback). This is obviously an simplified description of what goes on, but it should be sufficient for the purposes of this discussion.</p>

<p>If everyone follows these rules, then you get guaranteed completion of all participants in a transaction, even if that completion is to completely undo the work: atomicity ensures that what one does, they all do and durability (combined with a suitable failure recovery component) ensures it happens if if there are failures.</p>]]>
<![CDATA[<p>However, the two-phase commit protocol and the durability requirements on the coordinator and participants, obviously impose an overhead. If there are N participants in a transaction, then the coordinator sends 2N messages during 2PC in the commit case (there are optimisations such as read-only and one-phase that can help, but we'll consider the worst case scenario.) The coordinator must write a log record (and make sure it's flushed to disk and not cached in the operating system buffer) and each participant must do likewise (again, ignoring optimisations). Disk performance has improved a lot over the years, but it's still a physical process (e.g., moving the disk head to the correct place) and hence a major bottleneck.</p>

<p>But consider what you get for this overhead: guaranteed outcomes in the presence of failures and, if you've used distributed transaction support, this happens irrespective of the physical locality of your coordinator and its participants. Think about what you'd have to do if you wanted to do this yourself. Transactions are like an insurance policy for your critical applications: most of the time you don't see the benefit, but it's that "odd" occasion where you'll be really glad you had them.</p>

<p>So this is where the trade-off comes. You trade off some performance for these guarantees. If you don't want the guarantees (and many applications simply don't need transactions), then you shouldn't be using transactions. If you do want the guarantees, then I wouldn't encourage you to do this yourself.</p>

<p>Now if we return to some of those optimisations I mentioned earlier, there are cases where durability isn't needed at all. Some transaction systems allow you to use transactions to control what are often termed "recoverable" entities: these are resources that need the consensus of 2PC, but don't want the failure recovery - if a crash happens, the data is lost and that's just fine for these types of resources. You can mix recoverable and persistent (traditional) participants in the same transaction and the coordinator should correctly write logs only for the persistent resources.</p>

<p>Unfortunately the JTA doesn't support recoverable participants because it is tied to XA. That's not to say you couldn't have an XAResource implementation that was only recoverable, but that the XAResource interface doesn't convey that information to the coordinator. So, even if the coordinator could optimise the log writing in the case of recoverable participants, it can't in a pure JTA environment because it can't tell the difference via the participant interface. That means that this optimisation isn't available in a portable manner.</p>

<p>This is about the only time in a production environment that you should be considering using transactions and not having failure recovery. If you need transactions for your application and you're using a transaction system which doesn't support recovery, or it doesn't have it enabled by default, then don't use it. One argument I've heard from some for disabling recovery or not even implementing it is: "it's faster without recovery". Of course it's faster, because it's not doing anything! You've still got the 2PC overhead of course, but without recovery (or recoverable participants), what's the point? It's like buying an insurance policy that doesn't pay out: you get the pleasure of the overhead, but you'll never see any benefit.</p>

<p>You could argue that for testing purposes you don't need recovery, and  there may be a case there. However, if you're testing, performance shouldn't be an issue anyway. So what's a little slower response actually mean in this situation, when you actually get to test the entire system as it's going to work in a deployed environment?</p>

<p>Another argument I've heard is: "99% of applications don't need recovery". OK, that's fair (though I might not put it as high as that). But in that case, 99% of applications shouldn't be using transactions, because without recovery they don't get much benefit.</p>

<p>So after all of this, what I've tried to show is that if you want transactions then you need to be prepared to pay the cost. But the benefit in the case where you do need them, can be huge. And in that case, make sure you get the whole transaction package from your implementation of choice and worry about those suppliers who try to tell you that recovery isn't important. If you're sure you don't need recovery, then you shouldn't be considering using transactions either.</p>]]>
</content>
</entry>
<entry>
<title>A session &amp; context concept for Web Services</title>
<link rel="alternate" type="text/html" href="http://weblogs.java.net/blog/marklittle/archive/2005/05/a_session_conte.html" />
<modified>2008-01-02T17:42:16Z</modified>
<issued>2005-05-07T22:44:20Z</issued>
<id>tag:weblogs.java.net,2005:/blog/marklittle/237.2406</id>
<created>2005-05-07T22:44:20Z</created>
<summary type="text/plain">In this entry I&apos;ll take a look at the OASIS WS-Context specification that&apos;s being developed by Sun, Oracle and others. It&apos;s an important component in the Web Service stack and unlike some aspects of WS-Addressing, facilitates the loosely coupled nature of SOA.</summary>
<author>
<name>marklittle</name>

<email>mark.little@arjuna.com</email>
</author>
<dc:subject>Web Services and XML</dc:subject>
<content type="text/html" mode="escaped" xml:lang="en" xml:base="http://weblogs.java.net/blog/marklittle/">
<![CDATA[<p>One of the common features of all middleware systems is support for the <i>session concept</i>.  You don't have to look far to see this: J2EE, CORBA and going back further to systems such as Emerald, Argus and Camelot. A session is a mechanism for correlating multiple messages in order to achieve some application-visible semantic. But strange as it may sound, until recently there wasn't such a concept for Web Services.</p>

<p>In August 2003, Sun, Oracle, Arjuna Technologies, IONA Technologies and Fujitsu, released the <a href="http://www.infoworld.com/article/03/07/28/HNtransactionspec_1.html">Web Services Composite Application Framework</a> family of specifications. Shortly afterwards, these specifications were given to <a href="http://www.oasis-open.org">OASIS</a> to form the <a href="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=ws-caf">OASIS WS-CAF</a> technical committee.</p>

<p>In future entries I may look at the other specifications, but the one I'd like to concentrate on now is <a href="http://www.oasis-open.org/committees/download.php/12416/WS-Context.zip">WS-Context</a>.</p>]]>
<![CDATA[<p>You might think that sessions are something so fundamental that you simply can't live without them, so why were Web Services special? Why didn't such a capability exist until 2003? The answer is that it did, but it was done on an ad hoc basis by pretty much every specification or implementation that needed it. What Sun, Oracle et al set out to do with WS-Context was to standardise this for interoperability and re-useability.</p>

<p>Then of course there was the infamous ReferenceProperties in <a href="http://www.w3.org/Submission/ws-addressing/">WS-Addressing</a>. This allowed additional information within an address to be encoded in an opaque manner, but which could (and was) used to create sessions. The problems with this have been discussed many times elsewhere, but I recommend checking out <a href="http://www.orablogs.com/pavlik/archives/The-Session-Concept-in-Web-Services.pdf">this</a> paper for a summary. Fortunately ReferenceProperties were <a href="http://www.w3.org/2002/ws/addr/wd-issues/#i001">removed</a>, though ReferenceParameters still exist.</p>

<p>WS-Context provides a more lightweight, generalized session model, somewhat akin to that of a Web Services cookie. The Web Services architecture is not prescriptive about what happens behind service endpoints. This gives flexibility of implementation, allowing systems to adapt to changes in requirements, technology etc. without directly affecting users. It also means that issues such as whether or not a service maintains state on behalf of users or their (temporally bounded) interactions, has been an implementation choice not typically exposed to users. The WS-Context session model encourages loose coupling of services and their users and keeps any implementation specific choices about state where they belong: behind the service endpoint.</p>]]>
</content>
</entry>

</feed>