The Source for Java Technology Collaboration
User: Password:



David Van Couvering

David Van Couvering 's Blog

JPA and Rollbacks - Not Pretty

Posted by davidvc on April 27, 2007 at 05:03 PM | Comments (16)

We all want our interactions with the database to be successful. And most demos and code samples have everything going hunky dory. But what happens if they're not? What if you or the database needs to roll back the transaction? With JPA, if you're not careful, things can get pretty ugly.

Here it is from the horse's mouth:

3.3.2 Transaction Rollback

For both transaction-scoped and extended persistence contexts, transaction rollback causes all pre-existing managed instances and removed instances to become detached. ... Transaction rollback typically causes the persistence context to be in an inconsistent state at the point of rollback. In particular, the state of version attributes and generated state (e.g., generated primary keys) may be inconsistent. Instances that were formerly managed by the persistence context (including new instances that were made persistent in that transaction) may therefore not be reusable in the same manner as other detached objects—for example, they may fail when passed to the merge operation.

So, what the spec says is that on rollback basically you are SOL -- your objects are now detached and broken. A JPA rollback is like a tornado that sweeps through your JPA house and now everything is twisted and ruined, and you have to try and clean it all up.

As someone coming from a database background, where rollbacks are almost synonymous with consistency, this was a bit of a shocker.

I asked both the Glassfish and OpenJPA communities about this, and they both confirmed that yes, this was the case.

I tried to get some guidance on how you can write a reliable JPA app when you have these kinds of semantics. I didn't get a lot of satisfactory answers, but then I got this great answer from Ecmel Ercan on the Glassfish persistence alias, and I think he really nails the right design pattern. Basically, you never keep JPA objects attached or "managed" between transactions.

Ecmel writes:

Hi,

I use JPA in Java SE, and after some trial and error I finally decided to use detached objects, considering the following assumptions:

1. EntityManagerFactory (emf) is thread safe.
2. EntityManager (em) is NOT thread safe but it is cheap to create one whenever you need.
3. NO exceptions in JPA is recoverable.

So, I use only one emf per application and whenever I need to access persistence context I do the following:

EntityManager em = emf.createEntityManager();
EntityTransaction tx = null;
try {
tx = em.getTransaction();
tx.begin();
...
...
tx.commit();
} catch(Exception ex) {
  if(tx != null && tx.isActive()) tx.rollback();
} finally {
  em.close();
}

As soon as em.close() is called, your POJOs will become detached so there is no need to copy them explicitly.

Note that there are consequences of using detached objects all of the time. Every time you merge there is a full copy from your detached object to a separate managed object. Depending on the number of objects you are managing, this can get expensive. Update: a comment by rbygrave warns that using detached objects means you need to have a field on your JPA objects with the @Version tag so you can correctly handle optimistic concurrency semantics

Also, JPA says you can not rely on attributes of your object being available in a detached object if you are using lazy loading. The default behavior of JPA is eager loading update: except for @OneToMany and @ManyToMany associations, which are lazy by default (thanks greenhorn). This probably makes sense because otherwise you could be loading very large object graphs. But this means if you're using detached objects, you have to manually "fault in" associated object collections. Not really fun, but this may be the cost of having any kind of guarantee that you are working with data that is consistent and reliable.

Don't get me wrong, I love JPA. I think it takes a whole chunk of coding and maintenance out of your system. But there are definitely some areas where it needs some work, and this is one of them (the other area that really bugs me is you have to manage your own relationships).

An encouraging note from Patrick Linskey over at OpenJPA and a member of the JPA expert group (from my OpenJPA email thread on this): Yeah, it's not ideal, and is something we're hoping to resolve in the next version of the spec.

So, keep your eyes out, let's be careful out there, and let those JPA folks know (loudly if necessary) when you see a problem. Hopefully over time these kinks will be worked out.

Update: A number of folks have concurred this is the best way to work with JPA, including someone saying this is how Spring does it. To me this is astounding, that the right model for working with JPA includes detaching objects at the end of each transaction. As someone who worked for years on network protocols and performance, and has had drilled into me the cost of data copies between protocol layers, this performance hit as a matter of policy is hard to swallow. I do hope this gets fixed soon.


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • You guys are doing some very interesting blogs about JPA, thanks. The "rollback tornado" is actually one of the points that are mostly questioned about by customers - while it looks like I've always answered with the right stuff, I'm glad to have a clear reference now. :-)

    And yes, I love JPA too but they really need to improve things in this area.

    Posted by: fabriziogiudici on April 28, 2007 at 01:54 AM

  • Detaching objects at the end of the transaction is exactly the model that the Spring Framework has encouraged for a long time with their TransactionProxyFactoryBean. Scoping the Hibernate session ("persistence context" in JPA?) to the outermost transaction has worked very well for me.

    Posted by: rhasselbaum on April 28, 2007 at 07:09 AM


  • I think there should be a caveat put on this approach (correct me if I'm wrong).


    That is this will work as long as your Entity beans have a @Version property. Otherwise you may lose your optimistic concurrency checking (and potentially get lost updates etc).

    Posted by: rbygrave on April 29, 2007 at 03:59 AM

  • JPA is easing (as is the whole Java EE 5 stack) the development with these technologies and concepts. Still not everything can be approached that easily. Nice post on addressing these issues...
    Otherwise, small precision, @OneToMany and @ManyToMany associations are lazy by default...
    Alex

    Posted by: greenhorn on April 30, 2007 at 11:00 AM


  • This is potentially slightly off topic, but if you are starting to think along these lines (of always using detached beans) - then I'd suggest you have a quick look at Ebean at www.avaje.org (open source).


    The idea of Ebean is to be like JPA but without Session objects (EntityManager) and without the concept of attached/detached beans. If you are writing code like the above, then the way Ebean works will likely closely match your thinking.


    Cheers, Rob.

    Posted by: rbygrave on April 30, 2007 at 11:29 PM

  • I am not sure I really understand the problem, when I use SQL and read some info from a resultset and execute some updates, when I rollback, my data (at the object level) should be considered staled and read again in a new transaction, this is pretty much the JPA approach.

    I tend to not like detached objects since they come with a lot of drawbacks
    The proper approach is to use the long conversation pattern.
    Basically load objects and update them without flushing them until the last transaction which will be the actual conversation commit.*
    And you don't have to bother with the pattern too much it is implemented with extended persistence contexts in SFSB.
    If something goes wrong before the commit, discard your persistence context. This is exactly what you would do in a plain SQL code.

    * note that some persistence providers will force you not to use transaction at all except for the last request response challenge if they do not support the FlushMode.MANUAL, this is really something that should have ended in the spec in the first place, but oh well.

    Posted by: epbernard on May 01, 2007 at 08:38 AM

  • I think JDO can help you out here.
    It has sophisticated rollback handling capabilities, including options to configure whether objects are detached upon commit (see option javax.jdo.option.DetachAllOnCommit), and whether values are to be restored to what they were at the beginning of the transaction (see option javax.jdo.option.RestoreValues).
    Exceptions thrown during rollback also include the failed objects, so that you can more intelligently handle the exceptions. Further, some exceptions indicate conditions that can't be retried (derived from javax.jdo.JDOFatalException) versus those that can be retried (javax.jdo.JDOCanRetryException).
    Also, JDO supports but does not require that your objects have a field (or fields) to be used for optimistic consistency checks (or identity, for that matter).
    With regard to fetching, JDO supports flexible fetch plans (javax.jdo.FetchPlan) by combining fetch groups together additively. This allows you to precisely define the boundary of the object graph to be detached. Any field that you try to access once an object is detached throws javax.jdo.JDODetachedFieldAccessException.
    In addition to these options, you'll find reasonable defaults for most of this behavior in JDO. Check it out -- I think you'll be quite satisfied.
    -matthew

    Posted by: matthewadams on May 01, 2007 at 09:09 AM


  • epbernard: What you're saying makes sense, but since we have JPA, doesn't it make sense for JPA to handle this for you, rather than making you do it manually? They handle so many other things for you, why not this too? Note (see following comment from matthewadams) that JDO does this for you (option javax.jdo.option.RestoreValues).

    matthewadams: Well, JDO definitely has some advantages and a little more history behind it. For better or worse, however, J2EE and the major vendors are putting their wood behind JPA, so we need to work with it and try and improve it. Some of the JDO folks are in the JPA expert group, so they can bring their experience to bear...

    Posted by: davidvc on May 01, 2007 at 02:37 PM

  • I fail to understand what you want to achieve :-)
    Do you want JPA to give you back the object graph as it was before?

    But what is "before"? The Tx start? The second request response of your stateful SB? What about detached object "reattached", by cascade?

    And more importantly what for? What are your needs?

    For the record, we implemented that in Hibernate Core a long timne ago and decided to never commit it, it was a bad and undefined concept.

    Posted by: epbernard on May 02, 2007 at 09:12 AM


  • epbernard: Well, I was thinking the pretty basic "restore managed objects (and all objects in the persistence context) back to the way the were before the transaction." It's what I would have to do manually, so why can't JPA do this? I wouldn't expect detached objects to become re-attached, nor would I expect any semantics beyond the scope of the current transaction.

    I checked the JDO API docs on RestoreValues to see exactly what they do, and it appears to be what I'm thinking. Since JDO seems to have found a solution for this that works that isn't a "bad and undefined concept,", maybe we JPA can take a page out of their book. Unless folks using JDO can tell me that this doesn't actually work correctly - I'd love to hear if people out there in the field have used RestoreValues successfully.

    Posted by: davidvc on May 02, 2007 at 10:28 AM

  • It does not especially make sense when you have a long conversation. Why would I want to come back to the state when the transaction has started? Why not one step ahead in my conversation? Probably when the transaction start the object state is already "broken", coming back to that state does not give any advantage.

    Remember, you cannot claim atomicity if you change the DB state in several transactions. so every change, has to be done in one (the last one) transaction.

    FTR, what should be the state of those operations according to the semantic you are pushing?

    //state1
    Customer cust = new Customer();
    cust.set... //set some states
    Order order = new Order();
    //state2
    order.set //set some states

    Actor a = new Actor(..set some state);

    m.persist(a);

    //state3

    transaction.begin()
    cust.set... //set some additional states
    em.persist(cust);
    cust.set... //change some state
    //state 4
    order.set... //change some state (order still unattached)
    em.persist(order);
    //state 5
    order.set... //set some additional states
    //state 6
    transaction.rollback();

    Which state should i rollback to? 1,2,3,4,5 or 6. Any one more legitimate than the other from the application point of view?
    - state1 (ie no state)
    - state3 for customer but state5 for order? What if they are related somehow by some correlated data?
    - ...

    Rolling back the persistence state is important and guaranteed, rolling back the object values based on some arbitrary boundaries is a weak contract.

    Posted by: epbernard on May 02, 2007 at 02:30 PM

  • Reposting with some indentation

    It does not especially make sense when you have a long conversation. Why would I want to come back to the state when the transaction has started? Why not one step ahead in my conversation? Probably when the transaction start the object state is already "broken", coming back to that state does not give any advantage. Remember, you cannot claim atomicity if you change the DB state in several transactions. so every change, has to be done in one (the last one) transaction. FTR, what should be the state of those operations according to the semantic you are pushing?
    //state1
    Customer cust = new Customer();
    cust.set... //set some states
    Order order = new Order();
    //state2
    order.set //set some states
    Actor a = new Actor(..set some state);
    em.persist(a);
    //state3
    transaction.begin();
    cust.set... //set some additional states
    em.persist(cust);
    cust.set... //change some state
    //state 4
    order.set... //change some state (order still unattached)
    em.persist(order);
    //state 5
    order.set... //set some additional states
    //state 6
    transaction.rollback();

    Which state should i rollback to? 1,2,3,4,5 or 6. Any one more legitimate than the other from the application point of view?
    - state1 (ie no state)
    - state3 for customer but state5 for order? What if they are related somehow by some correlated data?
    - ...


    Rolling back the persistence state is important and guaranteed, rolling back the object values based on some arbitrary boundaries is a weak contract.

    Posted by: epbernard on May 02, 2007 at 02:34 PM


  • epbernard: I see your point, given the example you have above. I guess what strikes me as odd is to modify your state partially outside of the transaction scope and partially inside it

    If it were me, I would probably set all my state before starting the transaction, and then very quickly do em.persist for all my objects and then commit the transaction. This way I'm holding the transaction for the shortest period of time. I'm trying to think of an example where I would have a real reason to modify object state after I start the transaction...

    If we assume the model I propose works in most cases, then providing a feature where the persistence context is reverted back to where it was just before I call commit seems pretty darn useful.

    Posted by: davidvc on May 02, 2007 at 03:34 PM

  • My example was simplified, think about a service calling several DAOs. All inside one transaction.

    You never change some state based on what you retrieve from a DB query in the same transaction? Because if you do, you have your example ;-)

    Posted by: epbernard on May 02, 2007 at 05:09 PM

  • I responded to this thread at serverside.com. I am reposting it again here to see if i could get some answers.

    Discarding the entitymanager is not a problem, when the transaction has to go back to the point where it started. However these are the problems i face, which i guess can be solved only if the object state rollback is done properly, in the first place:

    1) Undo operations/Savepoints with JPA: Let's take the case of a long running conversation where several operations (as individual web requests) have been carried out by the user to reach a particular object state (and not yet commited to the database). From here, if i want to undo the changes that i made during the last operation alone, i have no way of doing it with JPA. I guess JDO also does not support this at this point in time. But my understanding is that, JDO has more likely hood to support this because it tracks changes possibly with field interceptors (and could stack these changes as savepoints).

    2) Incremental UI updates in AJAX with rich client state: AJAX applications with rich client state needs only the incremental changes that happened during the request (and not between save/commit). Otherwise we will end up sending the entire (or relevant subset) object tree to the UI everytime. This is inefficient and also requires the AJAX applications to render the whole UI (making it merely a client side html renderer). Again this is not supported in JDO as well, but could be matter of exposing the savepoint changes.

    Posted by: mdoraisamy on May 03, 2007 at 03:01 AM

  • epbernard:I think you have some good points. For the sake of the community, I would like to come up with some clear guidelines and example code for what to do on rollback. Notice that many folks have taken the detached object approach. Perhaps you can document with some example code the approach you take.

    mdoraisamy: I don't personally have an answer about savepoints, it sounds like you don't have this as an option. If you really need it you might need to revert to (gack) JDBC.

    But from an architectural perspective, I'd like to ask if there is a way you can re-architect your protocol so that you do not have a conversation across multiple requests. Take a look at the REST model of protocol design - one principle is to not have conversational state, as it makes you run into exactly the problems you are having, and also impacts the scalability of your system. I know it's not always possible to just wave your hands and re-design your entire protocol, but it's something to consider IMHO.

    Posted by: davidvc on May 03, 2007 at 01:52 PM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds