Skip to main content

JPA and Rollbacks - Not Pretty

Posted by davidvc on April 27, 2007 at 5:03 PM PDT

We all want our interactions with the database to be successful. And most demos and code samples have everything going hunky dory. But what happens if they're not? What if you or the database needs to roll back the transaction? With JPA, if you're not careful, things can get pretty ugly.

Here it is from the horse's mouth:


3.3.2 Transaction Rollback

For both transaction-scoped and extended persistence contexts,
transaction rollback causes all pre-existing managed instances and
removed instances to become detached. ... Transaction rollback typically causes the persistence
context to be in an inconsistent state at the point of rollback. In
particular, the state of version attributes and generated state (e.g.,
generated primary keys) may be inconsistent. Instances that were
formerly managed by the persistence context (including new instances
that were made persistent in that transaction) may therefore not be
reusable in the same manner as other detached objects—for example, they may fail when passed to the merge operation.

So, what the spec says is that on rollback basically you are SOL -- your objects are now detached and broken. A JPA rollback is like a tornado that sweeps through your JPA house and now everything is twisted and ruined, and you have to try and clean it all up.

As someone coming from a database background, where rollbacks are almost synonymous with consistency, this was a bit of a shocker.

I asked both the Glassfish and OpenJPA communities about this, and they both confirmed that yes, this was the case.

I tried to get some guidance on how you can write a reliable JPA app when you have these kinds of semantics. I didn't get a lot of satisfactory answers, but then I got this
great answer from Ecmel Ercan on the Glassfish persistence alias
, and I think he really nails the right design pattern. Basically, you never keep JPA objects attached or "managed" between transactions.

Ecmel writes:


Hi,

I use JPA in Java SE, and after some trial and error I finally
decided to use detached objects, considering the following
assumptions:

1. EntityManagerFactory (emf) is thread safe.


2. EntityManager (em) is NOT thread safe but it is cheap to create one
whenever you need.


3. NO exceptions in JPA is recoverable.

So, I use only one emf per application and whenever I need to access
persistence context I do the following:


EntityManager em = emf.createEntityManager();
EntityTransaction tx = null;
try {
tx = em.getTransaction();
tx.begin();
...
...
tx.commit();
} catch(Exception ex) {
  if(tx != null && tx.isActive()) tx.rollback();
} finally {
  em.close();
}

As soon as em.close() is called, your POJOs will become detached so
there is no need to copy them explicitly.

Note that there are consequences of using detached objects all of the time. Every time you merge there is a full copy from your detached object to a separate managed object. Depending on the number of objects you are managing, this can get expensive. Update: a comment by rbygrave warns that using detached objects means you need to have a field on your JPA objects with the @Version tag so you can correctly handle optimistic concurrency semantics

Also, JPA says you can not rely on attributes of your object being available in a detached object if you are using lazy loading. The default behavior of JPA is eager loading update: except for @OneToMany and @ManyToMany associations, which are lazy by default (thanks greenhorn). This probably makes sense because otherwise you could be loading very large object graphs. But this means if you're using detached objects, you have to manually "fault in" associated object collections. Not really fun, but this may be the cost of having any kind of guarantee that you are working with data that is consistent and reliable.

Don't get me wrong, I love JPA. I think it takes a whole chunk of coding and maintenance out of your system. But there are definitely some areas where it needs some work, and this is one of them (the other area that really bugs me is you have to manage your own relationships).

An encouraging note from Patrick Linskey over at OpenJPA and a member of the JPA expert group (from my
OpenJPA email thread on this
): Yeah, it's not ideal, and is something we're hoping to resolve in the
next version of the spec.

So, keep your eyes out, let's be careful out there, and let those JPA folks know (loudly if necessary) when you see a problem. Hopefully over time these kinks will be worked out.

Update: A number of folks have concurred this is the best way to work with JPA, including someone saying this is how Spring does it. To me this is astounding, that the right model for working with JPA includes detaching objects at the end of each transaction. As someone who worked for years on network protocols and performance, and has had drilled into me the cost of data copies between protocol layers, this performance hit as a matter of policy is hard to swallow. I do hope this gets fixed soon.

Comments

One note to lazy loading: EclipseLink JPA provider can ...

One note to lazy loading: EclipseLink JPA provider can handle lazy loading on detached objects - it automatically creates short-lived transaction to initialize.