Skip to main content

The story of the write cache and half a worm

Posted by davidvc on October 11, 2005 at 12:16 PM EDT

We’ve been doing some performance testing of Derby, and we discovered something that I suspect many you out there may not be aware of. I know it caught some of us by surprise, and we’re dealing with databases all the time.

First of all, let’s talk about your data. I think most of you agree that when you store your data in a transactional database, you expect it will meet the transactional guarantees of atomicity, consistency, isolation and durability. I mean, why else go through the trouble of using a database? For example, if you commit a transaction, and your database says “OK”, then if in the next moment the database crashes, when it comes back up, the data should (a) still be there and (b) be consistent (e.g. it’s worse finding half a worm in your apple than a whole worm).

In another blog I talked about how database systems sometimes don’t provide that guarantee. They either aren’t transactional at all, or by default they don’t write the log record to disk as part of the commit. They take care of it after the commit, in a sort of lazy way that significantly improves throughput. It reminds me of a teenager promising to clean their room “later.” Maybe they will, maybe they won’t. They even call it a “lazy write,” and I get this image of the log subsystem hanging out on the bed reading comic books. The problem with this approach is that “later” never happens if there is a crash before the system gets around to writing the log to disk.

Well, it turns out that some operating systems and hard drives play the “later” game with you as well. At this level this game is played by enabling the write cache by default, either in the operating system or within the disk controllers. Linux and Windows have the write cache enabled by default, as do ATA drives and even RAID controllers. Solaris has its write cache turned off by default, and also will try to turn off the write caches on any drives attached to the system (I have heard from my contacts in Solaris that many ATA drives don’t even let the operating system turn off the write cache -- you set the flag to turn the cache off, but the drive controller basically ignores you. ATA vendors actually do not even certify their disks for recovery with the write cache turned off).

Minor detail: if your disk crashes or there is a power failure, you’ve lost some of your data. Actually, with a write cache it’s even worse than if the database were doing the caching, because you can also lose consistency (half a worm). Your filesystem or database can become corrupted. At least with the database-level optimizations, where the log is written lazily, you are guaranteed consistency, if not durability.

The vendors know they do this, but it’s not very well published. Why quietly enable it by default (or even prevent you from disabling it)? It would seem the right thing to do for customers would be to have the write cache off by default and let them turn it on if they want to. You know, opt-in instead of opt-out.

One can only guess, but my strong suspicion for the reason behind this approach is pure and simple: marketing. When you test with the default configuration, your write-intensive applications scream. They beat the competition out of the water.

So, you have two choices. You can either turn on the write cache, and suffer the performance drop in the name of consistency and durability, or take the risk that your database may become corrupted. You can mitigate the risk using tricks like backup power supplies, but the risk is still there. Not a fun choice to make, but it would be great if this were a conscious choice by the customer instead of one the vendors make for you.

There is another very important point here. It is very easy to get the wrong impression about Solaris running with SCSI drives when comparing it with other operating systems and disk types. So please, when running database or other write-intensive performance comparisons, make sure have the write cache on or off consistently. Otherwise you’re comparing (wormy) apples to oranges.