Skip to main content

Truly reliable software?

Posted by joshy on September 21, 2003 at 6:33 PM PDT

I've been thinking. We have lots of software crashing these days. Some
due to bugs. Some due to viruses and worms. Some due to hardware failure. And yet software is becoming more common and important than ever before. So what can we do to make software more reliable? Can it
be 100% reliable when it is written by failable humans.

Mental exercise: how to you make truly 100% reliable software? And I mean hardcore 100%, like people die if it fails or goes offline. Assume money is no object (and saying "write perfect code" doesn't count :)

Here are some rather radical ideas:

  • Complete runtime software loading. Make every module be reloadable at runtime with rollback. No software is perfect so upgrading without restarting is essential. But how do you write software that only starts up once?
  • Extremely distributed computing. Have three copies of each module running on each server and three servers in each location with three locations. Average the results before sending it out. It would require a huge disaster to take those puppies out.
  • Multiple implementations. Have three teams develop three codebases that all fit a standard API. If all three disagree at some point then go with the majority and report the error. This protects against viruses that attack flaws in particular implementations.
  • Constant runtime testing We may never be able to mathematically prove software since any set of constraints complex enough to fully test the software would itself be software, just in another form. However, unit tests and end to end tests, constantly executed at runtime, go a long way towards proving the software is up and running properly.

- joshua