Actors in Java: What are they good for?
Exciting information technologies emerge from new discoveries and re-emerge from past discoveries at a rapid rate. Despite the glamor and curiosity appeal engendered by cutting edge technologies when they debut, after the hype fades and we look at what actually is there in the cold light of reason we find ourselves turning to our pragmatic side and asking:
“Ok, but what advantages do we really get from this?”
So now we will take that kind of a look at the actor paradigm as it is implemented for Java. The approach will begin with a brief exposition of a feature of the actor paradigm and then show how it supports something that is conducive to an effective information processing environment. What is presented here touches on some of the more important features of actor systems but is not exhaustive.
Actors, the atomic functional units of the actor paradigm, are not confined to one machine. Unlike the common case of traditional objects whose location is a virtual memory location within a single address space an actor's location is determined by a path that is composed of transport protocol, actor system name, hostname:port, guardian actor name, and worker actor name. This physically blind form of location allows actors to be arrayed across machines anywhere on the Internet and to be moved from one host to another at will with absolutely no change to application program code. The advantages of being able to move units of computation about freely from one machine to another include:
- Scalability of work load as smooth continuous change.
- Scalability outward to geographically remote locations as smooth continuous change.
- Scalability to a growing number of diverse platform types as smooth continuous change.
- Dynamic transfer of work load from one machine to another.
The above advantages of location transparency accrue to cloud based applications, applications that are inherently distributed ( e.g. hundreds of field offices ), merged or collaborating organizations who find themselves with differing computer hardware and operating systems, and applications that have an acute need for load balancing.
Computational sharing is defined by three basic models; shared memory, shared disk, and shared nothing. As its name implies neither sharing of disk or sharing of memory occurs between threads of execution in a shared nothing scheme. Traditional object oriented applications mostly combine the shared memory and shared disk models. Shared memory and shared disk models of computation typically bring with them higher ongoing costs than shared nothing models. This has been shown in academic papers such as
How to Build a High-Performance Data Warehouse
By David J. DeWitt, Ph.D.; Samuel Madden, Ph.D.; and Michael Stonebraker, Ph.D.
A very successful real world example is Google's adoption of the shared nothing model as the basis for its MapReduce data and programming models that have allowed them to minimize costs of processing really big data.
Data warehouses are an excellent example of applications that can benefit from this characteristic of the actor model.
Supervision in High Availability Transaction Servers
Actors are supervised by superior actors who can detect their death and restart a fresh instance to replace the failed actor. This gives actor systems the ability to conduct a more precise control of concurrent systems that are designed as high availability transaction servers. High availability servers consist of a cluster of connected machines that can transfer work load between them when failure occurs in a machine. The concept with a high availability server is that if one machine goes down the work load going to that machine can be redirected automatically to another machine in the cluster that will then process the work. The feature of actor systems that improves the cutover from a failed to a working machine is the fine granularity of control that is found in all actor systems combined with an enlarged scope of oversight. What typically happens when a machine in a non-actor high availability transaction server cluster fails is that in flight transactions at the time of failure are lost with the attendant loss of data consistency. In an actor system the parent of the actor is aware of the child actor's entire life cycle. If the child actor bearing a transaction dies ( on a failing platform ) the parent actor never receives ( within timeout ) the message from the child that signifies completion of its transaction. The parent actor therefore knows to create a fresh instance of the transaction bearing child and restart it. This ability allows actor based high availability transaction servers to cutover from failed machine to working machine with no loss of in flight transactions. Redirection of transactions in a cluster from a failed to a working host can ( optionally ) be configured to be automatically and transparently managed by the actor runtime system. In that type of configuration no human intervention with commands is required to make the cutover to a working machine.
This feature of actor systems is good for all high availability transactional server clusters.
This has been a quick tour of some of the reasons why you might want to give the actor paradigm consideration for your Java project. As with all projects the unique needs and vulnerabilities of the undertaking have to be recognized. Finally there are no magic technologies.
In the next post of the “Actors in Java” series we will take a hiatus from the theoretical and have some hands on fun with actual actor code. See “Actors in Java: The Fortune Cookie Application” coming in about a week.