Skip to main content

Introduction to GrizzlyMemcached

Posted by carryel on March 2, 2012 at 12:55 AM PST

Sometimes we use caches for speeding up by alleviating database load.
And the Memcached is the bestknown in-memory key-value store(cache). For using Memcached, you need clients and many clients already exist. You can also find Memcached clients based on Java.
Though there are already good Memcached clients which have optimized Memcached operations a long time, I would like to introduce GrizzlyMemcached based on Grizzly framework which is very scalable and gives high performance.

Main features

Improving and supporting bulk operations such as getMulti and setMulti as well as basic operations of Memcached

  • Using high performance connection pool
  • Using Grizzly Framework for I/O
  • Using only Memcached binary protocol
  • Supporting setMulti, deleteMulti, getsMulti and casMulti as well as getMulti

Supporting failover/failback of Memcached

  • Using consistent hashing
  • Allowing Memcached's changes dynamically
  • Providing an option for enabling/disabling failover/failback

Synchronizing many clients for preventing stale cache data automatically when Memcacheds are failed, removed and added dynamically

  • Using ZooKeeper
  • Using the Barrier for synchronizing Memcached's list

Considerations

I/O Model

It is very important that clients as well as servers should have stable and robust I/O base.
Grizzly NIO framework has high performance, scalability and stability and it can be integrated into various modules easily.
So GrizzlyMemcached uses Grizzly NIO framework for sending/parsing/receiving packets corresponding to Memcached's binary protocol.
Grizzly NIO framework also provides several I/O strategies.
I chose the same-thread IOStrategy with default and it showed good results in my benchmark because GrizzlyMemcached is not server but client(but, you can change it as your needs in configuration).

Connection Model

Some Memcached clients such as SpyMemcached and XMemcached use only one connection about requests of multi threads.
If multi threads share one connection, the client can optimize a set of continuous single get/set operations into a bulk operation like getMulti by using the request queue because a bulk opertaion is very fast and effective than many single operations.
But, one connection can also lack scalability if many/large requests of many threads are queued concurrently.
So some Memcached clients such as JavaMemcached use many connections(a connection per a thread) and pool of connections.
This is trade-off issue(more scalable but less effective than one connection model).

Finally, I chose "a connection per a thread" model because our company(Kakao) already has experienced a connection's overload. Most of cases were that hundreds of threads had requested many different kinds of keys simultaneously.

Stale cache data

Sometimes Memcacheds can be failed/added/removed or some Memcached clients can meet temporal network failures.
Of course clients use consistent hasing algorithm for choosing Memcacheds so they minimize side effects of Memcacheds' changes if a specific Memcached is failed because only keys of the failure's Memcached will be distributed to living Memcacheds.
Then, is the consistent hashing algorithm enough?
If you are using many clients with Memcacheds, you can't avoid stale cache data issue. If you need to build additional Memcacheds in real environments, all Memcached clients should share the same configuration of Memcached's list at the same time in order to minimize stale data.

Assuming that A, B are Memcacheds and there are hundreds of Memcached clients which know only A, B.
If new Memcached C should join the existing configuration set, some clients know A, B and C but others know only A, B while new configurations are being applied.
For preventing this issue, I chose the central configuration with ZooKeeper.
If the central configuration will be changed, all GrizzlyMemcacheds will detect and receive it(1 phase, prepare stage). If all GrizzlyMemcacheds receive it successfully, it will be applied simultaneously at the specific system time(2 phase, commit stage).
(I assumed all clients' system times are synchronized)

Benchmark

Test Information

  • Memcached and client machines
    • CPU: Intel Xeon 3.3G, 8 Processors
    • Memory: 16G
    • OS: Linux SentOS
    • JDK: 1.6
    • Network: 1Gbit
  • Server/Clients versions
    • Memcached(v1.4.13)
    • GrizzlyMemcached, SpyMemcached(v2.7.3), JavaMemcached(v2.6.0) and XMemcached(v1.3.5)

Senario

  • packets
    • 32, 64, 128, 256 and 512 bytes
  • operations
    • get, set, getMulti and setMulti(which is supported by only GrizzlyMemcached)
  • threads
    • 1, 50, 100, 200 and 400
  • Etc
    • multi keys are 200, Loop counts are 200(loops per a thread)

Result

You can see the benchmark codes and results here

Examples of Use

Simple usecase

// creates a singleton CacheManager
final GrizzlyMemcachedCacheManager manager = new GrizzlyMemcachedCacheManager.Builder().build();

// gets the cache builder
final GrizzlyMemcachedCache.Builder<String, String> builder = manager.createCacheBuilder("user");
// initializes Memcached's list
builder.servers(initialServerSet);
// creates the cache
final MemcachedCache<String, String> userCache = builder.build();

// if you need to add more Memcached
//userCache.addServer(ADDITIONAL_MEMCACHED_ADDRESS);

// cache operation
final boolean result = userCache.set("name", "foo", expirationTimeoutInSec, false);
final String value = userCache.get("name", false);
//...

// clean
manager.removeCache("user");
manager.shutdown();

ZooKeeper usecase

// gets the cache manager builder
final GrizzlyMemcachedCacheManager.Builder managerBuilder = new GrizzlyMemcachedCacheManager.Builder();

// setup zookeeper server
final ZooKeeperConfig zkConfig = ZooKeeperConfig.create("cache-manager", DEFAULT_ZOOKEEPER_ADDRESS);
zkConfig.setRootPath(ROOT);
zkConfig.setConnectTimeoutInMillis(3000);
zkConfig.setSessionTimeoutInMillis(30000);
zkConfig.setCommitDelayTimeInSecs(60);
managerBuilder.zooKeeperConfig(zkConfig);

// create a cache manager
final GrizzlyMemcachedCacheManager manager = managerBuilder.build();
final GrizzlyMemcachedCache.Builder<String, String> cacheBuilder = manager.createCacheBuilder("user");
// setup memcached servers
final Set<SocketAddress> memcachedServers = new HashSet<SocketAddress>();
memcachedServers.add(MEMCACHED_ADDRESS1);
memcachedServers.add(MEMCACHED_ADDRESS2);
cacheBuilder.servers(memcachedServers);

// create a user cache
final GrizzlyMemcachedCache<String, String> cache = cacheBuilder.build();

// ZooKeeperSupportCache's basic operations
if (cache.isZooKeeperSupported()) {
   final String serverListPath = cache.getZooKeeperServerListPath();
   final String serverList = cache.getCurrentServerListFromZooKeeper();
   cache.setCurrentServerListOfZooKeeper("localhost:11211,localhost:11212");
}
// ...

// clean
manager.removeCache("user");
manager.shutdown();

You can also see various unit test codes for more GrizzlyMemcached's examples here

Pom.xml

<dependency>
    <groupId>org.glassfish.grizzly</groupId>
    <artifactId>grizzly-memcached</artifactId>
    <version>1.0</version>
</dependency>

GrizzlyMemcached is released with v1.0(2012/03/21). And it has a different repository from Grizzly project.

Here are sources and git information.

http://java.net/projects/grizzly/sources/memcached/show

git://java.net/grizzly~memcached (read-only)

Just try to check out sources and experience it.
And any feedbacks, questions and thoughts/opinions are all welcome!

Grizzly mailing: users@grizzly.java.net or dev@grizzly.java.net